From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: "Szabolcs Nagy" <nsz@port70.net>
Cc: <musl@lists.openwall.com>
Subject: Re: [musl] [Patch] src/math/i386/remquo.s: remove conditional branch, shorter bit twiddling
Date: Wed, 4 Aug 2021 12:02:58 +0200 [thread overview]
Message-ID: <DFEFCEFB42FD4CBB9CD45CD321B57F1A@H270> (raw)
In-Reply-To: <20210803202735.GA37904@port70.net>
"Szabolcs Nagy" <nsz@port70.net> wrote:
>* Stefan Kanthak <stefan.kanthak@nexgo.de> [2021-08-01 17:59:52 +0200]:
>> Halve the number of instructions (from 12 to 6) to fetch the
>> (3-bit partial) quotient from the FPU flags C0:C3:C1, and
>> perform its negation without conditional branch.
>
> i haven't tested it but it looks good.
This is basically well-tested code I wrote about 20 years ago
for my own NOMSVCRT.LIB: I always found the bit twiddling of
J.T.Conklins code rather awful.
> i think we should not tweak x87 asm code too much though.
> it can introduce bugs and there are not many users of it.
> i think only the size saving can justify keeping any i386
> math code at all.
From your own FAQ <http://www.musl-libc.org/faq.html>
| When will it be finished?
| When there's nothing left to remove.
The change just follows by removing 6 LOC/instructions.-)
> but i'm not against committing this.
> thanks for the patch.
regards
Stefan
>> --- -/math/i386/remquo.s
>> +++ +/math/i386/remquo.s
>> @@ -2,49 +2,44 @@
>> .type remquof,@function
>> remquof:
>> mov 12(%esp),%ecx
>> + mov 8(%esp),%eax
>> + xor 4(%esp),%eax
>> flds 8(%esp)
>> flds 4(%esp)
>> - mov 11(%esp),%dh
>> - xor 7(%esp),%dh
>> - jmp 1f
>> + jmp 0f
>>
>> .global remquol
>> .type remquol,@function
>> remquol:
>> mov 28(%esp),%ecx
>> + mov 24(%esp),%eax
>> + xor 12(%esp),%eax
>> + cwtl
>> fldt 16(%esp)
>> fldt 4(%esp)
>> - mov 25(%esp),%dh
>> - xor 13(%esp),%dh
>> - jmp 1f
>> + jmp 0f
>>
>> .global remquo
>> .type remquo,@function
>> remquo:
>> mov 20(%esp),%ecx
>> + mov 16(%esp),%eax
>> + xor 8(%esp),%eax
>> fldl 12(%esp)
>> fldl 4(%esp)
>> - mov 19(%esp),%dh
>> - xor 11(%esp),%dh
>> +0: cltd
>> 1: fprem1
>> fnstsw %ax
>> sahf
>> jp 1b
>> fstp %st(1)
>> - mov %ah,%dl
>> - shr %dl
>> - and $1,%dl
>> - mov %ah,%al
>> - shr $5,%al
>> - and $2,%al
>> - or %al,%dl
>> - mov %ah,%al
>> - shl $2,%al
>> - and $4,%al
>> - or %al,%dl
>> - test %dh,%dh
>> - jns 1f
>> - neg %dl
>> -1: movsbl %dl,%edx
>> - mov %edx,(%ecx)
>> + adc %al,%al
>> + shl $2,%ah
>> + adc %al,%al
>> + shl $5,%ah
>> + adc %al,%al
>> + and $7,%eax
>> + xor %edx,%eax
>> + sub %edx,%eax
>> + mov %eax,(%ecx)
>> ret
next prev parent reply other threads:[~2021-08-04 10:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-01 15:59 Stefan Kanthak
2021-08-03 20:27 ` Szabolcs Nagy
2021-08-04 10:02 ` Stefan Kanthak [this message]
2021-08-05 13:40 ` Alexander Monakov
2021-08-06 10:17 ` Stefan Kanthak
2021-08-06 14:27 ` Rich Felker
2021-08-06 17:23 ` Stefan Kanthak
2021-08-07 0:55 ` Rich Felker
2021-08-07 13:12 ` Stefan Kanthak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DFEFCEFB42FD4CBB9CD45CD321B57F1A@H270 \
--to=stefan.kanthak@nexgo.de \
--cc=musl@lists.openwall.com \
--cc=nsz@port70.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).