From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5106 invoked from network); 6 Aug 2021 10:21:28 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 6 Aug 2021 10:21:28 -0000 Received: (qmail 9882 invoked by uid 550); 6 Aug 2021 10:21:26 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 9859 invoked from network); 6 Aug 2021 10:21:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nexgo.de; s=vfde-smtpout-mb-15sep; t=1628245274; bh=1oOOpU1U3eRiYZRADfNSLTmqaJAmC9zOZGCTOrqgtHo=; h=From:To:Cc:References:In-Reply-To:Subject:Date; b=iVZ8x9aRT66BGIgRQb4Z3INtK7z4JJzpkpZavIH8GqGNvWwfVXv9fkkkCUV3Unwrw /4lP1MVnw818PgMhplLaysY73Zciv+oDL8xrCr21SycJJvcG9HG0Ik6EHpECmDEOqD HKQwbzBPRQo4xNJqq7Yvc5eOzHc0tMrirzlcOga4= Message-ID: <6C4DCCC86B014B68877D73C798F54180@H270> From: "Stefan Kanthak" To: "Alexander Monakov" Cc: "Szabolcs Nagy" , References: <04BD4026EE364FF7AFBAF8C593E9A2E7@H270> <20210803202735.GA37904@port70.net> In-Reply-To: Date: Fri, 6 Aug 2021 12:17:12 +0200 Organization: Me, myself & IT MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_89D1_01D78ABC.FC89C110" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Windows Mail 6.0.6002.18197 X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7601.24158 X-purgate-type: clean X-purgate-Ad: Categorized by eleven eXpurgate (R) http://www.eleven.de X-purgate: This mail is considered clean (visit http://www.eleven.de for further information) X-purgate: clean X-purgate-size: 2409 X-purgate-ID: 155817::1628245274-00003C24-C5E45921/0/0 Subject: Re: [musl] [Patch] src/math/i386/remquo.s: remove conditional branch, shorter bit twiddling This is a multi-part message in MIME format. ------=_NextPart_000_89D1_01D78ABC.FC89C110 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Alexander Monakov wrote: > On Wed, 4 Aug 2021, Stefan Kanthak wrote: >> The change just follows by removing 6 LOC/instructions.-) > > Have you considered collecting the three bits in one go via a multiplication? No. My mind is not that twisted;-) > You can first isolate the necessary bits with 'and $0x4300, %eax', then do > 'imul $0x910000, %eax, %eax' to put the required bits in EAX[31:29] in the > right order, then shift right by 29. Three instructions, 14 bytes. Thanks, VERY NICE! How did you come up to it? Revised patch with shorter bit twiddling attached. Stefan ------=_NextPart_000_89D1_01D78ABC.FC89C110 Content-Type: application/octet-stream; name="remquo.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="remquo.patch" --- -remquo.s=0A= +++ +remquo.s=0A= @@ -2,49 +2,41 @@=0A= .type remquof,@function=0A= remquof:=0A= mov 12(%esp),%ecx=0A= + mov 8(%esp),%eax=0A= + xor 4(%esp),%eax=0A= flds 8(%esp)=0A= flds 4(%esp)=0A= - mov 11(%esp),%dh=0A= - xor 7(%esp),%dh=0A= - jmp 1f=0A= + jmp 0f=0A= =0A= .global remquol=0A= .type remquol,@function=0A= remquol:=0A= mov 28(%esp),%ecx=0A= + mov 24(%esp),%eax=0A= + xor 12(%esp),%eax=0A= + cwtl=0A= fldt 16(%esp)=0A= fldt 4(%esp)=0A= - mov 25(%esp),%dh=0A= - xor 13(%esp),%dh=0A= - jmp 1f=0A= + jmp 0f=0A= =0A= .global remquo=0A= .type remquo,@function=0A= remquo:=0A= mov 20(%esp),%ecx=0A= + mov 16(%esp),%eax=0A= + xor 8(%esp),%eax=0A= fldl 12(%esp)=0A= fldl 4(%esp)=0A= - mov 19(%esp),%dh=0A= - xor 11(%esp),%dh=0A= +0: cltd=0A= 1: fprem1=0A= fnstsw %ax=0A= sahf=0A= jp 1b=0A= fstp %st(1)=0A= - mov %ah,%dl=0A= - shr %dl=0A= - and $1,%dl=0A= - mov %ah,%al=0A= - shr $5,%al=0A= - and $2,%al=0A= - or %al,%dl=0A= - mov %ah,%al=0A= - shl $2,%al=0A= - and $4,%al=0A= - or %al,%dl=0A= - test %dh,%dh=0A= - jns 1f=0A= - neg %dl=0A= -1: movsbl %dl,%edx=0A= - mov %edx,(%ecx)=0A= + and $0x4300,%eax=0A= + imul $0x910000,%eax,%eax=0A= + shr $29,%eax=0A= + xor %edx,%eax=0A= + sub %edx,%eax=0A= + mov %eax,(%ecx)=0A= ret=0A= ------=_NextPart_000_89D1_01D78ABC.FC89C110--