From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 951 invoked from network); 6 Aug 2021 14:27:18 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 6 Aug 2021 14:27:18 -0000 Received: (qmail 21665 invoked by uid 550); 6 Aug 2021 14:27:15 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 21640 invoked from network); 6 Aug 2021 14:27:15 -0000 Date: Fri, 6 Aug 2021 10:27:02 -0400 From: Rich Felker To: Stefan Kanthak Cc: Alexander Monakov , Szabolcs Nagy , musl@lists.openwall.com Message-ID: <20210806142702.GV13220@brightrain.aerifal.cx> References: <04BD4026EE364FF7AFBAF8C593E9A2E7@H270> <20210803202735.GA37904@port70.net> <6C4DCCC86B014B68877D73C798F54180@H270> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6C4DCCC86B014B68877D73C798F54180@H270> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [Patch] src/math/i386/remquo.s: remove conditional branch, shorter bit twiddling On Fri, Aug 06, 2021 at 12:17:12PM +0200, Stefan Kanthak wrote: > Alexander Monakov wrote: > > > On Wed, 4 Aug 2021, Stefan Kanthak wrote: > >> The change just follows by removing 6 LOC/instructions.-) > > > > Have you considered collecting the three bits in one go via a multiplication? > > No. My mind is not that twisted;-) > > > You can first isolate the necessary bits with 'and $0x4300, %eax', then do > > 'imul $0x910000, %eax, %eax' to put the required bits in EAX[31:29] in the > > right order, then shift right by 29. Three instructions, 14 bytes. > > Thanks, VERY NICE! How did you come up to it? > > Revised patch with shorter bit twiddling attached. The path forward for all the math asm is moving it to inline asm in C files, with no flow control or bit/register shuffling in the asm, only using asm for the single instructions. See how Alexander Monakov did x86_64 remquol in commit 19f870c3a68a959c7c6ef1de12086ac908920e5e. I haven't read the mul trick here in detail but I believe it should be duplicable with plain C * operator. I really do not want to review/merge asm changes that keep this kind of complex logic in asm when there's no strong motivation for it (like fixing an actual bug, vs just reducing size or improving speed). The risk to reward ratio is just not reasonable. Rich