From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14984 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Patches for math subtree Date: Sun, 8 Dec 2019 09:54:15 -0500 Message-ID: <20191208145415.GE1666@brightrain.aerifal.cx> References: <20191207203804.GC1666@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="193691"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-15000-gllmg-musl=m.gmane.org@lists.openwall.com Sun Dec 08 15:54:32 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1idxxA-000oID-EK for gllmg-musl@m.gmane.org; Sun, 08 Dec 2019 15:54:32 +0100 Original-Received: (qmail 15362 invoked by uid 550); 8 Dec 2019 14:54:29 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 14311 invoked from network); 8 Dec 2019 14:54:28 -0000 Content-Disposition: inline In-Reply-To: <20191207203804.GC1666@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14984 Archived-At: On Sat, Dec 07, 2019 at 03:38:04PM -0500, Rich Felker wrote: > On Sat, Dec 07, 2019 at 09:15:34PM +0100, Stefan Kanthak wrote: > > Just some optimisations. > > > > --- -/src/math/i386/remquo.s > > +++ +/src/math/i386/remquo.s > > @@ -23,23 +23,17 @@ > > remquo: > > mov 20(%esp),%ecx > > fldl 12(%esp) > > fldl 4(%esp) > > mov 19(%esp),%dh > > xor 11(%esp),%dh > > 1: fprem1 > > fnstsw %ax > > sahf > > jp 1b > > fstp %st(1) > > - mov %ah,%dl > > - shr %dl > > - and $1,%dl > > - mov %ah,%al > > - shr $5,%al > > - and $2,%al > > - or %al,%dl > > - mov %ah,%al > > - shl $2,%al > > - and $4,%al > > - or %al,%dl > > + setc %dl > > + shl $2,%ah > > + adc %dl,%dl > > + shl $5,%ah > > + adc %dl,%dl > > test %dh,%dh > > > > --- -/src/math/ceil.c > > +++ +/src/math/ceil.c > > @@ -18,10 +18,10 @@ > > + /* special case because of non-nearest rounding modes */ > > + if (e < 0x3ff) { > > + FORCE_EVAL(y); > > + return u.i >> 63 ? -0.0 : 1.0; > > + } > > /* y = int(x) - x, where int(x) is an integer neighbor of x */ > > if (u.i >> 63) > > y = x - toint + toint - x; > > else > > y = x + toint - toint - x; > > - /* special case because of non-nearest rounding modes */ > > - if (e <= 0x3ff-1) { > > - FORCE_EVAL(y); > > - return u.i >> 63 ? -0.0 : 1; > > - } > > > > --- -/src/math/floor.c > > +++ +/src/math/floor.c > > @@ -18,10 +18,10 @@ > > + /* special case because of non-nearest rounding modes */ > > + if (e < 0x3ff) { > > + FORCE_EVAL(y); > > + return u.i >> 63 ? -1.0 : 0.0; > > + } > > /* y = int(x) - x, where int(x) is an integer neighbor of x */ > > if (u.i >> 63) > > y = x - toint + toint - x; > > else > > y = x + toint - toint - x; > > - /* special case because of non-nearest rounding modes */ > > - if (e <= 0x3ff-1) { > > - FORCE_EVAL(y); > > - return u.i >> 63 ? -1 : 0; > > - } > > Do you have any explanation of why these are optimizations? Specifically, the x86 asm one looks like it probably is, but I haven't read closely enough to verify. If it's a measurable improvement I'll try to take a look at it soon, but at some point all of the .s files here are slated for removal and replacement with inline asm in .c files that avoids all of the delicate flow/logic in asm and just uses the x87 instructions needed, so I don't want to spend a lot of effort on improving and validating improvements to them. For the latter two, the patches as written are wrong. They evaluate an uninitialized variable y. And I think these functions are required to set the status flags, so you can't just remove that. Maybe there's some alternate way to do it that would be faster, like just evaluating x±toint rather than the whole expression, but I'm not sure it helps. Rich