From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 6708 invoked from network); 10 Aug 2021 21:35:10 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 10 Aug 2021 21:35:10 -0000 Received: (qmail 9457 invoked by uid 550); 10 Aug 2021 21:35:08 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 9436 invoked from network); 10 Aug 2021 21:35:07 -0000 Date: Tue, 10 Aug 2021 23:34:55 +0200 From: Szabolcs Nagy To: Stefan Kanthak Cc: musl@lists.openwall.com Message-ID: <20210810213455.GB37904@port70.net> Mail-Followup-To: Stefan Kanthak , musl@lists.openwall.com References: <0C6AAAD55DA44C6189B2FF4F5FB2C3E7@H270> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0C6AAAD55DA44C6189B2FF4F5FB2C3E7@H270> Subject: Re: [musl] [PATCH] Properly simplified nextafter() * Stefan Kanthak [2021-08-10 08:23:46 +0200]: > > has quite some superfluous statements: > > 1. there's absolutely no need for 2 uint64_t holding |x| and |y|; > 2. IEEE-754 specifies -0.0 == +0.0, so (x == y) is equivalent to > (ax == 0) && (ay == 0): the latter 2 tests can be removed; you replaced 4 int cmps with 4 float cmps (among other things). it's target dependent if float compares are fast or not. (the i386 machine where i originally tested this preferred int cmp and float cmp was very slow in the subnormal range and iirc it also raises the non-standard input denormal exception, which is fine i guess. of course soft float abis much prefer int cmp so your code is likely much slower and bigger there). but i'm not against the change, it is likely better on modern machines. did you try to benchmark it? or check the code size? > 3. there's absolutely no need to compare the signs of x and y > with the sign of the direction: its sufficient to test that > direction and sign of x match; > 4. a proper compiler/optimizer should be able to reuse the results > of the comparision (x == y) for (x < y) or (x > y) and > (x == 0.0) for (x < 0.0) or (x > 0.0). > > JFTR: if ((x < 0.0) == (x < y)) is equivalent to > if ((x > 0.0) == (x > y)) > > --- -/src/math/nextafter.c > +++ +/src/math/nextafter.c > @@ -3,20 +3,15 @@ > double nextafter(double x, double y) > { > union {double f; uint64_t i;} ux={x}, uy={y}; > - uint64_t ax, ay; > int e; > > if (isnan(x) || isnan(y)) > return x + y; > - if (ux.i == uy.i) > + if (x == y) > return y; > - ax = ux.i & -1ULL/2; > - ay = uy.i & -1ULL/2; > - if (ax == 0) { > - if (ay == 0) > - return y; > + if (x == 0.0) > ux.i = (uy.i & 1ULL<<63) | 1; > - } else if (ax > ay || ((ux.i ^ uy.i) & 1ULL<<63)) > + else if ((x < 0.0) == (x < y)) > ux.i--; > else > ux.i++;