From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 6373 invoked from network); 15 Aug 2021 15:22:49 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 15 Aug 2021 15:22:49 -0000 Received: (qmail 11766 invoked by uid 550); 15 Aug 2021 15:22:47 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 11748 invoked from network); 15 Aug 2021 15:22:46 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nexgo.de; s=vfde-smtpout-mb-15sep; t=1629040955; bh=kNZNEnWAil8krlN0J5Jfpde+p0uP+vPZ3Q/9M30xYUU=; h=From:To:Cc:References:In-Reply-To:Subject:Date; b=mxgRi3vNpY67aUDlBGq6NmRGrHjxUcYFmVrGZvSrZGR7aefQU/YlU9dvRfY8x7JLi 1mEkdqxNfMtnTJYSnLyQcTlbEtp38gct27iq6nKHbDE/LUQ+O39rI7dTt2u2vp9za1 tIVr94n7mlVt5Wv3ZDTj4qKVMlfRQiet8YSV4+q8= Message-ID: <1F3569BD7D6E45889B7518DC9BE5004B@H270> From: "Stefan Kanthak" To: "Szabolcs Nagy" Cc: References: <0C6AAAD55DA44C6189B2FF4F5FB2C3E7@H270> <20210810213455.GB37904@port70.net> <20210814234612.GH37904@port70.net> <367A4018B58A4E308E2A95404362CBFB@H270> <20210815145614.GI37904@port70.net> In-Reply-To: <20210815145614.GI37904@port70.net> Date: Sun, 15 Aug 2021 17:19:05 +0200 Organization: Me, myself & IT MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Windows Mail 6.0.6002.18197 X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7601.24158 X-purgate-type: clean X-purgate-Ad: Categorized by eleven eXpurgate (R) http://www.eleven.de X-purgate: This mail is considered clean (visit http://www.eleven.de for further information) X-purgate: clean X-purgate-size: 2196 X-purgate-ID: 155817::1629040955-00004EF9-93EA7DFB/0/0 Subject: Re: [musl] [PATCH #2] Properly simplified nextafter() Szabolcs Nagy wrote: > * Stefan Kanthak [2021-08-15 09:04:55 +0200]: >> Szabolcs Nagy wrote: >>> you should benchmark, but the second best is to look >>> at the longest dependency chain in the hot path and >>> add up the instruction latencies. >> >> 1 billion calls to nextafter(), with random from, and to either 0 or +INF: >> run 1 against glibc, 8.58 ns/call >> run 2 against musl original, 3.59 >> run 3 against musl patched, 0.52 >> run 4 the pure floating-point variant from 0.72 >> my initial post in this thread, >> run 5 the assembly variant I posted. 0.28 ns/call > > thanks for the numbers. it's not the best measurment IF YOU DON'T LIKE IT, PERFORM YOUR OWN MEASUREMENT! > but shows some interesting effects. It clearly shows that musl's current implementation SUCKS, at least on AMD64. >> >> Now hurry up and patch your slowmotion code! >> >> Stefan >> >> PS: I cheated a very tiny little bit: the isnan() macro of musl patched is >> >> #ifdef PATCH >> #define isnan(x) ( \ >> sizeof(x) == sizeof(float) ? (__FLOAT_BITS(x) << 1) > 0xff00000U : \ >> sizeof(x) == sizeof(double) ? (__DOUBLE_BITS(x) << 1) > 0xffe0000000000000ULL : \ >> __fpclassifyl(x) == FP_NAN) >> #else >> #define isnan(x) ( \ >> sizeof(x) == sizeof(float) ? (__FLOAT_BITS(x) & 0x7fffffff) > 0x7f800000 : \ >> sizeof(x) == sizeof(double) ? (__DOUBLE_BITS(x) & -1ULL>>1) > 0x7ffULL<<52 : \ >> __fpclassifyl(x) == FP_NAN) >> #endif // PATCH > > i think on x86 this only changes an and to an add > (or nothing at all if the compiler is smart) BETTER THINK TWICE: where does the mask needed for the and come from? Does it need an extra register? How do you (for example) build it on ARM? > if this is measurable that's an uarch issue of your cpu. ARGH: it's not the and that makes the difference! JFTR: movabs $0x7ff0000000000000, %r*x is a 10 byte instruction I recommend to read Intel's and AMD's processor optimisation manuals and learn just a little bit! [braindead fullquote removed] Stefan