From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,RDNS_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 Received: (qmail 29923 invoked from network); 21 Mar 2020 17:54:10 -0000 Received-SPF: pass (mother.openwall.net: domain of lists.openwall.com designates 195.42.179.200 as permitted sender) receiver=inbox.vuxu.org; client-ip=195.42.179.200 envelope-from= Received: from unknown (HELO mother.openwall.net) (195.42.179.200) by inbox.vuxu.org with ESMTP; 21 Mar 2020 17:54:10 -0000 Received: (qmail 5930 invoked by uid 550); 21 Mar 2020 17:54:06 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 5908 invoked from network); 21 Mar 2020 17:54:04 -0000 Date: Sat, 21 Mar 2020 13:53:51 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200321175351.GJ11469@brightrain.aerifal.cx> References: <20200107130605.7618-1-amonakov@ispras.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200107130605.7618-1-amonakov@ispras.ru> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH] math: move i386 sqrt to C On Tue, Jan 07, 2020 at 04:06:05PM +0300, Alexander Monakov wrote: > --- > Since union ldshape does not have a dedicated field for 32 least significant > bits of the x87 long double mantissa, keeping the original approach with > > ux.i.m -= (fpsr & 0x200) - 0x100; > > would lead to a 64-bit subtraction that is not trivial for the compiler to > optimize to 32-bit subtraction as done in the original assembly. Therefore > I have elected to change the approach and use > > ux.i.m ^= (fpsr & 0x200) + 0x200; > > which is easier to optimize to a 32-bit rather than 64-bit xor. > > Thoughts? I'm getting test failures with sqrt and this seems to be the culprit -- I don't think it's equivalent. The original version could offset the value by +0x100 or -0x100 before rounding, and offsets in the opposite direction of the rounding that already occurred. Your version can only offset it by +0x200 or -0x400. The (well, one) particular failing case is: src/math/ucb/sqrt.h:49: RU sqrt(0x1.fffffffffffffp+1023) want 0x1p+512 got 0x1.fffffffffffffp+511 ulperr -0.250 = -0x1p-1 + 0x1p-2 Here the mantissa is fffffffffffffc00 and offset by -0x400 yields: fffffffffffff800 which has exactly 53 bits and therefore does not round up like it should. I still like your approach better if there's a way to salvage it. Do you see one? Rich