From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3301 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Using float_t and double_t in math functions Date: Wed, 8 May 2013 21:43:27 -0400 Message-ID: <20130509014327.GA6338@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1368063821 25838 80.91.229.3 (9 May 2013 01:43:41 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 9 May 2013 01:43:41 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-3305-gllmg-musl=m.gmane.org@lists.openwall.com Thu May 09 03:43:41 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1UaFtU-0000cg-73 for gllmg-musl@plane.gmane.org; Thu, 09 May 2013 03:43:40 +0200 Original-Received: (qmail 5981 invoked by uid 550); 9 May 2013 01:43:39 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 5973 invoked from network); 9 May 2013 01:43:39 -0000 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:3301 Archived-At: Hi all, Today I've been doing some experimenting on the relative math performance of musl and glibc. After eliminating a lot of bogus results (the gcc 4.4 on my test machine (x86) was causing musl's configure to use -ffloat-store, which kills performance) things mostly look good. Aside from sqrt (which is more costly on musl because glibc's violates the requirement of correct rounding), everything I'm testing seems faster, in some cases up to five times faster. While debugging the slowdown from -ffloat-store, one thing I ran across is that a lot of functions end up performing store/load pairs to drop excess precision when storing intermediate results. The situation is much worse with -ffloat-store, but persists with modern gcc because of -fexcess-precision=standard, which is implied anyway by -std=c99. As far as I can tell, in most of the affected code, keeping excess precision does not hurt the accuracy of the result, and it might even improve the results. Thus, nsz and I discussed (on IRC) the possibility of changing intermediate variables in functions that can accept excess precision from float and double to float_t and double_t. This would not affect the generated code at all on machines without excess precision, but on x86 (without SSE) it eliminates all the costly store/load pairs. As an example (on my test machine), it dropped the cost of sinf(0.25) from 180 cycles to 130 cycles (glibc takes 140 cycles, the main difference apparently being that glibc's math library updates errno). Unless there are objections, I think we should change float and double intermediate variables in the implementations of math functions to float_t and double_t, except where it's actually important to avoid excess precision. Comments? Rich