From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14083 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] x86: optimize fp_arch.h Date: Wed, 24 Apr 2019 22:01:08 -0400 Message-ID: <20190425020108.GP23599@brightrain.aerifal.cx> References: <20190424235106.GH26605@port70.net> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="38147"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14099-gllmg-musl=m.gmane.org@lists.openwall.com Thu Apr 25 04:01:25 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hJThU-0009p5-Up for gllmg-musl@m.gmane.org; Thu, 25 Apr 2019 04:01:25 +0200 Original-Received: (qmail 16383 invoked by uid 550); 25 Apr 2019 02:01:22 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 16356 invoked from network); 25 Apr 2019 02:01:21 -0000 Content-Disposition: inline In-Reply-To: <20190424235106.GH26605@port70.net> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14083 Archived-At: On Thu, Apr 25, 2019 at 01:51:06AM +0200, Szabolcs Nagy wrote: > tested on x86_64 and i386 > >From 5f97370ff3e94bea812ec123a31d7482965a3b1b Mon Sep 17 00:00:00 2001 > From: Szabolcs Nagy > Date: Wed, 24 Apr 2019 23:29:05 +0000 > Subject: [PATCH] x86: optimize fp_arch.h > > Use fp register constraint instead of volatile store when sse2 math is > available, and use memory constraint when only x87 fpu is available. > --- > arch/i386/fp_arch.h | 31 +++++++++++++++++++++++++++++++ > arch/x32/fp_arch.h | 25 +++++++++++++++++++++++++ > arch/x86_64/fp_arch.h | 25 +++++++++++++++++++++++++ > 3 files changed, 81 insertions(+) > create mode 100644 arch/i386/fp_arch.h > create mode 100644 arch/x32/fp_arch.h > create mode 100644 arch/x86_64/fp_arch.h > > diff --git a/arch/i386/fp_arch.h b/arch/i386/fp_arch.h > new file mode 100644 > index 00000000..b4019de2 > --- /dev/null > +++ b/arch/i386/fp_arch.h > @@ -0,0 +1,31 @@ > +#ifdef __SSE2_MATH__ > +#define FP_BARRIER(x) __asm__ __volatile__ ("" : "+x"(x)) > +#else > +#define FP_BARRIER(x) __asm__ __volatile__ ("" : "+m"(x)) > +#endif I guess for float and double you need the "m" constraint to ensure that a broken compiler doesn't skip dropping of precision (although I still wish we didn't bother with complexity to support that, and just relied on cast working correctly), but at least for long double couldn't we use an x87 register constraint to avoid the spill to memory? Rich