From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id 66fa6806 for ; Tue, 21 Jan 2020 04:22:46 +0000 (UTC) Received: (qmail 26015 invoked by uid 550); 21 Jan 2020 04:22:45 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 25997 invoked from network); 21 Jan 2020 04:22:44 -0000 Date: Mon, 20 Jan 2020 23:22:31 -0500 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200121042231.GO30412@brightrain.aerifal.cx> References: <20200116161427.GO30412@brightrain.aerifal.cx> <20200116193343.GP30412@brightrain.aerifal.cx> <20200117145350.GR30412@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: Rich Felker Subject: Re: [musl] Considering x86-64 fenv.s to C On Tue, Jan 21, 2020 at 02:53:53PM +1100, Damian McGuckin wrote: > > On Thu, 16 Jan 2020, Rich Felker wrote: > > >Would you be interested in assessing what kind of abstraction makes > >sense here? > > I think it is quite difficult, but eventually feasibly. > > Even having one abstract version for i386/x32 and x86_64 is not easy. > > My thoughts were to do an abstraction that works for at least those three, > simplify this to be even more abstract, and then see how well it > works for say something else. The i386/x32 and x86 are arguably > among the worst as > they effectively have 2 lots of status and control registers which are > not synced on-chip but that need to be for MUSL. It's possible that the x86's are actually the worst fit for the abstraction, and should be left separate, while the rest are unified. > The only assembler in which I have even limited skills is Sparc32/64 > which is not terribly useful for MUSL but in terms of an > abstraction, may be as good as anything. I will be investing in an > ARM soon but my skills will be starting from a base of none. If you don't feel ready to do unification or work on archs you're unfamiliar with, I think it's okay to either (1) only do the x86 work now, with no unification, or (2) start the unification in src/fenv/*.c, but with the arch files left in place in src/fenv/*/*.[csS] for all the archs that haven't been converted yet. I don't want to block improvement of the x86 versions just because the bigger task is too big. > On Fri, 17 Jan 2020, Rich Felker wrote: > > >As you said above, updating x87 status register is expensive because > >the only way to write it is to read-modify-write the whole fenv. But > >since we know on x86_64 we have sse registers, we can just move all > >the flags to the sse register, then use fnclex to clear the x87 one > >inexpensively, and the effective set of raised flags remains the same. > > > >I think we could do this on i386 too with a couple tricks: > > > >1. Do the same thing if sse is available (hwcap check). > > Yes. > > > >2. If sse is not available, clear all flags then re-raise the desired > >set via arithmetic operations. > > That works. That said, Based on a comment earlier today, my > thoughts are to use an arithmetic expression for the case where only > a single exception was active, including the pairs INEXACT/OVERFLOW > and INEXACT/UNDERFLOW, and use a fegetenv/set-register/fesetenv for > anything more complex. I think arithmetic should be far better for *any* case it works on. Another really stupid but perhaps very efficient idea we could do is just emulating the flags. Add a TLS slot for an fexcept_t value, move exceptions there as needed, and or it onto the result when reading back current exceptions. This would also make it dirt cheap for the math library to raise any exception it wants, without needing arithmetic, and it would make it possible to have the math library return errors via exception flags even on softfloat archs. Rich