Re: [PATCH] fmax(), fmaxf(), fmaxl(), fmin(), fminf(), fminl() simplified

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Rich Felker <dalias@libc.org>
To: Stefan Kanthak <stefan.kanthak@nexgo.de>
Cc: Szabolcs Nagy <nsz@port70.net>, musl@lists.openwall.com
Subject: Re: [PATCH] fmax(), fmaxf(), fmaxl(), fmin(), fminf(), fminl() simplified
Date: Wed, 11 Dec 2019 16:30:39 -0500	[thread overview]
Message-ID: <20191211213039.GP1666@brightrain.aerifal.cx> (raw)
In-Reply-To: <438FA5B9D0E8497FA998E44E99C7CC8B@H270>

On Wed, Dec 11, 2019 at 10:17:09PM +0100, Stefan Kanthak wrote:
> "Szabolcs Nagy" <nsz@port70.net> wrote:
> 
> 
> >* Stefan Kanthak <stefan.kanthak@nexgo.de> [2019-12-11 13:33:44 +0100]:
> >> "Szabolcs Nagy" <nsz@port70.net> wrote:
> >> >* Stefan Kanthak <stefan.kanthak@nexgo.de> [2019-12-11 10:55:29 +0100]:
> >> > these two are not equivalent for snan input, but we dont care
> >> > about snan, nor the compiler by default, so the compiler can
> >> > optimize one to the other (although musl uses explicit int
> >> > arithmetics instead of __builtin_isnan so it's a bit harder).
> >> 
> >> The latter behaviour was my reason to use (x != x) here: I attempt
> >> to replace as many function calls as possible with "normal" code,
> >> and also try to avoid transfers to/from FPU/SSE registers to/from
> >> integer registers if that does not result in faster/shorter code.
> > 
> > why not just change the definition of isnan then?
> 
> Because I did not want to introduce such a global change; until now my
> patches are just local (peephole) optimisations.
> 
> > #if __GNUC__ > xxx
> > #define isnan(x) sizeof(x)==sizeof(float) ? __builtin_isnanf(x) : ...
> 
> This is better than my proposed change, as it also avoids the side-
> effect of (x != x) which can raise exceptions, and gets rid of the
> explicit transfer to integer registers, which can hurt performance.
> 
> The macros isinf(), isnormal(), isfinite(), signbit() should of
> course be implemented in a similar way too, and the (internal only?)
> functions __FLOAT_BITS() and __DOUBLE_BITS() removed completely!

Not removed because the public headers support non-GNUC (or older GCC?
I forget when these were introduced) compilers that may not provide
these. Having the portable definitions present as the fallback case is
still desirable.

> PS: the following is just a "Gedankenspiel", extending the idea to
>     avoid transfers from/to SSE registers.
>     On x86-64, functions like isunordered(), copysign() etc. may be
>     implemented using SSE intrinsics _mm_*() as follows:
> 
> #include <immintrin.h>
> 
> int signbit(double argument)
> {
>     return /* 1 & */ _mm_movemask_pd(_mm_set_sd(argument));
> }

This is just a missed optimization the compiler should be able to do
without intrinsics, on any arch where floating point types are kept in
vector registers that can also do integer/bitmask operations.

> uint32_t lrint(double argument)
> {
>     return _mm_cvtsd_si32(_mm_set_sd(argument));
> }

This is already done (on x86_64 where it's valid). It's in an asm
source file but should be converted to a C source file with __asm__
and proper constraint, not intrinsics, because __asm__ is a compiler
feature we require support for and intrinsics aren't (and also they
have some really weird semantics with respect to how they interface
with C aliasing rules).

> double copysign(double magnitude, double sign)
> {
>     return _mm_cvtsd_f64(_mm_or_pd(_mm_and_pd(_mm_set_sd(-0.0), _mm_set_sd(sign)),
>                                    _mm_andnot_pd(_mm_set_sd(-0.0), _mm_set_sd(magnitude))));
> }

I don't think we have one like this for x86_64, but ideally the C
would compile to something like it. (See above about missed
optimization.)

>     Although the arguments and results are all held in SSE registers,
>     there's no way to use them directly; it's but necessary to
>     transfer them using _mm_set_sd() and _mm_cvtsd_f64(), which may
>     result in superfluous instructions emitted by the compiler.

I don't see why you say that. They should be used in-place if possible
just by virtue of how the compiler's IR works. Certainly for the
__asm__ form they will be used in-place.

>     If you but cheat and "hide" these functions from the compiler
>     by placing them in a library, you can implement them as follows:
> 
> __m128d fmin(__m128d x, __m128d y)
> {
>     __m128d mask = _mm_cmp_sd(x, x, _CMP_ORD_Q);
> 
>     return _mm_or_pd(_mm_and_pd(mask, _mm_min_sd(y, x)),
>                      _mm_andnot_pd(mask, y));
> }

Yes, this kind of thing (hacks with declaring functions with wrong
type to achieve an ABI result) is not something we really do in musl.
But it shouldn't be needed here.

Rich

next prev parent reply	other threads:[~2019-12-11 21:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11  9:55 Stefan Kanthak
2019-12-11 10:49 ` Szabolcs Nagy
2019-12-11 12:33   ` Stefan Kanthak
2019-12-11 13:16     ` Szabolcs Nagy
2019-12-11 13:25       ` Rich Felker
2019-12-11 21:17       ` Stefan Kanthak
2019-12-11 21:30         ` Rich Felker [this message]
2019-12-11 22:25           ` Stefan Kanthak
2019-12-11 22:14         ` Damian McGuckin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211213039.GP1666@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    --cc=nsz@port70.net \
    --cc=stefan.kanthak@nexgo.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).