mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] Re:Re: [musl] Re:Re: [musl] Re:Re: [musl] Re:Re: [musl] qsort
Date: Sat, 11 Feb 2023 08:35:33 -0500	[thread overview]
Message-ID: <20230211133532.GD4163@brightrain.aerifal.cx> (raw)
In-Reply-To: <CQFM48UU024L.3F72QJSEDJMQ@sumire>

On Sat, Feb 11, 2023 at 10:06:02AM +0100, alice wrote:
> On Sat Feb 11, 2023 at 9:39 AM CET, Joakim Sindholt wrote:
> > On Sat, 11 Feb 2023 06:44:29 +0100, "alice" <alice@ayaya.dev> wrote:
> > > based on the glibc profiling, glibc also has their natively-loaded-cpu-specific
> > > optimisations, the _avx_ functions in your case. musl doesn't implement any
> > > SIMD optimisations, so this is a bit apples-to-oranges unless musl implements
> > > the same kind of native per-arch optimisation.
> > > 
> > > you should rerun these with GLIBC_TUNABLES, from something in:
> > > https://www.gnu.org/software/libc/manual/html_node/Hardware-Capability-Tunables.html
> > > which should let you disable them all (if you just want to compare C to C code).
> > > 
> > > ( unrelated, but has there been some historic discussion of implementing
> > >   something similar in musl? i feel like i might be forgetting something. )
> >
> > There already are arch-specific asm implementations of functions like
> > memcpy.
> 
> apologies, i wasn't quite clear- the difference
> between src/string/x86_64/memcpy.s and the glibc fiesta is that the latter
> utilises subarch-specific SIMD (as you explain below), e.g. AVX like in the
> above benchmarks. a baseline x86_64 asm is more fair-game if the difference is
> as significant as it is for memcpy :)

Folks are missing the point here. It's not anything to do with AVX or
even glibc's memcpy making glibc faster here. Rather, it's that glibc
is *not calling memcpy* for 4-byte (and likely a bunch of other
specialized cases) element sizes. Either they manually special-case
them, or the compiler (due to lack of -ffreestanding and likely -O3 or
something) is inlining the memcpy.

Based on the profiling data, I would predict an instant 2x speed boost
special-casing small sizes to swap directly with no memcpy call.

Incidentally, our memcpy is almost surely at least as fast as glibc's
for 4-byte copies. It's very large sizes where performance is likely
to diverge.

Rich

  parent reply	other threads:[~2023-02-11 13:35 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20  1:49 Guy
2023-01-20 12:55 ` alice
2023-01-30 10:04   ` [musl] " David Wang
2023-02-01 18:01     ` Markus Wichmann
2023-02-02  2:12       ` [musl] " David Wang
2023-02-03  5:22         ` [musl] " David Wang
2023-02-03  8:03           ` Alexander Monakov
2023-02-03  9:01             ` [musl] " David Wang
2023-02-09 19:03       ` Rich Felker
2023-02-09 19:20         ` Alexander Monakov
2023-02-09 19:52           ` Rich Felker
2023-02-09 20:18             ` Rich Felker
2023-02-09 20:27               ` Pierpaolo Bernardi
2023-02-10  4:10             ` Markus Wichmann
2023-02-10 10:00         ` [musl] " David Wang
2023-02-10 13:10           ` Rich Felker
2023-02-10 13:45             ` [musl] " David Wang
2023-02-10 14:19               ` Rich Felker
2023-02-11  5:12                 ` [musl] " David Wang
2023-02-11  5:44                   ` alice
2023-02-11  8:39                     ` Joakim Sindholt
2023-02-11  9:06                       ` alice
2023-02-11  9:31                         ` [musl] " David Wang
2023-02-11 13:35                         ` Rich Felker [this message]
2023-02-11 17:18                           ` David Wang
2023-02-16 15:15       ` David Wang
2023-02-16 16:07         ` Rich Felker
2023-02-17  1:35           ` [musl] " David Wang
2023-02-17 13:17           ` Alexander Monakov
2023-02-17 15:07             ` Rich Felker
2023-02-11  9:22     ` [musl] " Markus Wichmann
2023-02-11  9:36       ` [musl] " David Wang
2023-02-11  9:51       ` David Wang
2023-01-20 13:32 ` [musl] qsort Valery Ushakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230211133532.GD4163@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).