mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Markus Wichmann <nullplan@gmx.net>
To: musl@lists.openwall.com
Subject: Re: [musl] Re:Re: [musl] qsort
Date: Sat, 11 Feb 2023 10:22:49 +0100	[thread overview]
Message-ID: <20230211092249.GC1903@voyager> (raw)
In-Reply-To: <4d290220.36d6.1860222ca46.Coremail.00107082@163.com>

On Mon, Jan 30, 2023 at 06:04:39PM +0800, David Wang wrote:
> On average, qsort with musl is significantly slower than c++ sort,
> while with glibc the average performance of qsort is bettern than c++
> sort.

Getting back to this, with the benchmarks recently performed, I notice
one area where C++ is able to significantly outperform C, and that is in
the area of generic functions. C++ std::sort is a template function, and
the compiler can optimize the code for the situation. For example, in
your case, you only ever need to swap two elements by way of copying
four bytes, which the compiler can just inline.

In C, the same code can only call memcpy(), and a call to cycle() will
involve many calls to memcpy(), and they cannot be inlined since the
compiler will not know the copy size at the time it is compiling the
calls. We end up with tons of calls to memcpy() with a size of four,
when memcpy() is optimized for large sizes.

Merge sort, as glibc implements it, allows at least a few memcpy() calls
with larger sizes. Glibc also avoids calling memcpy by duplicating the
code for common sizes, only resorting to memcpy() for the default case.
That may be an avenue worth pursuing, at least for cycle().

I'm not quite certain why glibc chose the sizes they chose for
msort_with_tmp(), but they are branching out for 32-bit integers, 64-bit
integers, longs (which in a System V ABI are always one of the former
two) and void* (which in a System V ABI are always the same as a long).
I think special casing 32-bit and 64-bit integers ought to get us most
of the way there, since those are the most common sizes that can still
be copied simply inline.

Also note that glibc further reduces the need for memcpy() by sorting
indirectly if the size of a single element exceeds 32 bytes. That is of
course something they can do, since they already are allocating memory
and have a fall-back strategy in place. Not sure if this is worthwhile
for musl. At least in the static linking case, it would suddenly pull
malloc() and free() into executables that previously did not have those.

That last optimization would also not have any bearing on the current
batch of benchmarks, since in those, a size of four is set in stone.

Ciao,
Markus

  parent reply	other threads:[~2023-02-11  9:23 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20  1:49 Guy
2023-01-20 12:55 ` alice
2023-01-30 10:04   ` [musl] " David Wang
2023-02-01 18:01     ` Markus Wichmann
2023-02-02  2:12       ` [musl] " David Wang
2023-02-03  5:22         ` [musl] " David Wang
2023-02-03  8:03           ` Alexander Monakov
2023-02-03  9:01             ` [musl] " David Wang
2023-02-09 19:03       ` Rich Felker
2023-02-09 19:20         ` Alexander Monakov
2023-02-09 19:52           ` Rich Felker
2023-02-09 20:18             ` Rich Felker
2023-02-09 20:27               ` Pierpaolo Bernardi
2023-02-10  4:10             ` Markus Wichmann
2023-02-10 10:00         ` [musl] " David Wang
2023-02-10 13:10           ` Rich Felker
2023-02-10 13:45             ` [musl] " David Wang
2023-02-10 14:19               ` Rich Felker
2023-02-11  5:12                 ` [musl] " David Wang
2023-02-11  5:44                   ` alice
2023-02-11  8:39                     ` Joakim Sindholt
2023-02-11  9:06                       ` alice
2023-02-11  9:31                         ` [musl] " David Wang
2023-02-11 13:35                         ` Rich Felker
2023-02-11 17:18                           ` David Wang
2023-02-16 15:15       ` David Wang
2023-02-16 16:07         ` Rich Felker
2023-02-17  1:35           ` [musl] " David Wang
2023-02-17 13:17           ` Alexander Monakov
2023-02-17 15:07             ` Rich Felker
2023-02-11  9:22     ` Markus Wichmann [this message]
2023-02-11  9:36       ` [musl] Re:Re: [musl] " David Wang
2023-02-11  9:51       ` David Wang
2023-01-20 13:32 ` [musl] qsort Valery Ushakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230211092249.GC1903@voyager \
    --to=nullplan@gmx.net \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).