mailing list of musl libc
 help / color / mirror / code / Atom feed
From: "David Wang" <00107082@163.com>
To: musl@lists.openwall.com
Subject: [musl] Re:Re: [musl] qsort
Date: Sun, 12 Feb 2023 01:18:18 +0800 (CST)	[thread overview]
Message-ID: <c0682e6.1a.186417c1d3a.Coremail.00107082@163.com> (raw)
In-Reply-To: <20230211133532.GD4163@brightrain.aerifal.cx>




At 2023-02-11 21:35:33, "Rich Felker" <dalias@libc.org> wrote:

>Based on the profiling data, I would predict an instant 2x speed boost
>special-casing small sizes to swap directly with no memcpy call.
>

I made some experimental changes, use different cycle function for width 4,8 or 16,
--- a/src/stdlib/qsort.c
+++ b/src/stdlib/qsort.c
...
+static void cyclex1(unsigned char* ar[], int n)
+{
+       unsigned char tmp[32];
+       int i;
+    int *p1, *p2;
+       if(n < 2) {
+               return;
+       }
+       ar[n] = tmp;
+    p1 = (int*)ar[n];
+    p2 = (int*)ar[0];
+    *p1=*p2;
+    for(i = 0; i < n; i++) {
+        p1 = (int*)ar[i];
+        p2 = (int*)ar[i+1];
+        p1[0]=p2[0];
+    }
+}
+static void cyclex2(unsigned char* ar[], int n)
+{
+       unsigned char tmp[32];
+       int i;
+    long long *p1, *p2;
+       if(n < 2) {
+               return;
+       }
+       ar[n] = tmp;
+    p1 = (long long*)ar[n];
+    p2 = (long long*)ar[0];
+    *p1=*p2;
+    for(i = 0; i < n; i++) {
+        p1 = (long long*)ar[i];
+        p2 = (long long*)ar[i+1];
+        p1[0]=p2[0];
+    }
+}
+static void cyclex4(unsigned char* ar[], int n)
+{
+       unsigned char tmp[32];
+       int i;
+    long long *p1, *p2;
+       if(n < 2) {
+               return;
+       }
+       ar[n] = tmp;
+    p1 = (long long*)ar[n];
+    p2 = (long long*)ar[0];
+    *p1++=*p2++;
+    *p1++=*p2++;
+    for(i = 0; i < n; i++) {
+        p1 = (long long*)ar[i];
+        p2 = (long long*)ar[i+1];
+        p1[0]=p2[0];
+        p1[1]=p2[1];
+    }
+}
+

-       cycle(width, ar, i);
+    if (width==4) cyclex1(ar, i);
+    else if (width==8) cyclex2(ar, i);
+    else if (width==16) cyclex4(ar, i);
+    else cycle(width, ar, i);
---
I am not skilled in writing high performance codes, the above is what I can think of for now.

a rough timing report is as following:
+-------------------------+-----------+----------+-----------+
|        item size        |   glibc   |   musl   |  opt musl |
+-------------------------+-----------+----------+-----------+
|          4 int          | 0m15.794s | 1m 7.52s | 0m 37.27s |
|          8 long         | 0m16.351s | 1m 2.92s | 0m 45.12s |
| 16 struct{ long k, v; } | 0m23.262s | 1m 9.74s | 0m 55.07s |
+-------------------------+-----------+----------+-----------+
(128 rounds of qsort random 1<<20 items)
The test code for 16bytes qsort:

#include <stdio.h>
#include <stdlib.h>


typedef struct { long long k, v; } VNode;
int mycmp(const void *a, const void *b) {
    long long d = ((const VNode*)a)->v - ((const VNode*)b)->v;
    if (d>0) return 1;
    else if (d<0) return -1;
    return 0;
}

#define MAXN 1<<20
VNode vs[MAXN];

int main() {
    int i, j, k, n;
    long long t;
    for (k=0; k<128; k++) {
        for (i=0; i<MAXN; i++) vs[i].v=i;
        for (n=MAXN; n>1; n--) {
            i=n-1; j=rand()%n;
            if (i!=j) { t=vs[i].v; vs[i].v=vs[j].v; vs[j].v=t; }
        }
        qsort(vs, MAXN, sizeof(vs[0]), mycmp);
        for (i=0; i<MAXN; i++) if (vs[i].v!=i) { printf("error\n") ;return 1; }
    }
    return 0;
}


The highest improvement happens with sorting int32,  and as date size increases, the impact of the memcpy call-overhead decreases.




>Incidentally, our memcpy is almost surely at least as fast as glibc's
>for 4-byte copies. It's very large sizes where performance is likely
>to diverge.
>
>Rich

  reply	other threads:[~2023-02-11 17:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20  1:49 Guy
2023-01-20 12:55 ` alice
2023-01-30 10:04   ` [musl] " David Wang
2023-02-01 18:01     ` Markus Wichmann
2023-02-02  2:12       ` [musl] " David Wang
2023-02-03  5:22         ` [musl] " David Wang
2023-02-03  8:03           ` Alexander Monakov
2023-02-03  9:01             ` [musl] " David Wang
2023-02-09 19:03       ` Rich Felker
2023-02-09 19:20         ` Alexander Monakov
2023-02-09 19:52           ` Rich Felker
2023-02-09 20:18             ` Rich Felker
2023-02-09 20:27               ` Pierpaolo Bernardi
2023-02-10  4:10             ` Markus Wichmann
2023-02-10 10:00         ` [musl] " David Wang
2023-02-10 13:10           ` Rich Felker
2023-02-10 13:45             ` [musl] " David Wang
2023-02-10 14:19               ` Rich Felker
2023-02-11  5:12                 ` [musl] " David Wang
2023-02-11  5:44                   ` alice
2023-02-11  8:39                     ` Joakim Sindholt
2023-02-11  9:06                       ` alice
2023-02-11  9:31                         ` [musl] " David Wang
2023-02-11 13:35                         ` Rich Felker
2023-02-11 17:18                           ` David Wang [this message]
2023-02-16 15:15       ` David Wang
2023-02-16 16:07         ` Rich Felker
2023-02-17  1:35           ` [musl] " David Wang
2023-02-17 13:17           ` Alexander Monakov
2023-02-17 15:07             ` Rich Felker
2023-02-11  9:22     ` [musl] " Markus Wichmann
2023-02-11  9:36       ` [musl] " David Wang
2023-02-11  9:51       ` David Wang
2023-01-20 13:32 ` [musl] qsort Valery Ushakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c0682e6.1a.186417c1d3a.Coremail.00107082@163.com \
    --to=00107082@163.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).