From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 7436 invoked from network); 9 Mar 2021 03:44:32 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 9 Mar 2021 03:44:32 -0000 Received: (qmail 32619 invoked by uid 550); 9 Mar 2021 03:44:28 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 32577 invoked from network); 9 Mar 2021 03:44:28 -0000 X-Virus-Scanned: Debian amavisd-new at disroot.org From: =?UTF-8?q?=C3=89rico=20Nogueira?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail; t=1615261454; bh=N5gYea/v7lgUK3YfiFSk0CALmQ69q4XeQ4Ro8t3JHlE=; h=From:To:Cc:Subject:Date; b=i6OPew1+tG6tr6NDUylVasBqx9xAW3o1vB8xVHfKMeYH9T3zWRgJZYyVhhYfFpL/Y 4OY311K0k74IJouvFb6/CZw4dM/2OPt1Hbj1i1mwMq7nl4k+Xb2hXzX/jUVPawgcYi 6SpoGqo14W2DY04v06L6SuT6jCw9zKxoV8ZKMSTWgzT0Vopn7h+w2DYVbZXWguysAR vEixAieAItFytt0PTN/xcEDtnlr0DiXaNdLZZqlCW6FgbMAYMhJtA0OB/OBYg8A76m H38rNeVTperX8T08PRNVze5r34WCSRrre1x6VUkNpUnxEKMdiW+xUs4vRa4a3djvYw FVmcrI1ctp6gg== To: musl@lists.openwall.com Cc: =?UTF-8?q?=C3=89rico=20Nogueira?= Date: Tue, 9 Mar 2021 00:44:01 -0300 Message-Id: <20210309034401.22201-1-ericonr@disroot.org> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [musl] [PATCH] add qsort_r. since most discussion around the addition of this function has centered around the possible code duplication it requires or that qsort would become much slower if implemented as a wrapper around qsort_r, this commit uses an impl file that's included with the appropriate definitions in order to generate qsort_r or the old qsort. this avoids code duplication and the efficiency loss, at the cost of greater binary size - size(1) reports 602282 vs 600673. this commit requires that the extra argument used by qsort_r always be called 'arg', and it should be the first argument of the function, which allows the usage of variadic macros to call the implementation internal functions, instead of requiring the addition of multiple macros with hard to read definitions. it also requires the comparison function always be called 'cmp'. some of the other functions which don't deal with 'cmp' or 'arg' can probably be factored out into a separate file, allowing for possibly smaller code size, but that hasn't been tried yet. --- The commit diff is, however, rather large. Unfortunately, I'm not sure the status around qsort_r has changed much. The points raised in [1] are mostly still valid and the linked bugs are still open, including Leah's summary in [2]. The FreeBSD bug [3] seems to be stalled since the end of 2018. The Austin Group bug [4] mentions another implementation possibility, where qsort would become a qsort_r wrapper: void qsort(void *base, size_t nel, size_t width, int (*compar)(const void *, const void *)) { qsort_r(base, nel, width, (int (*)(const void*, const void*, void*))compar, NULL); } but it would be undefined, per Rich's comment in [5]. [1] https://inbox.vuxu.org/musl/20180903205705.GA7639@localhost/ [2] https://inbox.vuxu.org/musl/87lg8h8dw1.fsf@vuxu.org/ [3] https://reviews.freebsd.org/D17083 [4] https://www.austingroupbugs.net/view.php?id=900 [5] https://inbox.vuxu.org/musl/20181216015704.GD23599@brightrain.aerifal.cx/ Despite all these negatives, my hope is that musl implementing the function with the glibc signature (which seems to be the agreed upon way forward) might move the bureacracy some. For now, musl distros have managed to deal with the lack of qsort_r reasonably fine, but I don't know what the silent cost (as in software that couldn't be easily packaged or used and no one complained) has been. For the "louder" cost that I know of: - util-linux loses some small functionality in libfdisk if qsort_r isn't available - elfutils required a patch that removed qsort_r usage up until very recently, and might in the future come to require it again, if the place where it's used is made threaded (for better or for worse, using qsort_r is prettier than using a __thread global :) include/stdlib.h | 1 + src/stdlib/qsort.c | 219 +------------------------------------ src/stdlib/qsort_impl.h | 232 ++++++++++++++++++++++++++++++++++++++++ src/stdlib/qsort_r.c | 4 + 4 files changed, 239 insertions(+), 217 deletions(-) create mode 100644 src/stdlib/qsort_impl.h create mode 100644 src/stdlib/qsort_r.c diff --git a/include/stdlib.h b/include/stdlib.h index b54a051f..0c0ced5f 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -158,6 +158,7 @@ struct __locale_struct; float strtof_l(const char *__restrict, char **__restrict, struct __locale_struct *); double strtod_l(const char *__restrict, char **__restrict, struct __locale_struct *); long double strtold_l(const char *__restrict, char **__restrict, struct __locale_struct *); +void qsort_r (void *, size_t, size_t, int (*)(const void *, const void *, void *), void *); #endif #if defined(_LARGEFILE64_SOURCE) || defined(_GNU_SOURCE) diff --git a/src/stdlib/qsort.c b/src/stdlib/qsort.c index da58fd31..2f9e024f 100644 --- a/src/stdlib/qsort.c +++ b/src/stdlib/qsort.c @@ -1,218 +1,3 @@ -/* Copyright (C) 2011 by Valentin Ochs - * - * Permission is hereby granted, free of charge, to any person obtaining a copy - * of this software and associated documentation files (the "Software"), to - * deal in the Software without restriction, including without limitation the - * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or - * sell copies of the Software, and to permit persons to whom the Software is - * furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - */ - -/* Minor changes by Rich Felker for integration in musl, 2011-04-27. */ - -/* Smoothsort, an adaptive variant of Heapsort. Memory usage: O(1). - Run time: Worst case O(n log n), close to O(n) in the mostly-sorted case. */ - -#include -#include -#include - -#include "atomic.h" -#define ntz(x) a_ctz_l((x)) - typedef int (*cmpfun)(const void *, const void *); - -static inline int pntz(size_t p[2]) { - int r = ntz(p[0] - 1); - if(r != 0 || (r = 8*sizeof(size_t) + ntz(p[1])) != 8*sizeof(size_t)) { - return r; - } - return 0; -} - -static void cycle(size_t width, unsigned char* ar[], int n) -{ - unsigned char tmp[256]; - size_t l; - int i; - - if(n < 2) { - return; - } - - ar[n] = tmp; - while(width) { - l = sizeof(tmp) < width ? sizeof(tmp) : width; - memcpy(ar[n], ar[0], l); - for(i = 0; i < n; i++) { - memcpy(ar[i], ar[i + 1], l); - ar[i] += l; - } - width -= l; - } -} - -/* shl() and shr() need n > 0 */ -static inline void shl(size_t p[2], int n) -{ - if(n >= 8 * sizeof(size_t)) { - n -= 8 * sizeof(size_t); - p[1] = p[0]; - p[0] = 0; - } - p[1] <<= n; - p[1] |= p[0] >> (sizeof(size_t) * 8 - n); - p[0] <<= n; -} - -static inline void shr(size_t p[2], int n) -{ - if(n >= 8 * sizeof(size_t)) { - n -= 8 * sizeof(size_t); - p[0] = p[1]; - p[1] = 0; - } - p[0] >>= n; - p[0] |= p[1] << (sizeof(size_t) * 8 - n); - p[1] >>= n; -} - -static void sift(unsigned char *head, size_t width, cmpfun cmp, int pshift, size_t lp[]) -{ - unsigned char *rt, *lf; - unsigned char *ar[14 * sizeof(size_t) + 1]; - int i = 1; - - ar[0] = head; - while(pshift > 1) { - rt = head - width; - lf = head - width - lp[pshift - 2]; - - if((*cmp)(ar[0], lf) >= 0 && (*cmp)(ar[0], rt) >= 0) { - break; - } - if((*cmp)(lf, rt) >= 0) { - ar[i++] = lf; - head = lf; - pshift -= 1; - } else { - ar[i++] = rt; - head = rt; - pshift -= 2; - } - } - cycle(width, ar, i); -} - -static void trinkle(unsigned char *head, size_t width, cmpfun cmp, size_t pp[2], int pshift, int trusty, size_t lp[]) -{ - unsigned char *stepson, - *rt, *lf; - size_t p[2]; - unsigned char *ar[14 * sizeof(size_t) + 1]; - int i = 1; - int trail; - - p[0] = pp[0]; - p[1] = pp[1]; - - ar[0] = head; - while(p[0] != 1 || p[1] != 0) { - stepson = head - lp[pshift]; - if((*cmp)(stepson, ar[0]) <= 0) { - break; - } - if(!trusty && pshift > 1) { - rt = head - width; - lf = head - width - lp[pshift - 2]; - if((*cmp)(rt, stepson) >= 0 || (*cmp)(lf, stepson) >= 0) { - break; - } - } - - ar[i++] = stepson; - head = stepson; - trail = pntz(p); - shr(p, trail); - pshift += trail; - trusty = 0; - } - if(!trusty) { - cycle(width, ar, i); - sift(head, width, cmp, pshift, lp); - } -} - -void qsort(void *base, size_t nel, size_t width, cmpfun cmp) -{ - size_t lp[12*sizeof(size_t)]; - size_t i, size = width * nel; - unsigned char *head, *high; - size_t p[2] = {1, 0}; - int pshift = 1; - int trail; - - if (!size) return; - - head = base; - high = head + size - width; - - /* Precompute Leonardo numbers, scaled by element width */ - for(lp[0]=lp[1]=width, i=2; (lp[i]=lp[i-2]+lp[i-1]+width) < size; i++); - - while(head < high) { - if((p[0] & 3) == 3) { - sift(head, width, cmp, pshift, lp); - shr(p, 2); - pshift += 2; - } else { - if(lp[pshift - 1] >= high - head) { - trinkle(head, width, cmp, p, pshift, 0, lp); - } else { - sift(head, width, cmp, pshift, lp); - } - - if(pshift == 1) { - shl(p, 1); - pshift = 0; - } else { - shl(p, pshift - 1); - pshift = 1; - } - } - - p[0] |= 1; - head += width; - } - - trinkle(head, width, cmp, p, pshift, 0, lp); - - while(pshift != 1 || p[0] != 1 || p[1] != 0) { - if(pshift <= 1) { - trail = pntz(p); - shr(p, trail); - pshift += trail; - } else { - shl(p, 2); - pshift -= 2; - p[0] ^= 7; - shr(p, 1); - trinkle(head - lp[pshift] - width, width, cmp, p, pshift + 1, 1, lp); - shl(p, 1); - p[0] |= 1; - trinkle(head - width, width, cmp, p, pshift, 1, lp); - } - head -= width; - } -} +#define call_cmp(v1,v2) ((*cmp)(v1,v2)) +#include "qsort_impl.h" diff --git a/src/stdlib/qsort_impl.h b/src/stdlib/qsort_impl.h new file mode 100644 index 00000000..c8d76e50 --- /dev/null +++ b/src/stdlib/qsort_impl.h @@ -0,0 +1,232 @@ +/* Copyright (C) 2011 by Valentin Ochs + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/* Minor changes by Rich Felker for integration in musl, 2011-04-27. */ + +/* Smoothsort, an adaptive variant of Heapsort. Memory usage: O(1). + Run time: Worst case O(n log n), close to O(n) in the mostly-sorted case. */ + +#include +#include +#include + +#include "atomic.h" +#define ntz(x) a_ctz_l((x)) + +static inline int pntz(size_t p[2]) { + int r = ntz(p[0] - 1); + if(r != 0 || (r = 8*sizeof(size_t) + ntz(p[1])) != 8*sizeof(size_t)) { + return r; + } + return 0; +} + +static void cycle(size_t width, unsigned char* ar[], int n) +{ + unsigned char tmp[256]; + size_t l; + int i; + + if(n < 2) { + return; + } + + ar[n] = tmp; + while(width) { + l = sizeof(tmp) < width ? sizeof(tmp) : width; + memcpy(ar[n], ar[0], l); + for(i = 0; i < n; i++) { + memcpy(ar[i], ar[i + 1], l); + ar[i] += l; + } + width -= l; + } +} + +/* shl() and shr() need n > 0 */ +static inline void shl(size_t p[2], int n) +{ + if(n >= 8 * sizeof(size_t)) { + n -= 8 * sizeof(size_t); + p[1] = p[0]; + p[0] = 0; + } + p[1] <<= n; + p[1] |= p[0] >> (sizeof(size_t) * 8 - n); + p[0] <<= n; +} + +static inline void shr(size_t p[2], int n) +{ + if(n >= 8 * sizeof(size_t)) { + n -= 8 * sizeof(size_t); + p[0] = p[1]; + p[1] = 0; + } + p[0] >>= n; + p[0] |= p[1] << (sizeof(size_t) * 8 - n); + p[1] >>= n; +} + +#ifdef QSORT_R +# define sift(...) sift_impl(arg, __VA_ARGS__) +static void sift_impl(void *arg, unsigned char *head, size_t width, cmpfun cmp, int pshift, size_t lp[]) +#else +# define sift sift_impl +static void sift_impl(unsigned char *head, size_t width, cmpfun cmp, int pshift, size_t lp[]) +#endif +{ + unsigned char *rt, *lf; + unsigned char *ar[14 * sizeof(size_t) + 1]; + int i = 1; + + ar[0] = head; + while(pshift > 1) { + rt = head - width; + lf = head - width - lp[pshift - 2]; + + if(call_cmp(ar[0], lf) >= 0 && call_cmp(ar[0], rt) >= 0) { + break; + } + if(call_cmp(lf, rt) >= 0) { + ar[i++] = lf; + head = lf; + pshift -= 1; + } else { + ar[i++] = rt; + head = rt; + pshift -= 2; + } + } + cycle(width, ar, i); +} + +#ifdef QSORT_R +# define trinkle(...) trinkle_impl(arg, __VA_ARGS__) +static void trinkle_impl(void *arg, unsigned char *head, size_t width, cmpfun cmp, size_t pp[2], int pshift, int trusty, size_t lp[]) +#else +# define trinkle trinkle_impl +static void trinkle_impl(unsigned char *head, size_t width, cmpfun cmp, size_t pp[2], int pshift, int trusty, size_t lp[]) +#endif +{ + unsigned char *stepson, + *rt, *lf; + size_t p[2]; + unsigned char *ar[14 * sizeof(size_t) + 1]; + int i = 1; + int trail; + + p[0] = pp[0]; + p[1] = pp[1]; + + ar[0] = head; + while(p[0] != 1 || p[1] != 0) { + stepson = head - lp[pshift]; + if(call_cmp(stepson, ar[0]) <= 0) { + break; + } + if(!trusty && pshift > 1) { + rt = head - width; + lf = head - width - lp[pshift - 2]; + if(call_cmp(rt, stepson) >= 0 || call_cmp(lf, stepson) >= 0) { + break; + } + } + + ar[i++] = stepson; + head = stepson; + trail = pntz(p); + shr(p, trail); + pshift += trail; + trusty = 0; + } + if(!trusty) { + cycle(width, ar, i); + sift(head, width, cmp, pshift, lp); + } +} + +#ifdef QSORT_R +void qsort_r(void *base, size_t nel, size_t width, cmpfun cmp, void *arg) +#else +void qsort(void *base, size_t nel, size_t width, cmpfun cmp) +#endif +{ + size_t lp[12*sizeof(size_t)]; + size_t i, size = width * nel; + unsigned char *head, *high; + size_t p[2] = {1, 0}; + int pshift = 1; + int trail; + + if (!size) return; + + head = base; + high = head + size - width; + + /* Precompute Leonardo numbers, scaled by element width */ + for(lp[0]=lp[1]=width, i=2; (lp[i]=lp[i-2]+lp[i-1]+width) < size; i++); + + while(head < high) { + if((p[0] & 3) == 3) { + sift(head, width, cmp, pshift, lp); + shr(p, 2); + pshift += 2; + } else { + if(lp[pshift - 1] >= high - head) { + trinkle(head, width, cmp, p, pshift, 0, lp); + } else { + sift(head, width, cmp, pshift, lp); + } + + if(pshift == 1) { + shl(p, 1); + pshift = 0; + } else { + shl(p, pshift - 1); + pshift = 1; + } + } + + p[0] |= 1; + head += width; + } + + trinkle(head, width, cmp, p, pshift, 0, lp); + + while(pshift != 1 || p[0] != 1 || p[1] != 0) { + if(pshift <= 1) { + trail = pntz(p); + shr(p, trail); + pshift += trail; + } else { + shl(p, 2); + pshift -= 2; + p[0] ^= 7; + shr(p, 1); + trinkle(head - lp[pshift] - width, width, cmp, p, pshift + 1, 1, lp); + shl(p, 1); + p[0] |= 1; + trinkle(head - width, width, cmp, p, pshift, 1, lp); + } + head -= width; + } +} diff --git a/src/stdlib/qsort_r.c b/src/stdlib/qsort_r.c new file mode 100644 index 00000000..e0500dcb --- /dev/null +++ b/src/stdlib/qsort_r.c @@ -0,0 +1,4 @@ +typedef int (*cmpfun)(const void *, const void *, void *); +#define call_cmp(v1,v2) ((*cmp)(v1,v2,arg)) +#define QSORT_R +#include "qsort_impl.h" -- 2.30.1