From: jason <jason@insomnia247.nl>
To: musl@lists.openwall.com
Cc: jason@insomnia247.nl
Subject: [musl] Improvements to qsort
Date: Fri, 9 Jul 2021 13:26:28 +0200 (CEST) [thread overview]
Message-ID: <20210709112629.024302221844@gateway02.insomnia247.nl> (raw)
I've been trying to understand ngmalloc, and it's a struggle, so for
relaxation I had a look at qsort, and have a few suggestions.
To avoid burying the lead, the following reduces comparisons by
about 4% with no downside:
--- a/qsort.c
+++ b/qsort.c
@@ -100,18 +100,16 @@ static void sift(unsigned char *head, size_t width, cmpfun cmp, int pshift, size
rt = head - width;
lf = head - width - lp[pshift - 2];
- if((*cmp)(ar[0], lf) >= 0 && (*cmp)(ar[0], rt) >= 0) {
- break;
- }
if((*cmp)(lf, rt) >= 0) {
- ar[i++] = lf;
head = lf;
pshift -= 1;
} else {
- ar[i++] = rt;
head = rt;
pshift -= 2;
}
+ if((*cmp)(ar[0], head) >= 0)
+ break;
+ ar[i++] = head;
}
cycle(width, ar, i);
}
Basically, instead of doing two comparisons to see if ar[0] is greater
than both the left and right children and then comparing the children
with each other to find the greater one, find the greater child and
compare ar[0] with that one.
I have several other less-significant changes that I wonder if musl is
interested in:
* I notice than pntz() is always used in the following context:
trail = pntz(p);
shr(p, trail);
pshift += trail;
both pntz() and shr() have branches conditional on whether the shift
is more or less than 8*sizeof(size_t) bits. I have replaced this
by a single function, which allows the conditional branch to be
shared:
/*
* Re-normalize p[]. Ignore the low-order bit (which must be set!),
* find the position of the next-least-significant bit, and shift p[]
* down by that much. Return the number of bits shifted.
*
* There must be a penultimate trailing bit; i.e. p[] must not be {1, 0}.
*/
static inline int normalize(size_t p[2])
{
int s;
size_t t = p[0] - 1;
if (t == 0) {
t = p[1];
s = ntz(t);
p[0] = t >> s;
p[1] = 0;
s += 8*sizeof(size_t);
} else {
s = ntz(t);
p[0] = t >> s | p[1] << (8*sizeof(size_t) - s);
p[1] >>= s;
}
return s;
}
which is used simply as:
pshift += normalize(p);
* As part of my learning about the code, I have added a lot of comments,
particularly large block comments about the smoothsort algorithm
in general, the meaning of p[] and pindex, what a "stepson" is,
when it's okay to use sift() instead of trinkle(), what the "trusty"
flag means, how cycle() is used, etc. Would these be worth cleaning
up and submitting?
* In my private copy, I have changed the type of "trusty" to bool,
a data type I'm quite fond of as it concisely documents the limited
range of values. I notice, however, that musl is completely devoid of
any use of <stdbool.h>, although it uses other C99 features. Is this
a conscious decision?
* Likewise, I notice that most musl code has a space in "keyword (condition)"
like "if (foo != 0)", "while (bar < N)" and "for (i=0; i < N; i++)".
But the qsort() code, with one single exception, omits the space.
Likewise, most msul code is quite happy to "if (condition) break;"
on a single line, with no braces, but the original sift() function
above puts the break on a second line and a closing brace on a third.
The musl code base is inconsistent enough that it's hard to infer code
formatting conventions. Are these discrepancies in qsort a problem
worth fixing?
* The way the heap-building loop maintains its loop invariants is a trifle
peculiar. It begins with the element at *head included in p[] and pindex,
but excluded from the heap ordering invariants. Then it establishes
the ordering invariants (with sift() or trinkle()) and immediately adds
another (un-ordered) element to p[] and pindex.
After the main loop exits, a standalone call to trinkle() places the
final element in heap order.
I have a more conventional (add, then sift) ordering working which
seems to simplify the code:
while(head < high) {
+ bool notrinkle = false;
+
+ /* Append an element to the heap */
+ head += width;
if((p[0] & 3) == 3) {
- sift(head, width, cmp, pshift, lp);
+ /* Two trees of adjacent sizes: merge with head */
shr(p, 2);
pshift += 2;
- } else {
- if(lp[pshift - 1] >= high - head) {
- trinkle(head, width, cmp, p, pshift, 0, lp);
- } else {
+ if((size_t)(high - head) > lp[pshift - 1]) {
sift(head, width, cmp, pshift, lp);
+ notrinkle = true;
}
-
- if(pshift == 1) {
- shl(p, 1);
- pshift = 0;
- } else {
- shl(p, pshift - 1);
- pshift = 1;
- }
+ } else if(pshift == 1) {
+ /* order-1 exists; new tree is order-0 */
+ shl(p, 1);
+ pshift = 0;
+ notrinkle = (high != head);
+ } else {
+ /* new tree is order-1 */
+ shl(p, pshift - 1);
+ pshift = 1;
+ notrinkle = ((size_t)(high - head) > lp[0]);
}
-
p[0] |= 1;
- head += width;
+
+ /* Re-establish the heap using trinkle */
+ if(!notrinkle)
+ trinkle(head, width, cmp, p, pshift, false, lp);
}
- trinkle(head, width, cmp, p, pshift, 0, lp);
If the original ordering is wanted, then the loop initialization
can be changed to start with two elements (p[0] = 3, pindex = 0,
head = base + width), since two elements automatically fulfil the
required loop invariants.
Does anyone have a preference?
* Since the first test in qsort() excludes width == 0, the loop over
width in cycle() can be changed to "do { ... } while (width);".
* The ar[] array in sift() only has to be as large as the lp[]
array in qsort(), not 1/6 larger.
* The ar[] array in trinkle() only has to be half as large, plus one.
* The pindex > 1 case in the heap-consuming loop contains some
redundant left-then-right shifting:
shl(p, 2);
pshift -= 2;
p[0] ^= 7;
shr(p, 1);
This can be optimized.
* There are several cases where variables can be moved into
smaller local scopes, particularly the "stepson", "rt" and "lf"
variables in trinkle(). This is another change I've made privately
as I think it makes the code more reasonable. Worth submitting?
* GCC emits a couple of signed/unsigned comparison warnings with
-W -Wall. Worth fixing?
next reply other threads:[~2021-07-09 13:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-09 11:26 jason [this message]
2021-07-10 0:20 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210709112629.024302221844@gateway02.insomnia247.nl \
--to=jason@insomnia247.nl \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).