>
>Before we consider any such change though we should probably see
>profiling data to determine where the time is being spent in
>smoothsort, in case it's stupid stuff that's fixable.
>
>Rich

I made a rough profiling against current implementation of  qsort in musl  (Just copy the source code and recompile it to expose symbols)


The function call counters are as following:

main: 931387
qsort2: 917148  <----this is qsort
__qsort_r: 911594
trinkle: 811314
sift: 537263
wrapper_cmp: 257403  <--- wrapper cmp function
mycmp: 167410  <---real cmp function
cycle: 105809
shr: 41585
pntz: 27833
_init: 14925
a_ctz_64: 11341
a_ctz_l: 9127
shl: 8333


1.  wrapper_cmp took about 25%,   overhead (257403-167410) seems  high when the real comp functions is very simple(which just compares integer values), could be optimized out for qsort
2.  most "qsort" time spend in "trinkle", and most "trinkle" time spend in "sift".


A tree-view profiling report is attached.  (There may be several incorrect call-chains collected, but I think the proportion among function-call counters  is correct,  with high probability....)


FYI
David