> >Before we consider any such change though we should probably see >profiling data to determine where the time is being spent in >smoothsort, in case it's stupid stuff that's fixable. > >Rich I made a rough profiling against current implementation of qsort in musl (Just copy the source code and recompile it to expose symbols) The function call counters are as following: main: 931387 qsort2: 917148 <----this is qsort __qsort_r: 911594 trinkle: 811314 sift: 537263 wrapper_cmp: 257403 <--- wrapper cmp function mycmp: 167410 <---real cmp function cycle: 105809 shr: 41585 pntz: 27833 _init: 14925 a_ctz_64: 11341 a_ctz_l: 9127 shl: 8333 1. wrapper_cmp took about 25%, overhead (257403-167410) seems high when the real comp functions is very simple(which just compares integer values), could be optimized out for qsort 2. most "qsort" time spend in "trinkle", and most "trinkle" time spend in "sift". A tree-view profiling report is attached. (There may be several incorrect call-chains collected, but I think the proportion among function-call counters is correct, with high probability....) FYI David