From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4
Received: (qmail 23442 invoked from network); 10 Feb 2023 13:11:02 -0000
Received: from second.openwall.net (193.110.157.125)
  by inbox.vuxu.org with ESMTPUTF8; 10 Feb 2023 13:11:02 -0000
Received: (qmail 14239 invoked by uid 550); 10 Feb 2023 13:10:59 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 14173 invoked from network); 10 Feb 2023 13:10:58 -0000
Date: Fri, 10 Feb 2023 08:10:45 -0500
From: Rich Felker <dalias@libc.org>
To: David Wang <00107082@163.com>
Cc: musl@lists.openwall.com, Markus Wichmann <nullplan@gmx.net>
Message-ID: <20230210131044.GZ4163@brightrain.aerifal.cx>
References: <CAAEi2GextYuWRK-JKtpCLxewyJ2u380m5+s=M_0P=ZBDxyX-xA@mail.gmail.com>
 <CPX182QJK5D6.2Y2DE5NACZ1I@sumire>
 <4d290220.36d6.1860222ca46.Coremail.00107082@163.com>
 <20230201180115.GB2626@voyager>
 <20230209190316.GU4163@brightrain.aerifal.cx>
 <75d9cfae.35eb.1863ac4e3c0.Coremail.00107082@163.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <75d9cfae.35eb.1863ac4e3c0.Coremail.00107082@163.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] Re:Re: [musl] Re:Re: [musl] qsort

On Fri, Feb 10, 2023 at 06:00:27PM +0800, David Wang wrote:
> >
> >Before we consider any such change though we should probably see
> >profiling data to determine where the time is being spent in
> >smoothsort, in case it's stupid stuff that's fixable.
> >
> >Rich
> 
> I made a rough profiling against current implementation of qsort in
> musl (Just copy the source code and recompile it to expose symbols)
> 
> 
> The function call counters are as following:
> 
> main: 931387
> qsort2: 917148  <----this is qsort
> __qsort_r: 911594
> trinkle: 811314
> sift: 537263
> wrapper_cmp: 257403  <--- wrapper cmp function
> mycmp: 167410  <---real cmp function
> cycle: 105809
> shr: 41585
> pntz: 27833
> _init: 14925
> a_ctz_64: 11341
> a_ctz_l: 9127
> shl: 8333
> 
> 
> 1. wrapper_cmp took about 25%, overhead (257403-167410) seems high
> when the real comp functions is very simple(which just compares
> integer values), could be optimized out for qsort
> 2. most "qsort" time spend in "trinkle", and most "trinkle" time
> spend in "sift".
> 
> 
> A tree-view profiling report is attached. (There may be several
> incorrect call-chains collected, but I think the proportion among
> function-call counters is correct, with high probability....)

What tool was used for this? gprof or anything else invasive is not
meaningful; for tiny functions, the entire time measured will be the
profiling overhead. perf(1) is the only way I know to get meaningful
numbers.

In particular, it makes no sense that significant time was spent in
wrapper_cmp, which looks like (i386):

   0:   ff 64 24 0c             jmp    *0xc(%esp)

or (x86_64):

   0:   ff e2                   jmpq   *%rdx

or (arm):

   0:   4710            bx      r2

but I can imagine it being (relatively) gigantic with a call out to
profiling code.

Rich