From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7367 Path: news.gmane.org!not-for-mail From: =?UTF-8?Q?Daniel_Cegie=C5=82ka?= Newsgroups: gmane.linux.lib.musl.general Subject: Re: musl perf, 20% slower than native build? Date: Thu, 9 Apr 2015 08:50:24 +0200 Message-ID: References: <20150408160507.GB31681@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1428562271 15874 80.91.229.3 (9 Apr 2015 06:51:11 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 9 Apr 2015 06:51:11 +0000 (UTC) Cc: John Mudd To: musl@lists.openwall.com Original-X-From: musl-return-7380-gllmg-musl=m.gmane.org@lists.openwall.com Thu Apr 09 08:50:58 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Yg6Ik-0006X1-LH for gllmg-musl@m.gmane.org; Thu, 09 Apr 2015 08:50:58 +0200 Original-Received: (qmail 27725 invoked by uid 550); 9 Apr 2015 06:50:56 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 27704 invoked from network); 9 Apr 2015 06:50:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=3tf29pdpp1wZzmlIaNzXClL8pvAPSdfIIYUTAR44JU0=; b=ILwVxpykHh/TdcJ/GzfWig2aFRkEG44YOLmVBhSqTYtzuz/8ROrcs/nXLsieJ+nrV8 r+n4davPQQjDew6+w7WUQrUBnvXMdlAMLEoRWdmNTQz0JYeVoHqOpKQjJssPy6wtqKoO msx4da+5AT33NNz1TQsWOcGIj4WCtcbI99gbJUjg2Ghw7nvrMCR4kQvVmIQLIJ16zKwA G4ZuRASWTI/6iX231Akc1vHRztXGi4gHqP4p3CRumpnC1uHFbDzLHNQbAJ6CLCHv/t1E A7pxcJiyo3zThqTgAG6uvm3DX+cwjSiHp9JMkTAmgNpVtv6WP0UVagH2/L5gmevYXjfC GSwg== X-Received: by 10.140.31.133 with SMTP id f5mr33730158qgf.23.1428562244439; Wed, 08 Apr 2015 23:50:44 -0700 (PDT) In-Reply-To: Xref: news.gmane.org gmane.linux.lib.musl.general:7367 Archived-At: 2015-04-08 22:59 GMT+02:00 Paul Schutte : > Hi Daniel, > > Pardon my stupidity, but with what did you replace the memcpy ? I use memcpy more suited to my CPU. memcpy latency was very important for me because it had a big impact on the total latency (in my code). I suppose that most of the problems with latency will have its cause in musl's memcpy. This is quite a complex topic, because the memcpy's optimal code depends on how large blocks of memory will be copied. Sometimes faster will be SSE2 and sometimes AVX2, but heavily optimized code is not portable (eg AVX2) and this is a problem. Fast memcpy implementations usualy uses CPUID to choose the right code, but such code is blown and ugly. Daniel > Regards > Paul > > On Wed, Apr 8, 2015 at 9:28 PM, Daniel Cegie=C5=82ka > wrote: >> >> 2015-04-08 21:10 GMT+02:00 John Mudd : >> >> > Here's output from perf record/report for libc. This looks consistent >> > with >> > the 5% longer run time. >> > >> > native: >> > 2.20% python libc-2.19.so [.] __memcpy_ssse3 >> >> > >> > musl: >> > 4.74% python libc.so [.] memcpy >> >> I was able to get twice speed-up (in my code) just by replacing memcpy >> in the musl. >> >> Daniel > >