From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3587 Path: news.gmane.org!not-for-mail From: Andre Renaud Newsgroups: gmane.linux.lib.musl.general Subject: Re: Thinking about release Date: Tue, 9 Jul 2013 17:06:21 +1200 Message-ID: References: <20130613012517.GA5859@brightrain.aerifal.cx> <20130613014314.GC29800@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: ger.gmane.org 1373346391 30899 80.91.229.3 (9 Jul 2013 05:06:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 9 Jul 2013 05:06:31 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-3591-gllmg-musl=m.gmane.org@lists.openwall.com Tue Jul 09 07:06:34 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1UwQ8H-00075n-Ly for gllmg-musl@plane.gmane.org; Tue, 09 Jul 2013 07:06:33 +0200 Original-Received: (qmail 31813 invoked by uid 550); 9 Jul 2013 05:06:33 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 31805 invoked from network); 9 Jul 2013 05:06:33 -0000 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=DtefMhtFOxCi2QyDh4RPBoRn+zIs4IaxTylhNQiuD60=; b=oLP5r+HS5BM1HC3vLvEj4wmzKUEAuqO+GwfjAxkKY661wG4x9CApUq7AD51RCN1fd5 9iRx1ZMzz17mbyys+7LI4uHZiETmhn8CSsovtay3h/Oq+/dyd4BpxSnsxaCuW/YIG7IR IBmUuup9i0n7VuCXMzHh0ckBPjhYBDI56LbloEPee40QYpr3srU4rfSB7b/jF/JOf0Y9 1sOvuiwXcumnWWAhD7tCgsHba8q/zEnlfjlIeIxHC9kE7Wc9ljXKAjxnNSuCecI3jb2p DgbhLrsw3BCN9z3jDv27F+nSCmVjnJxbjpduELBvR8KZWEGHDxyiOWr/s4sXPMxpOm5E 2gfw== X-Received: by 10.220.251.3 with SMTP id mq3mr15580542vcb.20.1373346381379; Mon, 08 Jul 2013 22:06:21 -0700 (PDT) In-Reply-To: <20130613014314.GC29800@brightrain.aerifal.cx> X-Gm-Message-State: ALoCoQlzIvZSZ6bdSF0NFRBs1K7FmBCNCBtxQFcpsOfS4U/6OtwmMnUOHTM4mvvtOTyabN7lRWvi Xref: news.gmane.org gmane.linux.lib.musl.general:3587 Archived-At: Hi Rich, > I think the first step should be benchmarking on real machines. > Somebody tried the asm that was posted and claimed it was no faster > than musl's C code; I don't know the specific hardware they were using > and I don't even recall right off who made the claim or where it was > reported, but I think before we start writing or importing code we > need to have a good idea how the current C code compares in > performance to other "optimized" implementations. In the interests of furthering this discussion (and because I'd like to start using musl as the basis for some of our projects, but the current speed degradation is noticeable , I've created some patches that enable memcmp, memcpy & memmove ARM optimisations. I've ignored the str* functions, as these are generally not used on the same bulk data as the mem* functions, and as such the performance issue is less noticeable. Using a fairly rudimentary test application, I've benchmarked it as having the following speed improvements (this is all on an actual ARM board - 400MHz arm926ejs): memcpy: 160% memmove: 162% memcmp: 272% These numbers bring musl in line with glibc (at least on ARMv5). memcmp in particular seems to be faster (90MB/s vs 75MB/s on my platform). I haven't looked at using the __hwcap feature at this stage to swap between these implementation and neon optimised versions. I assume this can come later. >From a code size point of view (this is all with -O3), memcpy goes from 1996 to 1680 bytes, memmove goes from 2592 to 2088 bytes, and memcmp goes from 1040 to 1452, for a total increase of 224 bytes. The code is from NetBSD and Android (essentially unmodified), and it is all BSD 2-clause licensed. The git tree is available here: https://github.com/AndreRenaud/musl/commit/713023e7320cf45b116d1c29b6155ece28904e69 Does anyone have any comments on the suitability of this code, or what kind of more rigorous testing could be applied? Regards, Andre