From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3593 Path: news.gmane.org!not-for-mail From: Andre Renaud Newsgroups: gmane.linux.lib.musl.general Subject: Re: Thinking about release Date: Wed, 10 Jul 2013 09:28:21 +1200 Message-ID: References: <20130613012517.GA5859@brightrain.aerifal.cx> <20130613014314.GC29800@brightrain.aerifal.cx> <20130709053711.GO29800@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: ger.gmane.org 1373405314 24724 80.91.229.3 (9 Jul 2013 21:28:34 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 9 Jul 2013 21:28:34 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-3597-gllmg-musl=m.gmane.org@lists.openwall.com Tue Jul 09 23:28:36 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1UwfSc-0000In-Bq for gllmg-musl@plane.gmane.org; Tue, 09 Jul 2013 23:28:34 +0200 Original-Received: (qmail 3528 invoked by uid 550); 9 Jul 2013 21:28:33 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3517 invoked from network); 9 Jul 2013 21:28:33 -0000 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=8YPCghjfoaLT/HYPPNC4I1jaMA3FukfeThrTXPgOrW4=; b=UX0VmexPR0zxPjKytIqmguPnUbyvkZ5dFTKfjb/qud4jKuI43M7AVq29QWGyZquAQ6 y+k3YgbHk4rdIKqmYikqtnvJ2OloxPbecwLLmDjfDaWEPZwKLgjC+AGXvNUibskGwIuy +M31WS8R7t7XkJb7+vrsvvwV66RtNnNFZYaR/JstjGoTOxgP+qe4L4tZsoTnubCgaYCw jlqM0C/mLzYFrkWW/rrtbGOh7IzwwW/bMvu8M6UKHSzFbCwLZordTA01a63R9YggYazY O1ubQjUze3mYypsk+zaJ3Sl29f/AZTmVo9TahFp0rcYZWvOrIXrhByW/SffUVnLJAkay WtvA== X-Received: by 10.58.152.3 with SMTP id uu3mr13005466veb.16.1373405301928; Tue, 09 Jul 2013 14:28:21 -0700 (PDT) In-Reply-To: <20130709053711.GO29800@brightrain.aerifal.cx> X-Gm-Message-State: ALoCoQnAxozzrVnCqs/73qgrnPzpRcQVVOEtGaZ0TMjB4DWnd8RMd5oNx9tbbh1wQkA6mW68OXDF Xref: news.gmane.org gmane.linux.lib.musl.general:3593 Archived-At: Hi Rich, > I think that's a reasonable place to begin. I do mildly question the > relevance of memmove to performance, so if we end up having to do a > lot of review or changes to get the asm committed, it might make sense > to leave memmove for later. I wasn't too sure on memmove, but I've seen a reasonable amount of code which just uses memmove as standard (rather than memcpy), to avoid the possibility of overlapping regions. Not a great policy, but still. I'm fine with dropping it at this stage. > At first glance, this looks like a clear improvement, but have you > compared it to much more naive optimizations? My _general_ experience > with optimized memcpy asm that's complex like this and that goes out > of its way to deal explicitly with cache lines and such is that it's > no faster than just naively moving large blocks at a time. Of course > this may or may not be the case for ARM, but I'd like to know if > you've done any tests. > > The basic principle in my mind here is that a complex solution is not > necessarily wrong if it's a big win in other ways, but that a complex > solution which is at most 1-2% faster than a much simpler solution is > probably not the best choice. Certainly if there was a more straight forward C implementation that achieved similar results that would be superior. However the existing musl C memcpy code is already optimised to some degree (doing 32-bit rather than 8-bit copies), and it is difficult to convince gcc to use the load-multiple & store-multiple instructions via C code I've found, without resorting to pretty horrible C code. It may still be preferable to the assembler though. At this stage I haven't benchmarked this - I'll see if I can come up with something. > It's an open question whether it's better to sync something like this > with an 'upstream' or adapt it to musl coding conventions. Generally > musl uses explicit instructions rather than pseudo-instructions/macros > for prologue and epilogue, and does not use named labels. Given that most of the other systems do some form of compile time optimisations (which we're trying to avoid), and that these are not functions that see a lot of code churn, I don't think it's too bad to have it adapted to musl's style. I haven't really done that so far. >> Does anyone have any comments on the suitability of this code, or what > > If nothing else, it fails to be armv4 compatible. Fixing that should > not be hard, but it would require a bit of an audit. The return > sequences are the obvious issue, but there may be other instructions > in use that are not available on armv4 or maybe not even on armv5...? Rob Landley mentioned a while ago that armv4 has issues with the EABI stuff. Is armv4 a definite lower bound for musl support, as opposed to armv4t or armv5? Regards, Andre