From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3783 Path: news.gmane.org!not-for-mail From: Harald Becker Newsgroups: gmane.linux.lib.musl.general Subject: Re: ARM memcpy post-0.9.12-release thread Date: Wed, 31 Jul 2013 05:13:47 +0200 Message-ID: <20130731051347.7d8340ac@ralda.gmx.de> References: <20130731022631.GA6655@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1375240440 25064 80.91.229.3 (31 Jul 2013 03:14:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 31 Jul 2013 03:14:00 +0000 (UTC) Cc: musl@lists.openwall.com, dalias@aerifal.cx Original-X-From: musl-return-3787-gllmg-musl=m.gmane.org@lists.openwall.com Wed Jul 31 05:14:02 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1V4MrS-0000kI-5y for gllmg-musl@plane.gmane.org; Wed, 31 Jul 2013 05:14:02 +0200 Original-Received: (qmail 28140 invoked by uid 550); 31 Jul 2013 03:14:00 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 28132 invoked from network); 31 Jul 2013 03:14:00 -0000 In-Reply-To: <20130731022631.GA6655@brightrain.aerifal.cx> X-Provags-ID: V03:K0:dVkQ9Tr7BbguNh7mnr+5w+Tl3k1EM3vDNW+WW9ihT4PKkYosx5s f1s3PRXATa1jegTWJ5x6h39A80r6Fpi6FJn1ek6H1fyQEarRUS/aqG7QuTj5tZze25OZHUl bwJOv0uQme7Y3jRycUW4I/dQo8a9kYY5wBN0VkYdp4nPg+/78bSq+yLQEZqCSdKYdneVmP6 rAwQkIurM6NxYsp2y1DpA== Xref: news.gmane.org gmane.linux.lib.musl.general:3783 Archived-At: Hi Rich ! 30-07-2013 22:26 Rich Felker : > Some rough times (128k copy repeated 10000 times): > > Aligned case: > Current C code: 1.2s > My best-attempt C code: 0.75s > My best-attempt inline asm: 0.57s > Bionic asm: 0.63s > Bionic asm without prefetch: 0.57s > > Misaligned case: > Current C code: 4.7s > My best-attempt inline asm: 2.9s > Bionic asm: 1.1s I like to throw in a question, as my cent to this topic: Does modern C Compiler not try to align all data types? So following this path in most cases aligned data structures are used and copying them around usually hit the aligned case. The misaligned case happens mostly due to working with strings, and those are usually short. Can't we consider other misaligned cases violation of the programmer or code generator? If so, I would prefer the best-attempt inline asm versions of code or even best attempt C code over arch specific asm versions ... and add a warning for performance lose on misaligned data in documentation, with giving a rough percentage of this lose. Those who really need to work with misaligned data may follow the link and consider to add an optimized memcpy to there work. May be, musl archive or web sit may hold a contribution directory with such optimized replacement functions, (nearly) ready for inclusion in other projects, but as officially unmaintained code. -- Harald