From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1508 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: crypt* files in crypt directory Date: Fri, 10 Aug 2012 14:06:13 -0400 Message-ID: <20120810180613.GA27715@brightrain.aerifal.cx> References: <20120808052844.GF27715@brightrain.aerifal.cx> <20120808062706.GA23135@openwall.com> <20120808214855.GL27715@brightrain.aerifal.cx> <20120809033613.GA24926@openwall.com> <20120809072940.GA26288@openwall.com> <20120809105348.GA27361@openwall.com> <20120809115811.GA32316@port70.net> <20120809232132.GX27715@brightrain.aerifal.cx> <20120810170435.GA29839@openwall.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1344621932 29719 80.91.229.3 (10 Aug 2012 18:05:32 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 10 Aug 2012 18:05:32 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-1509-gllmg-musl=m.gmane.org@lists.openwall.com Fri Aug 10 20:05:33 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1SztaT-0004k0-Ph for gllmg-musl@plane.gmane.org; Fri, 10 Aug 2012 20:05:29 +0200 Original-Received: (qmail 32756 invoked by uid 550); 10 Aug 2012 18:05:28 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 32747 invoked from network); 10 Aug 2012 18:05:28 -0000 Content-Disposition: inline In-Reply-To: <20120810170435.GA29839@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:1508 Archived-At: On Fri, Aug 10, 2012 at 09:04:35PM +0400, Solar Designer wrote: > > > why increase ptr at the begining? > > > it seems the idiomatic way would be > > > > > > *ptr++ = L; > > > *ptr++ = R; > > > > For me, making this change makes it 5% faster. I suspect the > > difference comes from the fact that gcc is not smart enough to move > > the ptr+=2; across the rest of the loop body, and the fact that it > > gets spilled to the stack and reloaded for *both* points of usage > > rather than just one. The original version may perform better on > > machines with A LOT more registers, but I'm doubtful... > > The spilling theory makes sense to me, but it does not fully explain the > 5% difference - I think it could explain a 1% difference or so. More > likely there's some change in register allocation overall, not only for > ptr - or something like it. Indeed, that's possible too. I haven't read the asm diff. > Anyhow, this does not match my test results so far, for different > revisions of this code. What compiler, options, architecture, CPU? gcc 4.6.3, -O3, generic/i486 code generation, no tuning for my cpu, which is Atom. Rich