From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1475 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: crypt* files in crypt directory Date: Wed, 8 Aug 2012 23:16:40 -0400 Message-ID: <20120809031639.GM27715@brightrain.aerifal.cx> References: <20120808022421.GE27715@brightrain.aerifal.cx> <20120808044235.GA22470@openwall.com> <20120808052844.GF27715@brightrain.aerifal.cx> <20120808062706.GA23135@openwall.com> <20120808214855.GL27715@brightrain.aerifal.cx> <20120808160810.731cec78@newbook> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1344482166 29911 80.91.229.3 (9 Aug 2012 03:16:06 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 9 Aug 2012 03:16:06 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-1476-gllmg-musl=m.gmane.org@lists.openwall.com Thu Aug 09 05:16:07 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1SzJEB-0002lk-9Y for gllmg-musl@plane.gmane.org; Thu, 09 Aug 2012 05:16:03 +0200 Original-Received: (qmail 3788 invoked by uid 550); 9 Aug 2012 03:16:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3780 invoked from network); 9 Aug 2012 03:16:01 -0000 Content-Disposition: inline In-Reply-To: <20120808160810.731cec78@newbook> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:1475 Archived-At: On Wed, Aug 08, 2012 at 04:08:10PM -0700, Isaac Dunham wrote: > On Wed, 8 Aug 2012 17:48:55 -0400 > Rich Felker wrote: > > > > > Maybe you could support -DFAST_CRYPT or the like. It could enable > > > > forced inlining and manual unrolls in crypt_blowfish.c. > ... > > Unless there's a really compelling reason to do so, I'd like to avoid > > having multiple alternative versions of the same code in a codebase. > > It makes it so there's more combinations you have to test to be sure > > the code works and doesn't have regressions. > > > > As it stands, the code I posted with the manual unrolling removed > > performs _better_ than the manually unrolled code with gcc 4 on x86_64 > > when optimized for speed, and it's 33% smaller when optimized for > > size. > > Per your own tests? > I say this because the test previously mentioned shows the > opposite: OK, I misread the units as c=cycles and s=?? instead of c=crypts and s=sec. But of course that doesn't make sense.. > > > The impact on x86-64 is less. With Ubuntu 12.04's gcc 4.6.3 on > > > FX-8120 I get 490 c/s for the original code, 450 c/s for your code > > > without inlining/unrolling, and somehow only 430 c/s with > > > -finline-functions -funroll-loops. > > that's : > Raw %speed version > 490 c/s 100% original > 450 c/s 92% rich's version > 430 c/s 88% rich's version, unrolled by compiler > Higher is faster. > IE, unrolling is actually slowing your version down more. > > GCC 3/x86 is getting 80% with rich's version, optimized. > > Also, how much "bloat" does solar designer's proposal (unroll inside > BF_body) add? Source bloat, even worse than either version. It requires completely duplicating the whole function (once unrolled, once straight). I have no idea how much binary bloat it adds; anybody care to try it? My principal hesitation to even go there is that it (1) makes really ugly source bloat, and (2) perhaps cuts the binary bloat savings in half or even worse, making the savings marginal and arguably no longer worth the cost of the source bloat from having 2 copies of the same code. Rich