From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1452 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: crypt* files in crypt directory Date: Wed, 8 Aug 2012 01:28:44 -0400 Message-ID: <20120808052844.GF27715@brightrain.aerifal.cx> References: <20120808022421.GE27715@brightrain.aerifal.cx> <20120808044235.GA22470@openwall.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1344403694 8318 80.91.229.3 (8 Aug 2012 05:28:14 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 8 Aug 2012 05:28:14 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-1453-gllmg-musl=m.gmane.org@lists.openwall.com Wed Aug 08 07:28:15 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1SyyoY-00072Q-Nd for gllmg-musl@plane.gmane.org; Wed, 08 Aug 2012 07:28:14 +0200 Original-Received: (qmail 9289 invoked by uid 550); 8 Aug 2012 05:28:13 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 9263 invoked from network); 8 Aug 2012 05:28:08 -0000 Content-Disposition: inline In-Reply-To: <20120808044235.GA22470@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:1452 Archived-At: On Wed, Aug 08, 2012 at 08:42:35AM +0400, Solar Designer wrote: > On Tue, Aug 07, 2012 at 10:24:21PM -0400, Rich Felker wrote: > > First, the compatibility code for the sign extension bug. How > > important is it to keep this? > > Not very important, but nice to keep musl's code revision closer to > upstream. > [...] > > I'm uncertain whether there's any portion of musl's user base that > > this would be useful to. > > Maybe not. After further reading, the cost is near zero. The compat hack is done at the same time useful data is being computed. I see no reason to disable/remove this feature unless the goal is to force people to stop using old hashes that are likely-vulnerable. > > Second, what can be done to reduce size? > > I felt the size was acceptable already. However, if you must, the > instances of BF_ENCRYPT that are outside of BF_body may be made slower > with little impact on overall speed. For example, they may be made a > function rather than a macro, and the function would only be inlined in > builds optimized for speed rather than size. > > > I think the first step is > > replacing the giant macros (BF_ROUND, BF_ENCRYPT, etc.) with > > functions so that the code doesn't get generated in duplicate unless > > aggressive inlining is enabled by CFLAGS. > > I see that you did this - and I think you took it too far. The code > became twice slower on Pentium 3 when compiling with gcc 3.4.5 (approx. > 140 c/s down to 77 c/s). Adding -finline-functions > -fold-unroll-all-loops regains only a fraction of the speed (112 c/s); > less aggressive loop unrolling results in lower speeds. Can you compare with a more modern gcc? 3.x is known to be horrible at optimizing. It can't even peephole-optimize bswaps. > The impact on x86-64 is less. With Ubuntu 12.04's gcc 4.6.3 on FX-8120 > I get 490 c/s for the original code, 450 c/s for your code without > inlining/unrolling, and somehow only 430 c/s with -finline-functions > -funroll-loops. Actually this is a lot closer to what I expected. I think you'll find similar results on 32-bit with gcc 4.6.3 too. The modern expectation is that manually unrolling loops will give worse performance than letting the compiler decide what to do. Certainly there are exceptions to the expected result, but on average, it's the right decision. > I think you should revert the changes for the instance of BF_ENCRYPT > that is inside of BF_body. > > I also think that this code should be optimized for speed even when the > rest of musl is optimized for size. In this case, better speed may mean > better security, because it lets the sysadmin configure a higher > iteration count for new passwords. Even if it's twice as slow, that should only be the cost of incrementing the (logarithmic) iteration count by one). The size difference between the versions is roughly 50% (7k vs 11.5k with -Os and roughly 9k vs 13.5k with -O3). Yes one can argue that the difference doesn't matter for one particular component they especially care about, but everyone cares about something different, and in the end the whole library ends up 50% larger if you follow that to its logical end. I'd much rather stick with letting the compiler do the bloating-up for performance purposes if the user wants it, so that the choice is left to them. Rich