From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1113
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@aerifal.cx>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: FreeSec crypt()
Date: Wed, 13 Jun 2012 10:53:18 -0400
Message-ID: <20120613145318.GC163@brightrain.aerifal.cx>
References: <20120612235113.GA21296@openwall.com>
 <20120613011842.GA163@brightrain.aerifal.cx>
 <20120613120754.GA21900@openwall.com>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: dough.gmane.org 1339599472 32619 80.91.229.3 (13 Jun 2012 14:57:52 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Wed, 13 Jun 2012 14:57:52 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-1114-gllmg-musl=m.gmane.org@lists.openwall.com Wed Jun 13 16:57:51 2012
Return-path: <musl-return-1114-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@plane.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-1114-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1Sep0x-0007yN-E5
	for gllmg-musl@plane.gmane.org; Wed, 13 Jun 2012 16:57:43 +0200
Original-Received: (qmail 14163 invoked by uid 550); 13 Jun 2012 14:57:43 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 14152 invoked from network); 13 Jun 2012 14:57:42 -0000
Content-Disposition: inline
In-Reply-To: <20120613120754.GA21900@openwall.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Xref: news.gmane.org gmane.linux.lib.musl.general:1113
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/1113>

On Wed, Jun 13, 2012 at 04:07:54PM +0400, Solar Designer wrote:
> > I also removed
> > some unused code for things like keeping salt/keys between multiple
> > runs since the data is not preserved across multiple runs anyway,
> 
> It can be preserved if you make the (tiny) struct _crypt_extended_local
> static.

I suppose this is worth considering. I was assuming it would be
difficult to preserve (require otherwise-unneeded locking), but for
crypt_r each caller could be responsible for its own copy.

However, on the other hand I'm not sure I see the benefit. Why would
you call crypt or crypt_r multiple times with the same key/salt except
possibly for password cracking (in which case you'll want a highly
optimized-for-speed implementation)?

> > and the lookup tables for bitshifts.
> 
> As discussed on IRC, I had tried that, but reverted my changes because
> they resulted in code size increase (greater than the savings from not
> having those tables).  Apparently, the table lookups helped avoid mov's
> (on 2-operand archs like x86) and add/sub (becomes part of effective
> address calculation).  Did you check how this affected total size in
> your case?  Or do you prefer cleaner source code anyway (makes sense)?

I didn't compare, but yes I do prefer cleaner source. I'm a bit
surprised that you found the difference in code size to be >136 bytes
though. Are you sure the code size changed by that much?

> > > Also, we could want to add a runtime self-test, which would detect
> > > possible miscompiles.
> > 
> > I understand your motivation for doing this with security-critical
> > things, but really most/all of libc is security-critical, and we can't
> > have runtime miscompilation tests all over the place. Moreover, the
> > vast majority of cases of GCC "miscompiling" something turn out to be
> > code that's invoking undefined behavior; the only non-UB example I've
> > encountered while working on musl is gcc 4.7's bad breakage, which is
> > so bad you can't even get programs to start...
> 
> I agree that the vast majority of cases may involve UB, but so what - we
> can't be 100% certain we don't have any cases of UB in our source code
> even if we review it very carefully.

For purely computational code that doesn't recurse or loop arbitrarily
long based on the input or read and write all over memory, static
analysis can determine the absence of UB. Having no UB, and having the
code tested with both signed and unsigned char (easy with gcc -f
options) should give an extremely high level of confidence that cases
which have been tested on one system/compiler will behave identically
on another system or another compiler.

> We might miss things, and besides
> source code changes over time may introduce UB (e.g., changes in
> variable declarations may result in code that continues to work
> correctly, but actually depends on undefined behavior).

And the same is true elsewhere in much more critical code, like
snprintf, strcpy, strchr, etc. all of which could create huge vulns if
miscompiled. What I was saying is that if you're going to call into
question the compiler, I think you need some basis for deciding which
functions need to double-check that their results are correct, unless
your intent is to make ALL of your code fault-tolerant in the form of
double-checking its results. The latter would certainly be an
interesting project in itself, but I think it goes way beyond the
scope (and contrary to many of the goals, including size, speed, and
simplicity) of musl. If one really wants to make such a system, I
think it would make a lot more sense to have the checking
automatically generated by the compiler/toolchain (e.g. use 2 separate
compilers and compare the results constantly).

> We have a fairly large chunk of code here (a few KB), which we can have
> self-tested by a few hundred bytes of code more.  I think this is worth
> it.

I'm open to seeing them if they're really that small and non-invasive.

> Besides, the self-test may serve to zeroize sensitive data - and in
> a more reliable way than a memset() would (which may be optimized out,
> and it would not necessarily zeroize stack and registers).  In fact, I
> am using this approach to zeroization in crypt_blowfish.  Yes, the
> self-test needs to run _after_ hashing the real password, and then
> depending on self-test result you either return the originally computed


> value or you return an error indication (what to return from crypt() on
> error is a separate non-trivial topic).

I saw the notes on this. What real-world code breaks from the
conformant behavior?

> >  * This version is derived from the original implementation of FreeSec
> ....
> 
> Can you please add a comment documenting your changes, too?

Sure, I'll do that before committing. What I sent to the list was just
a quick draft, and I certainly don't want to misrepresent my modified
version as if it were your original, especially if it has bugs. I
really should have added a comment before even sending it...

> I notice that you fully dropped the tests - are you going to have
> compile-time tests elsewhere?

I just did it while shuffling around the code -- I think I moved them
and the adjacent table-gen code I wrote to a separate file at the same
time. I'm fairly indifferent to putting them in the file, but I'd much
rather have a test for the external API in an external test module as
the actual tests we use.

Rich