[musl] crypt: dependency needed between main and test run?

mailing list of musl libc
 help / color / mirror / code / Atom feed

* [musl] crypt: dependency needed between main and test run?
@ 2023-05-27  9:52 Markus Wichmann
  2023-05-27 13:55 ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Markus Wichmann @ 2023-05-27  9:52 UTC (permalink / raw)
  To: musl

Hi all,

continuing my theme of not providing patches but rather asking
questions, I noticed something about all of the crypt() backends: They
all use a self test, and have no dependency between main and test calls
to their respective backends. E.g. in __crypt_sha512():

|p = sha512crypt(key, setting, output);
|/* self test and stack cleanup */
|q = sha512crypt(testkey, testsetting, testbuf);

The backend is itself a pure function. For any given input, it will
always make the same output. So I don't see a reason why the compiler
can't reorder these calls. The backend is a static function (and only
calling static functions) defined in the same file, so all of this is
available for the compiler to see.

Indeed the compiler might inline the second call, then keep constant
folding and loop unrolling until it is entirely evaluated at compile
time, and only the result remains. Constant folding and loop unrolling
aren't new ideas, but sha512 (and actually all of the hash functions)
would require the compiler to be a lot more agressive about them than
it normally is at this time. But that doesn't mean it cannot change in
future.

I'd propose to add an explicit dependency from the second to the first
call. Something like this:

p = sha512crypt(key, setting, output);
__asm__("" : "=x"(q) : "0"(testsetting), "x"(p));
q = sha512crypt(testkey, q, testbuf);

The asm statement basically tells the compiler that

q = f(testsetting, p);

where f is some unknown pure function. The fact that that function
happens to be the identity is unknown to the compiler.

Doing this has two purposes: One, it makes the second call depend on the
result of the first, and two, it obscures to the compiler that all
inputs to the second call are constant, and thus makes it extremely
unlikely that the test backend call would be entirely eliminated.

What do you guys think?

Ciao,
Markus

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [musl] crypt: dependency needed between main and test run?
  2023-05-27  9:52 [musl] crypt: dependency needed between main and test run? Markus Wichmann
@ 2023-05-27 13:55 ` Rich Felker
  2023-05-29 19:35   ` Solar Designer
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2023-05-27 13:55 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl, Solar Designer

On Sat, May 27, 2023 at 11:52:23AM +0200, Markus Wichmann wrote:
> Hi all,
> 
> continuing my theme of not providing patches but rather asking
> questions, I noticed something about all of the crypt() backends: They
> all use a self test, and have no dependency between main and test calls
> to their respective backends. E.g. in __crypt_sha512():
> 
> 
> |p = sha512crypt(key, setting, output);
> |/* self test and stack cleanup */
> |q = sha512crypt(testkey, testsetting, testbuf);
> 
> The backend is itself a pure function. For any given input, it will
> always make the same output. So I don't see a reason why the compiler
> can't reorder these calls. The backend is a static function (and only
> calling static functions) defined in the same file, so all of this is
> available for the compiler to see.
> 
> Indeed the compiler might inline the second call, then keep constant
> folding and loop unrolling until it is entirely evaluated at compile
> time, and only the result remains. Constant folding and loop unrolling
> aren't new ideas, but sha512 (and actually all of the hash functions)
> would require the compiler to be a lot more agressive about them than
> it normally is at this time. But that doesn't mean it cannot change in
> future.
> 
> I'd propose to add an explicit dependency from the second to the first
> call. Something like this:
> 
> p = sha512crypt(key, setting, output);
> __asm__("" : "=x"(q) : "0"(testsetting), "x"(p));
> q = sha512crypt(testkey, q, testbuf);
> 
> The asm statement basically tells the compiler that
> 
> q = f(testsetting, p);
> 
> where f is some unknown pure function. The fact that that function
> happens to be the identity is unknown to the compiler.
> 
> Doing this has two purposes: One, it makes the second call depend on the
> result of the first, and two, it obscures to the compiler that all
> inputs to the second call are constant, and thus makes it extremely
> unlikely that the test backend call would be entirely eliminated.
> 
> What do you guys think?

This is probably the right thing to do. I've CC'd Solar in case he has
any good ideas on this, since it was originally his design to do the
tests this way.

Rich

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [musl] crypt: dependency needed between main and test run?
  2023-05-27 13:55 ` Rich Felker
@ 2023-05-29 19:35   ` Solar Designer
  0 siblings, 0 replies; 3+ messages in thread
From: Solar Designer @ 2023-05-29 19:35 UTC (permalink / raw)
  To: Rich Felker; +Cc: Markus Wichmann, musl

Hi,

On Sat, May 27, 2023 at 09:55:13AM -0400, Rich Felker wrote:
> On Sat, May 27, 2023 at 11:52:23AM +0200, Markus Wichmann wrote:
> > continuing my theme of not providing patches but rather asking
> > questions, I noticed something about all of the crypt() backends: They
> > all use a self test, and have no dependency between main and test calls
> > to their respective backends. E.g. in __crypt_sha512():
> > 
> > 
> > |p = sha512crypt(key, setting, output);
> > |/* self test and stack cleanup */
> > |q = sha512crypt(testkey, testsetting, testbuf);
> > 
> > The backend is itself a pure function. For any given input, it will
> > always make the same output. So I don't see a reason why the compiler
> > can't reorder these calls. The backend is a static function (and only
> > calling static functions) defined in the same file, so all of this is
> > available for the compiler to see.
> > 
> > Indeed the compiler might inline the second call, then keep constant
> > folding and loop unrolling until it is entirely evaluated at compile
> > time, and only the result remains. Constant folding and loop unrolling
> > aren't new ideas, but sha512 (and actually all of the hash functions)
> > would require the compiler to be a lot more agressive about them than
> > it normally is at this time. But that doesn't mean it cannot change in
> > future.
> > 
> > I'd propose to add an explicit dependency from the second to the first
> > call. Something like this:
> > 
> > p = sha512crypt(key, setting, output);
> > __asm__("" : "=x"(q) : "0"(testsetting), "x"(p));
> > q = sha512crypt(testkey, q, testbuf);
> > 
> > The asm statement basically tells the compiler that
> > 
> > q = f(testsetting, p);
> > 
> > where f is some unknown pure function. The fact that that function
> > happens to be the identity is unknown to the compiler.
> > 
> > Doing this has two purposes: One, it makes the second call depend on the
> > result of the first, and two, it obscures to the compiler that all
> > inputs to the second call are constant, and thus makes it extremely
> > unlikely that the test backend call would be entirely eliminated.
> > 
> > What do you guys think?
> 
> This is probably the right thing to do. I've CC'd Solar in case he has
> any good ideas on this, since it was originally his design to do the
> tests this way.

Yes, I agree that we have the theoretical issues here that Markus
described, and the dummy inline asm is a way to avoid them.  Perhaps the
volatile keyword could be another way to avoid compile-time computation
of the self-test, but it might not prevent the reordering.

In crypt_blowfish.c, where I had originally introduced this design, we
partially mitigate these issues by having a dependency on the first
call's return value to decide between one of two test vectors for the
second call.  Theoretically, the compiler can still precompute the test
hashes for both possible inputs, but this is less likely.

While we're at it, I suggest usage of (almost?) the lowest supported
settings in the self-test - so for sha*crypt use rounds=1000 or e.g.
rounds=1001, not 1234.  In fact, in crypt_blowfish.c we override the
minimum for the self-test, to make the self-test's performance cost
negligible.  Maybe we should in sha*crypt, too.

Alexander

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-05-29 19:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-27  9:52 [musl] crypt: dependency needed between main and test run? Markus Wichmann
2023-05-27 13:55 ` Rich Felker
2023-05-29 19:35   ` Solar Designer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).