From: David Laight <David.Laight@ACULAB.COM>
To: "'Jason A. Donenfeld'" <Jason@zx2c4.com>,
Eric Biggers <email@example.com>
Cc: Linux Crypto Mailing List <firstname.lastname@example.org>,
WireGuard mailing list <email@example.com>,
LKML <firstname.lastname@example.org>, bpf <email@example.com>,
"Geert Uytterhoeven" <firstname.lastname@example.org>,
Theodore Ts'o <email@example.com>,
"Greg Kroah-Hartman" <firstname.lastname@example.org>,
Jean-Philippe Aumasson <email@example.com>,
Ard Biesheuvel <firstname.lastname@example.org>,
"Herbert Xu" <email@example.com>
Subject: RE: [PATCH crypto 1/2] lib/crypto: blake2s-generic: reduce code size on small systems
Date: Wed, 12 Jan 2022 21:27:40 +0000 [thread overview]
Message-ID: <d7e206a5a03d46a69c0be3b8ed651518@AcuMS.aculab.com> (raw)
From: Jason A. Donenfeld
> Sent: 12 January 2022 18:51
> On Wed, Jan 12, 2022 at 7:32 PM Eric Biggers <firstname.lastname@example.org> wrote:
> > How about unrolling the inner loop but not the outer one? Wouldn't that give
> > most of the benefit, without hurting performance as much?
> > If you stay with this approach and don't unroll either loop, can you use 'r' and
> > 'i' instead of 'i' and 'j', to match the naming in G()?
> All this might work, sure. But as mentioned earlier, I've abandoned
> this entirely, as I don't think this patch is necessary. See the v3
> patchset instead:
I think you mentioned in another thread that the buffers (eg for IPv6
addresses) are actually often quite short.
For short buffers the 'rolled-up' loop may be of similar performance
to the unrolled one because of the time taken to read all the instructions
into the I-cache and decode them.
If the loop ends up small enough it will fit into the 'decoded loop
buffer' of modern Intel x86 cpu and won't even need decoding on
I really suspect that the heavily unrolled loop is only really fast
for big buffers and/or when it is already in the I-cache.
In real life I wonder how often that actually happens?
Especially for the uses the kernel is making of the code.
You need to benchmark single executions of the function
(doable on x86 with the performance monitor cycle counter)
to get typical/best clocks/byte figures rather than a
big average for repeated operation on a long buffer.
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
next prev parent reply other threads:[~2022-01-16 21:13 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAHmME9qbnYmhvsuarButi6s=58=FPiti0Z-QnGMJ=OsMzy1eOg@mail.gmail.com>
2022-01-11 13:49 ` [PATCH crypto 0/2] smaller blake2s code size on m68k and other small platforms Jason A. Donenfeld
2022-01-11 13:49 ` [PATCH crypto 1/2] lib/crypto: blake2s-generic: reduce code size on small systems Jason A. Donenfeld
2022-01-12 10:57 ` Geert Uytterhoeven
2022-01-12 13:16 ` Jason A. Donenfeld
2022-01-12 18:31 ` Eric Biggers
2022-01-12 18:50 ` Jason A. Donenfeld
2022-01-12 21:27 ` David Laight [this message]
2022-01-12 22:00 ` Jason A. Donenfeld
2022-01-11 13:49 ` [PATCH crypto 2/2] lib/crypto: blake2s: move hmac construction into wireguard Jason A. Donenfeld
2022-01-11 14:43 ` Ard Biesheuvel
2022-01-12 18:35 ` Eric Biggers
2022-01-11 18:10 ` [PATCH crypto v2 0/2] reduce code size from blake2s on m68k and other small platforms Jason A. Donenfeld
2022-01-11 18:10 ` [PATCH crypto v2 1/2] lib/crypto: blake2s: move hmac construction into wireguard Jason A. Donenfeld
2022-01-11 18:10 ` [PATCH crypto v2 2/2] lib/crypto: sha1: re-roll loops to reduce code size Jason A. Donenfeld
2022-01-11 22:05 ` [PATCH crypto v3 0/2] reduce code size from blake2s on m68k and other small platforms Jason A. Donenfeld
2022-01-11 22:05 ` [PATCH crypto v3 1/2] lib/crypto: blake2s: move hmac construction into wireguard Jason A. Donenfeld
2022-01-11 22:05 ` [PATCH crypto v3 2/2] lib/crypto: sha1: re-roll loops to reduce code size Jason A. Donenfeld
2022-01-12 10:59 ` [PATCH crypto v3 0/2] reduce code size from blake2s on m68k and other small platforms Geert Uytterhoeven
2022-01-12 13:18 ` Jason A. Donenfeld
2022-01-18 6:42 ` Herbert Xu
2022-01-18 11:43 ` Jason A. Donenfeld
2022-01-18 12:44 ` David Laight
2022-01-18 12:50 ` Jason A. Donenfeld
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).