Development discussion of WireGuard
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'Jason A. Donenfeld'" <Jason@zx2c4.com>,
	Herbert Xu <herbert@gondor.apana.org.au>
Cc: "geert@linux-m68k.org" <geert@linux-m68k.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"wireguard@lists.zx2c4.com" <wireguard@lists.zx2c4.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"tytso@mit.edu" <tytso@mit.edu>,
	 "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"jeanphilippe.aumasson@gmail.com"
	<jeanphilippe.aumasson@gmail.com>,
	"ardb@kernel.org" <ardb@kernel.org>
Subject: RE: [PATCH crypto v3 0/2] reduce code size from blake2s on m68k and other small platforms
Date: Tue, 18 Jan 2022 12:44:57 +0000	[thread overview]
Message-ID: <ad862f5ad048404ab452e25bba074824@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHmME9ogAW0o2PReNtsD+fFgwp28q2kP7WADtbd8kA7GsnKBpg@mail.gmail.com>

From: Jason A. Donenfeld
> Sent: 18 January 2022 11:43
> 
> On 1/18/22, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > As the patches that triggered this weren't part of the crypto
> > tree, this will have to go through the random tree if you want
> > them for 5.17.
> 
> Sure, will do.

I've rammed the code through godbolt... https://godbolt.org/z/Wv64z9zG8

Some things I've noticed;

1) There is no point having all the inline functions.
   Far better to have real functions to do the work.
   Given the cost of hashing 64 bytes of data the extra
   function call won't matter.
   Indeed for repeated calls it will help because the required
   code will be in the I-cache.

2) The compiles I tried do manage to remove the blake2_sigma[][]
   when unrolling everything - which is a slight gain for the full
   unroll. But I doubt it is that significant if the access can
   get sensibly optimised.
   For non-x86 that might require all the values by multiplied by 4.

3) Although G() is a massive register dependency chain the compiler
   knows that G(,[0-3],) are independent and can execute in parallel.
   This does help execution time on multi-issue cpu (like x86).
   With care it ought to be possible to use the same code for G(,[4-7],)
   without stopping the compiler interleaving all the instructions.

4) I strongly suspect that using a loop for the rounds will have
   minimal impact on performance - especially if the first call is
   'cold cache'.
   But I've not got time to test the code.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

  reply	other threads:[~2022-01-18 12:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHmME9qbnYmhvsuarButi6s=58=FPiti0Z-QnGMJ=OsMzy1eOg@mail.gmail.com>
2022-01-11 13:49 ` [PATCH crypto 0/2] smaller blake2s code size " Jason A. Donenfeld
2022-01-11 13:49   ` [PATCH crypto 1/2] lib/crypto: blake2s-generic: reduce code size on small systems Jason A. Donenfeld
2022-01-12 10:57     ` Geert Uytterhoeven
2022-01-12 13:16       ` Jason A. Donenfeld
2022-01-12 18:31     ` Eric Biggers
2022-01-12 18:50       ` Jason A. Donenfeld
2022-01-12 21:27         ` David Laight
2022-01-12 22:00           ` Jason A. Donenfeld
2022-01-11 13:49   ` [PATCH crypto 2/2] lib/crypto: blake2s: move hmac construction into wireguard Jason A. Donenfeld
2022-01-11 14:43     ` Ard Biesheuvel
2022-01-12 18:35     ` Eric Biggers
2022-01-11 18:10   ` [PATCH crypto v2 0/2] reduce code size from blake2s on m68k and other small platforms Jason A. Donenfeld
2022-01-11 18:10     ` [PATCH crypto v2 1/2] lib/crypto: blake2s: move hmac construction into wireguard Jason A. Donenfeld
2022-01-11 18:10     ` [PATCH crypto v2 2/2] lib/crypto: sha1: re-roll loops to reduce code size Jason A. Donenfeld
2022-01-11 22:05     ` [PATCH crypto v3 0/2] reduce code size from blake2s on m68k and other small platforms Jason A. Donenfeld
2022-01-11 22:05       ` [PATCH crypto v3 1/2] lib/crypto: blake2s: move hmac construction into wireguard Jason A. Donenfeld
2022-01-11 22:05       ` [PATCH crypto v3 2/2] lib/crypto: sha1: re-roll loops to reduce code size Jason A. Donenfeld
2022-01-12 10:59       ` [PATCH crypto v3 0/2] reduce code size from blake2s on m68k and other small platforms Geert Uytterhoeven
2022-01-12 13:18         ` Jason A. Donenfeld
2022-01-18  6:42           ` Herbert Xu
2022-01-18 11:43             ` Jason A. Donenfeld
2022-01-18 12:44               ` David Laight [this message]
2022-01-18 12:50                 ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad862f5ad048404ab452e25bba074824@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=Jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jeanphilippe.aumasson@gmail.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=wireguard@lists.zx2c4.com \
    --subject='RE: [PATCH crypto v3 0/2] reduce code size from blake2s on m68k and other small platforms' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).