From: Baptiste Jonglez <baptiste@bitsofnetworks.org>
To: "René van Dorst" <opensource@vdorst.com>
Cc: wireguard@lists.zx2c4.com
Subject: Re: [WireGuard] News about MIPS and ARM optimized code?
Date: Fri, 9 Sep 2016 15:52:02 +0200 [thread overview]
Message-ID: <20160909135202.GA32666@lud.imag.fr> (raw)
In-Reply-To: <20160909134611.Horde.d1CtbRQrioV8yr-kI71aUI3@www.vdorst.com>
[-- Attachment #1: Type: text/plain, Size: 3959 bytes --]
Nice work! I had tried to write chacha20_generic_block in MIPS assembly,
but I got confused with endianness issues and the code didn't work in the
end.
Is your code available somewhere? I'd be happy to test on a variety of
MIPS routers.
On Fri, Sep 09, 2016 at 01:46:11PM +0000, René van Dorst wrote:
> Duo the misaligned data fetching function like poly1305 causes regression on
> the mips.
>
> h0 += (le32_to_cpuvp(src + 0) >> 0) & 0x3ffffff;
> h1 += (le32_to_cpuvp(src + 3) >> 2) & 0x3ffffff;
> h2 += (le32_to_cpuvp(src + 6) >> 4) & 0x3ffffff;
> h3 += (le32_to_cpuvp(src + 9) >> 6) & 0x3ffffff;
> h4 += (le32_to_cpuvp(src + 12) >> 8) | hibit;
>
>
> Had 26MBit now +42.
>
> root@lede:~# iperf3 -c 10.0.0.1 -i 10
> Connecting to host 10.0.0.1, port 5201
> [ 4] local 10.0.0.2 port 36216 connected to 10.0.0.1 port 5201
> [ ID] Interval Transfer Bandwidth Retr Cwnd
> [ 4] 0.00-10.08 sec 51.2 MBytes 42.7 Mbits/sec 0 171 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Retr
> [ 4] 0.00-10.08 sec 51.2 MBytes 42.7 Mbits/sec 0 sender
> [ 4] 0.00-10.08 sec 51.2 MBytes 42.7 Mbits/sec receiver
>
> iperf Done.
> root@lede:~# iperf3 -c 10.0.0.1 -u -b 1G -i 10
> Connecting to host 10.0.0.1, port 5201
> [ 4] local 10.0.0.2 port 60714 connected to 10.0.0.1 port 5201
> [ ID] Interval Transfer Bandwidth Total Datagrams
> [ 4] 0.00-10.00 sec 56.3 MBytes 47.2 Mbits/sec 7209
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth Jitter Lost/Total
> Datagrams
> [ 4] 0.00-10.00 sec 56.3 MBytes 47.2 Mbits/sec 0.034 ms 0/7209 (0%)
> [ 4] Sent 7209 datagrams
>
> iperf Done.
> root@lede:~#
>
>
> Work is not done yet but a good start.
>
> Greats,
>
> René van Dorst.
>
> Quoting René van Dorst <opensource@vdorst.com>:
>
> >I did try to write some MIPS32r2 code.
> >I wrote the chacha20_keysetup, chacha20_generic_block and
> >poly1305_generic_blocks in assembly.
> >Tried to load all needed variables in the registers. Which should reduce
> >the memory overhead.
> >But it is very difficult for me to do code profiling and/or isolate the
> >code and make some benchmark programs like supercop.
> >So testing was simple. Crosscompile the code. Copy and load the module on
> >the target. Run setup script and iperf.
> >
> >#ifdef CONFIG_CPU_MIPS32_R2
> >asmlinkage void chacha20_keysetup(struct chacha20_ctx *ctx, const u8
> >key[static 32], const u8 nonce[static 8]);
> >asmlinkage void chacha20_generic_block(struct chacha20_ctx *ctx);
> >asmlinkage unsigned int poly1305_generic_blocks(struct poly1305_ctx *ctx,
> >const u8 *src, unsigned int srclen, u32 hibit);
> >#endif
> >
> >But the speed is equal or less on my TP WR1043ND device which is a
> >MIPS32r2 24kc big endian.
> >So GCC does a good job. Also 24kc has no special CoProcessors or FPU.
> >
> >Most improvement what I had it to change the buildroot default
> >optimization -Os to -O2.
> >This gives around 1-3% speed improvement.
> >
> >ideas:
> >- remove the little endian parts on the MIPS.
> > Offcourse do it also on the other side.
> > On this device I can't switch endian.
> > But I did not see any improvements. Need 2 instruction for swapping
> >32bit register.
> > After a quick calculation it could save around 0.4% which is ~0.1MBit/s
> >on this device.
> >
> >Greats,
> >
> >René van Dorst.
> >
> >_______________________________________________
> >WireGuard mailing list
> >WireGuard@lists.zx2c4.com
> >http://lists.zx2c4.com/mailman/listinfo/wireguard
>
>
>
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> http://lists.zx2c4.com/mailman/listinfo/wireguard
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
next prev parent reply other threads:[~2016-09-09 13:44 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-08 13:23 René van Dorst
2016-08-08 14:29 ` Jason A. Donenfeld
2016-09-08 11:57 ` René van Dorst
2016-09-09 13:46 ` René van Dorst
2016-09-09 13:52 ` Baptiste Jonglez [this message]
2016-09-09 15:22 ` René van Dorst
2016-09-09 19:49 ` René van Dorst
2016-09-14 7:16 ` René van Dorst
2016-09-20 20:39 ` Jason A. Donenfeld
2016-09-22 18:27 ` René van Dorst
2016-09-27 1:48 ` Jason A. Donenfeld
2016-09-14 8:10 ` jens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160909135202.GA32666@lud.imag.fr \
--to=baptiste@bitsofnetworks.org \
--cc=opensource@vdorst.com \
--cc=wireguard@lists.zx2c4.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).