Hello,

I've been evaluating the use of Wireguard to replace a setup that uses OpenVPN. Initial tests look promising in terms of system resources required (much less CPU than OpenVPN), but I'm encountering a fair amount of packet loss and I can't see why.

The scenario is a public API endpoint that devices ping with a reasonably hefty payload. The payload is received by nginx which proxies it over a tunnel (via public network) to a server downstream.

Wiregard version is 0.0.20190406-1,

The test server is an Intel i5-4460 running Debian, with 4.19.0-5-amd64 kernel.

load average: 2.18, 2.12, 2.12
%Cpu(s): 25.2 us, 3.0 sy, 0.0 ni, 68.3 id, 0.0 wa, 0.0 hi, 3.5 si, 0.0 st

So basically, the traffic isn't exceptionally heavy and it is pretty stable in terms of volume, and the machine is not doing anything else.

Looking at the wg0 interface, I see it dropping a fair amount of RX packets. Doing some maths with /sys/class/net/wg0/statistics, it shows the interface is receiving about 600KB/sec and around 5000pps. The RX dropped counter is rising at about 120-150pps (between 2-3%) and this is show up as an error to the sender which then has to explicitly retry (this is how I became aware of the problem in the first place). 

The underlying eth0 interface isn't seeing a single packet dropped or any errors.

eth0 mtu is 1500, wg0 mtu is 1420 (haven't touched these).

I've tried raising txqueuelen, raising net.core.rmem_max and net.core.rmem_default to stupidly high values with 0 difference.

I've tried setting net.ipv4.tcp_rmem='16384 33554432 67108864, increasing net.core.netdev_max_backlog and net.ipv4.udp_mem but nothing changes. So rather than try even more random changes, I'm wondering if anybody recognizes the symptoms, and what the fix is? I think that covers it, but feel free to ask for other metrics.

The exact same machine using OpenVPN dropped nothing (although user cpu was closer to 60%).

Thanks,
Ian.



--
Sent using MsgSafe.io's Free Plan
Private, encrypted, online communication
For everyone. www.msgsafe.io