CPU round-robin and isolated cores - Charles-François Natali

Development discussion of WireGuard
 help / color / mirror / Atom feed

From: "Charles-François Natali" <cf.natali@gmail.com>
To: wireguard@lists.zx2c4.com
Subject: CPU round-robin and isolated cores
Date: Tue, 29 Mar 2022 23:16:45 +0100	[thread overview]
Message-ID: <CAH_1eM3ASzMSFcCrutjkP135G567JKdcb++2yMpVV3a-fTPodA@mail.gmail.com> (raw)

Hi!

We've run into an issue where wireguard doesn't play nice with
isolated cores (`isolcpus` kernel parameter).

Basically we use `isolcpus` to isolate cores and explicitly bind our
low-latency processes to those cores, in order to minimize latency due
to the kernel and userspace.

It worked great until we started using wireguard, in particular it
seems to be due to the way work is allocated to the workqueues created here:
https://github.com/torvalds/linux/blob/ae085d7f9365de7da27ab5c0d16b12d51ea7fca9/drivers/net/wireguard/device.c#L335

I'm not familiar with the wireguard code at all so might be missing
something, but looking at e.g.
https://github.com/torvalds/linux/blob/ae085d7f9365de7da27ab5c0d16b12d51ea7fca9/drivers/net/wireguard/receive.c#L575
and https://github.com/torvalds/linux/blob/ae085d7f9365de7da27ab5c0d16b12d51ea7fca9/drivers/net/wireguard/queueing.h#L176
it seems that the RX path uses round-robin to dispatch the
packets to all online CPUs, including isolated ones:

```
void wg_packet_receive(struct wg_device *wg, struct sk_buff *skb)
{
[...]
    /* Then we queue it up in the device queue, which consumes the
    * packet as soon as it can.
    */
    cpu = wg_cpumask_next_online(next_cpu);
    if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
        return -EPIPE;
    queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
    return 0;
}
```

Where `wg_cpumask_next_online` is defined like this:
```
static inline int wg_cpumask_next_online(int *next)
{
    int cpu = *next;

    while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
        cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
    *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
    return cpu;
}
```

It's a problem for us because it causes significant latency, see e.g.
this ftrace output showing a kworker - bound to an isolated core - spend over
240usec inside wg_packet_decrypt_worker - we've seen much higher, up to
500usec or even more:

```
kworker/47:1-2373323 [047] 243644.756405: funcgraph_entry: |
process_one_work() {
    kworker/47:1-2373323 [047] 243644.756406: funcgraph_entry: |
wg_packet_decrypt_worker() {
[...]
    kworker/47:1-2373323 [047] 243644.756647: funcgraph_exit: 0.591 us | }
    kworker/47:1-2373323 [047] 243644.756647: funcgraph_exit: ! 242.655 us | }
```

If it was for example a physical NIC, typically what we'd do would
be to set IRQ affinity to avoid those isolated cores, which would also
avoid running the corresponding softirqs on those cores, avoiding such
latency.

However it seems that there's currently no way to tell wireguard to
avoid those cores.

I was wondering if it would make sense for wireguard to ignore
isolated cores to avoid this kind of issue. As far as I can tell it should
be a matter of replacing usages of `cpu_online_mask` by
`housekeeping_cpumask(HK_TYPE_DOMAIN)` or even
`housekeeping_cpumask(HK_TYPE_DOMAIN | HK_TYPE_WQ)`.

We could potentially run with a patched kernel but would very much
prefer using an upstream fix if that's acceptable.

Thanks in advance!

Charles

                 reply	other threads:[~2022-04-21 23:48 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH_1eM3ASzMSFcCrutjkP135G567JKdcb++2yMpVV3a-fTPodA@mail.gmail.com \
    --to=cf.natali@gmail.com \
    --cc=wireguard@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).