Development discussion of WireGuard
 help / color / mirror / Atom feed
From: Fatih USTA <fatihusta86@gmail.com>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: WireGuard mailing list <wireguard@lists.zx2c4.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: wg-crypt-wg0 process
Date: Thu, 31 Dec 2020 08:47:28 +0300	[thread overview]
Message-ID: <73b49753-da23-eef3-890b-92e85278d882@gmail.com> (raw)
In-Reply-To: <CAHmME9oORiNMN6EyAWOy3Fxzo33etkQxQtQbKdPqw_fWdw9wtw@mail.gmail.com>

Hi Jason,

Thanks for the detailed research and explanation.That's ok for me.

Regards.

Fatih USTA

On 30.12.2020 15:39, Jason A. Donenfeld wrote:
> Hi Fatih,
>
> Thanks for the report and the detailed test case. From what I can see,
> this behavior presents itself both with the explicit ip link del and
> without. When running with debugging enabled, I can see this in dmesg:
>
> [558758.361056] wireguard: wg0: Keypair 244 destroyed for peer 21
> [558758.546649] wireguard: wg0: Peer 21 (10.150.150.2:51820) destroyed
> [558758.563317] wireguard: wg0: Interface destroyed
> [558758.567803] wireguard: wg0: Keypair 243 destroyed for peer 22
> [558758.733287] wireguard: wg0: Peer 22 (10.150.150.1:51820) destroyed
> [558758.749991] wireguard: wg0: Interface destroyed
>
> The fact that I see "Interface destroyed" for both interfaces means
> that wg_destruct() is being called, which includes these calls:
>
>          destroy_workqueue(wg->handshake_receive_wq);
>          destroy_workqueue(wg->handshake_send_wq);
>          destroy_workqueue(wg->packet_crypt_wq);
>
> In doing so, the output of ps changes from:
>
> $ ps aux|grep wg0
> root      200479  0.0  0.0      0     0 ?        I    13:06   0:00
> [kworker/0:2-wg-crypt-wg0]
> root      201226  0.0  0.0      0     0 ?        I    13:08   0:00
> [kworker/1:4-wg-crypt-wg0]
> root      201476  0.0  0.0      0     0 ?        I<   13:11   0:00
> [wg-crypt-wg0]
> root      201484  0.0  0.0      0     0 ?        I<   13:11   0:00
> [wg-crypt-wg0]
>
> to:
>
> $ ps aux|grep wg0
> root      200479  0.0  0.0      0     0 ?        I    13:06   0:00
> [kworker/0:2-wg-crypt-wg0]
> root      201226  0.0  0.0      0     0 ?        I    13:08   0:00
> [kworker/1:4-wg-crypt-wg0]
>
> What I suspect is happening is that destroying the workqueue does not
> actually destroy the kthreads that they were using, so that they can
> be reused (and eventually relabeled) by other drivers. Looking at the
> stack of those indicates this is probably the case:
>
> $ cat /proc/200479/stack
> [<0>] worker_thread+0xba/0x3c0
> [<0>] kthread+0x114/0x130
> [<0>] ret_from_fork+0x1f/0x30
>
> So it's just hanging out there idle waiting to be scheduled by
> something new. In fact, while I was writing this email, that worker
> already seems to have been reclaimed by another driver:
>
> $ cat /proc/200479/comm
> kworker/0:2-events
>
> Now it's called "events".
>
> This is happening because the kthread isn't actually destroyed, and
> task->comm is being hijacked. In proc_task_name we have:
>
>          if (p->flags & PF_WQ_WORKER)
>                 wq_worker_comm(tcomm, sizeof(tcomm), p);
>         else
>                 __get_task_comm(tcomm, sizeof(tcomm), p);
>
> That top condition holds for workqueue workers, and wq_worker_comm
> winds up scnprintf'ing the current worker description in there:
>
>                          /*
>                          * ->desc tracks information (wq name or
>                          * set_worker_desc()) for the latest execution.  If
>                          * current, prepend '+', otherwise '-'.
>                          */
>                         if (worker->desc[0] != '\0') {
>                                 if (worker->current_work)
>                                         scnprintf(buf + off, size - off, "+%s",
>                                                   worker->desc);
>                                 else
>                                         scnprintf(buf + off, size - off, "-%s",
>                                                   worker->desc);
>
> But worker->desc isn't set until process_one_work is called:
>
>          /*
>          * Record wq name for cmdline and debug reporting, may get
>          * overridden through set_worker_desc().
>          */
>         strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN);
>
> And it's never unset after the work is done and it's waiting idle in
> worker_thread for the scheduler to reschedule it and eventually call
> process_one_work on a new work unit.
>
> It would be easy to just null out that string after the work is done
> with something like:
>
>          worker->current_func(work);
>          worker->desc[0] = '\0';
>
> But I guess this has never sufficiently bothered anyone before. I
> suppose I could submit a patch and see how it's received. But it also
> looks like the scnprintf above in wq_worker_comm distinguishes these
> cases anyway. If there's a + it means that the work is active and if
> there's a - it means that it's an old leftover thread. So maybe this
> is fine as-is?
>
> Jason

      reply	other threads:[~2020-12-31  5:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-30  8:19 Fatih USTA
2020-12-30  9:29 ` John Sager
2020-12-30 12:39 ` Jason A. Donenfeld
2020-12-31  5:47   ` Fatih USTA [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73b49753-da23-eef3-890b-92e85278d882@gmail.com \
    --to=fatihusta86@gmail.com \
    --cc=Jason@zx2c4.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=wireguard@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).