* wg-crypt-wg0 process @ 2020-12-30 8:19 Fatih USTA 2020-12-30 9:29 ` John Sager 2020-12-30 12:39 ` Jason A. Donenfeld 0 siblings, 2 replies; 4+ messages in thread From: Fatih USTA @ 2020-12-30 8:19 UTC (permalink / raw) To: WireGuard mailing list Hi I'm playing wireguard with the namespace. I think I caught a litle problem. If I delete netns directly, everything is removed, but wg-crypt-wg0 process is still alive. root 8127 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] root 8143 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] root 8449 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] root 8454 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] If I delete first wireguard interface from the netns, everthing works fine. wg_version: 1.0.20201221 kernel_version: 3.16.85-1 #!/bin/bash case $1 in remove) ip link del dev bridge0 || { echo "Please add first."; exit 1; } ip link del dev veth1 ip link del dev veth2 #ip netns exec ns1 ip link del dev wg0 #ip netns exec ns2 ip link del dev wg0 ip netns del ns1 ip netns del ns2 iptables -D FORWARD -i bridge0 -o bridge0 -j ACCEPT rm -f /tmp/private-ns1 /tmp/private-ns2 /tmp/public-ns1 /tmp/public-ns2 ;; add) ip link add name bridge0 type bridge || { echo "Please remove first."; exit 1; } ip link set dev bridge0 up ip netns add ns1 ip netns add ns2 ip link add name veth1 type veth peer name eth0 netns ns1 ip link add name veth2 type veth peer name eth0 netns ns2 ip link set dev veth1 up master bridge0 ip link set dev veth2 up master bridge0 ip netns exec ns1 ip link set dev lo up ip netns exec ns1 ip link set dev eth0 up ip netns exec ns1 ip addr add 10.150.150.1/24 dev eth0 ip netns exec ns2 ip link set dev lo up ip netns exec ns2 ip link set dev eth0 up ip netns exec ns2 ip addr add 10.150.150.2/24 dev eth0 ( umask 0077; wg genkey | \ tee /tmp/private-ns1 | \ wg pubkey > /tmp/public-ns1 wg genkey | \ tee /tmp/private-ns2 | \ wg pubkey > /tmp/public-ns2 ) ip netns exec ns1 ip link add name wg0 type wireguard ip netns exec ns1 ip addr add 172.16.1.1/24 dev wg0 ip netns exec ns2 ip link add name wg0 type wireguard ip netns exec ns2 ip addr add 172.16.1.2/24 dev wg0 ip netns exec ns1 wg set wg0 private-key /tmp/private-ns1 listen-port 51820 ip netns exec ns1 ip link set wg0 up ip netns exec ns2 wg set wg0 private-key /tmp/private-ns2 listen-port 51820 ip netns exec ns2 ip link set wg0 up ip netns exec ns1 wg set wg0 peer "$(</tmp/public-ns2)" allowed-ips 172.16.1.0/24 endpoint 10.150.150.2:51820 ip netns exec ns2 wg set wg0 peer "$(</tmp/public-ns1)" allowed-ips 172.16.1.0/24 endpoint 10.150.150.1:51820 iptables -I FORWARD -i bridge0 -o bridge0 -j ACCEPT ip netns exec ns1 wg ip netns exec ns2 wg ip netns exec ns1 ping -i 0.3 -c 2 172.16.1.2 &>/dev/null && \ echo -e "\n\nWorked" || \ echo -e "\n\nFailed" ;; *)echo "$(basename $0) add|remove" ;; esac -- Fatih USTA ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wg-crypt-wg0 process 2020-12-30 8:19 wg-crypt-wg0 process Fatih USTA @ 2020-12-30 9:29 ` John Sager 2020-12-30 12:39 ` Jason A. Donenfeld 1 sibling, 0 replies; 4+ messages in thread From: John Sager @ 2020-12-30 9:29 UTC (permalink / raw) To: wireguard The posted script works for me, Xubuntu 20.04 kernel 5.4.0-38-generic x86_64. The first time I ran it, it deleted both [wg-crypt-wg0] instances but left one kworker process: [kworker/0:0-wg-crypt-wg0]. I then ran it again and no wg kernel processes were left. regards, John On 30/12/2020 08:19, Fatih USTA wrote: > Hi > > I'm playing wireguard with the namespace. I think I caught a litle problem. > > If I delete netns directly, everything is removed, but wg-crypt-wg0 process > is still alive. > > root 8127 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] > root 8143 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] > root 8449 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] > root 8454 0.0 0.0 0 0 ? S< 07:26 0:00 [wg-crypt-wg0] > > If I delete first wireguard interface from the netns, everthing works fine. > > wg_version: 1.0.20201221 > kernel_version: 3.16.85-1 > > #!/bin/bash > > case $1 in > remove) > ip link del dev bridge0 || { echo "Please add first."; exit 1; } > ip link del dev veth1 > ip link del dev veth2 > #ip netns exec ns1 ip link del dev wg0 > #ip netns exec ns2 ip link del dev wg0 > ip netns del ns1 > ip netns del ns2 > iptables -D FORWARD -i bridge0 -o bridge0 -j ACCEPT > rm -f /tmp/private-ns1 /tmp/private-ns2 /tmp/public-ns1 > /tmp/public-ns2 > ;; > add) > ip link add name bridge0 type bridge || { echo "Please remove > first."; exit 1; } > ip link set dev bridge0 up > > ip netns add ns1 > ip netns add ns2 > ip link add name veth1 type veth peer name eth0 netns ns1 > ip link add name veth2 type veth peer name eth0 netns ns2 > ip link set dev veth1 up master bridge0 > ip link set dev veth2 up master bridge0 > > ip netns exec ns1 ip link set dev lo up > ip netns exec ns1 ip link set dev eth0 up > ip netns exec ns1 ip addr add 10.150.150.1/24 dev eth0 > > ip netns exec ns2 ip link set dev lo up > ip netns exec ns2 ip link set dev eth0 up > ip netns exec ns2 ip addr add 10.150.150.2/24 dev eth0 > > ( umask 0077; > wg genkey | \ > tee /tmp/private-ns1 | \ > wg pubkey > /tmp/public-ns1 > > wg genkey | \ > tee /tmp/private-ns2 | \ > wg pubkey > /tmp/public-ns2 > ) > > ip netns exec ns1 ip link add name wg0 type wireguard > ip netns exec ns1 ip addr add 172.16.1.1/24 dev wg0 > > ip netns exec ns2 ip link add name wg0 type wireguard > ip netns exec ns2 ip addr add 172.16.1.2/24 dev wg0 > > ip netns exec ns1 wg set wg0 private-key /tmp/private-ns1 > listen-port 51820 > ip netns exec ns1 ip link set wg0 up > > ip netns exec ns2 wg set wg0 private-key /tmp/private-ns2 > listen-port 51820 > ip netns exec ns2 ip link set wg0 up > > ip netns exec ns1 wg set wg0 peer "$(</tmp/public-ns2)" allowed-ips > 172.16.1.0/24 endpoint 10.150.150.2:51820 > ip netns exec ns2 wg set wg0 peer "$(</tmp/public-ns1)" allowed-ips > 172.16.1.0/24 endpoint 10.150.150.1:51820 > > iptables -I FORWARD -i bridge0 -o bridge0 -j ACCEPT > > ip netns exec ns1 wg > ip netns exec ns2 wg > ip netns exec ns1 ping -i 0.3 -c 2 172.16.1.2 &>/dev/null && \ > echo -e "\n\nWorked" || \ > echo -e "\n\nFailed" > ;; > *)echo "$(basename $0) add|remove" ;; > esac > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wg-crypt-wg0 process 2020-12-30 8:19 wg-crypt-wg0 process Fatih USTA 2020-12-30 9:29 ` John Sager @ 2020-12-30 12:39 ` Jason A. Donenfeld 2020-12-31 5:47 ` Fatih USTA 1 sibling, 1 reply; 4+ messages in thread From: Jason A. Donenfeld @ 2020-12-30 12:39 UTC (permalink / raw) To: Fatih USTA; +Cc: WireGuard mailing list, LKML Hi Fatih, Thanks for the report and the detailed test case. From what I can see, this behavior presents itself both with the explicit ip link del and without. When running with debugging enabled, I can see this in dmesg: [558758.361056] wireguard: wg0: Keypair 244 destroyed for peer 21 [558758.546649] wireguard: wg0: Peer 21 (10.150.150.2:51820) destroyed [558758.563317] wireguard: wg0: Interface destroyed [558758.567803] wireguard: wg0: Keypair 243 destroyed for peer 22 [558758.733287] wireguard: wg0: Peer 22 (10.150.150.1:51820) destroyed [558758.749991] wireguard: wg0: Interface destroyed The fact that I see "Interface destroyed" for both interfaces means that wg_destruct() is being called, which includes these calls: destroy_workqueue(wg->handshake_receive_wq); destroy_workqueue(wg->handshake_send_wq); destroy_workqueue(wg->packet_crypt_wq); In doing so, the output of ps changes from: $ ps aux|grep wg0 root 200479 0.0 0.0 0 0 ? I 13:06 0:00 [kworker/0:2-wg-crypt-wg0] root 201226 0.0 0.0 0 0 ? I 13:08 0:00 [kworker/1:4-wg-crypt-wg0] root 201476 0.0 0.0 0 0 ? I< 13:11 0:00 [wg-crypt-wg0] root 201484 0.0 0.0 0 0 ? I< 13:11 0:00 [wg-crypt-wg0] to: $ ps aux|grep wg0 root 200479 0.0 0.0 0 0 ? I 13:06 0:00 [kworker/0:2-wg-crypt-wg0] root 201226 0.0 0.0 0 0 ? I 13:08 0:00 [kworker/1:4-wg-crypt-wg0] What I suspect is happening is that destroying the workqueue does not actually destroy the kthreads that they were using, so that they can be reused (and eventually relabeled) by other drivers. Looking at the stack of those indicates this is probably the case: $ cat /proc/200479/stack [<0>] worker_thread+0xba/0x3c0 [<0>] kthread+0x114/0x130 [<0>] ret_from_fork+0x1f/0x30 So it's just hanging out there idle waiting to be scheduled by something new. In fact, while I was writing this email, that worker already seems to have been reclaimed by another driver: $ cat /proc/200479/comm kworker/0:2-events Now it's called "events". This is happening because the kthread isn't actually destroyed, and task->comm is being hijacked. In proc_task_name we have: if (p->flags & PF_WQ_WORKER) wq_worker_comm(tcomm, sizeof(tcomm), p); else __get_task_comm(tcomm, sizeof(tcomm), p); That top condition holds for workqueue workers, and wq_worker_comm winds up scnprintf'ing the current worker description in there: /* * ->desc tracks information (wq name or * set_worker_desc()) for the latest execution. If * current, prepend '+', otherwise '-'. */ if (worker->desc[0] != '\0') { if (worker->current_work) scnprintf(buf + off, size - off, "+%s", worker->desc); else scnprintf(buf + off, size - off, "-%s", worker->desc); But worker->desc isn't set until process_one_work is called: /* * Record wq name for cmdline and debug reporting, may get * overridden through set_worker_desc(). */ strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN); And it's never unset after the work is done and it's waiting idle in worker_thread for the scheduler to reschedule it and eventually call process_one_work on a new work unit. It would be easy to just null out that string after the work is done with something like: worker->current_func(work); worker->desc[0] = '\0'; But I guess this has never sufficiently bothered anyone before. I suppose I could submit a patch and see how it's received. But it also looks like the scnprintf above in wq_worker_comm distinguishes these cases anyway. If there's a + it means that the work is active and if there's a - it means that it's an old leftover thread. So maybe this is fine as-is? Jason ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wg-crypt-wg0 process 2020-12-30 12:39 ` Jason A. Donenfeld @ 2020-12-31 5:47 ` Fatih USTA 0 siblings, 0 replies; 4+ messages in thread From: Fatih USTA @ 2020-12-31 5:47 UTC (permalink / raw) To: Jason A. Donenfeld; +Cc: WireGuard mailing list, LKML Hi Jason, Thanks for the detailed research and explanation.That's ok for me. Regards. Fatih USTA On 30.12.2020 15:39, Jason A. Donenfeld wrote: > Hi Fatih, > > Thanks for the report and the detailed test case. From what I can see, > this behavior presents itself both with the explicit ip link del and > without. When running with debugging enabled, I can see this in dmesg: > > [558758.361056] wireguard: wg0: Keypair 244 destroyed for peer 21 > [558758.546649] wireguard: wg0: Peer 21 (10.150.150.2:51820) destroyed > [558758.563317] wireguard: wg0: Interface destroyed > [558758.567803] wireguard: wg0: Keypair 243 destroyed for peer 22 > [558758.733287] wireguard: wg0: Peer 22 (10.150.150.1:51820) destroyed > [558758.749991] wireguard: wg0: Interface destroyed > > The fact that I see "Interface destroyed" for both interfaces means > that wg_destruct() is being called, which includes these calls: > > destroy_workqueue(wg->handshake_receive_wq); > destroy_workqueue(wg->handshake_send_wq); > destroy_workqueue(wg->packet_crypt_wq); > > In doing so, the output of ps changes from: > > $ ps aux|grep wg0 > root 200479 0.0 0.0 0 0 ? I 13:06 0:00 > [kworker/0:2-wg-crypt-wg0] > root 201226 0.0 0.0 0 0 ? I 13:08 0:00 > [kworker/1:4-wg-crypt-wg0] > root 201476 0.0 0.0 0 0 ? I< 13:11 0:00 > [wg-crypt-wg0] > root 201484 0.0 0.0 0 0 ? I< 13:11 0:00 > [wg-crypt-wg0] > > to: > > $ ps aux|grep wg0 > root 200479 0.0 0.0 0 0 ? I 13:06 0:00 > [kworker/0:2-wg-crypt-wg0] > root 201226 0.0 0.0 0 0 ? I 13:08 0:00 > [kworker/1:4-wg-crypt-wg0] > > What I suspect is happening is that destroying the workqueue does not > actually destroy the kthreads that they were using, so that they can > be reused (and eventually relabeled) by other drivers. Looking at the > stack of those indicates this is probably the case: > > $ cat /proc/200479/stack > [<0>] worker_thread+0xba/0x3c0 > [<0>] kthread+0x114/0x130 > [<0>] ret_from_fork+0x1f/0x30 > > So it's just hanging out there idle waiting to be scheduled by > something new. In fact, while I was writing this email, that worker > already seems to have been reclaimed by another driver: > > $ cat /proc/200479/comm > kworker/0:2-events > > Now it's called "events". > > This is happening because the kthread isn't actually destroyed, and > task->comm is being hijacked. In proc_task_name we have: > > if (p->flags & PF_WQ_WORKER) > wq_worker_comm(tcomm, sizeof(tcomm), p); > else > __get_task_comm(tcomm, sizeof(tcomm), p); > > That top condition holds for workqueue workers, and wq_worker_comm > winds up scnprintf'ing the current worker description in there: > > /* > * ->desc tracks information (wq name or > * set_worker_desc()) for the latest execution. If > * current, prepend '+', otherwise '-'. > */ > if (worker->desc[0] != '\0') { > if (worker->current_work) > scnprintf(buf + off, size - off, "+%s", > worker->desc); > else > scnprintf(buf + off, size - off, "-%s", > worker->desc); > > But worker->desc isn't set until process_one_work is called: > > /* > * Record wq name for cmdline and debug reporting, may get > * overridden through set_worker_desc(). > */ > strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN); > > And it's never unset after the work is done and it's waiting idle in > worker_thread for the scheduler to reschedule it and eventually call > process_one_work on a new work unit. > > It would be easy to just null out that string after the work is done > with something like: > > worker->current_func(work); > worker->desc[0] = '\0'; > > But I guess this has never sufficiently bothered anyone before. I > suppose I could submit a patch and see how it's received. But it also > looks like the scnprintf above in wq_worker_comm distinguishes these > cases anyway. If there's a + it means that the work is active and if > there's a - it means that it's an old leftover thread. So maybe this > is fine as-is? > > Jason ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-12-31 5:47 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-30 8:19 wg-crypt-wg0 process Fatih USTA 2020-12-30 9:29 ` John Sager 2020-12-30 12:39 ` Jason A. Donenfeld 2020-12-31 5:47 ` Fatih USTA
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).