* soft lockup - may be related to wireguard (backported) @ 2020-05-04 3:55 Wang Jian 2020-05-04 5:26 ` Jason A. Donenfeld ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Wang Jian @ 2020-05-04 3:55 UTC (permalink / raw) To: WireGuard mailing list The following is kernel log of soft lockup # uname -a Linux gw 5.5.0-2-amd64 #1 SMP Debian 5.5.17-1 (2020-04-15) x86_64 GNU/Linux # cat /sys/module/wireguard/version 1.0.0 This is a linux box run debian testing. I think the wireguard is backported from 5.6. The box had a cpu usage spike during the soft lockup, and after that, the wireguard stopped working, no handshake. ifdown wg0; ifup wg0 was not capable to recover the wireguard, no handshake. -- May 4 01:02:17 gw kernel: [27929.504771] rcu: INFO: rcu_sched self-detected stall on CPU May 4 01:02:17 gw kernel: [27929.504859] rcu: 2-....: (5582 ticks this GP) idle=c02/1/0x4000000000000002 softirq=300509/300511 fqs=2470 May 4 01:02:17 gw kernel: [27929.504937] (t=5250 jiffies g=518541 q=1110) May 4 01:02:17 gw kernel: [27929.504975] NMI backtrace for cpu 2 May 4 01:02:17 gw kernel: [27929.505010] CPU: 2 PID: 25094 Comm: kworker/2:1 Not tainted 5.5.0-2-amd64 #1 Debian 5.5.17-1 May 4 01:02:17 gw kernel: [27929.505076] Hardware name: /DN2800MT, BIOS MTCDT10N.86A.0169.2013.0923.1101 09/23/2013 May 4 01:02:17 gw kernel: [27929.505152] Workqueue: wg-crypt-wg0 wg_packet_tx_worker [wireguard] May 4 01:02:17 gw kernel: [27929.505205] Call Trace: May 4 01:02:17 gw kernel: [27929.505233] <IRQ> May 4 01:02:17 gw kernel: [27929.505263] dump_stack+0x66/0x90 May 4 01:02:17 gw kernel: [27929.505297] nmi_cpu_backtrace.cold+0x14/0x53 May 4 01:02:17 gw kernel: [27929.505339] ? lapic_can_unplug_cpu.cold+0x3e/0x3e May 4 01:02:17 gw kernel: [27929.505381] nmi_trigger_cpumask_backtrace+0xf6/0xf8 May 4 01:02:17 gw kernel: [27929.505427] rcu_dump_cpu_stacks+0x8f/0xbd May 4 01:02:17 gw kernel: [27929.505466] rcu_sched_clock_irq.cold+0x1b2/0x3ab May 4 01:02:17 gw kernel: [27929.505509] update_process_times+0x24/0x50 May 4 01:02:17 gw kernel: [27929.505549] tick_sched_handle+0x22/0x60 May 4 01:02:17 gw kernel: [27929.505585] tick_sched_timer+0x38/0x80 May 4 01:02:17 gw kernel: [27929.505620] ? tick_sched_do_timer+0x60/0x60 May 4 01:02:17 gw kernel: [27929.505660] __hrtimer_run_queues+0xf6/0x270 May 4 01:02:17 gw kernel: [27929.505700] hrtimer_interrupt+0x10e/0x240 May 4 01:02:17 gw kernel: [27929.505740] smp_apic_timer_interrupt+0x6c/0x130 May 4 01:02:17 gw kernel: [27929.505782] apic_timer_interrupt+0xf/0x20 May 4 01:02:17 gw kernel: [27929.505818] </IRQ> May 4 01:02:17 gw kernel: [27929.505843] RIP: 0010:queue_work_on+0x27/0x40 May 4 01:02:17 gw kernel: [27929.505883] Code: 00 00 00 0f 1f 44 00 00 53 9c 58 0f 1f 44 00 00 48 89 c3 fa 66 0f 1f 44 00 00 f0 48 0f ba 2a 00 73 12 45 31 c0 48 89 df 57 9d <0f> 1f 44 00 00 44 89 c0 5b c3 e8 ea fb ff ff 41 b8 01 00 00 00 eb May 4 01:02:17 gw kernel: [27929.506019] RSP: 0018:ffffb071c2eefaa8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 May 4 01:02:17 gw kernel: [27929.506080] RAX: 0000000000000001 RBX: 0000000000000286 RCX: 0000000000000000 May 4 01:02:17 gw kernel: [27929.506135] RDX: 00001966d8343dfd RSI: 0000000000000083 RDI: 0000000000000286 May 4 01:02:17 gw kernel: [27929.506191] RBP: ffff8b61aa1c0000 R08: 0000000000000001 R09: ffff8b61abdad510 May 4 01:02:17 gw kernel: [27929.506246] R10: ffff8b61a467ee98 R11: ffff8b61abdacb78 R12: ffff8b61abc43000 May 4 01:02:17 gw kernel: [27929.506301] R13: ffff8b60fef63500 R14: 0000000000000003 R15: ffff8b61a9623904 May 4 01:02:17 gw kernel: [27929.506367] wg_packet_send_staged_packets+0x320/0x5d0 [wireguard] May 4 01:02:17 gw kernel: [27929.506426] wg_xmit+0x324/0x490 [wireguard] May 4 01:02:17 gw kernel: [27929.506469] dev_hard_start_xmit+0x8d/0x1e0 May 4 01:02:17 gw kernel: [27929.506508] __dev_queue_xmit+0x721/0x8e0 May 4 01:02:17 gw kernel: [27929.506549] ip_finish_output2+0x19b/0x590 May 4 01:02:17 gw kernel: [27929.506604] ? nf_confirm+0xcb/0xf0 [nf_conntrack] May 4 01:02:17 gw kernel: [27929.506648] ip_output+0x76/0xf0 May 4 01:02:17 gw kernel: [27929.506681] ? __ip_finish_output+0x1c0/0x1c0 May 4 01:02:17 gw kernel: [27929.506720] iptunnel_xmit+0x174/0x210 May 4 01:02:17 gw kernel: [27929.506761] send4+0x120/0x390 [wireguard] May 4 01:02:17 gw kernel: [27929.506806] wg_socket_send_skb_to_peer+0x98/0xb0 [wireguard] May 4 01:02:17 gw kernel: [27929.506860] wg_packet_tx_worker+0xa9/0x210 [wireguard] May 4 01:02:17 gw kernel: [27929.506907] process_one_work+0x1b4/0x370 May 4 01:02:17 gw kernel: [27929.506944] worker_thread+0x50/0x3c0 May 4 01:02:17 gw kernel: [27929.506980] kthread+0xf9/0x130 May 4 01:02:17 gw kernel: [27929.510322] ? process_one_work+0x370/0x370 May 4 01:02:17 gw kernel: [27929.514059] ? kthread_park+0x90/0x90 May 4 01:02:17 gw kernel: [27929.517774] ret_from_fork+0x1f/0x40 May 4 01:02:45 gw kernel: [27957.473746] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:25094] May 4 01:02:45 gw kernel: [27957.476765] Modules linked in: nf_conntrack_netlink sctp wireguard pppoe pppox ppp_generic slhc curve25519_x86_64 libcurve25519 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha cfg80211 rfkill 8021q garp stp mrp llc xt_mark xt_set nft_chain_nat xt_MASQUERADE xt_TCPMSS xt_nat nf_nat nft_counter xt_tcpudp ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack xt_multiport nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink binfmt_misc nls_ascii nls_cp437 vfat intel_powerclamp fat coretemp snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio hid_generic snd_hda_codec_hdmi evdev efi_pstore snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core gma500_gfx usbhid snd_hwdep snd_pcm hid iTCO_wdt snd_timer efivars pcspkr iTCO_vendor_support drm_kms_helper snd sg watchdog soundcore i2c_algo_bit acpi_cpufreq button tcp_bbr sch_cake nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 loop drm efivarfs ip_tables x_tables autofs4 May 4 01:02:45 gw kernel: [27957.476843] ext4 crc16 mbcache jbd2 btrfs xor zstd_decompress zstd_compress raid6_pq crc32c_generic libcrc32c sd_mod ahci libahci uhci_hcd ehci_pci libata ehci_hcd e1000e scsi_mod usbcore ptp i2c_i801 lpc_ich pps_core mfd_core usb_common video [last unloaded: wireguard] May 4 01:02:45 gw kernel: [27957.501504] CPU: 2 PID: 25094 Comm: kworker/2:1 Not tainted 5.5.0-2-amd64 #1 Debian 5.5.17-1 May 4 01:02:45 gw kernel: [27957.505274] Hardware name: /DN2800MT, BIOS MTCDT10N.86A.0169.2013.0923.1101 09/23/2013 May 4 01:02:45 gw kernel: [27957.509046] Workqueue: wg-crypt-wg0 wg_packet_tx_worker [wireguard] May 4 01:02:45 gw kernel: [27957.512801] RIP: 0010:__xfrm_decode_session+0xd/0x800 May 4 01:02:45 gw kernel: [27957.516563] Code: 00 00 00 e8 55 be ce ff eb 8c e8 be 46 95 ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 <41> 89 ce 41 55 41 54 49 89 fc 53 48 89 f3 48 83 ec 10 83 fa 02 74 May 4 01:02:45 gw kernel: [27957.520497] RSP: 0018:ffffb071c2eefb28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 May 4 01:02:45 gw kernel: [27957.524422] RAX: 0000000000000000 RBX: ffff8b618d13ed80 RCX: 0000000000000000 May 4 01:02:45 gw kernel: [27957.528366] RDX: 0000000000000002 RSI: ffffb071c2eefb58 RDI: ffff8b60fe81ce00 May 4 01:02:45 gw kernel: [27957.532285] RBP: ffffb071c2eefb38 R08: 0000000000000000 R09: ffff8b60ff8e5f00 May 4 01:02:45 gw kernel: [27957.536216] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff92ae7f80 May 4 01:02:45 gw kernel: [27957.540097] R13: ffffb071c2eefb58 R14: 0000000000000000 R15: 0000000000000001 May 4 01:02:45 gw kernel: [27957.543929] FS: 0000000000000000(0000) GS:ffff8b61abd00000(0000) knlGS:0000000000000000 May 4 01:02:45 gw kernel: [27957.547734] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 4 01:02:45 gw kernel: [27957.551506] CR2: 00007fa15c342fb8 CR3: 000000010c80a000 CR4: 00000000000006e0 May 4 01:02:45 gw kernel: [27957.555248] Call Trace: May 4 01:02:45 gw kernel: [27957.558920] ip_route_me_harder+0x1ce/0x250 May 4 01:02:45 gw kernel: [27957.562614] nf_route_table_hook4+0xee/0x110 [nf_tables] May 4 01:02:45 gw kernel: [27957.566288] nf_hook_slow+0x40/0xb0 May 4 01:02:45 gw kernel: [27957.569916] __ip_local_out+0xd4/0x140 May 4 01:02:45 gw kernel: [27957.573500] ? ip_forward_options.cold+0x23/0x23 May 4 01:02:45 gw kernel: [27957.577098] ip_local_out+0x18/0x50 May 4 01:02:45 gw kernel: [27957.580669] iptunnel_xmit+0x174/0x210 May 4 01:02:45 gw kernel: [27957.584210] send4+0x120/0x390 [wireguard] May 4 01:02:45 gw kernel: [27957.587726] wg_socket_send_skb_to_peer+0x98/0xb0 [wireguard] May 4 01:02:45 gw kernel: [27957.591377] wg_packet_tx_worker+0xa9/0x210 [wireguard] May 4 01:02:45 gw kernel: [27957.594986] process_one_work+0x1b4/0x370 May 4 01:02:45 gw kernel: [27957.598526] worker_thread+0x50/0x3c0 May 4 01:02:45 gw kernel: [27957.602063] kthread+0xf9/0x130 May 4 01:02:45 gw kernel: [27957.605583] ? process_one_work+0x370/0x370 May 4 01:02:45 gw kernel: [27957.609112] ? kthread_park+0x90/0x90 May 4 01:02:45 gw kernel: [27957.612633] ret_from_fork+0x1f/0x40 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 3:55 soft lockup - may be related to wireguard (backported) Wang Jian @ 2020-05-04 5:26 ` Jason A. Donenfeld 2020-05-04 12:49 ` Wang Jian 2020-05-04 13:49 ` Alex Xu (Hello71) 2020-05-04 10:47 ` Serge Belyshev ` (2 subsequent siblings) 3 siblings, 2 replies; 10+ messages in thread From: Jason A. Donenfeld @ 2020-05-04 5:26 UTC (permalink / raw) To: Wang Jian; +Cc: WireGuard mailing list Are you routing wireguard over wireguard, or something interesting like that? Is ipsec being used? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 5:26 ` Jason A. Donenfeld @ 2020-05-04 12:49 ` Wang Jian 2020-05-04 13:49 ` Alex Xu (Hello71) 1 sibling, 0 replies; 10+ messages in thread From: Wang Jian @ 2020-05-04 12:49 UTC (permalink / raw) To: Jason A. Donenfeld; +Cc: WireGuard mailing list Jason A. Donenfeld <Jason@zx2c4.com> 于2020年5月4日周一 下午1:26写道: > > Are you routing wireguard over wireguard, or something interesting like that? > > Is ipsec being used? I don't think I have any fancy use cases. But wireguard over pppoe? Other details are - nftable's iptables compatible mode is used, along with ipset - pppoe link is default route, and wg-quick is configured to install additional default route into new created routing table (2000) - ipset matches are used to MARK traffic to specific destinations in mangle table, PREROUTING & OUTPUT, for both v4 and v6 - ip rules are added to match the fwmark and lookup new routing table (2000, so go out via wireguard interface) for forwarded traffic and output traffic Here is the list of loaded kmod. # lsmod Module Size Used by pppoe 20480 2 pppox 16384 1 pppoe ppp_generic 49152 6 pppox,pppoe slhc 20480 1 ppp_generic wireguard 94208 0 curve25519_x86_64 49152 1 wireguard libcurve25519 49152 2 curve25519_x86_64,wireguard libchacha20poly1305 16384 1 wireguard chacha_x86_64 28672 1 libchacha20poly1305 poly1305_x86_64 28672 1 libchacha20poly1305 ip6_udp_tunnel 16384 1 wireguard udp_tunnel 16384 1 wireguard libblake2s 16384 1 wireguard blake2s_x86_64 20480 1 libblake2s libblake2s_generic 20480 1 blake2s_x86_64 libchacha 16384 1 chacha_x86_64 cfg80211 839680 0 rfkill 28672 1 cfg80211 8021q 40960 0 garp 16384 1 8021q stp 16384 1 garp mrp 20480 1 8021q llc 16384 2 stp,garp xt_mark 16384 4 xt_set 16384 4 nft_chain_nat 16384 8 xt_MASQUERADE 20480 3 xt_TCPMSS 16384 5 xt_nat 16384 2 nf_nat 53248 3 xt_nat,nft_chain_nat,xt_MASQUERADE nft_counter 16384 25 xt_tcpudp 20480 7 ipt_REJECT 16384 3 nf_reject_ipv4 16384 1 ipt_REJECT xt_state 16384 0 xt_conntrack 16384 4 xt_multiport 20480 2 nft_compat 20480 34 nf_tables 176128 93 nft_compat,nft_counter,nft_chain_nat ip_set_hash_ip 45056 2 ip_set_hash_net 53248 3 ip_set 57344 3 ip_set_hash_ip,xt_set,ip_set_hash_net nfnetlink 16384 4 nft_compat,nf_tables,ip_set binfmt_misc 24576 1 intel_powerclamp 20480 0 nls_ascii 16384 1 coretemp 20480 0 snd_hda_codec_realtek 126976 1 nls_cp437 20480 1 vfat 20480 1 snd_hda_codec_generic 94208 1 snd_hda_codec_realtek fat 86016 1 vfat ledtrig_audio 16384 2 snd_hda_codec_generic,snd_hda_codec_realtek snd_hda_codec_hdmi 73728 1 snd_hda_intel 53248 0 snd_intel_dspcfg 24576 1 snd_hda_intel snd_hda_codec 163840 4 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec_realtek evdev 28672 3 gma500_gfx 221184 1 efi_pstore 16384 0 snd_hda_core 102400 5 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek snd_hwdep 16384 1 snd_hda_codec snd_pcm 131072 4 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_core drm_kms_helper 233472 1 gma500_gfx pcspkr 16384 0 efivars 20480 1 efi_pstore snd_timer 40960 1 snd_pcm iTCO_wdt 16384 0 snd 106496 8 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek,snd_timer,snd_pcm sg 36864 0 iTCO_vendor_support 16384 1 iTCO_wdt soundcore 16384 1 snd i2c_algo_bit 16384 1 gma500_gfx watchdog 28672 1 iTCO_wdt acpi_cpufreq 28672 1 button 24576 0 tcp_bbr 20480 36 sch_cake 32768 3 nf_conntrack 172032 5 xt_conntrack,nf_nat,xt_state,xt_nat,xt_MASQUERADE nf_defrag_ipv6 24576 1 nf_conntrack nf_defrag_ipv4 16384 1 nf_conntrack loop 40960 0 drm 585728 3 drm_kms_helper,gma500_gfx efivarfs 16384 1 ip_tables 32768 0 x_tables 53248 12 xt_conntrack,nft_compat,xt_multiport,xt_state,xt_tcpudp,xt_nat,xt_set,ipt_REJECT,ip_tables,xt_MASQUERADE,xt_TCPMSS,xt_mark autofs4 53248 2 ext4 765952 1 crc16 16384 1 ext4 mbcache 16384 1 ext4 jbd2 135168 1 ext4 btrfs 1478656 0 xor 24576 1 btrfs zstd_decompress 90112 1 btrfs zstd_compress 188416 1 btrfs raid6_pq 122880 1 btrfs crc32c_generic 16384 2 libcrc32c 16384 3 nf_conntrack,nf_nat,btrfs sd_mod 57344 3 ahci 40960 2 libahci 45056 1 ahci libata 290816 2 libahci,ahci e1000e 290816 0 ehci_pci 20480 0 uhci_hcd 53248 0 scsi_mod 253952 3 sd_mod,libata,sg ehci_hcd 98304 1 ehci_pci i2c_i801 32768 0 usbcore 315392 3 ehci_pci,ehci_hcd,uhci_hcd ptp 20480 1 e1000e lpc_ich 28672 0 mfd_core 20480 1 lpc_ich pps_core 24576 1 ptp usb_common 16384 3 usbcore,ehci_hcd,uhci_hcd video 53248 1 gma500_gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 5:26 ` Jason A. Donenfeld 2020-05-04 12:49 ` Wang Jian @ 2020-05-04 13:49 ` Alex Xu (Hello71) 2020-05-04 16:51 ` Wang Jian 1 sibling, 1 reply; 10+ messages in thread From: Alex Xu (Hello71) @ 2020-05-04 13:49 UTC (permalink / raw) To: Jason A. Donenfeld, Wang Jian, wireguard Excerpts from Jason A. Donenfeld's message of May 4, 2020 1:26 am: > Are you routing wireguard over wireguard, or something interesting like that? > > Is ipsec being used? > This "DN2800MT" looks like an Atom board from 2012; are you trying to run a very large bandwidth through it? I think it's plausible that buffering could cause a large enough chunk of work for a slow CPU that the kernel could erroneously think that the code is stuck. I don't know why that would happen on the tx thread though, or why it would result in an RCU stall, seeing as only a single access is made under an RCU lock in this function... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 13:49 ` Alex Xu (Hello71) @ 2020-05-04 16:51 ` Wang Jian 0 siblings, 0 replies; 10+ messages in thread From: Wang Jian @ 2020-05-04 16:51 UTC (permalink / raw) To: Alex Xu (Hello71); +Cc: Jason A. Donenfeld, WireGuard mailing list Alex Xu (Hello71) <alex_y_xu@yahoo.ca> 于2020年5月4日周一 下午9:49写道: > > Excerpts from Jason A. Donenfeld's message of May 4, 2020 1:26 am: > > Are you routing wireguard over wireguard, or something interesting like that? > > > > Is ipsec being used? > > > > This "DN2800MT" looks like an Atom board from 2012; are you trying to > run a very large bandwidth through it? I think it's plausible that > buffering could cause a large enough chunk of work for a slow CPU that > the kernel could erroneously think that the code is stuck. I don't know > why that would happen on the tx thread though, or why it would result in > an RCU stall, seeing as only a single access is made under an RCU lock > in this function... It's an old Atom CPU but still capable of handling at least 100Mbps (even with encryption). I have a prometheus running on this box, so I have monitoring data at an interval of 15s. Looking at the monitoring data 1. the event started from 1:02:00am, ended at 1:04:45am 2. peak time was 1:03:15am, for load, cpu usage and memory usage 3. available memory dropped from 3.359GB to 2.714GB at 1:03:15am, and bounced up gradually from this point Seems it's related to wireguard's memory use, a cause or a result. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 3:55 soft lockup - may be related to wireguard (backported) Wang Jian 2020-05-04 5:26 ` Jason A. Donenfeld @ 2020-05-04 10:47 ` Serge Belyshev 2020-05-04 22:55 ` Jason A. Donenfeld 2020-05-04 22:28 ` Jason A. Donenfeld 2020-05-05 0:05 ` Jason A. Donenfeld 3 siblings, 1 reply; 10+ messages in thread From: Serge Belyshev @ 2020-05-04 10:47 UTC (permalink / raw) To: WireGuard mailing list [-- Attachment #1: Type: text/plain, Size: 3374 bytes --] Hi! I can reproduce similar RCU stall with a different kernel under specific conditions on a specific box: [ 54.437636] rcu: INFO: rcu_sched self-detected stall on CPU [ 54.438838] rcu: 0-...!: (2101 ticks this GP) idle=ea6/1/0x4000000000000002 softirq=604/604 fqs=0 [ 54.440052] (t=2101 jiffies g=69 q=89) [ 54.441273] rcu: rcu_sched kthread starved for 2101 jiffies! g69 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 [ 54.442547] rcu: RCU grace-period kthread stack dump: [ 54.443812] rcu_sched I 0 10 2 0x80004000 [ 54.443814] Call Trace: [ 54.445087] ? __schedule+0x540/0xa80 [ 54.446356] schedule+0x45/0xb0 [ 54.447612] schedule_timeout+0x144/0x280 [ 54.448859] ? __next_timer_interrupt+0xc0/0xc0 [ 54.450099] rcu_gp_kthread+0x3f0/0x840 [ 54.451329] kthread+0xe6/0x120 [ 54.452557] ? rcu_gp_slow.part.0+0x30/0x30 [ 54.453761] ? __kthread_create_on_node+0x150/0x150 [ 54.454943] ret_from_fork+0x1f/0x30 [ 54.456095] NMI backtrace for cpu 0 [ 54.457221] CPU: 0 PID: 2910 Comm: md5sum Not tainted 5.6.0-00001-g6e142c237f00 #1309 [ 54.458355] Hardware name: Gigabyte Technology Co., Ltd. GA-MA790FX-DQ6/GA-MA790FX-DQ6, BIOS F7g 07/19/2010 [ 54.459484] Call Trace: [ 54.460576] <IRQ> [ 54.461672] dump_stack+0x50/0x70 [ 54.462772] nmi_cpu_backtrace.cold+0x14/0x53 [ 54.463871] ? lapic_can_unplug_cpu.cold+0x3e/0x3e [ 54.464955] nmi_trigger_cpumask_backtrace+0x7c/0x89 [ 54.466026] rcu_dump_cpu_stacks+0x7b/0xa9 [ 54.467088] rcu_sched_clock_irq.cold+0x153/0x38a [ 54.468146] update_process_times+0x1f/0x50 [ 54.469204] tick_sched_timer+0x33/0x70 [ 54.470262] ? tick_sched_do_timer+0x50/0x50 [ 54.471321] __hrtimer_run_queues+0xe2/0x180 [ 54.472378] hrtimer_interrupt+0x109/0x240 [ 54.473423] smp_apic_timer_interrupt+0x48/0x80 [ 54.474461] apic_timer_interrupt+0xf/0x20 [ 54.475486] </IRQ> [ 54.476495] RIP: 0033:0x556cbd33bf19 [ 54.477506] Code: ce 44 8b 4b 10 c1 c9 0f 01 d1 44 89 4c 24 c8 21 ce 31 c6 01 fe 41 8d bc 01 af 0f 7c f5 89 d0 44 8b 4b 3c c1 ce 0a 31 c8 01 ce <21> f0 31 d0 01 f8 41 8d bc 12 2a c6 87 47 89 ca 41 89 ea c1 c0 07 [ 54.479694] RSP: 002b:00007ffc30913ce8 EFLAGS: 00000283 ORIG_RAX: ffffffffffffff13 [ 54.480813] RAX: 00000000980270bd RBX: 0000556cbe81e4e0 RCX: 00000000c35c3b1a [ 54.481943] RDX: 000000005b5e4ba7 RSI: 00000000ae8ee5ae RDI: 0000000009b5de85 [ 54.483075] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 54.484201] R10: 0000000000000000 R11: 00000000b16eb4f8 R12: 0000000000000000 [ 54.485317] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000023604445 Details and steps to reproduce: 1. kernel from git://github.com/ckolivas/linux tag: 5.6-muqss-199 2. kernel .config in the attachment 3. boot with "threadirqs" 4. launch 100% cpu load for all threads, e.g.: for N in {1..6}; do md5sum /dev/zero & done 5. observe that the box stops responding to pings via wireguard interface. 6. after some time RCU stall may be triggered (but not always). 7. further wireguard configuration details in the attachment. Note that this is a heisenbug, it disappears with more debugging options are enabled, I cannot trigger it on mainline kernel or with different scheduler configuration, and on a different box (skylake laptop) with exactly the same kernel it is very hard to trigger. [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 25579 bytes --] [-- Attachment #3: proc_cpuinfo.gz --] [-- Type: application/gzip, Size: 775 bytes --] [-- Attachment #4: wg-config-details.gz --] [-- Type: application/gzip, Size: 1305 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 10:47 ` Serge Belyshev @ 2020-05-04 22:55 ` Jason A. Donenfeld 0 siblings, 0 replies; 10+ messages in thread From: Jason A. Donenfeld @ 2020-05-04 22:55 UTC (permalink / raw) To: Serge Belyshev, WireGuard mailing list On 5/4/20 4:47 AM, Serge Belyshev wrote: > Hi! I can reproduce similar RCU stall with a different kernel under > specific conditions on a specific box: > > [ 54.437636] rcu: INFO: rcu_sched self-detected stall on CPU > [ 54.438838] rcu: 0-...!: (2101 ticks this GP) idle=ea6/1/0x4000000000000002 softirq=604/604 fqs=0 > [ 54.440052] (t=2101 jiffies g=69 q=89) > [ 54.441273] rcu: rcu_sched kthread starved for 2101 jiffies! g69 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 > [ 54.442547] rcu: RCU grace-period kthread stack dump: > [ 54.443812] rcu_sched I 0 10 2 0x80004000 > [ 54.443814] Call Trace: > [ 54.445087] ? __schedule+0x540/0xa80 > [ 54.446356] schedule+0x45/0xb0 > [ 54.447612] schedule_timeout+0x144/0x280 > [ 54.448859] ? __next_timer_interrupt+0xc0/0xc0 > [ 54.450099] rcu_gp_kthread+0x3f0/0x840 > [ 54.451329] kthread+0xe6/0x120 > [ 54.452557] ? rcu_gp_slow.part.0+0x30/0x30 > [ 54.453761] ? __kthread_create_on_node+0x150/0x150 > [ 54.454943] ret_from_fork+0x1f/0x30 > [ 54.456095] NMI backtrace for cpu 0 > [ 54.457221] CPU: 0 PID: 2910 Comm: md5sum Not tainted 5.6.0-00001-g6e142c237f00 #1309 > [ 54.458355] Hardware name: Gigabyte Technology Co., Ltd. GA-MA790FX-DQ6/GA-MA790FX-DQ6, BIOS F7g 07/19/2010 > [ 54.459484] Call Trace: > [ 54.460576] <IRQ> > [ 54.461672] dump_stack+0x50/0x70 > [ 54.462772] nmi_cpu_backtrace.cold+0x14/0x53 > [ 54.463871] ? lapic_can_unplug_cpu.cold+0x3e/0x3e > [ 54.464955] nmi_trigger_cpumask_backtrace+0x7c/0x89 > [ 54.466026] rcu_dump_cpu_stacks+0x7b/0xa9 > [ 54.467088] rcu_sched_clock_irq.cold+0x153/0x38a > [ 54.468146] update_process_times+0x1f/0x50 > [ 54.469204] tick_sched_timer+0x33/0x70 > [ 54.470262] ? tick_sched_do_timer+0x50/0x50 > [ 54.471321] __hrtimer_run_queues+0xe2/0x180 > [ 54.472378] hrtimer_interrupt+0x109/0x240 > [ 54.473423] smp_apic_timer_interrupt+0x48/0x80 > [ 54.474461] apic_timer_interrupt+0xf/0x20 > [ 54.475486] </IRQ> > [ 54.476495] RIP: 0033:0x556cbd33bf19 > [ 54.477506] Code: ce 44 8b 4b 10 c1 c9 0f 01 d1 44 89 4c 24 c8 21 ce 31 c6 01 fe 41 8d bc 01 af 0f 7c f5 89 d0 44 8b 4b 3c c1 ce 0a 31 c8 01 ce <21> f0 31 d0 01 f8 41 8d bc 12 2a c6 87 47 89 ca 41 89 ea c1 c0 07 > [ 54.479694] RSP: 002b:00007ffc30913ce8 EFLAGS: 00000283 ORIG_RAX: ffffffffffffff13 > [ 54.480813] RAX: 00000000980270bd RBX: 0000556cbe81e4e0 RCX: 00000000c35c3b1a > [ 54.481943] RDX: 000000005b5e4ba7 RSI: 00000000ae8ee5ae RDI: 0000000009b5de85 > [ 54.483075] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > [ 54.484201] R10: 0000000000000000 R11: 00000000b16eb4f8 R12: 0000000000000000 > [ 54.485317] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000023604445 > I don't see anything wireguard-related in this stacktrace. Can you try sending one that has something wireguard-related in it? Or is more complete? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 3:55 soft lockup - may be related to wireguard (backported) Wang Jian 2020-05-04 5:26 ` Jason A. Donenfeld 2020-05-04 10:47 ` Serge Belyshev @ 2020-05-04 22:28 ` Jason A. Donenfeld 2020-05-05 6:54 ` Wang Jian 2020-05-05 0:05 ` Jason A. Donenfeld 3 siblings, 1 reply; 10+ messages in thread From: Jason A. Donenfeld @ 2020-05-04 22:28 UTC (permalink / raw) To: Wang Jian, WireGuard mailing list On 5/4/20 6:49 AM, Wang Jian wrote: > Jason A. Donenfeld <Jason@zx2c4.com> 于2020年5月4日周一 下午1:26写道: >> >> Are you routing wireguard over wireguard, or something interesting like that? >> >> Is ipsec being used? > > I don't think I have any fancy use cases. But wireguard over pppoe? > > Other details are > - nftable's iptables compatible mode is used, along with ipset > - pppoe link is default route, and wg-quick is configured to install > additional default route into new created routing table (2000) > - ipset matches are used to MARK traffic to specific destinations in > mangle table, PREROUTING & OUTPUT, for both v4 and v6 > - ip rules are added to match the fwmark and lookup new routing table > (2000, so go out via wireguard interface) for forwarded traffic and > output traffic Can you send full networking configuration in enough detail that I'll be able to reliably reproduce this problem? If I can't reproduce it, it's unlikely I'll be able to fix it. Meanwhile, it really really looks from your stacktrace that you have one wireguard interface going over another wireguard interface: [27929.506367] wg_packet_send_staged_packets+0x320/0x5d0 [wireguard] [27929.506426] wg_xmit+0x324/0x490 [wireguard] [27929.506469] dev_hard_start_xmit+0x8d/0x1e0 [27929.506508] __dev_queue_xmit+0x721/0x8e0 [27929.506549] ip_finish_output2+0x19b/0x590 [27929.506604] ? nf_confirm+0xcb/0xf0 [nf_conntrack] [27929.506648] ip_output+0x76/0xf0 [27929.506681] ? __ip_finish_output+0x1c0/0x1c0 [27929.506720] iptunnel_xmit+0x174/0x210 [27929.506761] send4+0x120/0x390 [wireguard] [27929.506806] wg_socket_send_skb_to_peer+0x98/0xb0 [wireguard] [27929.506860] wg_packet_tx_worker+0xa9/0x210 [wireguard] Here, a wireguard encrypted udp packet is being sent to an endpoint that then is being routed to a wireguard interface. What in your network config would make that possible? Jason ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 22:28 ` Jason A. Donenfeld @ 2020-05-05 6:54 ` Wang Jian 0 siblings, 0 replies; 10+ messages in thread From: Wang Jian @ 2020-05-05 6:54 UTC (permalink / raw) To: Jason A. Donenfeld; +Cc: WireGuard mailing list Jason A. Donenfeld <Jason@zx2c4.com> 于2020年5月5日周二 上午6:28写道: > Can you send full networking configuration in enough detail that I'll be > able to reliably reproduce this problem? If I can't reproduce it, it's > unlikely I'll be able to fix it. I will send full networking configuration to you privately. The configuration works and the problem occured only once. > > Meanwhile, it really really looks from your stacktrace that you have one > wireguard interface going over another wireguard interface: > > [27929.506367] wg_packet_send_staged_packets+0x320/0x5d0 [wireguard] > [27929.506426] wg_xmit+0x324/0x490 [wireguard] > [27929.506469] dev_hard_start_xmit+0x8d/0x1e0 > [27929.506508] __dev_queue_xmit+0x721/0x8e0 > [27929.506549] ip_finish_output2+0x19b/0x590 > [27929.506604] ? nf_confirm+0xcb/0xf0 [nf_conntrack] > [27929.506648] ip_output+0x76/0xf0 > [27929.506681] ? __ip_finish_output+0x1c0/0x1c0 > [27929.506720] iptunnel_xmit+0x174/0x210 > [27929.506761] send4+0x120/0x390 [wireguard] > [27929.506806] wg_socket_send_skb_to_peer+0x98/0xb0 [wireguard] > [27929.506860] wg_packet_tx_worker+0xa9/0x210 [wireguard] > > Here, a wireguard encrypted udp packet is being sent to an endpoint that > then is being routed to a wireguard interface. What in your network > config would make that possible? Maybe some erroneous ip route/rule magic that I don't notice. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: soft lockup - may be related to wireguard (backported) 2020-05-04 3:55 soft lockup - may be related to wireguard (backported) Wang Jian ` (2 preceding siblings ...) 2020-05-04 22:28 ` Jason A. Donenfeld @ 2020-05-05 0:05 ` Jason A. Donenfeld 3 siblings, 0 replies; 10+ messages in thread From: Jason A. Donenfeld @ 2020-05-05 0:05 UTC (permalink / raw) To: Wang Jian, WireGuard mailing list; +Cc: sultan Sultan had a nice stroke of insight. Let's see if this fixes the issue: https://git.zx2c4.com/wireguard-linux/commit/?h=stable&id=e21300c15cd8365017be856f944efa4db7b1cb95 ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-05-05 6:54 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-04 3:55 soft lockup - may be related to wireguard (backported) Wang Jian 2020-05-04 5:26 ` Jason A. Donenfeld 2020-05-04 12:49 ` Wang Jian 2020-05-04 13:49 ` Alex Xu (Hello71) 2020-05-04 16:51 ` Wang Jian 2020-05-04 10:47 ` Serge Belyshev 2020-05-04 22:55 ` Jason A. Donenfeld 2020-05-04 22:28 ` Jason A. Donenfeld 2020-05-05 6:54 ` Wang Jian 2020-05-05 0:05 ` Jason A. Donenfeld
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).