Development discussion of WireGuard
 help / color / mirror / Atom feed
* "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT
@ 2020-12-19 11:20 Erik Schuitema
  2020-12-19 12:15 ` Jason A. Donenfeld
  0 siblings, 1 reply; 5+ messages in thread
From: Erik Schuitema @ 2020-12-19 11:20 UTC (permalink / raw)
  To: wireguard

Hi,

I ran into an issue with Wireguard on Linux 5.4 kernels with the 
PREEMPT_RT patch, on Ubuntu 18.04. I tried kernels 5.4.47-rt28 and 
5.4.82-rt45.
Everything is fine until I send actual data to the machine through scp, 
resulting in the kernel log below stating "BUG: scheduling while 
atomic".
I tried both the latest Ubuntu package (with wireguard-dkms version 
1.0.20201112) as well as compiling the kernel module from the latest 
source from the wireguard-linux-compat repo, with the same result.

Since the call trace mentions kernel_fpu_begin, I looked at the code and 
the issue seems to occur while using SIMD for packet decryption.

When I forcibly disable SIMD with this simple bypass:
  static inline void simd_get(simd_context_t *ctx)
  {
-    *ctx = !IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd() ? 
HAVE_FULL_SIMD : HAVE_NO_SIMD;
+    *ctx = HAVE_NO_SIMD;
  }
indeed everything works fine again (ignoring the performance hit).

I was unable to further pinpoint the issue, unfortunately.
Any idea what might be the cause?

Best regards,
Erik Schuitema

=== kernel log ===

000: BUG: scheduling while atomic: kworker/0:1/15/0x00000002
000: Modules linked in: wireguard(E) ip6_udp_tunnel udp_tunnel 
intel_rapl_msr 8250_dw nls_iso8859_1 intel_rapl_common 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass 
intel_cstate intel_rapl_perf joydev input_leds wmi_bmof 
intel_wmi_thunderbolt serio_raw snd_hda_codec_hdmi mei_me mei 
snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep 
intel_lpss_pci snd_pcm intel_lpss snd_timer idma64 intel_pch_thermal 
virt_dma snd soundcore mac_hid acpi_pad ip6t_REJECT nf_reject_ipv6 
nf_log_ipv6 sch_fq_codel xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 
nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype 
xt_conntrack ib_iser ip6table_filter rdma_cm ip6_tables iw_cm ib_cm 
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat ib_core 
nf_conntrack_ftp iscsi_tcp libiscsi_tcp nf_conntrack libiscsi 
nf_defrag_ipv6 scsi_transport_iscsi nf_defrag_ipv4 iptable_filter 
ip_tables x_tables autofs4 btrfs zstd_compress algif_skcipher af_alg 
dm_crypt raid10
000:  raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx 
xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_logitech_hidpp 
hid_logitech_dj hid_generic usbhid hid amdgpu i915 gpu_sched ttm 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper 
syscopyarea nvme sysfillrect sysimgblt fb_sys_fops aesni_intel igb 
e1000e crypto_simd dca cryptd glue_helper ptp psmouse pps_core nvme_core 
i2c_algo_bit drm wmi video pinctrl_sunrisepoint
000: Preemption disabled at:
000: [<ffffffff87442203>] kernel_fpu_begin+0x13/0xd0
000: CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G            E     
5.4.47-rt28 #1
000: Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS 
HNKBLi70.86A.0064.2020.1028.1438 10/28/2020
000: Workqueue: wg-crypt-wg0 wg_packet_decrypt_worker [wireguard]
000: Call Trace:
000:  dump_stack+0x6f/0x95
000:  ? kernel_fpu_begin+0x13/0xd0
000:  __schedule_bug+0x78/0xc0
000:  __schedule+0x5f3/0x8b0
000:  ? task_blocks_on_rt_mutex+0x17c/0x350
000:  schedule+0x3d/0xe0
000:  rt_spin_lock_slowlock_locked+0x103/0x2e0
000:  rt_spin_lock_slowlock+0x57/0x90
000:  rt_spin_lock+0x44/0x50
000:  ? wg_packet_decrypt_worker+0xea/0x1c0 [wireguard]
000:  wg_packet_decrypt_worker+0xff/0x1c0 [wireguard]
000:  process_one_work+0x1ee/0x4d0
000:  worker_thread+0x34/0x3f0
000:  kthread+0x121/0x140
000:  ? process_one_work+0x4d0/0x4d0
000:  ? kthread_park+0x90/0x90
000:  ret_from_fork+0x35/0x40
000: ------------[ cut here ]------------


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT
  2020-12-19 11:20 "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT Erik Schuitema
@ 2020-12-19 12:15 ` Jason A. Donenfeld
  2020-12-19 15:32   ` Erik Schuitema
  0 siblings, 1 reply; 5+ messages in thread
From: Jason A. Donenfeld @ 2020-12-19 12:15 UTC (permalink / raw)
  To: erik; +Cc: WireGuard mailing list

Hi Erik,

Thanks for the report. I've fixed this here:
https://git.zx2c4.com/wireguard-linux-compat/commit/?id=8dcc75dbbe0a7b82c7c9a9388a49d1e32723d8a9
This will be part of the next wireguard-linux-compat snapshot release.

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT
  2020-12-19 12:15 ` Jason A. Donenfeld
@ 2020-12-19 15:32   ` Erik Schuitema
  2020-12-19 18:16     ` Jason A. Donenfeld
  0 siblings, 1 reply; 5+ messages in thread
From: Erik Schuitema @ 2020-12-19 15:32 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

Jason A. Donenfeld schreef op 2020-12-19 13:15:

> Hi Erik,
> 
> Thanks for the report. I've fixed this here:
> https://git.zx2c4.com/wireguard-linux-compat/commit/?id=8dcc75dbbe0a7b82c7c9a9388a49d1e32723d8a9
> This will be part of the next wireguard-linux-compat snapshot release.
> 
> Jason

Thanks for the quick fix!
 From your patch, I see that SIMD must be completely disabled for 5.4 
PREEMPT_RT kernels.
Is this any different for kernels >=5.6?

Best regards,
Erik


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT
  2020-12-19 15:32   ` Erik Schuitema
@ 2020-12-19 18:16     ` Jason A. Donenfeld
  2021-02-08 11:36       ` Erik Schuitema
  0 siblings, 1 reply; 5+ messages in thread
From: Jason A. Donenfeld @ 2020-12-19 18:16 UTC (permalink / raw)
  To: Erik Schuitema; +Cc: WireGuard mailing list

Hi Erik,

So far as I can tell, upstream is fine with this. I'd encourage you to
move to the newer LTS, 5.10. The compat stuff has always been pretty
meh. It was an important step in getting WireGuard bootstrapped, of
course, but just look at this horror:

https://git.zx2c4.com/wireguard-linux-compat/tree/src/compat/compat.h

I'll keep it working as people need, but folks should really really
move to the new LTS, now that it's out.

I've also backported upstream commit-by-commit to 5.4 (and android
4.19), for stable kernels, as used by Oracle, SUSE, Google, and so on:
https://git.zx2c4.com/wireguard-linux/log/?h=backport-5.4.y
This too is much preferable to using the compat stuff.

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT
  2020-12-19 18:16     ` Jason A. Donenfeld
@ 2021-02-08 11:36       ` Erik Schuitema
  0 siblings, 0 replies; 5+ messages in thread
From: Erik Schuitema @ 2021-02-08 11:36 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

Hi Jason,

(Sorry for the delay in my reply..)

On 19/12/2020 19:16, Jason A. Donenfeld wrote:
 > So far as I can tell, upstream is fine with this. I'd encourage you to
 > move to the newer LTS, 5.10. The compat stuff has always been pretty
 > meh. It was an important step in getting WireGuard bootstrapped, of
 > course, but just look at this horror:
 >
 > https://git.zx2c4.com/wireguard-linux-compat/tree/src/compat/compat.h

I don't have doubts about the upstream code, I was merely wondering 
whether the performance hit from disabling SIMD is still present in 
newer kernels (it wasn't immediately obvious to me while browsing the 
5.10 source).

 > I'll keep it working as people need, but folks should really really
 > move to the new LTS, now that it's out.

These efforts are highly appreciated! It's not trivial for me to switch 
to a new kernel (needs extensive product testing), so I'm happy with the 
5.4 patch. But I'll be sure to skip right to 5.10 when moving to a new 
kernel.

Best regards,
Erik



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-02-08 11:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-19 11:20 "BUG: scheduling while atomic" on 5.4 kernels with PREEMPT_RT Erik Schuitema
2020-12-19 12:15 ` Jason A. Donenfeld
2020-12-19 15:32   ` Erik Schuitema
2020-12-19 18:16     ` Jason A. Donenfeld
2021-02-08 11:36       ` Erik Schuitema

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).