Development discussion of WireGuard
 help / color / mirror / Atom feed
* Kernel lockup with (debian) 4.16.0-2-rt-amd64
@ 2018-06-12 20:00 Paul Hedderly
  2018-06-12 21:35 ` Jason A. Donenfeld
  2018-06-12 21:38 ` Paul Hedderly
  0 siblings, 2 replies; 16+ messages in thread
From: Paul Hedderly @ 2018-06-12 20:00 UTC (permalink / raw)
  To: WireGuard mailing list

Loving wireguard but I'm getting failures running the Debian realtime
kernel. I first noticed that the wg link was freezing for 20-30 seconds
at a time, and then the machine would freeze.

For example now, before the innevitable freeze:

http://dpaste.com/1WFGS46

from 3820.516865 seconds in

[ 3820.516865] BUG: scheduling while atomic:
kworker/1:2/17295/0x00000002
[ 3820.516865] Modules linked ...
[ 3820.516926] Preemption disabled at:
[ 3820.516932] [<ffffffffbda3366f>] kernel_fpu_begin+0xf/0x20
[ 3820.516934] CPU: 1 PID: 17295 Comm: kworker/1:2 Tainted:
G     U     O     4.16.0-2-rt-amd64 #1 Debian 4.16.12-1
[ 3820.516935] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS A06
01/27/2015
[ 3820.516940] Workqueue: wg-crypt-wg0 packet_encrypt_worker
[wireguard]
[ 3820.516940] Call Trace:
[ 3820.516946]  dump_stack+0x5c/0x85
[ 3820.516948]  ? kernel_fpu_begin+0xf/0x20
[ 3820.516950]  __schedule_bug+0x73/0xc0
[ 3820.516953]  __schedule+0x5a1/0x6e0
<etc - see paste>

Is there any more info needed? I think I'm going to drop the rt kernel
for now because I've had 4 lockups in 24hrs (since moving to the rt
kernel)

Is this a known problem? I'm guessing that wg hasnt been tested much
with the rt patchset.

With a previous freeze it was preceeded by thousands of :

Jun 12 18:11:40 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:11:45 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:11:50 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:11:55 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:12:00 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:12:05 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:12:10 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1
Jun 12 18:12:15 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
block_reap:328: [bandwidth] bad exit code 1

then:

Jun 12 18:16:01 brix kernel: [16507.893206] CPU: 2 PID: 18331 Comm:
kworker/2:2 Tainted: G     U     O     4.16.0-2-rt-amd64 #1 Debian
4.16.12-1
Jun 12 18:16:01 brix kernel: [16507.893206] Hardware name: Dell Inc.
PowerEdge T20/0VD5HY, BIOS A06 01/27/2015
Jun 12 18:16:01 brix kernel: [16507.893211] Workqueue: wg-crypt-wg0
packet_encrypt_worker [wireguard]
Jun 12 18:16:01 brix kernel: [16507.893212] Call Trace:
Jun 12 18:16:01 brix kernel: [16507.893218]  dump_stack+0x5c/0x85
Jun 12 18:16:01 brix kernel: [16507.893220]  ?
kernel_fpu_begin+0xf/0x20
Jun 12 18:16:01 brix kernel: [16507.893222]  __schedule_bug+0x73/0xc0
Jun 12 18:16:01 brix kernel: [16507.893224]  __schedule+0x5a1/0x6e0

And this was all interspersed with the network going up and down. A log
of the previous failure:

https://pastebin.com/eFPHXaYk

Many thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-12 20:00 Kernel lockup with (debian) 4.16.0-2-rt-amd64 Paul Hedderly
@ 2018-06-12 21:35 ` Jason A. Donenfeld
  2018-06-12 21:42   ` Paul Hedderly
  2018-06-12 21:47   ` Jason A. Donenfeld
  2018-06-12 21:38 ` Paul Hedderly
  1 sibling, 2 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-12 21:35 UTC (permalink / raw)
  To: paul; +Cc: WireGuard mailing list

Hi Paul,

Thanks for the useful bug report. I'll have a fix for it soon.

Thanks, also, for bringing Debian's rt kernel to my attention. Seems
like this might be a good testing platform.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-12 20:00 Kernel lockup with (debian) 4.16.0-2-rt-amd64 Paul Hedderly
  2018-06-12 21:35 ` Jason A. Donenfeld
@ 2018-06-12 21:38 ` Paul Hedderly
  1 sibling, 0 replies; 16+ messages in thread
From: Paul Hedderly @ 2018-06-12 21:38 UTC (permalink / raw)
  To: WireGuard mailing list

On Tue, 2018-06-12 at 21:00 +0100, Paul Hedderly wrote:
> Loving wireguard but I'm getting failures running the Debian realtime
> kernel. I first noticed that the wg link was freezing for 20-30
> seconds
> at a time, and then the machine would freeze.


Sorry I meant to add this info:

prh@brix:~$ sudo modinfo wireguard
filename:       /lib/modules/4.16.0-2-rt-
amd64/updates/dkms/wireguard.ko
alias:          net-pf-16-proto-16-family-wireguard
alias:          rtnl-link-wireguard
version:        0.0.20180531-1
author:         Jason A. Donenfeld <Jason@zx2c4.com>
description:    Fast, secure, and modern VPN tunnel
license:        GPL v2
srcversion:     6ED5AE02FC2B8D8E9EA3A3D
depends:        udp_tunnel,ip6_udp_tunnel
retpoline:      Y
name:           wireguard
vermagic:       4.16.0-2-rt-amd64 SMP preempt mod_unload modversions
prh@brix:~$ dpkg -l|grep wireg
ii  wireguard                                     0.0.20180531-
1                             all          fast, modern, secure kernel
VPN tunnel (metapackage)
ii  wireguard-dkms                                0.0.20180531-
1                             all          fast, modern, secure kernel
VPN tunnel (DKMS version)
ii  wireguard-tools                               0.0.20180531-
1                             amd64        fast, modern, secure kernel
VPN tunnel (userland utilities)
> 

I think that is the latest release.

I can raise this with the realtime folk too if that would help - I'm
not sure where the problem would lie really.

Thanks


> For example now, before the innevitable freeze:
> 
> http://dpaste.com/1WFGS46
> 
> from 3820.516865 seconds in
> 
> [ 3820.516865] BUG: scheduling while atomic:
> kworker/1:2/17295/0x00000002
> [ 3820.516865] Modules linked ...
> [ 3820.516926] Preemption disabled at:
> [ 3820.516932] [<ffffffffbda3366f>] kernel_fpu_begin+0xf/0x20
> [ 3820.516934] CPU: 1 PID: 17295 Comm: kworker/1:2 Tainted:
> G     U     O     4.16.0-2-rt-amd64 #1 Debian 4.16.12-1
> [ 3820.516935] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS
> A06
> 01/27/2015
> [ 3820.516940] Workqueue: wg-crypt-wg0 packet_encrypt_worker
> [wireguard]
> [ 3820.516940] Call Trace:
> [ 3820.516946]  dump_stack+0x5c/0x85
> [ 3820.516948]  ? kernel_fpu_begin+0xf/0x20
> [ 3820.516950]  __schedule_bug+0x73/0xc0
> [ 3820.516953]  __schedule+0x5a1/0x6e0
> <etc - see paste>
> 
> Is there any more info needed? I think I'm going to drop the rt
> kernel
> for now because I've had 4 lockups in 24hrs (since moving to the rt
> kernel)
> 
> Is this a known problem? I'm guessing that wg hasnt been tested much
> with the rt patchset.
> 
> With a previous freeze it was preceeded by thousands of :
> 
> Jun 12 18:11:40 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:11:45 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:11:50 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:11:55 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:12:00 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:12:05 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:12:10 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> Jun 12 18:12:15 brix /usr/lib/gdm3/gdm-x-session[9135]: ERROR
> block_reap:328: [bandwidth] bad exit code 1
> 
> then:
> 
> Jun 12 18:16:01 brix kernel: [16507.893206] CPU: 2 PID: 18331 Comm:
> kworker/2:2 Tainted: G     U     O     4.16.0-2-rt-amd64 #1 Debian
> 4.16.12-1
> Jun 12 18:16:01 brix kernel: [16507.893206] Hardware name: Dell Inc.
> PowerEdge T20/0VD5HY, BIOS A06 01/27/2015
> Jun 12 18:16:01 brix kernel: [16507.893211] Workqueue: wg-crypt-wg0
> packet_encrypt_worker [wireguard]
> Jun 12 18:16:01 brix kernel: [16507.893212] Call Trace:
> Jun 12 18:16:01 brix kernel: [16507.893218]  dump_stack+0x5c/0x85
> Jun 12 18:16:01 brix kernel: [16507.893220]  ?
> kernel_fpu_begin+0xf/0x20
> Jun 12 18:16:01 brix kernel: [16507.893222]  __schedule_bug+0x73/0xc0
> Jun 12 18:16:01 brix kernel: [16507.893224]  __schedule+0x5a1/0x6e0
> 
> And this was all interspersed with the network going up and down. A
> log
> of the previous failure:
> 
> https://pastebin.com/eFPHXaYk
> 
> Many thanks.
> 
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-12 21:35 ` Jason A. Donenfeld
@ 2018-06-12 21:42   ` Paul Hedderly
  2018-06-12 21:47   ` Jason A. Donenfeld
  1 sibling, 0 replies; 16+ messages in thread
From: Paul Hedderly @ 2018-06-12 21:42 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

On Tue, 2018-06-12 at 23:35 +0200, Jason A. Donenfeld wrote:
> Hi Paul,
> 
> Thanks for the useful bug report. I'll have a fix for it soon.

What a star :)

It's times like this I love being a invovled in OS and a linux
sysadmin. Forced to use a windows VD and o365 for work... and can't
imagine getting the frequent bugs I hit in both fixed so easily. Or
ever. Hit a windows explorer bug that's been a pain since at least
windows xp today!

> Thanks, also, for bringing Debian's rt kernel to my attention. Seems
> like this might be a good testing platform.

There are times the -rt- kernels are a bit behind, they are not always
built with every experimental/unstable/testing kernel, but for audio
work it is very nice to have them.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-12 21:35 ` Jason A. Donenfeld
  2018-06-12 21:42   ` Paul Hedderly
@ 2018-06-12 21:47   ` Jason A. Donenfeld
  2018-06-13  1:58     ` Jason A. Donenfeld
  1 sibling, 1 reply; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-12 21:47 UTC (permalink / raw)
  To: paul; +Cc: WireGuard mailing list

Hi again,

Would you try out https://=D7=90.cc/yFB763cY and tell me if that fixes the
problem for you? I'm not convinced that's a good solution for it, but
it would be useful to know if this at least works.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-12 21:47   ` Jason A. Donenfeld
@ 2018-06-13  1:58     ` Jason A. Donenfeld
  2018-06-13  7:58       ` Paul Hedderly
  0 siblings, 1 reply; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-13  1:58 UTC (permalink / raw)
  To: paul; +Cc: WireGuard mailing list

Hi Paul,

The current patch I'm now considering is here:
https://git.zx2c4.com/WireGuard/patch/?id=17fb4ff6064e10bb91bf2ccf6534bfdf767a9b90

If this works for you, I'll put out a new snapshot.

Regards,
Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13  1:58     ` Jason A. Donenfeld
@ 2018-06-13  7:58       ` Paul Hedderly
  2018-06-13 12:13         ` Jason A. Donenfeld
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Hedderly @ 2018-06-13  7:58 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

On Wed, 2018-06-13 at 03:58 +0200, Jason A. Donenfeld wrote:
> Hi Paul,
> 
> The current patch I'm now considering is here:
> https://git.zx2c4.com/WireGuard/patch/?id=17fb4ff6064e10bb91bf2ccf653
> 4bfdf767a9b90

Ahh!

Last night I recompiled with your first patch and left the machine
running - this morning it was frozen again :(


prh@brix:~$ sudo modinfo /lib/modules/4.16.0-2-rt-
amd64/updates/dkms/wireguard.ko
filename:       /lib/modules/4.16.0-2-rt-
amd64/updates/dkms/wireguard.ko
alias:          net-pf-16-proto-16-family-wireguard
alias:          rtnl-link-wireguard
version:        0.0.20180531-1
author:         Jason A. Donenfeld <Jason@zx2c4.com>
description:    Fast, secure, and modern VPN tunnel
license:        GPL v2
srcversion:     6BC9480277BB8058D75035C
depends:        udp_tunnel,ip6_udp_tunnel
retpoline:      Y
name:           wireguard
vermagic:       4.16.0-2-rt-amd64 SMP preempt mod_unload modversions

But although the machine froze I dont see the BUG's in the kernel log. 

So honestly I'm wondering if that freezing is just a coincidence with
the bug you found since I didnt see a BUG... However I could run the
non-rt kernel for 13 days without freeze but cant run the -rt- kernel
without freezing for more than a few hours. Do you think think the bug
could cause that?

I need to run the non-rt today to get some work done and I'll run the
-rt- with the new patch this evening if thats ok.

The other "weird" errors are there, but I've just rebooted to the non-
rt kernel and I'm getting those "ERROR block_reap:328: [bandwidth] bad
exit code 1" a lot still... but searching specifically on those is
pointing to i3blocks... indeed killing i3blocks does stop them, so
something in my i3blocks config is screwy. Woops! I was confused
because the logs show them being trapped by gdm3.

Many thanks for your help and brilliant code.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13  7:58       ` Paul Hedderly
@ 2018-06-13 12:13         ` Jason A. Donenfeld
  2018-06-13 13:52           ` Jason A. Donenfeld
  2018-06-13 14:49           ` Paul Hedderly
  0 siblings, 2 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-13 12:13 UTC (permalink / raw)
  To: paul; +Cc: WireGuard mailing list

Hi Paul,

> But although the machine froze I dont see the BUG's in the kernel log.

Alright, so this means I've determined the root cause of the BUGs you
were seeing before, and both the patch you applied and hopefully the
one I committed should take care of that. However, this lockup...

> Last night I recompiled with your first patch and left the machine
> running - this morning it was frozen again :(
> So honestly I'm wondering if that freezing is just a coincidence with
> the bug you found since I didnt see a BUG... However I could run the
> non-rt kernel for 13 days without freeze but cant run the -rt- kernel
> without freezing for more than a few hours. Do you think think the bug
> could cause that?

Does the freeze happen _without_ WireGuard? Or does it only happen
when using WireGuard?

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 12:13         ` Jason A. Donenfeld
@ 2018-06-13 13:52           ` Jason A. Donenfeld
  2018-06-13 14:54             ` Paul Hedderly
  2018-06-14 19:49             ` Paul Hedderly
  2018-06-13 14:49           ` Paul Hedderly
  1 sibling, 2 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-13 13:52 UTC (permalink / raw)
  To: paul; +Cc: WireGuard mailing list

Hi Paul,

I got an -rt kernel up and running, enabled a bunch of nice debugging
options, and found a handful of problems, all of which were fixed by:
https://git.zx2c4.com/WireGuard/commit/?id=0f05452d043d8d047cf5d7987fc2732b97d676e6

I realize the solution in that patch is a bit of a bummer, but at the
very least it keeps things from breaking now. I'll see if I can
improve it somewhere down the line.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 12:13         ` Jason A. Donenfeld
  2018-06-13 13:52           ` Jason A. Donenfeld
@ 2018-06-13 14:49           ` Paul Hedderly
  1 sibling, 0 replies; 16+ messages in thread
From: Paul Hedderly @ 2018-06-13 14:49 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

On Wed, 2018-06-13 at 14:13 +0200, Jason A. Donenfeld wrote:
> Hi Paul,
> 
> > But although the machine froze I dont see the BUG's in the kernel
> > log.
> 
> Alright, so this means I've determined the root cause of the BUGs you
> were seeing before, and both the patch you applied and hopefully the
> one I committed should take care of that. However, this lockup...
> 
> > Last night I recompiled with your first patch and left the machine
> > running - this morning it was frozen again :(
> > So honestly I'm wondering if that freezing is just a coincidence
> > with
> > the bug you found since I didnt see a BUG... However I could run
> > the
> > non-rt kernel for 13 days without freeze but cant run the -rt-
> > kernel
> > without freezing for more than a few hours. Do you think think the
> > bug
> > could cause that?
> 
> Does the freeze happen _without_ WireGuard? Or does it only happen
> when using WireGuard?
> 
> Jason
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard

I will test that also (as well as the new patch) - havent not run
wireguard for a while... :)

Thanks 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 13:52           ` Jason A. Donenfeld
@ 2018-06-13 14:54             ` Paul Hedderly
  2018-06-13 15:08               ` Greg KH
  2018-06-13 21:12               ` Jason A. Donenfeld
  2018-06-14 19:49             ` Paul Hedderly
  1 sibling, 2 replies; 16+ messages in thread
From: Paul Hedderly @ 2018-06-13 14:54 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

On Wed, 2018-06-13 at 15:52 +0200, Jason A. Donenfeld wrote:
> Hi Paul,
> 
> I got an -rt kernel up and running, enabled a bunch of nice debugging
> options, and found a handful of problems, all of which were fixed by:
> https://git.zx2c4.com/WireGuard/commit/?id=0f05452d043d8d047cf5d7987f
> c2732b97d676e6
> 
> I realize the solution in that patch is a bit of a bummer, but at the
> very least it keeps things from breaking now. I'll see if I can
> improve it somewhere down the line.

OUch. "So on sane kernels" I guess RT isn't sane then!

Yea that performance hit is a nuisance, although in reality when I
really need RT I wont be doing much over a VPN... so I may have to suck
up to rebooting between kernels for a while. No big deal.

Thanks again - I will test that new patch later.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 14:54             ` Paul Hedderly
@ 2018-06-13 15:08               ` Greg KH
  2018-06-13 16:07                 ` Paul Hedderly
  2018-06-13 21:12               ` Jason A. Donenfeld
  1 sibling, 1 reply; 16+ messages in thread
From: Greg KH @ 2018-06-13 15:08 UTC (permalink / raw)
  To: Paul Hedderly; +Cc: WireGuard mailing list

On Wed, Jun 13, 2018 at 03:54:24PM +0100, Paul Hedderly wrote:
> On Wed, 2018-06-13 at 15:52 +0200, Jason A. Donenfeld wrote:
> > Hi Paul,
> > 
> > I got an -rt kernel up and running, enabled a bunch of nice debugging
> > options, and found a handful of problems, all of which were fixed by:
> > https://git.zx2c4.com/WireGuard/commit/?id=0f05452d043d8d047cf5d7987f
> > c2732b97d676e6
> > 
> > I realize the solution in that patch is a bit of a bummer, but at the
> > very least it keeps things from breaking now. I'll see if I can
> > improve it somewhere down the line.
> 
> OUch. "So on sane kernels" I guess RT isn't sane then!
> 
> Yea that performance hit is a nuisance, although in reality when I
> really need RT I wont be doing much over a VPN... so I may have to suck
> up to rebooting between kernels for a while. No big deal.

Note, the -rt kernels _always_ run slower than a non-rt kernel.  Real
time is not "faster", it is only "deterministic".  And it achieves this
goal at the expense of performance, which is the only way it can be
done.

So I recommend just trying the "normal" kernel instead, odds are it will
work just fine for you, unless you have some _very_ specific max-bound
latency issues.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 15:08               ` Greg KH
@ 2018-06-13 16:07                 ` Paul Hedderly
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Hedderly @ 2018-06-13 16:07 UTC (permalink / raw)
  To: Greg KH; +Cc: WireGuard mailing list

On Wed, 2018-06-13 at 17:08 +0200, Greg KH wrote:
> 
> Note, the -rt kernels _always_ run slower than a non-rt kernel.  Real
> time is not "faster", it is only "deterministic".  And it achieves
> this
> goal at the expense of performance, which is the only way it can be
> done.

Thanks for the reminder - I do forget that and get lazy and often use
the rt all the time.
 
> So I recommend just trying the "normal" kernel instead, odds are it
> will
> work just fine for you, unless you have some _very_ specific max-
> bound
> latency issues.

Mostly audio/jack ie duplex 32 channels at 48k with a B* X32 mixer. To
be fair I have more issues with the disc keeping up than the kernel...
But the truth is that use takes a very small part of my time and
normally a different machine so really I should use the rt kernel less!

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 14:54             ` Paul Hedderly
  2018-06-13 15:08               ` Greg KH
@ 2018-06-13 21:12               ` Jason A. Donenfeld
  1 sibling, 0 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-13 21:12 UTC (permalink / raw)
  To: paul; +Cc: WireGuard mailing list

On Wed, Jun 13, 2018 at 4:54 PM Paul Hedderly <paul@mjr.org> wrote:
> Yea that performance hit is a nuisance

I find that doubtful, for the reasons Greg mentioned.

But in either case, I don't anticipate this will be the final
situation, but rather just a stop gap measure. I think we'll be ready
to revisit this after working on some other related FPU changes. And
for the time being, this at least fixes the crashes.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-13 13:52           ` Jason A. Donenfeld
  2018-06-13 14:54             ` Paul Hedderly
@ 2018-06-14 19:49             ` Paul Hedderly
  2018-06-15 17:04               ` Jason A. Donenfeld
  1 sibling, 1 reply; 16+ messages in thread
From: Paul Hedderly @ 2018-06-14 19:49 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list

On Wed, 2018-06-13 at 15:52 +0200, Jason A. Donenfeld wrote:
> Hi Paul,
> 
> I got an -rt kernel up and running, enabled a bunch of nice debugging
> options, and found a handful of problems, all of which were fixed by:
> https://git.zx2c4.com/WireGuard/commit/?id=0f05452d043d8d047cf5d7987f
> c2732b97d676e6

Well just some feedback - I ran overnight on the -rt- kernel with no
wireguard... and it was clean and normal in the morning. Worked for a
couple of hours with and then fired up wireguard compiled from the
above commit - and its been running 10 hours now with no issues
whatsoever. dmesg nice and empty!

So many thanks - I think you nailed that one!

> I realize the solution in that patch is a bit of a bummer, but at the
> very least it keeps things from breaking now. I'll see if I can
> improve it somewhere down the line.

Its great to get a stable system again - brilliant work.

I love wg - and having it on android (user-space only because its a new
unrooted phone :( ) and Ubiquiti and LEDE... I'm sorted!

The GSOC projects are also very exciting prospects... 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Kernel lockup with (debian) 4.16.0-2-rt-amd64
  2018-06-14 19:49             ` Paul Hedderly
@ 2018-06-15 17:04               ` Jason A. Donenfeld
  0 siblings, 0 replies; 16+ messages in thread
From: Jason A. Donenfeld @ 2018-06-15 17:04 UTC (permalink / raw)
  To: Paul Hedderly; +Cc: WireGuard mailing list

[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

Happy to hear it's working for you. I went asking for deeper answers, and
this thread might interest you:

https://marc.info/?l=linux-rt-users&m=152906766629953&w=2
https://marc.info/?l=linux-rt-users&m=152907896201090&w=2

On Thu, Jun 14, 2018, 21:50 Paul Hedderly <paul@mjr.org> wrote:

> On Wed, 2018-06-13 at 15:52 +0200, Jason A. Donenfeld wrote:
> > Hi Paul,
> >
> > I got an -rt kernel up and running, enabled a bunch of nice debugging
> > options, and found a handful of problems, all of which were fixed by:
> > https://git.zx2c4.com/WireGuard/commit/?id=0f05452d043d8d047cf5d7987f
> > c2732b97d676e6
>
> Well just some feedback - I ran overnight on the -rt- kernel with no
> wireguard... and it was clean and normal in the morning. Worked for a
> couple of hours with and then fired up wireguard compiled from the
> above commit - and its been running 10 hours now with no issues
> whatsoever. dmesg nice and empty!
>
> So many thanks - I think you nailed that one!
>
> > I realize the solution in that patch is a bit of a bummer, but at the
> > very least it keeps things from breaking now. I'll see if I can
> > improve it somewhere down the line.
>
> Its great to get a stable system again - brilliant work.
>
> I love wg - and having it on android (user-space only because its a new
> unrooted phone :( ) and Ubiquiti and LEDE... I'm sorted!
>
> The GSOC projects are also very exciting prospects...
>

[-- Attachment #2: Type: text/html, Size: 2296 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-06-15 17:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-12 20:00 Kernel lockup with (debian) 4.16.0-2-rt-amd64 Paul Hedderly
2018-06-12 21:35 ` Jason A. Donenfeld
2018-06-12 21:42   ` Paul Hedderly
2018-06-12 21:47   ` Jason A. Donenfeld
2018-06-13  1:58     ` Jason A. Donenfeld
2018-06-13  7:58       ` Paul Hedderly
2018-06-13 12:13         ` Jason A. Donenfeld
2018-06-13 13:52           ` Jason A. Donenfeld
2018-06-13 14:54             ` Paul Hedderly
2018-06-13 15:08               ` Greg KH
2018-06-13 16:07                 ` Paul Hedderly
2018-06-13 21:12               ` Jason A. Donenfeld
2018-06-14 19:49             ` Paul Hedderly
2018-06-15 17:04               ` Jason A. Donenfeld
2018-06-13 14:49           ` Paul Hedderly
2018-06-12 21:38 ` Paul Hedderly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).