Development discussion of WireGuard
 help / color / mirror / Atom feed
* kp's and mem corruption?
@ 2021-05-02 21:44 Manojav Sridhar
  2021-05-03 13:07 ` Jason A. Donenfeld
  0 siblings, 1 reply; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-02 21:44 UTC (permalink / raw)
  To: wireguard

Hi Jason,

Great work on the Freebsd kmod so far!

Couple of issues to report. I am running the wireguard-kmod-0.0.20210428
snapshot on my pfsense router. I am working with the pfSense-pkg-Wireguard
effort in building the WG package. Admittedly I am mostly testing and
providing some UI code. However I have come across 2 errors. First one  is
a KP that happened sometime today.

FreeBSD pfsense 12.2-STABLE FreeBSD 12.2-STABLE
1b709158e581(RELENG_2_5_0) pfSense  amd64

Here is the stack trace from the KP https://pastebin.com/4bjdzYas

db:0:kdb.enter.default> bt
Tracing pid 0 tid 100402 td 0xfffff800c67b6740
kdb_enter() at kdb_enter+0x37/frame 0xfffffe004d02c4b0
vpanic() at vpanic+0x197/frame 0xfffffe004d02c500
panic() at panic+0x43/frame 0xfffffe004d02c560
trap_fatal() at trap_fatal+0x391/frame 0xfffffe004d02c5c0
trap() at trap+0x67/frame 0xfffffe004d02c6d0
calltrap() at calltrap+0x8/frame 0xfffffe004d02c6d0
--- trap 0x9, rip = 0xffffffff840fd580, rsp = 0xfffffe004d02c7a0, rbp =
0xfffffe004d02c7e0 ---
noise_remote_index_insert() at noise_remote_index_insert+0xb0/frame
0xfffffe004d02c7e0
noise_consume_initiation() at noise_consume_initiation+0x6bb/frame
0xfffffe004d02ca10
wg_softc_handshake_receive() at wg_softc_handshake_receive+0x27a/frame
0xfffffe004d02cb20
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x121/frame
0xfffffe004d02cb80
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xb6/frame
0xfffffe004d02cbb0
fork_exit() at fork_exit+0x7e/frame 0xfffffe004d02cbf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004d02cbf0


Second issue is that I am seeing memory silent corruption where the pfSense
UI stops responding and serves up invalid files. Reboot fixes it. I have
NOT noticed this issue with the 0415 snapshot; this happened both in the
0424 and 0428 snapshots. While I cannot definitively say its wg related,
that is the only bit changing on the boxes.

Thanks
Manoj

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-02 21:44 kp's and mem corruption? Manojav Sridhar
@ 2021-05-03 13:07 ` Jason A. Donenfeld
  2021-05-03 13:33   ` Manojav Sridhar
  2021-05-03 15:08   ` Jason A. Donenfeld
  0 siblings, 2 replies; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-03 13:07 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list

Hi Manojav,

On Mon, May 3, 2021 at 3:05 PM Manojav Sridhar <manojav@manojav.com> wrote:
> --- trap 0x9, rip = 0xffffffff840fd580, rsp = 0xfffffe004d02c7a0, rbp =
> 0xfffffe004d02c7e0 ---
> noise_remote_index_insert() at noise_remote_index_insert+0xb0/frame
> 0xfffffe004d02c7e0
> noise_consume_initiation() at noise_consume_initiation+0x6bb/frame
> 0xfffffe004d02ca10
> wg_softc_handshake_receive() at wg_softc_handshake_receive+0x27a/frame
> 0xfffffe004d02cb20

Do you know how to reproduce this? Do you have the symbol file
anywhere? Otherwise, do you think you could send me (off list) your
if_wg.ko file that produced this stack trace? Then I can put it into
the disassembler.

> Second issue is that I am seeing memory silent corruption where the pfSense
> UI stops responding and serves up invalid files.

Fixed in https://lists.zx2c4.com/pipermail/wireguard/2021-May/006694.html .

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 13:07 ` Jason A. Donenfeld
@ 2021-05-03 13:33   ` Manojav Sridhar
  2021-05-03 15:08   ` Jason A. Donenfeld
  1 sibling, 0 replies; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-03 13:33 UTC (permalink / raw)
  Cc: WireGuard mailing list

Thanks. I have responded off list onr your other request. Will
continue to test on the latest snapshots!

On Mon, May 3, 2021 at 9:07 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Manojav,
>
> On Mon, May 3, 2021 at 3:05 PM Manojav Sridhar <manojav@manojav.com> wrote:
> > --- trap 0x9, rip = 0xffffffff840fd580, rsp = 0xfffffe004d02c7a0, rbp =
> > 0xfffffe004d02c7e0 ---
> > noise_remote_index_insert() at noise_remote_index_insert+0xb0/frame
> > 0xfffffe004d02c7e0
> > noise_consume_initiation() at noise_consume_initiation+0x6bb/frame
> > 0xfffffe004d02ca10
> > wg_softc_handshake_receive() at wg_softc_handshake_receive+0x27a/frame
> > 0xfffffe004d02cb20
>
> Do you know how to reproduce this? Do you have the symbol file
> anywhere? Otherwise, do you think you could send me (off list) your
> if_wg.ko file that produced this stack trace? Then I can put it into
> the disassembler.
>
> > Second issue is that I am seeing memory silent corruption where the pfSense
> > UI stops responding and serves up invalid files.
>
> Fixed in https://lists.zx2c4.com/pipermail/wireguard/2021-May/006694.html .
>
> Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 13:07 ` Jason A. Donenfeld
  2021-05-03 13:33   ` Manojav Sridhar
@ 2021-05-03 15:08   ` Jason A. Donenfeld
  2021-05-03 15:27     ` Manojav Sridhar
  1 sibling, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-03 15:08 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list, Christian McDonald

Hey again,

Thanks for the .ko you sent me. That was helpful in tracking down the
bug, which Matt and I have now fixed here:

https://git.zx2c4.com/wireguard-freebsd/commit/?id=c69fb61b94341027ea3c539bcf96d9fe03f65fa5

The commit message includes a little bash reproducer that hit the same
crash in my tests, making me somewhat confident we squashed the right
one.

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:08   ` Jason A. Donenfeld
@ 2021-05-03 15:27     ` Manojav Sridhar
  2021-05-03 15:30       ` Jason A. Donenfeld
  0 siblings, 1 reply; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-03 15:27 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list, Christian McDonald

Jason,

Thanks for the follow up and bash to script to ensure help re-pro it.
I am guessing my constant restarting of the tunnels when testing the
pfSense based UI we are building triggered the scenario your bash
script creates.

I tried the bash script on both bare metal box and virtualbox pfsense
box. both ran for a few minutes okay. How long does it take to happen?
I am testing with the 0428 snapshot.


Manoj

On Mon, May 3, 2021 at 11:08 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hey again,
>
> Thanks for the .ko you sent me. That was helpful in tracking down the
> bug, which Matt and I have now fixed here:
>
> https://git.zx2c4.com/wireguard-freebsd/commit/?id=c69fb61b94341027ea3c539bcf96d9fe03f65fa5
>
> The commit message includes a little bash reproducer that hit the same
> crash in my tests, making me somewhat confident we squashed the right
> one.
>
> Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:27     ` Manojav Sridhar
@ 2021-05-03 15:30       ` Jason A. Donenfeld
  2021-05-03 15:32         ` Manojav Sridhar
  0 siblings, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-03 15:30 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list, Christian McDonald

On Mon, May 3, 2021 at 5:27 PM Manojav Sridhar <manojav@manojav.com> wrote:
> I tried the bash script on both bare metal box and virtualbox pfsense
> box. both ran for a few minutes okay. How long does it take to happen?
> I am testing with the 0428 snapshot.

Try changing in wg_noise.c:

#define HT_INDEX_SIZE (1 << 13)

to

#define HT_INDEX_SIZE (1 << 3)

And then you'll see it hit pretty quickly.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:30       ` Jason A. Donenfeld
@ 2021-05-03 15:32         ` Manojav Sridhar
  2021-05-03 15:34           ` Manojav Sridhar
  2021-05-03 15:35           ` Jason A. Donenfeld
  0 siblings, 2 replies; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-03 15:32 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list, Christian McDonald

Ah. Understood. I am not set up to build for freebsd yet ko. But I can
leave it running on my test box for a bit.

On Mon, May 3, 2021 at 11:31 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Mon, May 3, 2021 at 5:27 PM Manojav Sridhar <manojav@manojav.com> wrote:
> > I tried the bash script on both bare metal box and virtualbox pfsense
> > box. both ran for a few minutes okay. How long does it take to happen?
> > I am testing with the 0428 snapshot.
>
> Try changing in wg_noise.c:
>
> #define HT_INDEX_SIZE (1 << 13)
>
> to
>
> #define HT_INDEX_SIZE (1 << 3)
>
> And then you'll see it hit pretty quickly.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:32         ` Manojav Sridhar
@ 2021-05-03 15:34           ` Manojav Sridhar
  2021-05-03 15:35             ` Jason A. Donenfeld
  2021-05-03 15:35           ` Jason A. Donenfeld
  1 sibling, 1 reply; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-03 15:34 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list, Christian McDonald

Just happened! so yeah that was it on the trigger. Once Cmac builds
the ko for me I  will test it again!

Again thanks so much!

On Mon, May 3, 2021 at 11:32 AM Manojav Sridhar <manojav@manojav.com> wrote:
>
> Ah. Understood. I am not set up to build for freebsd yet ko. But I can
> leave it running on my test box for a bit.
>
> On Mon, May 3, 2021 at 11:31 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > On Mon, May 3, 2021 at 5:27 PM Manojav Sridhar <manojav@manojav.com> wrote:
> > > I tried the bash script on both bare metal box and virtualbox pfsense
> > > box. both ran for a few minutes okay. How long does it take to happen?
> > > I am testing with the 0428 snapshot.
> >
> > Try changing in wg_noise.c:
> >
> > #define HT_INDEX_SIZE (1 << 13)
> >
> > to
> >
> > #define HT_INDEX_SIZE (1 << 3)
> >
> > And then you'll see it hit pretty quickly.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:32         ` Manojav Sridhar
  2021-05-03 15:34           ` Manojav Sridhar
@ 2021-05-03 15:35           ` Jason A. Donenfeld
  1 sibling, 0 replies; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-03 15:35 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list, Christian McDonald

On Mon, May 3, 2021 at 5:33 PM Manojav Sridhar <manojav@manojav.com> wrote:
>
> Ah. Understood. I am not set up to build for freebsd yet ko. But I can
> leave it running on my test box for a bit.

Ah, don't worry about it. The trigger was sufficient for my purposes,
but it doesn't need to be reproduced elsewhere necessarily. However,
if you do wind up seeing this same bug again, using the latest master
branch that contains the fix, please let me know, since that'd
indicate I've done something wrong.

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:34           ` Manojav Sridhar
@ 2021-05-03 15:35             ` Jason A. Donenfeld
  2021-05-03 15:36               ` Manojav Sridhar
  0 siblings, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-03 15:35 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list, Christian McDonald

On Mon, May 3, 2021 at 5:35 PM Manojav Sridhar <manojav@manojav.com> wrote:
>
> Just happened! so yeah that was it on the trigger.

With 1 << 13 or 1 << 3?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:35             ` Jason A. Donenfeld
@ 2021-05-03 15:36               ` Manojav Sridhar
  2021-05-03 17:56                 ` Jason A. Donenfeld
  0 siblings, 1 reply; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-03 15:36 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list, Christian McDonald

With the same ko I sent you. 1<<13. I was just confirming I could trigger it.

On Mon, May 3, 2021 at 11:35 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Mon, May 3, 2021 at 5:35 PM Manojav Sridhar <manojav@manojav.com> wrote:
> >
> > Just happened! so yeah that was it on the trigger.
>
> With 1 << 13 or 1 << 3?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 15:36               ` Manojav Sridhar
@ 2021-05-03 17:56                 ` Jason A. Donenfeld
  2021-05-03 19:53                   ` Manojav Sridhar
  0 siblings, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-03 17:56 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list, Christian McDonald

The code in here will repro the bug much faster:

https://git.zx2c4.com/wireguard-freebsd/commit/?id=561f3a8f930cf2e44f493fa04d932ba9a2362cc5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 17:56                 ` Jason A. Donenfeld
@ 2021-05-03 19:53                   ` Manojav Sridhar
  2021-05-06 12:10                     ` Jason A. Donenfeld
  0 siblings, 1 reply; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-03 19:53 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list, Christian McDonald

Jason,

Thanks for the update. Yes it still triggers this on the current
snapshot, which is built prior to your fix. I will retry it once you
release a new snapshot. It seems quite a long shot that this occurred
on my firewall in the first place. Glad it was reported and fixed.
Onward!


Thanks
Manoj



On Mon, May 3, 2021 at 1:56 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> The code in here will repro the bug much faster:
>
> https://git.zx2c4.com/wireguard-freebsd/commit/?id=561f3a8f930cf2e44f493fa04d932ba9a2362cc5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-03 19:53                   ` Manojav Sridhar
@ 2021-05-06 12:10                     ` Jason A. Donenfeld
  2021-05-06 15:59                       ` Manojav Sridhar
  0 siblings, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2021-05-06 12:10 UTC (permalink / raw)
  To: Manojav Sridhar; +Cc: WireGuard mailing list, Christian McDonald

Hi Manojav,

0.0.20210503 is now in ports, which contains these fixes.

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: kp's and mem corruption?
  2021-05-06 12:10                     ` Jason A. Donenfeld
@ 2021-05-06 15:59                       ` Manojav Sridhar
  0 siblings, 0 replies; 15+ messages in thread
From: Manojav Sridhar @ 2021-05-06 15:59 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: WireGuard mailing list, Christian McDonald

Jason,

With some help I was able to get the latest if_wg.ko built. I have
been running the triggering bash script for a while now and not
managed to trigger it. However prior to installing the latest snapshot
it happened one more my on firewall (just provided as FYI) in the
middle of the night as part of normal usage.

Thanks for jumping on this! looking solid again!
Manoj

On Thu, May 6, 2021 at 8:10 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Manojav,
>
> 0.0.20210503 is now in ports, which contains these fixes.
>
> Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-05-06 15:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-02 21:44 kp's and mem corruption? Manojav Sridhar
2021-05-03 13:07 ` Jason A. Donenfeld
2021-05-03 13:33   ` Manojav Sridhar
2021-05-03 15:08   ` Jason A. Donenfeld
2021-05-03 15:27     ` Manojav Sridhar
2021-05-03 15:30       ` Jason A. Donenfeld
2021-05-03 15:32         ` Manojav Sridhar
2021-05-03 15:34           ` Manojav Sridhar
2021-05-03 15:35             ` Jason A. Donenfeld
2021-05-03 15:36               ` Manojav Sridhar
2021-05-03 17:56                 ` Jason A. Donenfeld
2021-05-03 19:53                   ` Manojav Sridhar
2021-05-06 12:10                     ` Jason A. Donenfeld
2021-05-06 15:59                       ` Manojav Sridhar
2021-05-03 15:35           ` Jason A. Donenfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).