* Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+
@ 2023-07-02 3:31 Bagas Sanjaya
2023-07-02 11:57 ` Bagas Sanjaya
0 siblings, 1 reply; 8+ messages in thread
From: Bagas Sanjaya @ 2023-07-02 3:31 UTC (permalink / raw)
To: Eric DeVolder, Borislav Petkov (AMD),
David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney,
Joel Fernandes, Boqun Feng, Jason A. Donenfeld, Jay Vosburgh,
Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin
Cc: Linux Kernel Mailing List, Linux Regressions, Linux RCU,
Wireguard Mailing List, Linux Networking, Linux ACPI
Hi,
I notice a regression report on Bugzilla [1]. Quoting from it:
> I've spent the last week on debugging a problem with my attempt to upgrade my kernel from 6.2.8 to 6.3.8 (now also with 6.4.0 too).
>
> The lenghty and detailed bug reports with all aspects of git bisect are at
> https://bugs.gentoo.org/909066
>
> A summary:
> - if I do not configure wg0, the kernel does not hang
> - if I use a kernel older than commit fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c, it does not hang
>
> The commit refers to code that seems unrelated to the problem for my naiive eye.
>
> The hardware is a Dell PowerEdge R620 running Gentoo ~amd64.
>
> I have so far excluded:
> - dracut for generating the initramfs is the same version over all kernels
> - linux-firmware has been the same
> - CPU microcode has been the same
>
> It's been a long time since I seriously involved with software development and I have been even less involved with kernel development.
>
> Gentoo maintainers recommended me to open a bug with upstream, so here I am.
>
> I currently have no idea how to make progress, but I'm willing to try things.
See Bugzilla for the full thread.
Anyway, I'm adding it to regzbot to make sure it doesn't fall through cracks
unnoticed:
#regzbot introduced: fed8d8773b8ea6 https://bugzilla.kernel.org/show_bug.cgi?id=217620
#regzbot title: correcting acpi_is_processor_usable() check causes RCU stalls with wireguard over bonding+igb
#regzbot link: https://bugs.gentoo.org/909066
Thanks.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=217620
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 3:31 Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ Bagas Sanjaya @ 2023-07-02 11:57 ` Bagas Sanjaya 2023-07-02 12:37 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 8+ messages in thread From: Bagas Sanjaya @ 2023-07-02 11:57 UTC (permalink / raw) To: Eric DeVolder, Borislav Petkov (AMD), David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney, Joel Fernandes, Boqun Feng, Jason A. Donenfeld, Jay Vosburgh, Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Thorsten Leemhuis Cc: Linux Kernel Mailing List, Linux Regressions, Linux RCU, Wireguard Mailing List, Linux Networking, Linux ACPI, Manuel 'satmd' Leiner [also Cc: original reporter] On 7/2/23 10:31, Bagas Sanjaya wrote: > Hi, > > I notice a regression report on Bugzilla [1]. Quoting from it: > >> I've spent the last week on debugging a problem with my attempt to upgrade my kernel from 6.2.8 to 6.3.8 (now also with 6.4.0 too). >> >> The lenghty and detailed bug reports with all aspects of git bisect are at >> https://bugs.gentoo.org/909066 >> >> A summary: >> - if I do not configure wg0, the kernel does not hang >> - if I use a kernel older than commit fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c, it does not hang >> >> The commit refers to code that seems unrelated to the problem for my naiive eye. >> >> The hardware is a Dell PowerEdge R620 running Gentoo ~amd64. >> >> I have so far excluded: >> - dracut for generating the initramfs is the same version over all kernels >> - linux-firmware has been the same >> - CPU microcode has been the same >> >> It's been a long time since I seriously involved with software development and I have been even less involved with kernel development. >> >> Gentoo maintainers recommended me to open a bug with upstream, so here I am. >> >> I currently have no idea how to make progress, but I'm willing to try things. > > See Bugzilla for the full thread. > > Anyway, I'm adding it to regzbot to make sure it doesn't fall through cracks > unnoticed: > > #regzbot introduced: fed8d8773b8ea6 https://bugzilla.kernel.org/show_bug.cgi?id=217620 > #regzbot title: correcting acpi_is_processor_usable() check causes RCU stalls with wireguard over bonding+igb > #regzbot link: https://bugs.gentoo.org/909066 > satmd: Can you repeat bisection to confirm that fed8d8773b8ea6 is really the culprit? Thorsten: It seems like the reporter concluded bisection to the (possibly) incorrect culprit. What can I do in this case besides asking to repeat bisection? -- An old man doll... just what I always wanted! - Clara ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 11:57 ` Bagas Sanjaya @ 2023-07-02 12:37 ` Linux regression tracking (Thorsten Leemhuis) 2023-07-02 13:46 ` Jason A. Donenfeld ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2023-07-02 12:37 UTC (permalink / raw) To: Bagas Sanjaya, Eric DeVolder, Borislav Petkov (AMD), David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney, Joel Fernandes, Boqun Feng, Jason A. Donenfeld, Jay Vosburgh, Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin Cc: Linux Kernel Mailing List, Linux Regressions, Linux RCU, Wireguard Mailing List, Linux Networking, Linux ACPI, Manuel 'satmd' Leiner On 02.07.23 13:57, Bagas Sanjaya wrote: > [also Cc: original reporter] BTW: I think you CCed too many developers here. There are situations where this can makes sense, but it's rare. And if you do this too often people might start to not really look into your mails or might even ignore them completely. Normally it's enough to write the mail to (1) the people in the signed-off-by-chain, (2) the maintainers of the subsystem that merged a commit, and (3) the lists for all affected subsystems; leave it up to developers from the first two groups to CC the maintainers of the third group. > On 7/2/23 10:31, Bagas Sanjaya wrote: >> I notice a regression report on Bugzilla [1]. Quoting from it: >> >>> I've spent the last week on debugging a problem with my attempt to upgrade my kernel from 6.2.8 to 6.3.8 (now also with > [...] >> See Bugzilla for the full thread. >> >> Anyway, I'm adding it to regzbot to make sure it doesn't fall through cracks >> unnoticed: >> >> #regzbot introduced: fed8d8773b8ea6 https://bugzilla.kernel.org/show_bug.cgi?id=217620 >> #regzbot title: correcting acpi_is_processor_usable() check causes RCU stalls with wireguard over bonding+igb >> #regzbot link: https://bugs.gentoo.org/909066 > satmd: Can you repeat bisection to confirm that fed8d8773b8ea6 is > really the culprit? I'd be careful to ask people that, as that might mean a lot of work for them. Best to leave things like that to developers, unless it's pretty obvious that something went sideways. > Thorsten: It seems like the reporter concluded bisection to the > (possibly) incorrect culprit. What makes your think so? I just looked at bugzilla and it (for now) seems reverting fed8d8773b8ea6 ontop of 6.4 fixed things for the reporter, which is a pretty strong indicator that this change really causes the trouble somehow. /me really wonders what's he's missing > What can I do in this case besides > asking to repeat bisection? Not much apart from updating regzbot state (e.g. something like "regzbot introduced v6.3..v6.4") and a reply to your initial report (ideally with a quick apology) to let everyone know it was a false alarm. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 12:37 ` Linux regression tracking (Thorsten Leemhuis) @ 2023-07-02 13:46 ` Jason A. Donenfeld 2023-07-03 1:29 ` Jason A. Donenfeld 2023-07-03 1:34 ` Bagas Sanjaya 2023-07-02 14:03 ` Sam James 2023-07-02 14:08 ` Bagas Sanjaya 2 siblings, 2 replies; 8+ messages in thread From: Jason A. Donenfeld @ 2023-07-02 13:46 UTC (permalink / raw) To: Linux regressions mailing list Cc: Bagas Sanjaya, Eric DeVolder, Borislav Petkov (AMD), David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney, Joel Fernandes, Boqun Feng, Jay Vosburgh, Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin, Linux Kernel Mailing List, Linux RCU, Wireguard Mailing List, Linux Networking, Linux ACPI, Manuel 'satmd' Leiner I've got an overdue patch that I still need to submit to netdev, which I suspect might actually fix this. Can you let me know if https://git.zx2c4.com/wireguard-linux/patch/?id=54d5e4329efe0d1dba8b4a58720d29493926bed0 solves the problem? Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 13:46 ` Jason A. Donenfeld @ 2023-07-03 1:29 ` Jason A. Donenfeld 2023-07-03 1:34 ` Bagas Sanjaya 1 sibling, 0 replies; 8+ messages in thread From: Jason A. Donenfeld @ 2023-07-03 1:29 UTC (permalink / raw) To: Linux regressions mailing list Cc: Bagas Sanjaya, Eric DeVolder, Borislav Petkov (AMD), David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney, Joel Fernandes, Boqun Feng, Jay Vosburgh, Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin, Linux Kernel Mailing List, Linux RCU, Wireguard Mailing List, Linux Networking, Linux ACPI, Manuel 'satmd' Leiner On Sun, Jul 02, 2023 at 03:46:38PM +0200, Jason A. Donenfeld wrote: > I've got an overdue patch that I still need to submit to netdev, which > I suspect might actually fix this. > > Can you let me know if > https://git.zx2c4.com/wireguard-linux/patch/?id=54d5e4329efe0d1dba8b4a58720d29493926bed0 > solves the problem? satmd, the original reporter, confirmed over on the Gentoo bug report - https://bugs.gentoo.org/909066 - that this patch fixes the issue. This patch has been sent into netdev and will presumably hit the various trees and stable in due time. Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 13:46 ` Jason A. Donenfeld 2023-07-03 1:29 ` Jason A. Donenfeld @ 2023-07-03 1:34 ` Bagas Sanjaya 1 sibling, 0 replies; 8+ messages in thread From: Bagas Sanjaya @ 2023-07-03 1:34 UTC (permalink / raw) To: Jason A. Donenfeld, Linux regressions mailing list Cc: Eric DeVolder, Borislav Petkov (AMD), David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney, Joel Fernandes, Boqun Feng, Jay Vosburgh, Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin, Linux Kernel Mailing List, Linux RCU, Wireguard Mailing List, Linux Networking, Linux ACPI, Manuel 'satmd' Leiner [-- Attachment #1: Type: text/plain, Size: 576 bytes --] On Sun, Jul 02, 2023 at 03:46:38PM +0200, Jason A. Donenfeld wrote: > I've got an overdue patch that I still need to submit to netdev, which > I suspect might actually fix this. > > Can you let me know if > https://git.zx2c4.com/wireguard-linux/patch/?id=54d5e4329efe0d1dba8b4a58720d29493926bed0 > solves the problem? The reporter on Bugzilla [1] said it fixed the regression, so telling regzbot: #regzbot fix: 54d5e4329efe0d Thanks. [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217620#c6 -- An old man doll... just what I always wanted! - Clara [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 12:37 ` Linux regression tracking (Thorsten Leemhuis) 2023-07-02 13:46 ` Jason A. Donenfeld @ 2023-07-02 14:03 ` Sam James 2023-07-02 14:08 ` Bagas Sanjaya 2 siblings, 0 replies; 8+ messages in thread From: Sam James @ 2023-07-02 14:03 UTC (permalink / raw) To: regressions Cc: Jason, andy, bagasdotme, boqun.feng, boris.ovstrosky, bp, dave.hansen, david, eric.devolder, hpa, j.vosburgh, joel, lenb, linux-acpi, linux-kernel, manuel.leiner, miguel.luis, mingo, netdev, paulmck, rafael, rcu, regressions, tglx, wireguard, x86 #regzbot link: https://bugs.gentoo.org/909066 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ 2023-07-02 12:37 ` Linux regression tracking (Thorsten Leemhuis) 2023-07-02 13:46 ` Jason A. Donenfeld 2023-07-02 14:03 ` Sam James @ 2023-07-02 14:08 ` Bagas Sanjaya 2 siblings, 0 replies; 8+ messages in thread From: Bagas Sanjaya @ 2023-07-02 14:08 UTC (permalink / raw) To: Linux regressions mailing list, Eric DeVolder, Borislav Petkov (AMD), David R, Boris Ostrovsky, Miguel Luis, Paul E. McKenney, Joel Fernandes, Boqun Feng, Jason A. Donenfeld, Jay Vosburgh, Andy Gospodarek, Rafael J. Wysocki, Len Brown, Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin Cc: Linux Kernel Mailing List, Linux RCU, Wireguard Mailing List, Linux Networking, Linux ACPI, Manuel 'satmd' Leiner On 7/2/23 19:37, Linux regression tracking (Thorsten Leemhuis) wrote: > On 02.07.23 13:57, Bagas Sanjaya wrote: >> [also Cc: original reporter] > > BTW: I think you CCed too many developers here. There are situations > where this can makes sense, but it's rare. And if you do this too often > people might start to not really look into your mails or might even > ignore them completely. > > Normally it's enough to write the mail to (1) the people in the > signed-off-by-chain, (2) the maintainers of the subsystem that merged a > commit, and (3) the lists for all affected subsystems; leave it up to > developers from the first two groups to CC the maintainers of the third > group. > Hi, In this case I had to also Cc: wireguard, bonding, RCU, and x86 people, since this issue spans these subsystems (I naively thought). Anyway, thanks for detailed tip (honestly /me wonder if I forgot this later, as is often the case). >> On 7/2/23 10:31, Bagas Sanjaya wrote: >>> I notice a regression report on Bugzilla [1]. Quoting from it: >>> >>>> I've spent the last week on debugging a problem with my attempt to upgrade my kernel from 6.2.8 to 6.3.8 (now also with >> [...] >>> See Bugzilla for the full thread. >>> >>> Anyway, I'm adding it to regzbot to make sure it doesn't fall through cracks >>> unnoticed: >>> >>> #regzbot introduced: fed8d8773b8ea6 https://bugzilla.kernel.org/show_bug.cgi?id=217620 >>> #regzbot title: correcting acpi_is_processor_usable() check causes RCU stalls with wireguard over bonding+igb >>> #regzbot link: https://bugs.gentoo.org/909066 > >> satmd: Can you repeat bisection to confirm that fed8d8773b8ea6 is >> really the culprit? > > I'd be careful to ask people that, as that might mean a lot of work for > them. Best to leave things like that to developers, unless it's pretty > obvious that something went sideways. > OK. >> Thorsten: It seems like the reporter concluded bisection to the >> (possibly) incorrect culprit. > > What makes your think so? I just looked at bugzilla and it (for now) > seems reverting fed8d8773b8ea6 ontop of 6.4 fixed things for the > reporter, which is a pretty strong indicator that this change really > causes the trouble somehow. > OK too. > /me really wonders what's he's missing > >> What can I do in this case besides >> asking to repeat bisection? > > Not much apart from updating regzbot state (e.g. something like "regzbot > introduced v6.3..v6.4") and a reply to your initial report (ideally with > a quick apology) to let everyone know it was a false alarm. > OK. -- An old man doll... just what I always wanted! - Clara ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-07-23 16:05 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-02 3:31 Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ Bagas Sanjaya 2023-07-02 11:57 ` Bagas Sanjaya 2023-07-02 12:37 ` Linux regression tracking (Thorsten Leemhuis) 2023-07-02 13:46 ` Jason A. Donenfeld 2023-07-03 1:29 ` Jason A. Donenfeld 2023-07-03 1:34 ` Bagas Sanjaya 2023-07-02 14:03 ` Sam James 2023-07-02 14:08 ` Bagas Sanjaya
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).