* [PR PATCH] linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero
@ 2022-09-01 13:47 mmnmnnmnmm
2022-09-02 15:33 ` [PR PATCH] [Closed]: " sgn
0 siblings, 1 reply; 2+ messages in thread
From: mmnmnnmnmm @ 2022-09-01 13:47 UTC (permalink / raw)
To: ml
[-- Attachment #1: Type: text/plain, Size: 4337 bytes --]
There is a new pull request by mmnmnnmnmm against master on the void-packages repository
https://github.com/mmnmnnmnmm/void-packages linux5.19-rcu
https://github.com/void-linux/void-packages/pull/39023
linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero
Linux 5.19 causes an rcu stall error on boot.
Diffing the 5.19 and 5.15 config I found the addition of the new CONFIG_RCU_EXP_CPU_STALL_TIMEOUT,
which appears erroneously defined as 20, as the Void kernel config defines CONFIG_ANDROID ¹.
Setting CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero, which should be the same behaviour as in previous kernels,
fixes the problem.
¹ https://www.kernel.org/doc/html/latest/RCU/stallwarn.html
This has been fixed in kernel v6.0: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1045a06724f322ed61f1ffb994427c7bdbe64647
Thread I found this fix and information from:
https://lkml.org/lkml/2022/6/28/1051
https://lore.kernel.org/all/1656357116.rhe0mufk6a.none@localhost/
dmesg when set to 20:
```
kern.err: [ 2.424048] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-... } 21 jiffies s: 29 root: 0x1/.
kern.info: [ 2.424059] fbcon: Taking over console
kern.err: [ 2.424063] rcu: blocking rcu_node structures (internal RCU debug):
kern.info: [ 2.424066] Task dump for CPU 0:
kern.info: [ 2.424068] task:kworker/0:3 state:R running task stack: 0 pid: 324 ppid: 2 flags:0x00004008
kern.info: [ 2.424073] Workqueue: events work_for_cpu_fn
kern.info: [ 2.424078] Call Trace:
kern.info: [ 2.424080] <TASK>
kern.info: [ 2.424082] ? __slab_free+0xa0/0x2d0
kern.info: [ 2.424087] ? radeon_ttm_tt_create+0x36/0xa0 [radeon]
kern.info: [ 2.424155] ? put_cpu_partial+0x6d/0xb0
kern.info: [ 2.424158] ? ttm_resource_free+0x67/0x80 [ttm]
kern.info: [ 2.424164] ? kmem_cache_alloc_lru+0x1b4/0x3b0
kern.info: [ 2.424167] ? _raw_spin_unlock_irqrestore+0x20/0x40
kern.info: [ 2.424170] ? __wake_up_common_lock+0x8a/0xc0
kern.info: [ 2.424173] ? sysvec_apic_timer_interrupt+0xaf/0xd0
kern.info: [ 2.424177] ? sysvec_apic_timer_interrupt+0xaf/0xd0
kern.info: [ 2.424179] ? asm_sysvec_apic_timer_interrupt+0x16/0x20
kern.info: [ 2.424183] ? delay_tsc+0x4a/0xc0
kern.info: [ 2.424187] ? delay_tsc+0x42/0xc0
kern.info: [ 2.424190] ? rv770_set_uvd_clocks+0x27e/0x350 [radeon]
kern.info: [ 2.424255] ? uvd_v1_0_init+0x37/0x570 [radeon]
kern.info: [ 2.424317] ? rv770_startup+0xfce/0x1740 [radeon]
kern.info: [ 2.424383] ? rv770_init+0x259/0x2c0 [radeon]
kern.info: [ 2.424448] ? radeon_device_init+0x553/0xa10 [radeon]
kern.info: [ 2.424502] ? radeon_driver_load_kms+0xc8/0x260 [radeon]
kern.info: [ 2.424556] ? drm_dev_register+0xcc/0x1c0 [drm]
kern.info: [ 2.424572] ? radeon_pci_probe+0xc4/0x110 [radeon]
kern.info: [ 2.424626] ? local_pci_probe+0x45/0x80
kern.info: [ 2.424628] ? work_for_cpu_fn+0x16/0x20
kern.info: [ 2.424631] ? process_one_work+0x1e5/0x3b0
kern.info: [ 2.424634] ? worker_thread+0x1c4/0x3a0
kern.info: [ 2.424636] ? rescuer_thread+0x390/0x390
kern.info: [ 2.424639] ? kthread+0xe7/0x110
kern.info: [ 2.424641] ? kthread_complete_and_exit+0x20/0x20
kern.info: [ 2.424644] ? ret_from_fork+0x22/0x30
kern.info: [ 2.424647] </TASK>
```
<!-- Uncomment relevant sections and delete options which are not applicable -->
#### Testing the changes
- I tested the changes in this PR: **YES**
<!--
#### New package
- This new package conforms to the [package requirements](https://github.com/void-linux/void-packages/blob/master/CONTRIBUTING.md#package-requirements): **YES**|**NO**
-->
<!-- Note: If the build is likely to take more than 2 hours, please add ci skip tag as described in
https://github.com/void-linux/void-packages/blob/master/CONTRIBUTING.md#continuous-integration
and test at least one native build and, if supported, at least one cross build.
Ignore this section if this PR is not skipping CI.
-->
#### Local build testing
- I built this PR locally for my native architecture, (x86_64-glibc)
(Only on x86_64 on the affected hardware.)
[ci skip]
A patch file from https://github.com/void-linux/void-packages/pull/39023.patch is attached
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: github-pr-linux5.19-rcu-39023.patch --]
[-- Type: text/x-diff, Size: 2823 bytes --]
From c42f3a5e5b2f4caa03d0c5586d087c404b35ecd7 Mon Sep 17 00:00:00 2001
From: mmnmnnmnmm <mnnnm@disroot.org>
Date: Thu, 1 Sep 2022 14:28:49 +0100
Subject: [PATCH] linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero
The Void kernel config defines CONFIG_ANDROID, which in 5.19
inadvertantly sets CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to 20.
Setting it to zero matches previous and future behaviour.
This was fixed in kernel v6.0: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1045a06724f322ed61f1ffb994427c7bdbe64647
---
srcpkgs/linux5.19/files/arm64-dotconfig | 2 +-
srcpkgs/linux5.19/files/i386-dotconfig | 2 +-
srcpkgs/linux5.19/files/x86_64-dotconfig | 2 +-
srcpkgs/linux5.19/template | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/srcpkgs/linux5.19/files/arm64-dotconfig b/srcpkgs/linux5.19/files/arm64-dotconfig
index 2c6f82ecf62b..7189dede07df 100644
--- a/srcpkgs/linux5.19/files/arm64-dotconfig
+++ b/srcpkgs/linux5.19/files/arm64-dotconfig
@@ -12350,7 +12350,7 @@ CONFIG_TORTURE_TEST=m
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_REF_SCALE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
-CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20
+CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
CONFIG_RCU_TRACE=y
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging
diff --git a/srcpkgs/linux5.19/files/i386-dotconfig b/srcpkgs/linux5.19/files/i386-dotconfig
index 82e1c846ba3a..bf3ce35396c3 100644
--- a/srcpkgs/linux5.19/files/i386-dotconfig
+++ b/srcpkgs/linux5.19/files/i386-dotconfig
@@ -10479,7 +10479,7 @@ CONFIG_TORTURE_TEST=m
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_REF_SCALE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
-CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20
+CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging
diff --git a/srcpkgs/linux5.19/files/x86_64-dotconfig b/srcpkgs/linux5.19/files/x86_64-dotconfig
index 3c6c45056643..749aaf7144ef 100644
--- a/srcpkgs/linux5.19/files/x86_64-dotconfig
+++ b/srcpkgs/linux5.19/files/x86_64-dotconfig
@@ -10745,7 +10745,7 @@ CONFIG_TORTURE_TEST=m
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_REF_SCALE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
-CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20
+CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging
diff --git a/srcpkgs/linux5.19/template b/srcpkgs/linux5.19/template
index 4eb2f4e2a886..42f36b6941f3 100644
--- a/srcpkgs/linux5.19/template
+++ b/srcpkgs/linux5.19/template
@@ -1,7 +1,7 @@
# Template file for 'linux5.19'
pkgname=linux5.19
version=5.19.4
-revision=1
+revision=2
wrksrc="linux-${version%.*}"
short_desc="Linux kernel and modules (${version%.*} series)"
maintainer="Đoàn Trần Công Danh <congdanhqx@gmail.com>"
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PR PATCH] [Closed]: linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero
2022-09-01 13:47 [PR PATCH] linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero mmnmnnmnmm
@ 2022-09-02 15:33 ` sgn
0 siblings, 0 replies; 2+ messages in thread
From: sgn @ 2022-09-02 15:33 UTC (permalink / raw)
To: ml
[-- Attachment #1: Type: text/plain, Size: 4174 bytes --]
There's a closed pull request on the void-packages repository
linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero
https://github.com/void-linux/void-packages/pull/39023
Description:
Linux 5.19 causes an rcu stall error on boot.
Diffing the 5.19 and 5.15 config I found the addition of the new CONFIG_RCU_EXP_CPU_STALL_TIMEOUT,
which appears erroneously defined as 20, as the Void kernel config defines CONFIG_ANDROID ¹.
Setting CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero, which should be the same behaviour as in previous kernels,
fixes the problem.
¹ https://www.kernel.org/doc/html/latest/RCU/stallwarn.html
This has been fixed in kernel v6.0: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1045a06724f322ed61f1ffb994427c7bdbe64647
Thread I found this fix and information from:
https://lkml.org/lkml/2022/6/28/1051
https://lore.kernel.org/all/1656357116.rhe0mufk6a.none@localhost/
dmesg when set to 20:
```
kern.err: [ 2.424048] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-... } 21 jiffies s: 29 root: 0x1/.
kern.info: [ 2.424059] fbcon: Taking over console
kern.err: [ 2.424063] rcu: blocking rcu_node structures (internal RCU debug):
kern.info: [ 2.424066] Task dump for CPU 0:
kern.info: [ 2.424068] task:kworker/0:3 state:R running task stack: 0 pid: 324 ppid: 2 flags:0x00004008
kern.info: [ 2.424073] Workqueue: events work_for_cpu_fn
kern.info: [ 2.424078] Call Trace:
kern.info: [ 2.424080] <TASK>
kern.info: [ 2.424082] ? __slab_free+0xa0/0x2d0
kern.info: [ 2.424087] ? radeon_ttm_tt_create+0x36/0xa0 [radeon]
kern.info: [ 2.424155] ? put_cpu_partial+0x6d/0xb0
kern.info: [ 2.424158] ? ttm_resource_free+0x67/0x80 [ttm]
kern.info: [ 2.424164] ? kmem_cache_alloc_lru+0x1b4/0x3b0
kern.info: [ 2.424167] ? _raw_spin_unlock_irqrestore+0x20/0x40
kern.info: [ 2.424170] ? __wake_up_common_lock+0x8a/0xc0
kern.info: [ 2.424173] ? sysvec_apic_timer_interrupt+0xaf/0xd0
kern.info: [ 2.424177] ? sysvec_apic_timer_interrupt+0xaf/0xd0
kern.info: [ 2.424179] ? asm_sysvec_apic_timer_interrupt+0x16/0x20
kern.info: [ 2.424183] ? delay_tsc+0x4a/0xc0
kern.info: [ 2.424187] ? delay_tsc+0x42/0xc0
kern.info: [ 2.424190] ? rv770_set_uvd_clocks+0x27e/0x350 [radeon]
kern.info: [ 2.424255] ? uvd_v1_0_init+0x37/0x570 [radeon]
kern.info: [ 2.424317] ? rv770_startup+0xfce/0x1740 [radeon]
kern.info: [ 2.424383] ? rv770_init+0x259/0x2c0 [radeon]
kern.info: [ 2.424448] ? radeon_device_init+0x553/0xa10 [radeon]
kern.info: [ 2.424502] ? radeon_driver_load_kms+0xc8/0x260 [radeon]
kern.info: [ 2.424556] ? drm_dev_register+0xcc/0x1c0 [drm]
kern.info: [ 2.424572] ? radeon_pci_probe+0xc4/0x110 [radeon]
kern.info: [ 2.424626] ? local_pci_probe+0x45/0x80
kern.info: [ 2.424628] ? work_for_cpu_fn+0x16/0x20
kern.info: [ 2.424631] ? process_one_work+0x1e5/0x3b0
kern.info: [ 2.424634] ? worker_thread+0x1c4/0x3a0
kern.info: [ 2.424636] ? rescuer_thread+0x390/0x390
kern.info: [ 2.424639] ? kthread+0xe7/0x110
kern.info: [ 2.424641] ? kthread_complete_and_exit+0x20/0x20
kern.info: [ 2.424644] ? ret_from_fork+0x22/0x30
kern.info: [ 2.424647] </TASK>
```
<!-- Uncomment relevant sections and delete options which are not applicable -->
#### Testing the changes
- I tested the changes in this PR: **YES**
<!--
#### New package
- This new package conforms to the [package requirements](https://github.com/void-linux/void-packages/blob/master/CONTRIBUTING.md#package-requirements): **YES**|**NO**
-->
<!-- Note: If the build is likely to take more than 2 hours, please add ci skip tag as described in
https://github.com/void-linux/void-packages/blob/master/CONTRIBUTING.md#continuous-integration
and test at least one native build and, if supported, at least one cross build.
Ignore this section if this PR is not skipping CI.
-->
#### Local build testing
- I built this PR locally for my native architecture, (x86_64-glibc)
(Only on x86_64 on the affected hardware.)
[ci skip]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-09-02 15:33 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-01 13:47 [PR PATCH] linux5.19: Set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to zero mmnmnnmnmm
2022-09-02 15:33 ` [PR PATCH] [Closed]: " sgn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).