Development discussion of WireGuard
 help / color / mirror / Atom feed
* [PATCH 0/5] fix call cpumask_next() if no further cpus set
@ 2023-03-06 16:06 Vernon Yang
  2023-03-06 16:06 ` [PATCH 1/5] random: fix try_to_generate_entropy() " Vernon Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 16:06 UTC (permalink / raw)
  To: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy
  Cc: linux-kernel, wireguard, netdev, linux-scsi, Vernon Yang

Hello,

I updated the Linux kernel to commit fe15c26ee26e ("Linux 6.3-rc1")
and found that when the system boots to systemd ranom initialization,
panic, as follows:

[    3.607299] BUG: unable to handle page fault for address: 000000000001cc43
[    3.607558] #PF: supervisor read access in kernel mode
[    3.607704] #PF: error_code(0x0000) - not-present page
[    3.607704] PGD 0 P4D 0
[    3.607704] Oops: 0000 [#1] PREEMPT SMP NOPTI
[    3.607704] CPU: 1 PID: 1 Comm: systemd Not tainted 6.3.0-rc1 #50
[    3.607704] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[    3.607704] RIP: 0010:_raw_spin_lock+0x12/0x30
[    3.607704] Code: 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 65 ff 05 dd de 1e 7e 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 c3 cc cc cc cc 89 c6 e9 9c 00 00 00 6
[    3.607704] RSP: 0018:ffffc90000013d50 EFLAGS: 00000002
[    3.607704] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000002
[    3.607704] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 000000000001cc43
[    3.607704] RBP: ffffc90000013dc8 R08: 00000000d6fbd601 R09: 0000000065abc912
[    3.607704] R10: 00000000ba93b167 R11: 000000007bb5d0bf R12: 000000000001cc43
[    3.607704] R13: 000000000001cc43 R14: 0000000000000003 R15: 0000000000000003
[    3.607704] FS:  00007fbd4911b400(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[    3.607704] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.607704] CR2: 000000000001cc43 CR3: 0000000003b42000 CR4: 00000000000006e0
[    3.607704] Call Trace:
[    3.607704]  <TASK>
[    3.607704]  add_timer_on+0x80/0x130
[    3.607704]  try_to_generate_entropy+0x246/0x270
[    3.607704]  ? do_filp_open+0xb1/0x160
[    3.607704]  ? __pfx_entropy_timer+0x10/0x10
[    3.607704]  ? inode_security+0x1d/0x60
[    3.607704]  urandom_read_iter+0x23/0x90
[    3.607704]  vfs_read+0x203/0x2d0
[    3.607704]  ksys_read+0x5e/0xe0
[    3.607704]  do_syscall_64+0x3f/0x90
[    3.607704]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[    3.607704] RIP: 0033:0x7fbd49a25992
[    3.607704] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 4
[    3.607704] RSP: 002b:00007ffea3fe8318 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[    3.607704] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007fbd49a25992
[    3.607704] RDX: 0000000000000010 RSI: 00007ffea3fe83a0 RDI: 000000000000000c
[    3.607704] RBP: 000000000000000c R08: 3983c6a57a866072 R09: c736ebfbeb917d7e
[    3.607704] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[    3.607704] R13: 0000000000000001 R14: 00007ffea3fe83a0 R15: 00005609e5454ea8
[    3.607704]  </TASK>
[    3.607704] Modules linked in:
[    3.607704] CR2: 000000000001cc43
[    3.607704] ---[ end trace 0000000000000000 ]---
[    3.607704] RIP: 0010:_raw_spin_lock+0x12/0x30
[    3.607704] Code: 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 65 ff 05 dd de 1e 7e 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 c3 cc cc cc cc 89 c6 e9 9c 00 00 00 6
[    3.607704] RSP: 0018:ffffc90000013d50 EFLAGS: 00000002
[    3.607704] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000002
[    3.607704] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 000000000001cc43
[    3.607704] RBP: ffffc90000013dc8 R08: 00000000d6fbd601 R09: 0000000065abc912
[    3.607704] R10: 00000000ba93b167 R11: 000000007bb5d0bf R12: 000000000001cc43
[    3.607704] R13: 000000000001cc43 R14: 0000000000000003 R15: 0000000000000003
[    3.607704] FS:  00007fbd4911b400(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[    3.607704] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.607704] CR2: 000000000001cc43 CR3: 0000000003b42000 CR4: 00000000000006e0
[    3.607704] note: systemd[1] exited with irqs disabled
[    3.618556] note: systemd[1] exited with preempt_count 2
[    3.618991] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    3.619797] Kernel Offset: disabled
[    3.619798] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---


Analysis add_timer_on() found that the parameter cpu is equal to 64, which
feels strange, because qemu only specifies two CPUs, continues to look up,
and finds that the parameter cpu is obtained by
try_to_generate_entropy() -> cpumask_next().

Then use git bisect to find the bug, and find that it was introduced by
commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations"),
carefully analyzing the cpumask_next() modification record, I found that
nr_cpumask_bits modified to small_cpumask_bits, and when NR_CPUS <= BITS_PER_LONG,
small_cpumask_bits is a macro and before nr_cpumask_bits is a variable-sized.

look for find_next_bit() If no bits are set, returns @size, I seem to
understand the cause of the problem.

I fixed this bug by make `if (cpu == nr_cpumask_bits)` to `if (cpu >= nr_cpumask_bits)`

At the same time I think about this situation, maybe there are the same errors
elsewhere, check it, sure enough, there are, quite a few.

The patch "random:xxx" has been verified, it is valid, the other three fixes
have not been verified, because I do not have an environment, but they
principle are same, so also submitted at the same time, if someone helps to
verify, thanks you very much.

If there is anything error, please tell me, thanks.

Vernon Yang (5):
  random: fix try_to_generate_entropy() if no further cpus set
  wireguard: fix wg_cpumask_choose_online() if no further cpus set
  scsi: lpfc: fix lpfc_cpu_affinity_check() if no further cpus set
  scsi: lpfc: fix lpfc_nvmet_setup_io_context() if no further cpus set
  cpumask: fix comment of cpumask_xxx

 drivers/char/random.c            |  2 +-
 drivers/net/wireguard/queueing.h |  2 +-
 drivers/scsi/lpfc/lpfc_init.c    | 14 +++++-----
 drivers/scsi/lpfc/lpfc_nvmet.c   |  2 +-
 include/linux/cpumask.h          | 46 ++++++++++++++++----------------
 5 files changed, 33 insertions(+), 33 deletions(-)

--
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] random: fix try_to_generate_entropy() if no further cpus set
  2023-03-06 16:06 [PATCH 0/5] fix call cpumask_next() if no further cpus set Vernon Yang
@ 2023-03-06 16:06 ` Vernon Yang
  2023-03-06 16:26   ` Yury Norov
  2023-03-06 16:06 ` [PATCH 2/5] wireguard: fix wg_cpumask_choose_online() " Vernon Yang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 16:06 UTC (permalink / raw)
  To: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy
  Cc: linux-kernel, wireguard, netdev, linux-scsi, Vernon Yang

After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations"), when NR_CPUS <= BITS_PER_LONG, small_cpumask_bits used
a macro instead of variable-sized for efficient.

If no further cpus set, the cpumask_next() returns small_cpumask_bits,
it must greater than or equal to nr_cpumask_bits, so fix it to correctly.

Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
---
 drivers/char/random.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index ce3ccd172cc8..d76f12a5f74f 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1311,7 +1311,7 @@ static void __cold try_to_generate_entropy(void)
 			/* Basic CPU round-robin, which avoids the current CPU. */
 			do {
 				cpu = cpumask_next(cpu, &timer_cpus);
-				if (cpu == nr_cpumask_bits)
+				if (cpu >= nr_cpumask_bits)
 					cpu = cpumask_first(&timer_cpus);
 			} while (cpu == smp_processor_id() && num_cpus > 1);
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/5] wireguard: fix wg_cpumask_choose_online() if no further cpus set
  2023-03-06 16:06 [PATCH 0/5] fix call cpumask_next() if no further cpus set Vernon Yang
  2023-03-06 16:06 ` [PATCH 1/5] random: fix try_to_generate_entropy() " Vernon Yang
@ 2023-03-06 16:06 ` Vernon Yang
  2023-03-06 16:06 ` [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() " Vernon Yang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 16:06 UTC (permalink / raw)
  To: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy
  Cc: linux-kernel, wireguard, netdev, linux-scsi, Vernon Yang

After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations"), when NR_CPUS <= BITS_PER_LONG, small_cpumask_bits used
a macro instead of variable-sized for efficient.

If no further cpus set, the cpumask_next() returns small_cpumask_bits,
it must greater than or equal to nr_cpumask_bits, so fix it to correctly.

Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
---
 drivers/net/wireguard/queueing.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
index 583adb37ee1e..41adeac3ee0b 100644
--- a/drivers/net/wireguard/queueing.h
+++ b/drivers/net/wireguard/queueing.h
@@ -106,7 +106,7 @@ static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
 {
 	unsigned int cpu = *stored_cpu, cpu_index, i;
 
-	if (unlikely(cpu == nr_cpumask_bits ||
+	if (unlikely(cpu >= nr_cpumask_bits ||
 		     !cpumask_test_cpu(cpu, cpu_online_mask))) {
 		cpu_index = id % cpumask_weight(cpu_online_mask);
 		cpu = cpumask_first(cpu_online_mask);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() if no further cpus set
  2023-03-06 16:06 [PATCH 0/5] fix call cpumask_next() if no further cpus set Vernon Yang
  2023-03-06 16:06 ` [PATCH 1/5] random: fix try_to_generate_entropy() " Vernon Yang
  2023-03-06 16:06 ` [PATCH 2/5] wireguard: fix wg_cpumask_choose_online() " Vernon Yang
@ 2023-03-06 16:06 ` Vernon Yang
  2023-03-06 18:48   ` Linus Torvalds
  2023-03-06 16:06 ` [PATCH 4/5] scsi: lpfc: fix lpfc_nvmet_setup_io_context() " Vernon Yang
  2023-03-06 16:06 ` [PATCH 5/5] cpumask: fix comment of cpumask_xxx Vernon Yang
  4 siblings, 1 reply; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 16:06 UTC (permalink / raw)
  To: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy
  Cc: linux-kernel, wireguard, netdev, linux-scsi, Vernon Yang

After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations"), when NR_CPUS <= BITS_PER_LONG, small_cpumask_bits used
a macro instead of variable-sized for efficient.

If no further cpus set, the cpumask_next() returns small_cpumask_bits,
it must greater than or equal to nr_cpumask_bits, so fix it to correctly.

Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
---
 drivers/scsi/lpfc/lpfc_init.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 61958a24a43d..01c0e2f47cf7 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12563,7 +12563,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 					goto found_same;
 				new_cpu = cpumask_next(
 					new_cpu, cpu_present_mask);
-				if (new_cpu == nr_cpumask_bits)
+				if (new_cpu >= nr_cpumask_bits)
 					new_cpu = first_cpu;
 			}
 			/* At this point, we leave the CPU as unassigned */
@@ -12577,7 +12577,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 			 * selecting the same IRQ.
 			 */
 			start_cpu = cpumask_next(new_cpu, cpu_present_mask);
-			if (start_cpu == nr_cpumask_bits)
+			if (start_cpu >= nr_cpumask_bits)
 				start_cpu = first_cpu;
 
 			lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
@@ -12613,7 +12613,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 					goto found_any;
 				new_cpu = cpumask_next(
 					new_cpu, cpu_present_mask);
-				if (new_cpu == nr_cpumask_bits)
+				if (new_cpu >= nr_cpumask_bits)
 					new_cpu = first_cpu;
 			}
 			/* We should never leave an entry unassigned */
@@ -12631,7 +12631,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 			 * selecting the same IRQ.
 			 */
 			start_cpu = cpumask_next(new_cpu, cpu_present_mask);
-			if (start_cpu == nr_cpumask_bits)
+			if (start_cpu >= nr_cpumask_bits)
 				start_cpu = first_cpu;
 
 			lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
@@ -12704,7 +12704,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 				goto found_hdwq;
 			}
 			new_cpu = cpumask_next(new_cpu, cpu_present_mask);
-			if (new_cpu == nr_cpumask_bits)
+			if (new_cpu >= nr_cpumask_bits)
 				new_cpu = first_cpu;
 		}
 
@@ -12719,7 +12719,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 				goto found_hdwq;
 
 			new_cpu = cpumask_next(new_cpu, cpu_present_mask);
-			if (new_cpu == nr_cpumask_bits)
+			if (new_cpu >= nr_cpumask_bits)
 				new_cpu = first_cpu;
 		}
 
@@ -12730,7 +12730,7 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
  found_hdwq:
 		/* We found an available entry, copy the IRQ info */
 		start_cpu = cpumask_next(new_cpu, cpu_present_mask);
-		if (start_cpu == nr_cpumask_bits)
+		if (start_cpu >= nr_cpumask_bits)
 			start_cpu = first_cpu;
 		cpup->hdwq = new_cpup->hdwq;
  logit:
-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 4/5] scsi: lpfc: fix lpfc_nvmet_setup_io_context() if no further cpus set
  2023-03-06 16:06 [PATCH 0/5] fix call cpumask_next() if no further cpus set Vernon Yang
                   ` (2 preceding siblings ...)
  2023-03-06 16:06 ` [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() " Vernon Yang
@ 2023-03-06 16:06 ` Vernon Yang
  2023-03-06 16:06 ` [PATCH 5/5] cpumask: fix comment of cpumask_xxx Vernon Yang
  4 siblings, 0 replies; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 16:06 UTC (permalink / raw)
  To: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy
  Cc: linux-kernel, wireguard, netdev, linux-scsi, Vernon Yang

After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations"), when NR_CPUS <= BITS_PER_LONG, small_cpumask_bits used
a macro instead of variable-sized for efficient.

If no further cpus set, the cpumask_next() returns small_cpumask_bits,
it must greater than or equal to nr_cpumask_bits, so fix it to correctly.

Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
---
 drivers/scsi/lpfc/lpfc_nvmet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index 7517dd55fe91..3ae7f330f827 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -1621,7 +1621,7 @@ lpfc_nvmet_setup_io_context(struct lpfc_hba *phba)
 			continue;
 		}
 		cpu = cpumask_next(cpu, cpu_present_mask);
-		if (cpu == nr_cpu_ids)
+		if (cpu >= nr_cpu_ids)
 			cpu = cpumask_first(cpu_present_mask);
 
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:06 [PATCH 0/5] fix call cpumask_next() if no further cpus set Vernon Yang
                   ` (3 preceding siblings ...)
  2023-03-06 16:06 ` [PATCH 4/5] scsi: lpfc: fix lpfc_nvmet_setup_io_context() " Vernon Yang
@ 2023-03-06 16:06 ` Vernon Yang
  2023-03-06 16:39   ` Yury Norov
  2023-03-06 17:29   ` Linus Torvalds
  4 siblings, 2 replies; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 16:06 UTC (permalink / raw)
  To: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy
  Cc: linux-kernel, wireguard, netdev, linux-scsi, Vernon Yang

After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations"), the cpumask size is divided into three different case,
so fix comment of cpumask_xxx correctly.

Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
---
 include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 8fbe76607965..248bdb1c50dc 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
  * cpumask_first - get the first cpu in a cpumask
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if no cpus set.
+ * Returns >= small_cpumask_bits if no cpus set.
  */
 static inline unsigned int cpumask_first(const struct cpumask *srcp)
 {
@@ -166,7 +166,7 @@ static inline unsigned int cpumask_first(const struct cpumask *srcp)
  * cpumask_first_zero - get the first unset cpu in a cpumask
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if all cpus are set.
+ * Returns >= small_cpumask_bits if all cpus are set.
  */
 static inline unsigned int cpumask_first_zero(const struct cpumask *srcp)
 {
@@ -178,7 +178,7 @@ static inline unsigned int cpumask_first_zero(const struct cpumask *srcp)
  * @src1p: the first input
  * @src2p: the second input
  *
- * Returns >= nr_cpu_ids if no cpus set in both.  See also cpumask_next_and().
+ * Returns >= small_cpumask_bits if no cpus set in both.  See also cpumask_next_and().
  */
 static inline
 unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask *srcp2)
@@ -190,7 +190,7 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
  * cpumask_last - get the last CPU in a cpumask
  * @srcp:	- the cpumask pointer
  *
- * Returns	>= nr_cpumask_bits if no CPUs set.
+ * Returns	>= small_cpumask_bits if no CPUs set.
  */
 static inline unsigned int cpumask_last(const struct cpumask *srcp)
 {
@@ -202,7 +202,7 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
  * @n: the cpu prior to the place to search (ie. return will be > @n)
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if no further cpus set.
+ * Returns >= small_cpumask_bits if no further cpus set.
  */
 static inline
 unsigned int cpumask_next(int n, const struct cpumask *srcp)
@@ -218,7 +218,7 @@ unsigned int cpumask_next(int n, const struct cpumask *srcp)
  * @n: the cpu prior to the place to search (ie. return will be > @n)
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if no further cpus unset.
+ * Returns >= small_cpumask_bits if no further cpus unset.
  */
 static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
 {
@@ -258,7 +258,7 @@ unsigned int cpumask_any_distribute(const struct cpumask *srcp);
  * @src1p: the first cpumask pointer
  * @src2p: the second cpumask pointer
  *
- * Returns >= nr_cpu_ids if no further cpus set in both.
+ * Returns >= small_cpumask_bits if no further cpus set in both.
  */
 static inline
 unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
@@ -276,7 +276,7 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
  * @cpu: the (optionally unsigned) integer iterator
  * @mask: the cpumask pointer
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu(cpu, mask)				\
 	for_each_set_bit(cpu, cpumask_bits(mask), small_cpumask_bits)
@@ -310,7 +310,7 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *
  * The implementation does not assume any bit in @mask is set (including @start).
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu_wrap(cpu, mask, start)				\
 	for_each_set_bit_wrap(cpu, cpumask_bits(mask), small_cpumask_bits, start)
@@ -327,7 +327,7 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *	for_each_cpu(cpu, &tmp)
  *		...
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu_and(cpu, mask1, mask2)				\
 	for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
@@ -345,7 +345,7 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *	for_each_cpu(cpu, &tmp)
  *		...
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu_andnot(cpu, mask1, mask2)				\
 	for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
@@ -375,7 +375,7 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
  * @srcp: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static inline unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
 {
@@ -388,7 +388,7 @@ static inline unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *s
  * @srcp2: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static inline
 unsigned int cpumask_nth_and(unsigned int cpu, const struct cpumask *srcp1,
@@ -404,7 +404,7 @@ unsigned int cpumask_nth_and(unsigned int cpu, const struct cpumask *srcp1,
  * @srcp2: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static inline
 unsigned int cpumask_nth_andnot(unsigned int cpu, const struct cpumask *srcp1,
@@ -421,7 +421,7 @@ unsigned int cpumask_nth_andnot(unsigned int cpu, const struct cpumask *srcp1,
  * @srcp3: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static __always_inline
 unsigned int cpumask_nth_and_andnot(unsigned int cpu, const struct cpumask *srcp1,
@@ -529,7 +529,7 @@ static inline void cpumask_setall(struct cpumask *dstp)
 }
 
 /**
- * cpumask_clear - clear all cpus (< nr_cpu_ids) in a cpumask
+ * cpumask_clear - clear all cpus (< large_cpumask_bits) in a cpumask
  * @dstp: the cpumask pointer
  */
 static inline void cpumask_clear(struct cpumask *dstp)
@@ -650,7 +650,7 @@ static inline bool cpumask_subset(const struct cpumask *src1p,
 
 /**
  * cpumask_empty - *srcp == 0
- * @srcp: the cpumask to that all cpus < nr_cpu_ids are clear.
+ * @srcp: the cpumask to that all cpus < small_cpumask_bits are clear.
  */
 static inline bool cpumask_empty(const struct cpumask *srcp)
 {
@@ -659,7 +659,7 @@ static inline bool cpumask_empty(const struct cpumask *srcp)
 
 /**
  * cpumask_full - *srcp == 0xFFFFFFFF...
- * @srcp: the cpumask to that all cpus < nr_cpu_ids are set.
+ * @srcp: the cpumask to that all cpus < nr_cpumask_bits are set.
  */
 static inline bool cpumask_full(const struct cpumask *srcp)
 {
@@ -668,7 +668,7 @@ static inline bool cpumask_full(const struct cpumask *srcp)
 
 /**
  * cpumask_weight - Count of bits in *srcp
- * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @srcp: the cpumask to count bits (< small_cpumask_bits) in.
  */
 static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 {
@@ -677,8 +677,8 @@ static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 
 /**
  * cpumask_weight_and - Count of bits in (*srcp1 & *srcp2)
- * @srcp1: the cpumask to count bits (< nr_cpu_ids) in.
- * @srcp2: the cpumask to count bits (< nr_cpu_ids) in.
+ * @srcp1: the cpumask to count bits (< small_cpumask_bits) in.
+ * @srcp2: the cpumask to count bits (< small_cpumask_bits) in.
  */
 static inline unsigned int cpumask_weight_and(const struct cpumask *srcp1,
 						const struct cpumask *srcp2)
@@ -727,7 +727,7 @@ static inline void cpumask_copy(struct cpumask *dstp,
  * cpumask_any - pick a "random" cpu from *srcp
  * @srcp: the input cpumask
  *
- * Returns >= nr_cpu_ids if no cpus set.
+ * Returns >= small_cpumask_bits if no cpus set.
  */
 #define cpumask_any(srcp) cpumask_first(srcp)
 
@@ -736,7 +736,7 @@ static inline void cpumask_copy(struct cpumask *dstp,
  * @mask1: the first input cpumask
  * @mask2: the second input cpumask
  *
- * Returns >= nr_cpu_ids if no cpus set.
+ * Returns >= small_cpumask_bits if no cpus set.
  */
 #define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/5] random: fix try_to_generate_entropy() if no further cpus set
  2023-03-06 16:06 ` [PATCH 1/5] random: fix try_to_generate_entropy() " Vernon Yang
@ 2023-03-06 16:26   ` Yury Norov
  0 siblings, 0 replies; 19+ messages in thread
From: Yury Norov @ 2023-03-06 16:26 UTC (permalink / raw)
  To: Vernon Yang
  Cc: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, andriy.shevchenko, linux, james.smart,
	dick.kennedy, linux-kernel, wireguard, netdev, linux-scsi

On Tue, Mar 07, 2023 at 12:06:47AM +0800, Vernon Yang wrote:
> After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> optimizations"), when NR_CPUS <= BITS_PER_LONG, small_cpumask_bits used
> a macro instead of variable-sized for efficient.
> 
> If no further cpus set, the cpumask_next() returns small_cpumask_bits,
> it must greater than or equal to nr_cpumask_bits, so fix it to correctly.
> 
> Signed-off-by: Vernon Yang <vernon2gm@gmail.com>

Hi Vernon,

In all that cases, nr_cpu_ids must be used. The difference is that
nr_cpumask_bits is an upper limit for possible CPUs, and it's derived
from compile-time NR_CPUS, unless CPUMASK_OFFSTACK is enabled.

nr_cpu_ids is an actual number of CPUS as counted on boot.

So, nr_cpu_ids is always equal or less than nr_cpumask_bits, and we'd
compare with the smaller number.

Nor sure, but maybe it's worth to introduce a macro like:
 #define valid_cpuid(cpu) (cpu) < nr_cpu_ids

Thanks,
Yury
> ---
>  drivers/char/random.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index ce3ccd172cc8..d76f12a5f74f 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1311,7 +1311,7 @@ static void __cold try_to_generate_entropy(void)
>  			/* Basic CPU round-robin, which avoids the current CPU. */
>  			do {
>  				cpu = cpumask_next(cpu, &timer_cpus);
> -				if (cpu == nr_cpumask_bits)
> +				if (cpu >= nr_cpumask_bits)
>  					cpu = cpumask_first(&timer_cpus);
>  			} while (cpu == smp_processor_id() && num_cpus > 1);
>  
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:06 ` [PATCH 5/5] cpumask: fix comment of cpumask_xxx Vernon Yang
@ 2023-03-06 16:39   ` Yury Norov
  2023-03-06 16:44     ` Jason A. Donenfeld
  2023-03-06 17:45     ` Vernon Yang
  2023-03-06 17:29   ` Linus Torvalds
  1 sibling, 2 replies; 19+ messages in thread
From: Yury Norov @ 2023-03-06 16:39 UTC (permalink / raw)
  To: Vernon Yang
  Cc: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, andriy.shevchenko, linux, james.smart,
	dick.kennedy, linux-kernel, wireguard, netdev, linux-scsi

On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> optimizations"), the cpumask size is divided into three different case,
> so fix comment of cpumask_xxx correctly.
> 
> Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> ---
>  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
>  1 file changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 8fbe76607965..248bdb1c50dc 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
>   * cpumask_first - get the first cpu in a cpumask
>   * @srcp: the cpumask pointer
>   *
> - * Returns >= nr_cpu_ids if no cpus set.
> + * Returns >= small_cpumask_bits if no cpus set.

There's no such thing like small_cpumask_bits. Here and everywhere,
nr_cpu_ids must be used.

Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
must be like that for all users even now.

nr_cpumask_bits must be considered as internal cpumask parameter and
never referenced outside of cpumask code.

Thansk,
Yury

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:39   ` Yury Norov
@ 2023-03-06 16:44     ` Jason A. Donenfeld
  2023-03-06 16:54       ` Yury Norov
  2023-03-06 17:45     ` Vernon Yang
  1 sibling, 1 reply; 19+ messages in thread
From: Jason A. Donenfeld @ 2023-03-06 16:44 UTC (permalink / raw)
  To: Yury Norov
  Cc: Vernon Yang, torvalds, tytso, davem, edumazet, kuba, pabeni,
	jejb, martin.petersen, andriy.shevchenko, linux, james.smart,
	dick.kennedy, linux-kernel, wireguard, netdev, linux-scsi

On Mon, Mar 6, 2023 at 5:39 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > optimizations"), the cpumask size is divided into three different case,
> > so fix comment of cpumask_xxx correctly.
> >
> > Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> > ---
> >  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
> >  1 file changed, 23 insertions(+), 23 deletions(-)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 8fbe76607965..248bdb1c50dc 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
> >   * cpumask_first - get the first cpu in a cpumask
> >   * @srcp: the cpumask pointer
> >   *
> > - * Returns >= nr_cpu_ids if no cpus set.
> > + * Returns >= small_cpumask_bits if no cpus set.
>
> There's no such thing like small_cpumask_bits. Here and everywhere,
> nr_cpu_ids must be used.
>
> Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
> must be like that for all users even now.
>
> nr_cpumask_bits must be considered as internal cpumask parameter and
> never referenced outside of cpumask code.

What's the right thing I should do, then, for wireguard's usage and
for random.c's usage? It sounds like you object to this patchset, but
if the problem is real, it sounds like I should at least fix the two
cases I maintain. What's the right check?

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:44     ` Jason A. Donenfeld
@ 2023-03-06 16:54       ` Yury Norov
  2023-03-06 17:04         ` Jason A. Donenfeld
  0 siblings, 1 reply; 19+ messages in thread
From: Yury Norov @ 2023-03-06 16:54 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Vernon Yang, torvalds, tytso, davem, edumazet, kuba, pabeni,
	jejb, martin.petersen, andriy.shevchenko, linux, james.smart,
	dick.kennedy, linux-kernel, wireguard, netdev, linux-scsi

On Mon, Mar 06, 2023 at 05:44:41PM +0100, Jason A. Donenfeld wrote:
> On Mon, Mar 6, 2023 at 5:39 PM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> > > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > > optimizations"), the cpumask size is divided into three different case,
> > > so fix comment of cpumask_xxx correctly.
> > >
> > > Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> > > ---
> > >  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
> > >  1 file changed, 23 insertions(+), 23 deletions(-)
> > >
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 8fbe76607965..248bdb1c50dc 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
> > > @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
> > >   * cpumask_first - get the first cpu in a cpumask
> > >   * @srcp: the cpumask pointer
> > >   *
> > > - * Returns >= nr_cpu_ids if no cpus set.
> > > + * Returns >= small_cpumask_bits if no cpus set.
> >
> > There's no such thing like small_cpumask_bits. Here and everywhere,
> > nr_cpu_ids must be used.
> >
> > Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
> > must be like that for all users even now.
> >
> > nr_cpumask_bits must be considered as internal cpumask parameter and
> > never referenced outside of cpumask code.
> 
> What's the right thing I should do, then, for wireguard's usage and
> for random.c's usage? It sounds like you object to this patchset, but
> if the problem is real, it sounds like I should at least fix the two
> cases I maintain. What's the right check?

Everywhere outside of cpumasks internals use (cpu < nr_cpu_ids) to
check if the cpu is in a valid range, like:

cpu = cpumask_first(cpus);
if (cpu >= nr_cpu_ids)
        pr_err("There's no cpus");
 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:54       ` Yury Norov
@ 2023-03-06 17:04         ` Jason A. Donenfeld
  0 siblings, 0 replies; 19+ messages in thread
From: Jason A. Donenfeld @ 2023-03-06 17:04 UTC (permalink / raw)
  To: Yury Norov
  Cc: Vernon Yang, torvalds, tytso, davem, edumazet, kuba, pabeni,
	jejb, martin.petersen, andriy.shevchenko, linux, james.smart,
	dick.kennedy, linux-kernel, wireguard, netdev, linux-scsi

On Mon, Mar 6, 2023 at 5:54 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Mon, Mar 06, 2023 at 05:44:41PM +0100, Jason A. Donenfeld wrote:
> > On Mon, Mar 6, 2023 at 5:39 PM Yury Norov <yury.norov@gmail.com> wrote:
> > >
> > > On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> > > > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > > > optimizations"), the cpumask size is divided into three different case,
> > > > so fix comment of cpumask_xxx correctly.
> > > >
> > > > Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> > > > ---
> > > >  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
> > > >  1 file changed, 23 insertions(+), 23 deletions(-)
> > > >
> > > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > > index 8fbe76607965..248bdb1c50dc 100644
> > > > --- a/include/linux/cpumask.h
> > > > +++ b/include/linux/cpumask.h
> > > > @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
> > > >   * cpumask_first - get the first cpu in a cpumask
> > > >   * @srcp: the cpumask pointer
> > > >   *
> > > > - * Returns >= nr_cpu_ids if no cpus set.
> > > > + * Returns >= small_cpumask_bits if no cpus set.
> > >
> > > There's no such thing like small_cpumask_bits. Here and everywhere,
> > > nr_cpu_ids must be used.
> > >
> > > Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
> > > must be like that for all users even now.
> > >
> > > nr_cpumask_bits must be considered as internal cpumask parameter and
> > > never referenced outside of cpumask code.
> >
> > What's the right thing I should do, then, for wireguard's usage and
> > for random.c's usage? It sounds like you object to this patchset, but
> > if the problem is real, it sounds like I should at least fix the two
> > cases I maintain. What's the right check?
>
> Everywhere outside of cpumasks internals use (cpu < nr_cpu_ids) to
> check if the cpu is in a valid range, like:
>
> cpu = cpumask_first(cpus);
> if (cpu >= nr_cpu_ids)
>         pr_err("There's no cpus");

Oh, okay, so the ones for wireguard and random.c in this series are
correct then? If so, could you give a Reviewed-by:, and then I'll
queue those up in my respective trees.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:06 ` [PATCH 5/5] cpumask: fix comment of cpumask_xxx Vernon Yang
  2023-03-06 16:39   ` Yury Norov
@ 2023-03-06 17:29   ` Linus Torvalds
  2023-03-06 17:47     ` Linus Torvalds
  2023-03-06 18:13     ` Vernon Yang
  1 sibling, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2023-03-06 17:29 UTC (permalink / raw)
  To: Vernon Yang
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 6, 2023 at 8:07 AM Vernon Yang <vernon2gm@gmail.com> wrote:
>
> After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> optimizations"), the cpumask size is divided into three different case,
> so fix comment of cpumask_xxx correctly.

No no.

Those three cases are meant to be entirely internal optimizations.
They are literally just "preferred sizes".

The correct thing to do is always that

   * Returns >= nr_cpu_ids if no cpus set.

because nr_cpu_ids is always the *smallest* of the access sizes.

That's exactly why it's a ">=". The CPU mask stuff has always
historically potentially used a different size than the actual
nr_cpu_ids, in that it could do word-sized scans even when the machine
might only have a smaller set of CPUs.

So the whole "small" vs "large" should be seen entirely internal to
cpumask.h. We should not expose it outside (sadly, that already
happened with "nr_cpumask_size", which also was that kind of thing.

So no, this patch is wrong. If anything, the comments should be strengthened.

Of course, right now Guenter seems to be reporting a problem with that
optimization, so unless I figure out what is going on I'll just need
to revert it anyway.

                Linus

                Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 16:39   ` Yury Norov
  2023-03-06 16:44     ` Jason A. Donenfeld
@ 2023-03-06 17:45     ` Vernon Yang
  1 sibling, 0 replies; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 17:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: torvalds, tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, andriy.shevchenko, linux, james.smart,
	dick.kennedy, linux-kernel, wireguard, netdev, linux-scsi

On Mon, Mar 06, 2023 at 08:39:03AM -0800, Yury Norov wrote:
> On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > optimizations"), the cpumask size is divided into three different case,
> > so fix comment of cpumask_xxx correctly.
> >
> > Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> > ---
> >  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
> >  1 file changed, 23 insertions(+), 23 deletions(-)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 8fbe76607965..248bdb1c50dc 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
> >   * cpumask_first - get the first cpu in a cpumask
> >   * @srcp: the cpumask pointer
> >   *
> > - * Returns >= nr_cpu_ids if no cpus set.
> > + * Returns >= small_cpumask_bits if no cpus set.
>
> There's no such thing like small_cpumask_bits. Here and everywhere,
> nr_cpu_ids must be used.
>
> Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
> must be like that for all users even now.
>
> nr_cpumask_bits must be considered as internal cpumask parameter and
> never referenced outside of cpumask code.

OK, I remove this path for next version.

>
> Thansk,
> Yury

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 17:29   ` Linus Torvalds
@ 2023-03-06 17:47     ` Linus Torvalds
  2023-03-06 18:02       ` Linus Torvalds
  2023-03-06 18:13     ` Vernon Yang
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2023-03-06 17:47 UTC (permalink / raw)
  To: Vernon Yang
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 6, 2023 at 9:29 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The correct thing to do is always that
>
>    * Returns >= nr_cpu_ids if no cpus set.
>
> because nr_cpu_ids is always the *smallest* of the access sizes.
>
> Of course, right now Guenter seems to be reporting a problem with that
> optimization, so unless I figure out what is going on I'll just need
> to revert it anyway.

Ahh. And the reason is exactly that people do *not* follow that
"Returns >= nr_cpu_ids" rule.

The drivers/char/random.c code is very wrong, and does

             if (cpu == nr_cpumask_bits)
                             cpu = cpumask_first(&timer_cpus);

which fails miserably exactly because it doesn't use ">=".

Oh well.

I'll have to look for more of this pattern, but basically all those
"xyz_cpumask_bits" things were supposed to always be just internal to
that header file implementation, which is *exactly* why you have to
check the result for ">= nr_cpu_ids".

       Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 17:47     ` Linus Torvalds
@ 2023-03-06 18:02       ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2023-03-06 18:02 UTC (permalink / raw)
  To: Vernon Yang
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 6, 2023 at 9:47 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The drivers/char/random.c code is very wrong, and does
>
>              if (cpu == nr_cpumask_bits)
>                              cpu = cpumask_first(&timer_cpus);
>
> which fails miserably exactly because it doesn't use ">=".

Turns out this "cpu == nr_cpumask_bits" pattern exists in a couple of
other places too.

It was always wrong, but it always just happened to work. The lpfc
SCSI driver in particular seems to *love* this pattern:

        start_cpu = cpumask_next(new_cpu, cpu_present_mask);
        if (start_cpu == nr_cpumask_bits)
                start_cpu = first_cpu;

and has repeated it multiple times, all incorrect.

We do have "cpumask_next_wrap()", and that *seems* to be what the lpcf
driver actually wants to do.

.. and then we have kernel/sched/fair.c, which is actually not buggy,
just odd. It uses nr_cpumask_bits too, but it uses it purely for its
own internal nefarious reasons - it's not actually related to the
cpumask functions at all, its just used as a "not valid CPU number".

I think that scheduler use is still very *wrong*, but it doesn't look
actively buggy.

The other cases all look very buggy indeed, but yes, they happened to
work, and now they don't. So commit 596ff4a09b89 ("cpumask:
re-introduce constant-sized cpumask optimizations") did break them.

I'd rather fix these bad users than revert, but there does seem to be
an alarming number of these things, which worries me:

     git grep '== nr_cpumask_bits'

and that's just checking for this *exact* thing.

                Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 17:29   ` Linus Torvalds
  2023-03-06 17:47     ` Linus Torvalds
@ 2023-03-06 18:13     ` Vernon Yang
  2023-03-06 18:34       ` Linus Torvalds
  1 sibling, 1 reply; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 18:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 06, 2023 at 09:29:10AM -0800, Linus Torvalds wrote:
> On Mon, Mar 6, 2023 at 8:07 AM Vernon Yang <vernon2gm@gmail.com> wrote:
> >
> > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > optimizations"), the cpumask size is divided into three different case,
> > so fix comment of cpumask_xxx correctly.
>
> No no.
>
> Those three cases are meant to be entirely internal optimizations.
> They are literally just "preferred sizes".
>
> The correct thing to do is always that
>
>    * Returns >= nr_cpu_ids if no cpus set.
>
> because nr_cpu_ids is always the *smallest* of the access sizes.
>
> That's exactly why it's a ">=". The CPU mask stuff has always
> historically potentially used a different size than the actual
> nr_cpu_ids, in that it could do word-sized scans even when the machine
> might only have a smaller set of CPUs.
>
> So the whole "small" vs "large" should be seen entirely internal to
> cpumask.h. We should not expose it outside (sadly, that already
> happened with "nr_cpumask_size", which also was that kind of thing.

I also just see nr_cpumask_size exposed to outside, so... Sorry.

>
> So no, this patch is wrong. If anything, the comments should be strengthened.
>
> Of course, right now Guenter seems to be reporting a problem with that
> optimization, so unless I figure out what is going on I'll just need
> to revert it anyway.

Yes, cause is the cpumask_next() calls find_next_bit(..., size, ...), and
find_next_bit(..., size, ...) if no bits are set, returns @size.

@size was a nr_cpumask_bits variable before, now it is small_cpumask_bits, and
when NR_CPUS < = BITS_PER_LONG, small_cpumask_bits is a macro, which is
replaced with NR_CPUS at compile, so only the NR_CPUS is returned when it no
further cpus set.

But before nr_cpumask_bits variable, it was read while running, and it was
mutable.

The random.c try_to_generate_entropy() to get first cpu by
`if (cpu == nr_cpumask_bits)`, but cpumask_next() alway return NR_CPUS,
nr_cpumask_bits is nr_cpu_ids, so pass NR_CPUS to add_timer_on(),

>
>                 Linus
>
>                 Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx
  2023-03-06 18:13     ` Vernon Yang
@ 2023-03-06 18:34       ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2023-03-06 18:34 UTC (permalink / raw)
  To: Vernon Yang
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 6, 2023 at 10:13 AM Vernon Yang <vernon2gm@gmail.com> wrote:
>
> I also just see nr_cpumask_size exposed to outside, so...

Yeah, it's not great.

nr_cpumask_bits came out of the exact same "this is an internal value
that we use for optimized cpumask accesses", and existed exactly
because it *might* be the same as 'nr_cpu_ids', but it might also be a
simpler "small constant that is big enough" case.

It just depended on the exact kernel config which one was used.

But clearly that internal value then spread outside, and that then
caused problems when the internal implementation changed.

            Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() if no further cpus set
  2023-03-06 16:06 ` [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() " Vernon Yang
@ 2023-03-06 18:48   ` Linus Torvalds
  2023-03-06 20:09     ` Vernon Yang
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2023-03-06 18:48 UTC (permalink / raw)
  To: Vernon Yang
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 6, 2023 at 8:07 AM Vernon Yang <vernon2gm@gmail.com> wrote:
>
> -                               if (new_cpu == nr_cpumask_bits)
> +                               if (new_cpu >= nr_cpumask_bits)

This all should use "nr_cpu_ids", not "nr_cpumask_bits".

But I really suspect that it should all be rewritten to not do that
thing over and over, but just use a helper function for it.

  int lpfc_next_present_cpu(int n, int alternate)
  {
        n = cpumask_next(n, cpu_present_mask);
        if (n >= nr_cpu_ids)
                n = alternate;
        return n;
  }

and then you could just use

        start_cpu = lpfc_next_present_cpu(new_cpu, first_cpu);

or similar.

              Linus

PS. We "kind of" already have a helper function for this:
cpumask_next_wrap(). But it's really meant for a different pattern
entirely, so let's not confuse things.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() if no further cpus set
  2023-03-06 18:48   ` Linus Torvalds
@ 2023-03-06 20:09     ` Vernon Yang
  0 siblings, 0 replies; 19+ messages in thread
From: Vernon Yang @ 2023-03-06 20:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: tytso, Jason, davem, edumazet, kuba, pabeni, jejb,
	martin.petersen, yury.norov, andriy.shevchenko, linux,
	james.smart, dick.kennedy, linux-kernel, wireguard, netdev,
	linux-scsi

On Mon, Mar 06, 2023 at 10:48:04AM -0800, Linus Torvalds wrote:
> On Mon, Mar 6, 2023 at 8:07 AM Vernon Yang <vernon2gm@gmail.com> wrote:
> >
> > -                               if (new_cpu == nr_cpumask_bits)
> > +                               if (new_cpu >= nr_cpumask_bits)
>
> This all should use "nr_cpu_ids", not "nr_cpumask_bits".
>
> But I really suspect that it should all be rewritten to not do that
> thing over and over, but just use a helper function for it.
>
>   int lpfc_next_present_cpu(int n, int alternate)
>   {
>         n = cpumask_next(n, cpu_present_mask);
>         if (n >= nr_cpu_ids)
>                 n = alternate;
>         return n;
>   }
>
> and then you could just use
>
>         start_cpu = lpfc_next_present_cpu(new_cpu, first_cpu);

OK, thanks you very much.

I'll send a second version shortly

>
> or similar.
>
>               Linus
>
> PS. We "kind of" already have a helper function for this:
> cpumask_next_wrap(). But it's really meant for a different pattern
> entirely, so let's not confuse things.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-03-06 23:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-06 16:06 [PATCH 0/5] fix call cpumask_next() if no further cpus set Vernon Yang
2023-03-06 16:06 ` [PATCH 1/5] random: fix try_to_generate_entropy() " Vernon Yang
2023-03-06 16:26   ` Yury Norov
2023-03-06 16:06 ` [PATCH 2/5] wireguard: fix wg_cpumask_choose_online() " Vernon Yang
2023-03-06 16:06 ` [PATCH 3/5] scsi: lpfc: fix lpfc_cpu_affinity_check() " Vernon Yang
2023-03-06 18:48   ` Linus Torvalds
2023-03-06 20:09     ` Vernon Yang
2023-03-06 16:06 ` [PATCH 4/5] scsi: lpfc: fix lpfc_nvmet_setup_io_context() " Vernon Yang
2023-03-06 16:06 ` [PATCH 5/5] cpumask: fix comment of cpumask_xxx Vernon Yang
2023-03-06 16:39   ` Yury Norov
2023-03-06 16:44     ` Jason A. Donenfeld
2023-03-06 16:54       ` Yury Norov
2023-03-06 17:04         ` Jason A. Donenfeld
2023-03-06 17:45     ` Vernon Yang
2023-03-06 17:29   ` Linus Torvalds
2023-03-06 17:47     ` Linus Torvalds
2023-03-06 18:02       ` Linus Torvalds
2023-03-06 18:13     ` Vernon Yang
2023-03-06 18:34       ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).