Currently, _SC_NPROCESSORS_CONF is always equal to _SC_NPROCESSORS_ONLN. However, it is expected from the first one to give the total number of CPUs in the system, while the later must return only the number of CPUs which are currently online. This distinction is important for a software such as trace-cmd. Trace-cmd is a front-end for the kernel tracing tool ftrace. When recording traces, trace-cmd needs to get the total number of CPUs available in the system (_SC_NPROCESSORS_CONF) and not only the online ones otherwise if a CPU goes offline some data might be missing. Hence, add a specific method to get _SC_NPROCESSORS_CONF, based on the sysfs CPU entries /sys/devices/system/cpu/cpu[0-9] diff --git a/src/conf/sysconf.c b/src/conf/sysconf.c index 3baaed32..6281cfb6 100644 --- a/src/conf/sysconf.c +++ b/src/conf/sysconf.c @@ -1,12 +1,17 @@ +#include <ctype.h> #include <unistd.h> #include <limits.h> #include <errno.h> #include <sys/resource.h> #include <signal.h> +#include <string.h> #include <sys/sysinfo.h> #include "syscall.h" #include "libc.h" +#define _GNU_SOURCE +#include <dirent.h> + #define JT(x) (-256|(x)) #define VER JT(1) #define JT_ARG_MAX JT(2) @@ -22,6 +27,42 @@ #define RLIM(x) (-32768|(RLIMIT_ ## x)) +static inline int get_nrprocessors_conf(void) +{ + DIR *d = opendir("/sys/devices/system/cpu"); + struct dirent *de; + unsigned int cnt = 0; + + if (!d) + return -1; + + while ((de = readdir(d))) { + if (de->d_type == DT_DIR && + strlen(de->d_name) > 3 && + de->d_name[0] == 'c' && + de->d_name[1] == 'p' && + de->d_name[2] == 'u' && + isdigit(de->d_name[3])) + cnt++; + } + + closedir(d); + + return cnt; +} + +static inline int get_nrprocessors_onln(void) +{ + unsigned char set[128] = {1}; + int i, cnt; + + __syscall(SYS_sched_getaffinity, 0, sizeof set, set); + for (i=cnt=0; i<sizeof set; i++) + for (; set[i]; set[i]&=set[i]-1, cnt++); + + return cnt; +} + long sysconf(int name) { static const short values[] = { @@ -193,14 +234,13 @@ long sysconf(int name) return SEM_VALUE_MAX; case JT_DELAYTIMER_MAX & 255: return DELAYTIMER_MAX; - case JT_NPROCESSORS_CONF & 255: + case JT_NPROCESSORS_CONF & 255: ; + int cnt = get_nrprocessors_conf(); + if (cnt > 0) + return cnt; + return get_nrprocessors_onln(); case JT_NPROCESSORS_ONLN & 255: ; - unsigned char set[128] = {1}; - int i, cnt; - __syscall(SYS_sched_getaffinity, 0, sizeof set, set); - for (i=cnt=0; i<sizeof set; i++) - for (; set[i]; set[i]&=set[i]-1, cnt++); - return cnt; + return get_nrprocessors_onln(); case JT_PHYS_PAGES & 255: case JT_AVPHYS_PAGES & 255: ; unsigned long long mem; -- 2.27.0
On Wed, 5 May 2021, Vincent Donnefort wrote:
> Currently, _SC_NPROCESSORS_CONF is always equal to _SC_NPROCESSORS_ONLN.
> However, it is expected from the first one to give the total number of CPUs
> in the system, while the later must return only the number of CPUs which
> are currently online. This distinction is important for a software such as
> trace-cmd. Trace-cmd is a front-end for the kernel tracing tool ftrace.
> When recording traces, trace-cmd needs to get the total number of CPUs
> available in the system (_SC_NPROCESSORS_CONF) and not only the online ones
> otherwise if a CPU goes offline some data might be missing.
>
> Hence, add a specific method to get _SC_NPROCESSORS_CONF, based on the
> sysfs CPU entries /sys/devices/system/cpu/cpu[0-9]
Why do the opendir instead of reading from /sys/devices/system/cpu/possible?
The online/offline/possible CPU masks are documented in
linux/Documentation/ABI/testing/sysfs-devices-system-cpu and
linux/Documentation/cputopology.txt
Alexander
On Wed, May 05, 2021 at 05:04:53PM +0300, Alexander Monakov wrote:
>
>
> On Wed, 5 May 2021, Vincent Donnefort wrote:
>
> > Currently, _SC_NPROCESSORS_CONF is always equal to _SC_NPROCESSORS_ONLN.
> > However, it is expected from the first one to give the total number of CPUs
> > in the system, while the later must return only the number of CPUs which
> > are currently online. This distinction is important for a software such as
> > trace-cmd. Trace-cmd is a front-end for the kernel tracing tool ftrace.
> > When recording traces, trace-cmd needs to get the total number of CPUs
> > available in the system (_SC_NPROCESSORS_CONF) and not only the online ones
> > otherwise if a CPU goes offline some data might be missing.
> >
> > Hence, add a specific method to get _SC_NPROCESSORS_CONF, based on the
> > sysfs CPU entries /sys/devices/system/cpu/cpu[0-9]
>
> Why do the opendir instead of reading from /sys/devices/system/cpu/possible?
> The online/offline/possible CPU masks are documented in
> linux/Documentation/ABI/testing/sysfs-devices-system-cpu and
> linux/Documentation/cputopology.txt
>
> Alexander
Could indeed use one of the CPU mask. "present" is probably better suited for
this usage. "possible" seems to have a different behavior on different
architectures e.g it is CONFIG_HOTPLUG dependent on x86.
Will do a V2 based on the present mask.
--
Vincent
[-- Attachment #1: Type: text/plain, Size: 2101 bytes --] On Wed, May 5, 2021 at 10:05 AM Alexander Monakov <amonakov@ispras.ru> wrote: > > > On Wed, 5 May 2021, Vincent Donnefort wrote: > > > Currently, _SC_NPROCESSORS_CONF is always equal to _SC_NPROCESSORS_ONLN. > > However, it is expected from the first one to give the total number of > CPUs > > in the system, while the later must return only the number of CPUs which > > are currently online. This distinction is important for a software such > as > > trace-cmd. Trace-cmd is a front-end for the kernel tracing tool ftrace. > > When recording traces, trace-cmd needs to get the total number of CPUs > > available in the system (_SC_NPROCESSORS_CONF) and not only the online > ones > > otherwise if a CPU goes offline some data might be missing. > BTW, it looks like what trace-cmd actually needs is the "largest cpu id-number that could exist this boot" (as used by sched_getaffinity, pthread_setaffinity_np, etc.) rather than "total number of CPUs which could exist this boot". Now, as far as I can tell _in practice_ the kernel always allocates "possible" cpu ids contiguously (so /sys/devices/system/cpu/possible will always contain e.g. "0-3", rather than something like "0,1,46,47"), but the data structures don't appear to require that. It's stored and reported as a bitmask. If Linux intends to guarantee that the "possible" bitset is (and will always) remain contiguous from 0, then the "largest" and "count" numbers are effectively equivalent, but otherwise they are not. > Hence, add a specific method to get _SC_NPROCESSORS_CONF, based on the > > sysfs CPU entries /sys/devices/system/cpu/cpu[0-9] > > Why do the opendir instead of reading from > /sys/devices/system/cpu/possible? > The online/offline/possible CPU masks are documented in > linux/Documentation/ABI/testing/sysfs-devices-system-cpu and > linux/Documentation/cputopology.txt The /sys/devices/system/cpu/cpuNN directories are created for every cpu in the "possible" bitmask, so those should be equivalent. I expect the reason glibc uses readdir is simply because "possible" was only introduced in Linux 2.6.26 in 2008. [-- Attachment #2: Type: text/html, Size: 2851 bytes --]