mailing list of musl libc
 help / color / mirror / code / Atom feed
* __sched_cpucount returns garbage
@ 2014-11-29 23:36 Isaac Dunham
  2014-11-30 11:38 ` Szabolcs Nagy
  0 siblings, 1 reply; 6+ messages in thread
From: Isaac Dunham @ 2014-11-29 23:36 UTC (permalink / raw)
  To: musl

Hello,
I noticed that nproc ended up on the toybox TODO list (via Tizen), and went
poking about via strace and ltrace to see where it got the cpu count from.

In the process, I discovered that __sched_cpucount is returning garbage;
on Alpine Linux on my N270-based netbook (1 physical core but 
hyperthreading makes it look like 2),
nproc
outputs a random number of CPUs ranging from 413 to 472.
ltrace indicates that this is calling __sched_cpucount() and printing 
its return value.

nproc --all
calls sysconf(_SC_NPROCESSORS_CONF) and gets the proper number of CPUs.

Thanks,
Isaac Dunham


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __sched_cpucount returns garbage
  2014-11-29 23:36 __sched_cpucount returns garbage Isaac Dunham
@ 2014-11-30 11:38 ` Szabolcs Nagy
  2014-12-03  0:11   ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Szabolcs Nagy @ 2014-11-30 11:38 UTC (permalink / raw)
  To: musl

* Isaac Dunham <ibid.ag@gmail.com> [2014-11-29 15:36:33 -0800]:
> I noticed that nproc ended up on the toybox TODO list (via Tizen), and went
> poking about via strace and ltrace to see where it got the cpu count from.
> 
> In the process, I discovered that __sched_cpucount is returning garbage;

works here as expected:

#define _GNU_SOURCE
#include <sched.h>
int main()
{
	cpu_set_t s = {0};
	CPU_SET(3, &s);
	CPU_SET(7, &s);
	CPU_SET(24, &s);
	return __sched_cpucount(sizeof s, &s);
}

returns 3

> on Alpine Linux on my N270-based netbook (1 physical core but 
> hyperthreading makes it look like 2),
> nproc
> outputs a random number of CPUs ranging from 413 to 472.

see where the cpu_set_t argument comes from
(most likely sched_getaffinity syscall)
then see why that is broken

__sched_cpucount just counts bit flags

> ltrace indicates that this is calling __sched_cpucount() and printing 
> its return value.
> 
> nproc --all
> calls sysconf(_SC_NPROCESSORS_CONF) and gets the proper number of CPUs.
> 
> Thanks,
> Isaac Dunham


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __sched_cpucount returns garbage
  2014-11-30 11:38 ` Szabolcs Nagy
@ 2014-12-03  0:11   ` Rich Felker
  2014-12-03  1:33     ` Isaac Dunham
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2014-12-03  0:11 UTC (permalink / raw)
  To: musl

On Sun, Nov 30, 2014 at 12:38:46PM +0100, Szabolcs Nagy wrote:
> * Isaac Dunham <ibid.ag@gmail.com> [2014-11-29 15:36:33 -0800]:
> > I noticed that nproc ended up on the toybox TODO list (via Tizen), and went
> > poking about via strace and ltrace to see where it got the cpu count from.
> > 
> > In the process, I discovered that __sched_cpucount is returning garbage;
> 
> works here as expected:
> 
> #define _GNU_SOURCE
> #include <sched.h>
> int main()
> {
> 	cpu_set_t s = {0};
> 	CPU_SET(3, &s);
> 	CPU_SET(7, &s);
> 	CPU_SET(24, &s);
> 	return __sched_cpucount(sizeof s, &s);
> }
> 
> returns 3
> 
> > on Alpine Linux on my N270-based netbook (1 physical core but 
> > hyperthreading makes it look like 2),
> > nproc
> > outputs a random number of CPUs ranging from 413 to 472.
> 
> see where the cpu_set_t argument comes from
> (most likely sched_getaffinity syscall)
> then see why that is broken
> 
> __sched_cpucount just counts bit flags

Is it possible that the macros from sched.h are using it wrong, or
that nproc is using __sched_cpucount directly rather than using the
sched.h macros and expecting different behavior from it (perhaps a
mismatch between the musl and glibc behavior, like counting bits vs
bytes vs longs)?

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __sched_cpucount returns garbage
  2014-12-03  0:11   ` Rich Felker
@ 2014-12-03  1:33     ` Isaac Dunham
  2014-12-03  2:48       ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Isaac Dunham @ 2014-12-03  1:33 UTC (permalink / raw)
  To: musl

On Tue, Dec 02, 2014 at 07:11:15PM -0500, Rich Felker wrote:
> On Sun, Nov 30, 2014 at 12:38:46PM +0100, Szabolcs Nagy wrote:
> > * Isaac Dunham <ibid.ag@gmail.com> [2014-11-29 15:36:33 -0800]:
> > > I noticed that nproc ended up on the toybox TODO list (via Tizen), and went
> > > poking about via strace and ltrace to see where it got the cpu count from.
> > > 
> > > In the process, I discovered that __sched_cpucount is returning garbage;
> > 
> > works here as expected:
> > 
> > #define _GNU_SOURCE
> > #include <sched.h>
> > int main()
> > {
> > 	cpu_set_t s = {0};
> > 	CPU_SET(3, &s);
> > 	CPU_SET(7, &s);
> > 	CPU_SET(24, &s);
> > 	return __sched_cpucount(sizeof s, &s);
> > }
> > 
> > returns 3
> > 
> > > on Alpine Linux on my N270-based netbook (1 physical core but 
> > > hyperthreading makes it look like 2),
> > > nproc
> > > outputs a random number of CPUs ranging from 413 to 472.
> > 
> > see where the cpu_set_t argument comes from
> > (most likely sched_getaffinity syscall)
> > then see why that is broken
> > 
> > __sched_cpucount just counts bit flags
> 
> Is it possible that the macros from sched.h are using it wrong, or
> that nproc is using __sched_cpucount directly rather than using the
> sched.h macros and expecting different behavior from it (perhaps a
> mismatch between the musl and glibc behavior, like counting bits vs
> bytes vs longs)?
> 
> Rich

I have no idea what it's doing; after reading the source, I have *less*
of an understanding, since it's got half a dozen #ifdefs in the relevant
code (in lib/nproc.c).
But I can say that it's returning the result of __sched_cpucount without
modification (the return matches the output of nproc).

OK, rereading it:
We're probably using HAVE_SCHED_GETAFFINITY_LIKE_GLIBC, and CPU_COUNT is
defined.
So it ostensibly should be more-or-less:
  if (sched_getaffinity (0, sizeof (set), &set) == 0)
    {
      unsigned long count;
      count = CPU_COUNT(&set);
      if (count > 0)
        return count;
    }
BUT... isolating that snippet gives me the expected results...if
I initialize set to 0, which they *don't*.
So I guess it's the missing initialization.


Thanks,
Isaac Dunham



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __sched_cpucount returns garbage
  2014-12-03  1:33     ` Isaac Dunham
@ 2014-12-03  2:48       ` Rich Felker
  2014-12-03  3:19         ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2014-12-03  2:48 UTC (permalink / raw)
  To: musl

On Tue, Dec 02, 2014 at 05:33:04PM -0800, Isaac Dunham wrote:
> On Tue, Dec 02, 2014 at 07:11:15PM -0500, Rich Felker wrote:
> > On Sun, Nov 30, 2014 at 12:38:46PM +0100, Szabolcs Nagy wrote:
> > > * Isaac Dunham <ibid.ag@gmail.com> [2014-11-29 15:36:33 -0800]:
> > > > I noticed that nproc ended up on the toybox TODO list (via Tizen), and went
> > > > poking about via strace and ltrace to see where it got the cpu count from.
> > > > 
> > > > In the process, I discovered that __sched_cpucount is returning garbage;
> > > 
> > > works here as expected:
> > > 
> > > #define _GNU_SOURCE
> > > #include <sched.h>
> > > int main()
> > > {
> > > 	cpu_set_t s = {0};
> > > 	CPU_SET(3, &s);
> > > 	CPU_SET(7, &s);
> > > 	CPU_SET(24, &s);
> > > 	return __sched_cpucount(sizeof s, &s);
> > > }
> > > 
> > > returns 3
> > > 
> > > > on Alpine Linux on my N270-based netbook (1 physical core but 
> > > > hyperthreading makes it look like 2),
> > > > nproc
> > > > outputs a random number of CPUs ranging from 413 to 472.
> > > 
> > > see where the cpu_set_t argument comes from
> > > (most likely sched_getaffinity syscall)
> > > then see why that is broken
> > > 
> > > __sched_cpucount just counts bit flags
> > 
> > Is it possible that the macros from sched.h are using it wrong, or
> > that nproc is using __sched_cpucount directly rather than using the
> > sched.h macros and expecting different behavior from it (perhaps a
> > mismatch between the musl and glibc behavior, like counting bits vs
> > bytes vs longs)?
> > 
> > Rich
> 
> I have no idea what it's doing; after reading the source, I have *less*
> of an understanding, since it's got half a dozen #ifdefs in the relevant
> code (in lib/nproc.c).
> But I can say that it's returning the result of __sched_cpucount without
> modification (the return matches the output of nproc).
> 
> OK, rereading it:
> We're probably using HAVE_SCHED_GETAFFINITY_LIKE_GLIBC, and CPU_COUNT is
> defined.
> So it ostensibly should be more-or-less:
>   if (sched_getaffinity (0, sizeof (set), &set) == 0)
>     {
>       unsigned long count;
>       count = CPU_COUNT(&set);
>       if (count > 0)
>         return count;
>     }
> BUT... isolating that snippet gives me the expected results...if
> I initialize set to 0, which they *don't*.
> So I guess it's the missing initialization.

I think it's a kernel quirk. It looks like the kernel only fills the
part of the cpuset up to the actual number of cpus the kernel knows
about or supports. The syscall then returns a value (bits? bytes?)
indicating the amount filled, and userspace is responsible for
zero-filling the rest and returning zero. I'll look into the details
and fix it.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: __sched_cpucount returns garbage
  2014-12-03  2:48       ` Rich Felker
@ 2014-12-03  3:19         ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2014-12-03  3:19 UTC (permalink / raw)
  To: musl

On Tue, Dec 02, 2014 at 09:48:51PM -0500, Rich Felker wrote:
> > OK, rereading it:
> > We're probably using HAVE_SCHED_GETAFFINITY_LIKE_GLIBC, and CPU_COUNT is
> > defined.
> > So it ostensibly should be more-or-less:
> >   if (sched_getaffinity (0, sizeof (set), &set) == 0)
> >     {
> >       unsigned long count;
> >       count = CPU_COUNT(&set);
> >       if (count > 0)
> >         return count;
> >     }
> > BUT... isolating that snippet gives me the expected results...if
> > I initialize set to 0, which they *don't*.
> > So I guess it's the missing initialization.
> 
> I think it's a kernel quirk. It looks like the kernel only fills the
> part of the cpuset up to the actual number of cpus the kernel knows
> about or supports. The syscall then returns a value (bits? bytes?)
> indicating the amount filled, and userspace is responsible for
> zero-filling the rest and returning zero. I'll look into the details
> and fix it.

Should be fixed as of commit a56e339419c1a90f8a85f86621f3c73945e07b23.
The subsequent commit 66140b0c926ed097f2cb7474863523e4af351f5b also
fixes the return value for the pthread_*_np variants.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-12-03  3:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-29 23:36 __sched_cpucount returns garbage Isaac Dunham
2014-11-30 11:38 ` Szabolcs Nagy
2014-12-03  0:11   ` Rich Felker
2014-12-03  1:33     ` Isaac Dunham
2014-12-03  2:48       ` Rich Felker
2014-12-03  3:19         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).