From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/6659 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: __sched_cpucount returns garbage Date: Tue, 2 Dec 2014 21:48:51 -0500 Message-ID: <20141203024851.GD4574@brightrain.aerifal.cx> References: <20141129233632.GA2146@newbook> <20141130113846.GC9258@port70.net> <20141203001115.GC4574@brightrain.aerifal.cx> <20141203013303.GA5250@newbook> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1417574952 1375 80.91.229.3 (3 Dec 2014 02:49:12 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 3 Dec 2014 02:49:12 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-6672-gllmg-musl=m.gmane.org@lists.openwall.com Wed Dec 03 03:49:05 2014 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Xw001-00027n-Nh for gllmg-musl@m.gmane.org; Wed, 03 Dec 2014 03:49:05 +0100 Original-Received: (qmail 17450 invoked by uid 550); 3 Dec 2014 02:49:04 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 17439 invoked from network); 3 Dec 2014 02:49:03 -0000 Content-Disposition: inline In-Reply-To: <20141203013303.GA5250@newbook> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:6659 Archived-At: On Tue, Dec 02, 2014 at 05:33:04PM -0800, Isaac Dunham wrote: > On Tue, Dec 02, 2014 at 07:11:15PM -0500, Rich Felker wrote: > > On Sun, Nov 30, 2014 at 12:38:46PM +0100, Szabolcs Nagy wrote: > > > * Isaac Dunham [2014-11-29 15:36:33 -0800]: > > > > I noticed that nproc ended up on the toybox TODO list (via Tizen), and went > > > > poking about via strace and ltrace to see where it got the cpu count from. > > > > > > > > In the process, I discovered that __sched_cpucount is returning garbage; > > > > > > works here as expected: > > > > > > #define _GNU_SOURCE > > > #include > > > int main() > > > { > > > cpu_set_t s = {0}; > > > CPU_SET(3, &s); > > > CPU_SET(7, &s); > > > CPU_SET(24, &s); > > > return __sched_cpucount(sizeof s, &s); > > > } > > > > > > returns 3 > > > > > > > on Alpine Linux on my N270-based netbook (1 physical core but > > > > hyperthreading makes it look like 2), > > > > nproc > > > > outputs a random number of CPUs ranging from 413 to 472. > > > > > > see where the cpu_set_t argument comes from > > > (most likely sched_getaffinity syscall) > > > then see why that is broken > > > > > > __sched_cpucount just counts bit flags > > > > Is it possible that the macros from sched.h are using it wrong, or > > that nproc is using __sched_cpucount directly rather than using the > > sched.h macros and expecting different behavior from it (perhaps a > > mismatch between the musl and glibc behavior, like counting bits vs > > bytes vs longs)? > > > > Rich > > I have no idea what it's doing; after reading the source, I have *less* > of an understanding, since it's got half a dozen #ifdefs in the relevant > code (in lib/nproc.c). > But I can say that it's returning the result of __sched_cpucount without > modification (the return matches the output of nproc). > > OK, rereading it: > We're probably using HAVE_SCHED_GETAFFINITY_LIKE_GLIBC, and CPU_COUNT is > defined. > So it ostensibly should be more-or-less: > if (sched_getaffinity (0, sizeof (set), &set) == 0) > { > unsigned long count; > count = CPU_COUNT(&set); > if (count > 0) > return count; > } > BUT... isolating that snippet gives me the expected results...if > I initialize set to 0, which they *don't*. > So I guess it's the missing initialization. I think it's a kernel quirk. It looks like the kernel only fills the part of the cpuset up to the actual number of cpus the kernel knows about or supports. The syscall then returns a value (bits? bytes?) indicating the amount filled, and userspace is responsible for zero-filling the rest and returning zero. I'll look into the details and fix it. Rich