From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 24688 invoked from network); 10 Jul 2021 17:11:53 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 10 Jul 2021 17:11:53 -0000 Received: (qmail 5573 invoked by uid 550); 10 Jul 2021 17:11:51 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 5555 invoked from network); 10 Jul 2021 17:11:50 -0000 Date: Sat, 10 Jul 2021 13:11:38 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20210710171137.GW13220@brightrain.aerifal.cx> References: <20210710053157.4F16D22215EC@gateway02.insomnia247.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210710053157.4F16D22215EC@gateway02.insomnia247.nl> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH v3] sysconf: add _SC_NPROCESSORS_CONF support On Sat, Jul 10, 2021 at 07:31:57AM +0200, jason wrote: > Here's a simpler bit of code to do the same thing. The differences: > - Doesn't bother using stdio. /proc and /sys reads like this don't sleep > and so don't return short or EINTR or any of that hair. Non-use of stdio may be better but also requires care. The code as you've done it is not safe against cancellation (which it needs to block). > - Check for the terminating newline, and use it in the parsing loop > to trigger completing a range, just like ','. > > The previous code never actually ensured that the buffer was '\0' > terminated. :-( > > - Use one "current number being parsed" variable rather than two. > Just copy n to prev_n on finding a '-'. (Feel free to bikeshed the > variable names to "range_start" or something if you prefer.) > > If you do want to use a FILE *, musl should probably have an internal > API for reading into the stdio buffer and parsing straight from there, > avoiding the separate stack buffer. (Or even allocating the FILE on > the stack and avoiding the malloc.) It does. __fopen_rb_ca. This is probably the preferred way to do it. You can then just consume bytes with getc_unlocked (which expands as a macro within libc). > This is probably far from the last small config file (/proc or /sys or > perhaps something like /etc/hostname) that someone will want to access. > > (You could even get ambitious and make it reallocate the buffer as > required to get the whole file into memory That's not a good thing. > Probably something other > code would be interested in, like timezone parsing.) Time zones are mmapped and used in-place. > #include > #include > #include > > long get_nrprocessors_conf1(void) > { > int fd = open("/sys/devices/system/cpu/possible", > O_RDONLY | O_NOCTTY | O_CLOEXEC); > if (fd < 0) > goto failure; > > char buffer[128]; /* Some reasonable size. */ > ssize_t ss = read(fd, buffer, sizeof buffer); > /* > * Short reads, EINTR, etc. can't happen with /sys files. > * We could in theory preserve the read's errno in case the close > * overwrites it, but that seems excessive. > */ > if (close(fd) < 0 || ss < 0) > goto failure; > if (ss < 2 || buffer[ss-1] != '\n') > goto parse_error; > /* > * The possible CPUs are given as a comma-separated list of ranges, > * with each range being either a single decimal integer ("1") or > * a hyphen-separated range ("1-10"). > * > * Q: How careful do we want the parsing to be? For now, assume > * the kernel can be trusted not to give us "1-2-3" or ",," or > * other strangeness. > */ > char const *p = buffer; > unsigned prev_n = -1u; > unsigned n = 0; > unsigned total = 0; > for (;;) { > char c = *p++; > switch (c) { > case '0': case '1': case '2': case '3': case '4': > case '5': case '6': case '7': case '8': case '9': > if (n > (-1u - 9)/10) > goto parse_error; > n = 10 * n + c - '0'; > break; > case '-': > prev_n = n; > n = 0; > break; > case ',': case '\n': > if (n > prev_n) > total += n - prev_n; > total++; > if (c == '\n') > return (long)total; /* Success! */ > prev_n = -1u; > n = 0; > break; > default: > goto parse_error; > } > } > > parse_error: > errno = ENOTSUP; /* What's a good error to return? */ There is no need to account for data errors at all. If you can't trust the kernel not to give you something invalid, you're running in a compromised context where you can't do anything. If this is really the right interface to get this data from, I would hope there's some simpler way to do it, simplifying it down to assume the data is well-formed (don't invoke UB if it's not, but returning an utterly bogus value if it's not would be fine). A couple minor style nits: no spaces for indentation or extra indentation of case labels, and no need to cast in any place where there's an implicit conversion that does the same (unsigned to long). Rich