mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH v3] sysconf: add _SC_NPROCESSORS_CONF support
Date: Sat, 10 Jul 2021 13:11:38 -0400	[thread overview]
Message-ID: <20210710171137.GW13220@brightrain.aerifal.cx> (raw)
In-Reply-To: <20210710053157.4F16D22215EC@gateway02.insomnia247.nl>

On Sat, Jul 10, 2021 at 07:31:57AM +0200, jason wrote:
> Here's a simpler bit of code to do the same thing.  The differences:
> - Doesn't bother using stdio.  /proc and /sys reads like this don't sleep
>   and so don't return short or EINTR or any of that hair.

Non-use of stdio may be better but also requires care. The code as
you've done it is not safe against cancellation (which it needs to
block).

> - Check for the terminating newline, and use it in the parsing loop
>   to trigger completing a range, just like ','.
> 
>   The previous code never actually ensured that the buffer was '\0'
>   terminated. :-(
> 
> - Use one "current number being parsed" variable rather than two.
>   Just copy n to prev_n on finding a '-'.  (Feel free to bikeshed the
>   variable names to "range_start" or something if you prefer.)
> 
> If you do want to use a FILE *, musl should probably have an internal
> API for reading into the stdio buffer and parsing straight from there,
> avoiding the separate stack buffer.  (Or even allocating the FILE on
> the stack and avoiding the malloc.)

It does. __fopen_rb_ca. This is probably the preferred way to do it.
You can then just consume bytes with getc_unlocked (which expands as a
macro within libc).

> This is probably far from the last small config file (/proc or /sys or
> perhaps something like /etc/hostname) that someone will want to access.
> 
> (You could even get ambitious and make it reallocate the buffer as
> required to get the whole file into memory

That's not a good thing.

> Probably something other
> code would be interested in, like timezone parsing.)

Time zones are mmapped and used in-place.

> #include <unistd.h>
> #include <fcntl.h>
> #include <errno.h>
> 
> long get_nrprocessors_conf1(void)
> {
> 	int fd = open("/sys/devices/system/cpu/possible",
> 			O_RDONLY | O_NOCTTY | O_CLOEXEC);
> 	if (fd < 0)
> 		goto failure;
> 
> 	char buffer[128];	/* Some reasonable size. */
> 	ssize_t ss = read(fd, buffer, sizeof buffer);
> 	/*
> 	 * Short reads, EINTR, etc. can't happen with /sys files.
> 	 * We could in theory preserve the read's errno in case the close
> 	 * overwrites it, but that seems excessive.
> 	 */
> 	if (close(fd) < 0 || ss < 0)
> 		goto failure;
> 	if (ss < 2 || buffer[ss-1] != '\n')
> 		goto parse_error;
> 	/*
> 	 * The possible CPUs are given as a comma-separated list of ranges,
> 	 * with each range being either a single decimal integer ("1") or
> 	 * a hyphen-separated range ("1-10").
> 	 *
> 	 * Q: How careful do we want the parsing to be?  For now, assume
> 	 * the kernel can be trusted not to give us "1-2-3" or ",," or
> 	 * other strangeness.
> 	 */
> 	char const *p = buffer;
> 	unsigned prev_n = -1u;
> 	unsigned n = 0;
> 	unsigned total = 0;
> 	for (;;) {
> 		char c = *p++;
> 		switch (c) {
> 		  case '0': case '1': case '2': case '3': case '4':
> 		  case '5': case '6': case '7': case '8': case '9':
> 			if (n > (-1u - 9)/10)
> 				goto parse_error;
> 			n = 10 * n + c - '0';
> 			break;
> 		  case '-':
> 			prev_n = n;
> 			n = 0;
> 			break;
> 		  case ',': case '\n':
> 			if (n > prev_n)
> 				total += n - prev_n;
> 			total++;
> 			if (c == '\n')
> 				return (long)total;	/* Success! */
> 			prev_n = -1u;
> 			n = 0;
> 			break;
> 		  default:
> 			goto parse_error;
> 		}
> 	}
> 
> parse_error:
> 	errno = ENOTSUP;	/* What's a good error to return? */

There is no need to account for data errors at all. If you can't trust
the kernel not to give you something invalid, you're running in a
compromised context where you can't do anything.

If this is really the right interface to get this data from, I would
hope there's some simpler way to do it, simplifying it down to assume
the data is well-formed (don't invoke UB if it's not, but returning an
utterly bogus value if it's not would be fine).

A couple minor style nits: no spaces for indentation or extra
indentation of case labels, and no need to cast in any place where
there's an implicit conversion that does the same (unsigned to long).

Rich

  reply	other threads:[~2021-07-10 17:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-10  5:31 jason
2021-07-10 17:11 ` Rich Felker [this message]
2021-07-10 17:29 ` Samuel Holland
2021-07-10 19:44   ` Rich Felker
2021-07-12 13:08     ` Vincent Donnefort
2021-07-12 15:41       ` Rich Felker
2021-07-12 16:32         ` Vincent Donnefort
2021-07-12 17:14           ` Rich Felker
2021-07-12 18:12             ` Vincent Donnefort
2021-08-07 14:52             ` James Y Knight
2021-08-07 16:29           ` Leah Neukirchen
  -- strict thread matches above, loose matches on Subject: below --
2021-07-06  9:55 Vincent Donnefort

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210710171137.GW13220@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).