Hi, POSIX [1][2] says about the functions posix_spawn_file_actions_addopen posix_spawn_file_actions_adddup2 The function "shall fail if: [EBADF] The value specified by fildes is negative or greater than or equal to {OPEN_MAX}." However, in musl libc 1.2.2, these two test programs exit with status 2: ======================================================================== #include <spawn.h> #include <fcntl.h> int main () { posix_spawn_file_actions_t actions; if (posix_spawn_file_actions_init (&actions) != 0) return 1; if (posix_spawn_file_actions_addopen (&actions, 10000000, "foo", 0, O_RDONLY) == 0) return 2; return 0; } ======================================================================== #include <spawn.h> int main () { posix_spawn_file_actions_t actions; if (posix_spawn_file_actions_init (&actions) != 0) return 1; if (posix_spawn_file_actions_adddup2 (&actions, 10000000, 2) == 0) return 2; return 0; } ======================================================================== sysconf (_SC_OPEN_MAX) is 1024, on that system. Best regards, Bruno [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_addopen.html [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_adddup2.html
Similarly, POSIX [1] says: "The posix_spawn_file_actions_addclose() function shall fail if: [EBADF] The value specified by fildes is negative." However, in musl libc 1.2.2, this test program exits with status 2: ======================================================================== #include <spawn.h> int main () { posix_spawn_file_actions_t actions; if (posix_spawn_file_actions_init (&actions) != 0) return 1; if (posix_spawn_file_actions_addclose (&actions, -5) == 0) return 2; return 0; } ======================================================================== Best regards, Bruno [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_addclose.html
On Mon, Jan 25, 2021 at 09:31:50AM +0100, Bruno Haible wrote: > Hi, > > POSIX [1][2] says about the functions > posix_spawn_file_actions_addopen > posix_spawn_file_actions_adddup2 > > The function "shall fail if: > [EBADF] > The value specified by fildes is negative or greater than or equal to {OPEN_MAX}." > > However, in musl libc 1.2.2, these two test programs exit with status 2: > ======================================================================== > #include <spawn.h> > #include <fcntl.h> > int main () > { > posix_spawn_file_actions_t actions; > if (posix_spawn_file_actions_init (&actions) != 0) > return 1; > if (posix_spawn_file_actions_addopen (&actions, 10000000, "foo", 0, O_RDONLY) > == 0) > return 2; > return 0; > } > ======================================================================== > #include <spawn.h> > int main () > { > posix_spawn_file_actions_t actions; > if (posix_spawn_file_actions_init (&actions) != 0) > return 1; > if (posix_spawn_file_actions_adddup2 (&actions, 10000000, 2) == 0) > return 2; > return 0; > } > ======================================================================== > > sysconf (_SC_OPEN_MAX) is 1024, on that system. > > Best regards, > > Bruno > > [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_addopen.html > [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_adddup2.html Thanks. I think I was vaguely aware of this, but misremembered https://www.austingroupbugs.net/view.php?id=418 as dropping the requirement (which is rather odious, especially if a file action for changing rlimit were ever to be added) rather than just removing it for close. With that said, is there any normative text that {OPEN_MAX} in the spec indicates a requirement to honor a dynamic max sysconf(_SC_OPEN_MAX)? As written, the "shall fail" seems to apply just on systems where OPEN_MAX is defined; sysconf isn't referenced. I would very much prefer not to have to enforce such a max here since it's hostile to future extensibility and wastes a syscall in an operation that should not require one. In any case negative values should be checked. Rich
Rich Felker wrote: > I would very much prefer not to have to enforce such a max here since > ... wastes a syscall in an > operation that should not require one. sysconf (_SC_OPEN_MAX) is equivalent to the rlim_cur value from getrlimit(RLIMIT_NOFILE, ...). A mechanism for avoiding syscalls is the vdso [1][2]. How about adding getlimit to the vdso? Do you know why even simple syscalls like getpid() exist in the vdso only for ia64? Bruno [1] https://man7.org/linux/man-pages/man7/vdso.7.html [2] https://lwn.net/Articles/615809/
On Mon, Jan 25, 2021 at 05:07:36PM +0100, Bruno Haible wrote: > Rich Felker wrote: > > I would very much prefer not to have to enforce such a max here since > > ... wastes a syscall in an > > operation that should not require one. > > sysconf (_SC_OPEN_MAX) is equivalent to the rlim_cur value from > getrlimit(RLIMIT_NOFILE, ...). I'm aware of that. But I'm not convinced that the standard as written requires any comparison against sysconf(_SC_OPEN_MAX). Is there a general rule somewhere that {x_MAX} in the text implies a requirement to use the dynamic runtime value if x_MAX is undefined but there's a corresponding _SC_x_MAX? Otherwise "greater than or equal to {OPEN_MAX}" is vacuously false when OPEN_MAX is not defined. > A mechanism for avoiding syscalls is the vdso [1][2]. How about adding > getlimit to the vdso? There's an unbounded amount of stuff like this that technically could be added to the vdso but with diminishing returns. Even if there does turn out to be a requirement to check against the dynamic max here, I don't think the cost is sufficient to justify added complexity. > Do you know why even simple syscalls like getpid() > exist in the vdso only for ia64? In the case of getpid, because glibc does (or at least did? not sure if they still do) cache it themselves in the TCB, adding it to vdso offered no additional concrete benefit. Rich
Rich Felker wrote: > I'm not convinced that the standard as written > requires any comparison against sysconf(_SC_OPEN_MAX). Is there a > general rule somewhere that {x_MAX} in the text implies a requirement > to use the dynamic runtime value if x_MAX is undefined but there's a > corresponding _SC_x_MAX? As far as I understand, [1] defines the meaning of {OPEN_MAX}, and [2] says that {OPEN_MAX} is sysconf(_SC_OPEN_MAX). Also, [2] says "sysconf(_SC_OPEN_MAX) may return different values before and after a call to setrlimit() which changes the RLIMIT_NOFILE soft limit." Bruno [1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/limits.h.html [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/sysconf.html
On 2021-01-25 19:15, Rich Felker wrote: > On Mon, Jan 25, 2021 at 05:07:36PM +0100, Bruno Haible wrote: >> Do you know why even simple syscalls like getpid() >> exist in the vdso only for ia64? > > In the case of getpid, because glibc does (or at least did? not sure > if they still do) cache it themselves in the TCB, adding it to vdso > offered no additional concrete benefit. > Glibc dropped PID caching long ago[1], and it famously regressed systemd at the time[2]. [1] https://sourceware.org/git/?p=glibc.git;a=commit;h=c579f48edba88380635ab98cb612030e3ed8691e [2] https://bugzilla.redhat.com/show_bug.cgi?id=1469670 Alexey
On Mon, Jan 25, 2021 at 11:15:12AM -0500, Rich Felker wrote:
> In the case of getpid, because glibc does (or at least did? not sure
> if they still do) cache it themselves in the TCB, adding it to vdso
> offered no additional concrete benefit.
>
> Rich
They no longer do that. The manpage says they did, but stopped because
people kept running the fork, vfork, and clone system calls directly.
What eventually killed the caching was an irreconcilable race condition
in the clone() wrapper function: If the child received a signal right
after being created but before the getpid() cache was invalidated,
getpid() (which is specified as being async-signal-safe) would return
invalid values. Now, this is fixable by blocking signals during that
time, but that would be a lot of work for little benefit.
Adding a getpid vDSO call would require adding the code itself, adding
the PID to the vvar page (isn't that shared among all processes?), and
adding support for finding the vDSO function into the libcs. All of that
for rather dubious performance benefits (or can you name a program that
was not fast enough because getpid() ran too slowly?)
Ciao,
Markus
* Markus Wichmann: > Adding a getpid vDSO call would require adding the code itself, adding > the PID to the vvar page (isn't that shared among all processes?), and > adding support for finding the vDSO function into the libcs. All of that > for rather dubious performance benefits (or can you name a program that > was not fast enough because getpid() ran too slowly?) There's some discussion about this in the context of extensible rseq support, including userspace donating thread-local data for kernel use (“KTLS”). Reliable PID/TID/UID caching in userspace and sigprocmask with a userspace fastpath are potential applications. Thanks, Florian -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill