From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 7753 invoked from network); 18 Aug 2022 12:37:13 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 18 Aug 2022 12:37:13 -0000 Received: (qmail 11289 invoked by uid 550); 18 Aug 2022 12:37:10 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 10223 invoked from network); 18 Aug 2022 12:37:09 -0000 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.com; s=mail; t=1660825920; bh=kAW2VvEz7H1aiAzqnfpeVJV8NrtV8BFJF4/GMOBqmRU=; h=In-Reply-To:Message-ID:Subject:References:To:From:Date; b=kmMe/n72rnBTYGTmNR5HtiBEaVTCkFz5A4L5dv8IavNXtUeg31APuquhWtupLYYQ+ rmhZMLmgj8hFLlyfInfDdl3DU4h9AgC/XmbfzRda8uU45PArd7ra+KeU+DkjWQT5Af 9UbZ/Y7Bi4/F5Z5U3oOQ/oILqx5I2e/VjRBJ6Sr4= Authentication-Results: vla1-62318bfe5573.qloud-c.yandex.net; dkim=pass header.i=@yandex.com Date: Thu, 18 Aug 2022 12:31:52 +0000 From: Guilherme Janczak To: musl@lists.openwall.com Message-ID: <20220818123152.xtx74dywquxl227p@yandex.com> Mail-Followup-To: musl@lists.openwall.com References: <20220810130311.dwk7zxwkz32igrdm@yandex.com> <20220818003531.GG7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20220818003531.GG7074@brightrain.aerifal.cx> Subject: Re: [musl] [PATCH] add close_range() syscall wrapper Please let me know if you need any changes, thanks. On Wed, Aug 17, 2022 at 08:35:32PM -0400, Rich Felker wrote: > On Wed, Aug 10, 2022 at 01:03:11PM +0000, Guilherme Janczak wrote: > > close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc > > 2.34 added a wrapper. > > --- > > The existence of this operation has been controversial, and it's > arguable that it should be excluded by policy not to support UB (it's > UB to close fds you don't own that might be used internally by the > implementation) though I'm not sure it really helps since folks who > want to use it will just make the syscall directly. We should probably > at least consider it for inclusion. > The Linux pull request has more information on the syscall: https://lkml.iu.edu/hypermail/linux/kernel/2008.0/02649.html The situation right now is that everyone who wants to do this has their own fallback: https://github.com/FreeRADIUS/freeradius-server/blob/a8f7670a15b0d135e82d7029ad8202403b88e3f3/src/lib/server/exec.c#L312 https://github.com/openssh/openssh-portable/blob/715c892f0a5295b391ae92c26ef4d6a86ea96e8e/openbsd-compat/bsd-closefrom.c#L123-L134 https://gitlab.freedesktop.org/dbus/dbus/-/blob/7e0c51d8000a467a153700420a35a18d1b580c51/NEWS#L49-L50 I myself have been writing my own this past week. One thing I noticed is that they all suffer from subtle bugs. For instance, usually the last fallback is calling getrlimit() or sysconf() and closing everything between the desired number and the maximum reported by either function as done by libbsd: https://cgit.freedesktop.org/libbsd/tree/src/closefrom.c#n78 However, POSIX allows having a fd numbered above that limit. Here's a test program: ``` #define _POSIX_C_SOURCE 200809L #include #include #include #include #include int main(void) { struct rlimit rl; int fd; getrlimit(RLIMIT_NOFILE, &rl); if (rl.rlim_cur == RLIM_INFINITY) rl.rlim_cur = 20; if ((fd = dup2(STDOUT_FILENO, rl.rlim_cur+1)) == -1) { if (errno != EBADF) { perror(NULL); return 1; } printf( "Failed to open file descriptor %d because it's over the\n" "default RLIM_NOFILE limit of %d.\n", (int)rl.rlim_cur+1, (int)rl.rlim_cur); if ((fd = dup2(STDOUT_FILENO, rl.rlim_cur-1)) == -1) { perror(NULL); return 1; } } rl.rlim_max = rl.rlim_cur /= 2; if (setrlimit(RLIMIT_NOFILE, &rl) == -1) { fputs("failed to set the open file descriptor limit below" " the highest open fd\n", stderr); perror(NULL); return 1; } else { printf( "I have fd %d open and I've reduced the fd limit.\n" "getrlimit() says the hard limit on open fds is %d.\n" "sysconf(_SC_OPEN_MAX) reports a limit of %ld.\n", fd, (int)rl.rlim_max, sysconf(_SC_OPEN_MAX)); } } ``` Alpine Linux (musl) and Hyperbola Linux (glibc) give the same output: Failed to open file descriptor 1025 because it's over the default RLIM_NOFILE limit of 1024. I have fd 1023 open and I've reduced the fd limit. getrlimit() says the hard limit on open fds is 512. sysconf(_SC_OPEN_MAX) reports a limit of 512. That bug can only be fixed by guaranteeing this last fallback is not reached. > > include/unistd.h | 3 +++ > > src/linux/close_range.c | 8 ++++++++ > > 2 files changed, 11 insertions(+) > > create mode 100644 src/linux/close_range.c > > > > diff --git a/include/unistd.h b/include/unistd.h > > index 80be3b26..e4b8cbb1 100644 > > --- a/include/unistd.h > > +++ b/include/unistd.h > > @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned); > > #define L_INCR 1 > > #define L_XTND 2 > > int brk(void *); > > +#define CLOSE_RANGE_UNSHARE 0x01 > > +#define CLOSE_RANGE_CLOEXEC 0x02 > > +int close_range(unsigned int, unsigned int, int); > > void *sbrk(intptr_t); > > pid_t vfork(void); > > int vhangup(void); > > diff --git a/src/linux/close_range.c b/src/linux/close_range.c > > new file mode 100644 > > index 00000000..4c5cd68c > > --- /dev/null > > +++ b/src/linux/close_range.c > > @@ -0,0 +1,8 @@ > > +#define _GNU_SOURCE > > This should probably be _BSD_SOURCE if it's exposed under _BSD_SOURCE. glibc exposes it under _GNU_SOURCE: https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/unistd.h;h=764d914cea1d22ed1a9d0d60b00d9c97f4560a0f;hb=HEAD#l1202 It might be more correct to expose it under both because it's also a FreeBSD function, I was just thinking about doing the same thing as glibc here. > > > +#include > > +#include "syscall.h" > > + > > +int close_range(unsigned int first, unsigned int last, int flags) > > +{ > > + return syscall(SYS_close_range, first, last, (unsigned int)flags); > > +} > > -- > > 2.35.1 > > Can you elaborate on what the cast to unsigned is for? > > Rich The Linux syscall wants the flags parameter to be unsigned int, but the libc function uses int. syscall() would pun it and that's the same thing as a cast, but it seemed to make sense to show it's intentional.