* [musl] [PATCH] add close_range() syscall wrapper
@ 2023-09-01 8:02 Natanael Copa
2023-09-01 13:57 ` Rich Felker
0 siblings, 1 reply; 15+ messages in thread
From: Natanael Copa @ 2023-09-01 8:02 UTC (permalink / raw)
To: musl; +Cc: Natanael Copa
close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
2.34 added a wrapper.
Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
it under _BSD_SOURCE since it is also a FreeBSD function.
---
This is a re-take of close_range() syscall wrapper. It was previously
discussed in mailing list: https://www.openwall.com/lists/musl/2022/08/18/5
Difference from previous submission:
- Use correct values for CLOSE_RANGE_UNSHARE and CLOSE_RANGE_CLOEXEC.
- Set errno on errors.
- Drop the (unsigned int) cast for flags, as it raised questions last
time.
I think it is a good idea to add this to musl because it is difficult to
the close before exec properly without it.
Most workaounrds currently out there are either parsing /proc or try to
close everything to maxfd reported by getrlimit(), sysconf() or
getdtablesize().
opendir("/proc/self/fd") is problematic becase 1) /proc may not be
mounted and 2) some versions of musl hangs on malloc between fork and
exec.
Trying to close everything between a maxfd is also problematic. On some
systems (under docker for example) the maxfd can be 1G, which
effectively results in a hang. (See https://github.com/k0sproject/k0s/pull/3436)
I think its better to encourage the use of close_range().
See also: https://github.com/OpenRC/openrc/pull/645
include/unistd.h | 3 +++
src/linux/close_range.c | 8 ++++++++
2 files changed, 11 insertions(+)
create mode 100644 src/linux/close_range.c
diff --git a/include/unistd.h b/include/unistd.h
index 5bc7f798..d89e3d4c 100644
--- a/include/unistd.h
+++ b/include/unistd.h
@@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
#define L_INCR 1
#define L_XTND 2
int brk(void *);
+#define CLOSE_RANGE_UNSHARE (1U << 1)
+#define CLOSE_RANGE_CLOEXEC (1U << 2)
+int close_range(unsigned int, unsigned int, int);
void *sbrk(intptr_t);
pid_t vfork(void);
int vhangup(void);
diff --git a/src/linux/close_range.c b/src/linux/close_range.c
new file mode 100644
index 00000000..258ba8bd
--- /dev/null
+++ b/src/linux/close_range.c
@@ -0,0 +1,8 @@
+#define _GNU_SOURCE
+#include <unistd.h>
+#include "syscall.h"
+
+int close_range(unsigned int first, unsigned int last, int flags)
+{
+ return __syscall_ret(syscall(SYS_close_range, first, last, flags));
+}
--
2.42.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2023-09-01 8:02 [musl] [PATCH] add close_range() syscall wrapper Natanael Copa
@ 2023-09-01 13:57 ` Rich Felker
2023-09-01 14:55 ` Natanael Copa
2023-09-01 14:58 ` [musl] [PATCH v2] " Natanael Copa
0 siblings, 2 replies; 15+ messages in thread
From: Rich Felker @ 2023-09-01 13:57 UTC (permalink / raw)
To: Natanael Copa; +Cc: musl
On Fri, Sep 01, 2023 at 10:02:00AM +0200, Natanael Copa wrote:
> close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> 2.34 added a wrapper.
>
> Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
> it under _BSD_SOURCE since it is also a FreeBSD function.
> ---
>
> This is a re-take of close_range() syscall wrapper. It was previously
> discussed in mailing list: https://www.openwall.com/lists/musl/2022/08/18/5
>
> Difference from previous submission:
>
> - Use correct values for CLOSE_RANGE_UNSHARE and CLOSE_RANGE_CLOEXEC.
> - Set errno on errors.
> - Drop the (unsigned int) cast for flags, as it raised questions last
> time.
>
> I think it is a good idea to add this to musl because it is difficult to
> the close before exec properly without it.
>
> Most workaounrds currently out there are either parsing /proc or try to
> close everything to maxfd reported by getrlimit(), sysconf() or
> getdtablesize().
>
> opendir("/proc/self/fd") is problematic becase 1) /proc may not be
> mounted and 2) some versions of musl hangs on malloc between fork and
> exec.
>
> Trying to close everything between a maxfd is also problematic. On some
> systems (under docker for example) the maxfd can be 1G, which
> effectively results in a hang. (See https://github.com/k0sproject/k0s/pull/3436)
>
> I think its better to encourage the use of close_range().
>
> See also: https://github.com/OpenRC/openrc/pull/645
>
> include/unistd.h | 3 +++
> src/linux/close_range.c | 8 ++++++++
> 2 files changed, 11 insertions(+)
> create mode 100644 src/linux/close_range.c
>
> diff --git a/include/unistd.h b/include/unistd.h
> index 5bc7f798..d89e3d4c 100644
> --- a/include/unistd.h
> +++ b/include/unistd.h
> @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
> #define L_INCR 1
> #define L_XTND 2
> int brk(void *);
> +#define CLOSE_RANGE_UNSHARE (1U << 1)
> +#define CLOSE_RANGE_CLOEXEC (1U << 2)
> +int close_range(unsigned int, unsigned int, int);
> void *sbrk(intptr_t);
> pid_t vfork(void);
> int vhangup(void);
> diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> new file mode 100644
> index 00000000..258ba8bd
> --- /dev/null
> +++ b/src/linux/close_range.c
> @@ -0,0 +1,8 @@
> +#define _GNU_SOURCE
> +#include <unistd.h>
> +#include "syscall.h"
> +
> +int close_range(unsigned int first, unsigned int last, int flags)
> +{
> + return __syscall_ret(syscall(SYS_close_range, first, last, flags));
> +}
> --
> 2.42.0
This is double-processing errno. You need either return
__syscall_ret(__syscall(...)) (note the second __) or just return
syscall(...) (the syscall macro without __ automatically does the
__syscall_ret).
Aside from that, I think there's a question whether, if we support
this as a function rather than leaving it to the application to use
the syscall, we should provide a fallback for ENOSYS. I'm not sure,
but it's something that should be considered before adding it.
Rich
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2023-09-01 13:57 ` Rich Felker
@ 2023-09-01 14:55 ` Natanael Copa
2023-09-01 15:06 ` Rich Felker
2023-09-01 14:58 ` [musl] [PATCH v2] " Natanael Copa
1 sibling, 1 reply; 15+ messages in thread
From: Natanael Copa @ 2023-09-01 14:55 UTC (permalink / raw)
To: Rich Felker; +Cc: musl
On Fri, 1 Sep 2023 09:57:34 -0400
Rich Felker <dalias@libc.org> wrote:
> > +int close_range(unsigned int first, unsigned int last, int flags)
> > +{
> > + return __syscall_ret(syscall(SYS_close_range, first, last, flags));
> > +}
> > --
> > 2.42.0
>
> This is double-processing errno. You need either return
> __syscall_ret(__syscall(...)) (note the second __) or just return
> syscall(...) (the syscall macro without __ automatically does the
> __syscall_ret).
Ah, ok, I'll send a v2 patch.
> Aside from that, I think there's a question whether, if we support
> this as a function rather than leaving it to the application to use
> the syscall, we should provide a fallback for ENOSYS. I'm not sure,
> but it's something that should be considered before adding it.
It was mentioned earlier that CPython expects close_range() to
async-safe, and that glibc does not provide fallback. I would prefer
that musl does not provide fallback.
https://www.openwall.com/lists/musl/2022/08/18/4
-nc
^ permalink raw reply [flat|nested] 15+ messages in thread
* [musl] [PATCH v2] add close_range() syscall wrapper
2023-09-01 13:57 ` Rich Felker
2023-09-01 14:55 ` Natanael Copa
@ 2023-09-01 14:58 ` Natanael Copa
2024-08-01 9:43 ` Alexey Izbyshev
1 sibling, 1 reply; 15+ messages in thread
From: Natanael Copa @ 2023-09-01 14:58 UTC (permalink / raw)
To: musl; +Cc: Natanael Copa
close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
2.34 added a wrapper.
Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
it under _BSD_SOURCE since it is also a FreeBSD function.
---
v2: use syscall without __syscall_ret
include/unistd.h | 3 +++
src/linux/close_range.c | 8 ++++++++
2 files changed, 11 insertions(+)
create mode 100644 src/linux/close_range.c
diff --git a/include/unistd.h b/include/unistd.h
index 5bc7f798..d89e3d4c 100644
--- a/include/unistd.h
+++ b/include/unistd.h
@@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
#define L_INCR 1
#define L_XTND 2
int brk(void *);
+#define CLOSE_RANGE_UNSHARE (1U << 1)
+#define CLOSE_RANGE_CLOEXEC (1U << 2)
+int close_range(unsigned int, unsigned int, int);
void *sbrk(intptr_t);
pid_t vfork(void);
int vhangup(void);
diff --git a/src/linux/close_range.c b/src/linux/close_range.c
new file mode 100644
index 00000000..3f1378a0
--- /dev/null
+++ b/src/linux/close_range.c
@@ -0,0 +1,8 @@
+#define _GNU_SOURCE
+#include <unistd.h>
+#include "syscall.h"
+
+int close_range(unsigned int first, unsigned int last, int flags)
+{
+ return syscall(SYS_close_range, first, last, flags);
+}
--
2.42.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2023-09-01 14:55 ` Natanael Copa
@ 2023-09-01 15:06 ` Rich Felker
0 siblings, 0 replies; 15+ messages in thread
From: Rich Felker @ 2023-09-01 15:06 UTC (permalink / raw)
To: Natanael Copa; +Cc: musl
On Fri, Sep 01, 2023 at 04:55:53PM +0200, Natanael Copa wrote:
> On Fri, 1 Sep 2023 09:57:34 -0400
> Rich Felker <dalias@libc.org> wrote:
>
>
> > > +int close_range(unsigned int first, unsigned int last, int flags)
> > > +{
> > > + return __syscall_ret(syscall(SYS_close_range, first, last, flags));
> > > +}
> > > --
> > > 2.42.0
> >
> > This is double-processing errno. You need either return
> > __syscall_ret(__syscall(...)) (note the second __) or just return
> > syscall(...) (the syscall macro without __ automatically does the
> > __syscall_ret).
>
> Ah, ok, I'll send a v2 patch.
>
> > Aside from that, I think there's a question whether, if we support
> > this as a function rather than leaving it to the application to use
> > the syscall, we should provide a fallback for ENOSYS. I'm not sure,
> > but it's something that should be considered before adding it.
>
> It was mentioned earlier that CPython expects close_range() to
> async-safe, and that glibc does not provide fallback. I would prefer
> that musl does not provide fallback.
>
> https://www.openwall.com/lists/musl/2022/08/18/4
If musl were to provide a fallback it would be AS-safe.
Rich
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH v2] add close_range() syscall wrapper
2023-09-01 14:58 ` [musl] [PATCH v2] " Natanael Copa
@ 2024-08-01 9:43 ` Alexey Izbyshev
2024-08-01 16:25 ` Rich Felker
0 siblings, 1 reply; 15+ messages in thread
From: Alexey Izbyshev @ 2024-08-01 9:43 UTC (permalink / raw)
To: musl; +Cc: Natanael Copa
On 2023-09-01 17:58, Natanael Copa wrote:
> close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> 2.34 added a wrapper.
>
> Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
> it under _BSD_SOURCE since it is also a FreeBSD function.
> ---
>
> v2: use syscall without __syscall_ret
>
> include/unistd.h | 3 +++
> src/linux/close_range.c | 8 ++++++++
> 2 files changed, 11 insertions(+)
> create mode 100644 src/linux/close_range.c
>
> diff --git a/include/unistd.h b/include/unistd.h
> index 5bc7f798..d89e3d4c 100644
> --- a/include/unistd.h
> +++ b/include/unistd.h
> @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
> #define L_INCR 1
> #define L_XTND 2
> int brk(void *);
> +#define CLOSE_RANGE_UNSHARE (1U << 1)
> +#define CLOSE_RANGE_CLOEXEC (1U << 2)
> +int close_range(unsigned int, unsigned int, int);
> void *sbrk(intptr_t);
> pid_t vfork(void);
> int vhangup(void);
> diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> new file mode 100644
> index 00000000..3f1378a0
> --- /dev/null
> +++ b/src/linux/close_range.c
> @@ -0,0 +1,8 @@
> +#define _GNU_SOURCE
> +#include <unistd.h>
> +#include "syscall.h"
> +
> +int close_range(unsigned int first, unsigned int last, int flags)
> +{
> + return syscall(SYS_close_range, first, last, flags);
> +}
Regarding FreeBSD, close_range was added not in 8.0, but in 13.0 [1],
and also backported to 12.2 [2].
Otherwise, this patch looks good to me.
Rich, is it possible to consider close_range wrapper inclusion again?
Apart from FreeBSD and glibc, bionic has it too. A cursory Debian code
search shows that close_range libc wrapper can be used at least by
openssh, libvirt, network-manager, openrc, qemu, lxc, rsyslog packages
(in addition to CPython that I mentioned ealier).
As for having a fallback in case the syscall is unavailable, I'm not
aware of anybody implementing it, so I'd expect all close_range users to
implement their own fallback/error handling. For example, Debian's
openssh migrated from closefrom to close_range with their own fallback
because of too aggressive closefrom fallback in glibc[3].
Thanks,
Alexey
[1]
https://cgit.freebsd.org/src/commit/?h=releng/13.0&id=472ced39efb537374068f06b348fe5dac389c45a
[2]
https://cgit.freebsd.org/src/commit/?h=releng/12.2&id=a80adba5ab46ba6d44d5abfc9b7f3b6de8afda55
[3]
https://sources.debian.org/src/openssh/1%3A9.8p1-1/debian/changelog/#L895
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH v2] add close_range() syscall wrapper
2024-08-01 9:43 ` Alexey Izbyshev
@ 2024-08-01 16:25 ` Rich Felker
2024-08-01 20:24 ` Alexey Izbyshev
0 siblings, 1 reply; 15+ messages in thread
From: Rich Felker @ 2024-08-01 16:25 UTC (permalink / raw)
To: musl
On Thu, Aug 01, 2024 at 12:43:00PM +0300, Alexey Izbyshev wrote:
> On 2023-09-01 17:58, Natanael Copa wrote:
> >close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> >2.34 added a wrapper.
> >
> >Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
> >it under _BSD_SOURCE since it is also a FreeBSD function.
> >---
> >
> >v2: use syscall without __syscall_ret
> >
> > include/unistd.h | 3 +++
> > src/linux/close_range.c | 8 ++++++++
> > 2 files changed, 11 insertions(+)
> > create mode 100644 src/linux/close_range.c
> >
> >diff --git a/include/unistd.h b/include/unistd.h
> >index 5bc7f798..d89e3d4c 100644
> >--- a/include/unistd.h
> >+++ b/include/unistd.h
> >@@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
> > #define L_INCR 1
> > #define L_XTND 2
> > int brk(void *);
> >+#define CLOSE_RANGE_UNSHARE (1U << 1)
> >+#define CLOSE_RANGE_CLOEXEC (1U << 2)
> >+int close_range(unsigned int, unsigned int, int);
> > void *sbrk(intptr_t);
> > pid_t vfork(void);
> > int vhangup(void);
> >diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> >new file mode 100644
> >index 00000000..3f1378a0
> >--- /dev/null
> >+++ b/src/linux/close_range.c
> >@@ -0,0 +1,8 @@
> >+#define _GNU_SOURCE
> >+#include <unistd.h>
> >+#include "syscall.h"
> >+
> >+int close_range(unsigned int first, unsigned int last, int flags)
> >+{
> >+ return syscall(SYS_close_range, first, last, flags);
> >+}
>
> Regarding FreeBSD, close_range was added not in 8.0, but in 13.0
> [1], and also backported to 12.2 [2].
>
> Otherwise, this patch looks good to me.
>
> Rich, is it possible to consider close_range wrapper inclusion
> again? Apart from FreeBSD and glibc, bionic has it too. A cursory
> Debian code search shows that close_range libc wrapper can be used
> at least by openssh, libvirt, network-manager, openrc, qemu, lxc,
> rsyslog packages (in addition to CPython that I mentioned ealier).
>
> As for having a fallback in case the syscall is unavailable, I'm not
> aware of anybody implementing it, so I'd expect all close_range
> users to implement their own fallback/error handling. For example,
> Debian's openssh migrated from closefrom to close_range with their
> own fallback because of too aggressive closefrom fallback in
> glibc[3].
>
> Thanks,
> Alexey
>
> [1] https://cgit.freebsd.org/src/commit/?h=releng/13.0&id=472ced39efb537374068f06b348fe5dac389c45a
> [2] https://cgit.freebsd.org/src/commit/?h=releng/12.2&id=a80adba5ab46ba6d44d5abfc9b7f3b6de8afda55
> [3] https://sources.debian.org/src/openssh/1%3A9.8p1-1/debian/changelog/#L895
Thanks for looking into this.
Generally, I try to follow a principle that if an interface is
genuinely new functionality, managing some new kind of kernel object
or something previously not in the data model, that it's fine not to
have fallback, but that if it's just a new way to act on existing
things (like adding a missing flags argument to an existing
operation), there should be fallback at least in cases where no new
underlying functionality is needed (like in that example, if flags
value is 0).
However, close_range really isn't an improved/generalized way to do
close, but something intended for its own purposes, and if
applications which use it are prepared for it to fail with ENOSYS
(this is important! not just prepared for it to be missing at
configure-time link check) then omitting fallback and letting them do
their own fallbacks seems like it'd be okay.
Unlike closefrom, whose *only* use is invoking UB by closing fds you
don't own, close_range at least admits well-defined uses where you've
tracked ranges you do own and want to close them quickly. That's not
to say folks will use it this way, but having at least some valid use
is something going for it.
Without a fallback, I'm not sure there's a lot of value to providing a
wrapper rather than just having applications use it via
syscall(SYS_close_range, ...), but there's also not any significant
cost to having the wrapper if that's what programs expect.
Are there any other things we should weigh on this topic?
Rich
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH v2] add close_range() syscall wrapper
2024-08-01 16:25 ` Rich Felker
@ 2024-08-01 20:24 ` Alexey Izbyshev
0 siblings, 0 replies; 15+ messages in thread
From: Alexey Izbyshev @ 2024-08-01 20:24 UTC (permalink / raw)
To: musl
On 2024-08-01 19:25, Rich Felker wrote:
> On Thu, Aug 01, 2024 at 12:43:00PM +0300, Alexey Izbyshev wrote:
>> On 2023-09-01 17:58, Natanael Copa wrote:
>> >close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
>> >2.34 added a wrapper.
>> >
>> >Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
>> >it under _BSD_SOURCE since it is also a FreeBSD function.
>> >---
>> >
>> >v2: use syscall without __syscall_ret
>> >
>> > include/unistd.h | 3 +++
>> > src/linux/close_range.c | 8 ++++++++
>> > 2 files changed, 11 insertions(+)
>> > create mode 100644 src/linux/close_range.c
>> >
>> >diff --git a/include/unistd.h b/include/unistd.h
>> >index 5bc7f798..d89e3d4c 100644
>> >--- a/include/unistd.h
>> >+++ b/include/unistd.h
>> >@@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
>> > #define L_INCR 1
>> > #define L_XTND 2
>> > int brk(void *);
>> >+#define CLOSE_RANGE_UNSHARE (1U << 1)
>> >+#define CLOSE_RANGE_CLOEXEC (1U << 2)
>> >+int close_range(unsigned int, unsigned int, int);
>> > void *sbrk(intptr_t);
>> > pid_t vfork(void);
>> > int vhangup(void);
>> >diff --git a/src/linux/close_range.c b/src/linux/close_range.c
>> >new file mode 100644
>> >index 00000000..3f1378a0
>> >--- /dev/null
>> >+++ b/src/linux/close_range.c
>> >@@ -0,0 +1,8 @@
>> >+#define _GNU_SOURCE
>> >+#include <unistd.h>
>> >+#include "syscall.h"
>> >+
>> >+int close_range(unsigned int first, unsigned int last, int flags)
>> >+{
>> >+ return syscall(SYS_close_range, first, last, flags);
>> >+}
>>
>> Regarding FreeBSD, close_range was added not in 8.0, but in 13.0
>> [1], and also backported to 12.2 [2].
>>
>> Otherwise, this patch looks good to me.
>>
>> Rich, is it possible to consider close_range wrapper inclusion
>> again? Apart from FreeBSD and glibc, bionic has it too. A cursory
>> Debian code search shows that close_range libc wrapper can be used
>> at least by openssh, libvirt, network-manager, openrc, qemu, lxc,
>> rsyslog packages (in addition to CPython that I mentioned ealier).
>>
>> As for having a fallback in case the syscall is unavailable, I'm not
>> aware of anybody implementing it, so I'd expect all close_range
>> users to implement their own fallback/error handling. For example,
>> Debian's openssh migrated from closefrom to close_range with their
>> own fallback because of too aggressive closefrom fallback in
>> glibc[3].
>>
>> Thanks,
>> Alexey
>>
>> [1]
>> https://cgit.freebsd.org/src/commit/?h=releng/13.0&id=472ced39efb537374068f06b348fe5dac389c45a
>> [2]
>> https://cgit.freebsd.org/src/commit/?h=releng/12.2&id=a80adba5ab46ba6d44d5abfc9b7f3b6de8afda55
>> [3]
>> https://sources.debian.org/src/openssh/1%3A9.8p1-1/debian/changelog/#L895
>
> Thanks for looking into this.
>
> Generally, I try to follow a principle that if an interface is
> genuinely new functionality, managing some new kind of kernel object
> or something previously not in the data model, that it's fine not to
> have fallback, but that if it's just a new way to act on existing
> things (like adding a missing flags argument to an existing
> operation), there should be fallback at least in cases where no new
> underlying functionality is needed (like in that example, if flags
> value is 0).
>
> However, close_range really isn't an improved/generalized way to do
> close, but something intended for its own purposes, and if
> applications which use it are prepared for it to fail with ENOSYS
> (this is important! not just prepared for it to be missing at
> configure-time link check) then omitting fallback and letting them do
> their own fallbacks seems like it'd be okay.
I've checked 8 projects mentioned above, and all of them are prepared to
handle close_range errors. Usually any error leads to a fallback, though
there are cases where certain errors are treated differently, e.g. in
network-manager[1], lxc[2].
The glibc manual explicitly warns that the caller has to be prepared for
ENOSYS[3], although the Linux man-pages project doesn't mention that[4].
I've also discovered that half of those projects (libvirt,
network-manager, openrc, lxc) can use close_range via syscall in absence
of the wrapper, and libvirt isn't even aware of the wrapper (i.e. it
uses syscall() only).
>
> Unlike closefrom, whose *only* use is invoking UB by closing fds you
> don't own, close_range at least admits well-defined uses where you've
> tracked ranges you do own and want to close them quickly. That's not
> to say folks will use it this way, but having at least some valid use
> is something going for it.
>
> Without a fallback, I'm not sure there's a lot of value to providing a
> wrapper rather than just having applications use it via
> syscall(SYS_close_range, ...), but there's also not any significant
> cost to having the wrapper if that's what programs expect.
>
Indeed, the value of the wrapper can't be called huge, but it's still a
convenience:
* There is no danger of invoking the usual UB by passing int variadic
arguments to libc's syscall implementation that expects long (even
though I'm not aware of an arch where this matters in practice).
* FreeBSD had the wrapper since day 1, so for projects that support both
FreeBSD and Linux (and don't need close_range so desperately that they
can't wait for the wrapper in libc) it'd be nicer to just use the
wrapper in both cases.
> Are there any other things we should weigh on this topic?
>
One thing that comes to mind is that CLOSE_RANGE_UNSHARE flag is an
extra way to create an execution environment that is unexpected by musl.
However, given that musl provides the wrapper for unshare, I don't
believe that breaking compatibility with glibc and bionic by rejecting
some close_range flags (like in clone) is worth it.
I also have to add that I'm somewhat biased in the favor of the wrapper
because it was me who enabled its usage in CPython's subprocess on Linux
(based on pre-existing work for FreeBSD), but this code doesn't work on
musl yet.
Thanks,
Alexey
[1]
https://sources.debian.org/src/network-manager/1.48.6-1/src/libnm-systemd-shared/src/basic/fd-util.c/?hl=314#L314
[2]
https://sources.debian.org/src/lxc/1:6.0.1-1/src/lxc/initutils.c/?hl=612#L612
[3]
https://www.gnu.org/software/libc/manual/html_node/Opening-and-Closing-Files.html#index-close_005frange
[4] https://man7.org/linux/man-pages/man2/close_range.2.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2022-08-18 14:42 ` James Y Knight
@ 2022-08-18 14:54 ` Alexey Izbyshev
0 siblings, 0 replies; 15+ messages in thread
From: Alexey Izbyshev @ 2022-08-18 14:54 UTC (permalink / raw)
To: musl; +Cc: Guilherme Janczak
On 2022-08-18 17:42, James Y Knight wrote:
> On Thu, Aug 18, 2022 at 6:40 AM Alexey Izbyshev <izbyshev@ispras.ru>
> wrote:
>
>> Glibc doesn't implement a fallback and explicitly says it in the
>> manual.
>> Using a different implementation in musl seems undesirable.
>
> True. I would note, however, that glibc also implements another
> function "closefrom", which first calls close_range, and if it fails,
> falls back to iterating /proc/self/fd.
Yes, but glibc's closefrom() *aborts* if there is no /proc, so it's not
suitable for all use cases (and would not be suitable for CPython's
subprocess or "os.closerange()", for instance).
Alexey
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2022-08-18 10:40 ` Alexey Izbyshev
@ 2022-08-18 14:42 ` James Y Knight
2022-08-18 14:54 ` Alexey Izbyshev
0 siblings, 1 reply; 15+ messages in thread
From: James Y Knight @ 2022-08-18 14:42 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 389 bytes --]
On Thu, Aug 18, 2022 at 6:40 AM Alexey Izbyshev <izbyshev@ispras.ru> wrote:
> Glibc doesn't implement a fallback and explicitly says it in the manual.
> Using a different implementation in musl seems undesirable.
True. I would note, however, that glibc also implements another function
"closefrom", which first calls close_range, and if it fails, falls back to
iterating /proc/self/fd.
[-- Attachment #2: Type: text/html, Size: 687 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2022-08-18 0:35 ` Rich Felker
2022-08-18 10:11 ` Érico Nogueira
@ 2022-08-18 12:31 ` Guilherme Janczak
1 sibling, 0 replies; 15+ messages in thread
From: Guilherme Janczak @ 2022-08-18 12:31 UTC (permalink / raw)
To: musl
Please let me know if you need any changes, thanks.
On Wed, Aug 17, 2022 at 08:35:32PM -0400, Rich Felker wrote:
> On Wed, Aug 10, 2022 at 01:03:11PM +0000, Guilherme Janczak wrote:
> > close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> > 2.34 added a wrapper.
> > ---
>
> The existence of this operation has been controversial, and it's
> arguable that it should be excluded by policy not to support UB (it's
> UB to close fds you don't own that might be used internally by the
> implementation) though I'm not sure it really helps since folks who
> want to use it will just make the syscall directly. We should probably
> at least consider it for inclusion.
>
The Linux pull request has more information on the syscall:
https://lkml.iu.edu/hypermail/linux/kernel/2008.0/02649.html
The situation right now is that everyone who wants to do this has their
own fallback:
https://github.com/FreeRADIUS/freeradius-server/blob/a8f7670a15b0d135e82d7029ad8202403b88e3f3/src/lib/server/exec.c#L312
https://github.com/openssh/openssh-portable/blob/715c892f0a5295b391ae92c26ef4d6a86ea96e8e/openbsd-compat/bsd-closefrom.c#L123-L134
https://gitlab.freedesktop.org/dbus/dbus/-/blob/7e0c51d8000a467a153700420a35a18d1b580c51/NEWS#L49-L50
I myself have been writing my own this past week. One thing I noticed is
that they all suffer from subtle bugs. For instance, usually the last
fallback is calling getrlimit() or sysconf() and closing everything
between the desired number and the maximum reported by either function
as done by libbsd:
https://cgit.freedesktop.org/libbsd/tree/src/closefrom.c#n78
However, POSIX allows having a fd numbered above that limit. Here's a
test program:
```
#define _POSIX_C_SOURCE 200809L
#include <sys/resource.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main(void)
{
struct rlimit rl;
int fd;
getrlimit(RLIMIT_NOFILE, &rl);
if (rl.rlim_cur == RLIM_INFINITY)
rl.rlim_cur = 20;
if ((fd = dup2(STDOUT_FILENO, rl.rlim_cur+1)) == -1) {
if (errno != EBADF) {
perror(NULL);
return 1;
}
printf(
"Failed to open file descriptor %d because it's over the\n"
"default RLIM_NOFILE limit of %d.\n", (int)rl.rlim_cur+1,
(int)rl.rlim_cur);
if ((fd = dup2(STDOUT_FILENO, rl.rlim_cur-1)) == -1) {
perror(NULL);
return 1;
}
}
rl.rlim_max = rl.rlim_cur /= 2;
if (setrlimit(RLIMIT_NOFILE, &rl) == -1) {
fputs("failed to set the open file descriptor limit below"
" the highest open fd\n", stderr);
perror(NULL);
return 1;
} else {
printf(
"I have fd %d open and I've reduced the fd limit.\n"
"getrlimit() says the hard limit on open fds is %d.\n"
"sysconf(_SC_OPEN_MAX) reports a limit of %ld.\n", fd,
(int)rl.rlim_max, sysconf(_SC_OPEN_MAX));
}
}
```
Alpine Linux (musl) and Hyperbola Linux (glibc) give the same output:
Failed to open file descriptor 1025 because it's over the
default RLIM_NOFILE limit of 1024.
I have fd 1023 open and I've reduced the fd limit.
getrlimit() says the hard limit on open fds is 512.
sysconf(_SC_OPEN_MAX) reports a limit of 512.
That bug can only be fixed by guaranteeing this last fallback is not
reached.
> > include/unistd.h | 3 +++
> > src/linux/close_range.c | 8 ++++++++
> > 2 files changed, 11 insertions(+)
> > create mode 100644 src/linux/close_range.c
> >
> > diff --git a/include/unistd.h b/include/unistd.h
> > index 80be3b26..e4b8cbb1 100644
> > --- a/include/unistd.h
> > +++ b/include/unistd.h
> > @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
> > #define L_INCR 1
> > #define L_XTND 2
> > int brk(void *);
> > +#define CLOSE_RANGE_UNSHARE 0x01
> > +#define CLOSE_RANGE_CLOEXEC 0x02
> > +int close_range(unsigned int, unsigned int, int);
> > void *sbrk(intptr_t);
> > pid_t vfork(void);
> > int vhangup(void);
> > diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> > new file mode 100644
> > index 00000000..4c5cd68c
> > --- /dev/null
> > +++ b/src/linux/close_range.c
> > @@ -0,0 +1,8 @@
> > +#define _GNU_SOURCE
>
> This should probably be _BSD_SOURCE if it's exposed under _BSD_SOURCE.
glibc exposes it under _GNU_SOURCE:
https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/unistd.h;h=764d914cea1d22ed1a9d0d60b00d9c97f4560a0f;hb=HEAD#l1202
It might be more correct to expose it under both because it's also a
FreeBSD function, I was just thinking about doing the same thing as
glibc here.
>
> > +#include <unistd.h>
> > +#include "syscall.h"
> > +
> > +int close_range(unsigned int first, unsigned int last, int flags)
> > +{
> > + return syscall(SYS_close_range, first, last, (unsigned int)flags);
> > +}
> > --
> > 2.35.1
>
> Can you elaborate on what the cast to unsigned is for?
>
> Rich
The Linux syscall wants the flags parameter to be unsigned int, but the
libc function uses int. syscall() would pun it and that's the same
thing as a cast, but it seemed to make sense to show it's intentional.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2022-08-18 10:11 ` Érico Nogueira
@ 2022-08-18 10:40 ` Alexey Izbyshev
2022-08-18 14:42 ` James Y Knight
0 siblings, 1 reply; 15+ messages in thread
From: Alexey Izbyshev @ 2022-08-18 10:40 UTC (permalink / raw)
To: musl
On 2022-08-18 13:11, Érico Nogueira wrote:
> On Wed Aug 17, 2022 at 9:35 PM -03, Rich Felker wrote:
>> On Wed, Aug 10, 2022 at 01:03:11PM +0000, Guilherme Janczak wrote:
>> > close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
>> > 2.34 added a wrapper.
>> > ---
>>
>> The existence of this operation has been controversial, and it's
>> arguable that it should be excluded by policy not to support UB (it's
>> UB to close fds you don't own that might be used internally by the
>> implementation) though I'm not sure it really helps since folks who
>> want to use it will just make the syscall directly. We should probably
>> at least consider it for inclusion.
>
> I remember an idea to implement fallback logic, in case the syscall is
> unavailable, had been mentioned. It would then avoid whatever fallback
> code the application tried to implement, which might not be as
> relevant,
> now that opendir() can be called in a forked child. And I don't know if
> there's interest in implementing anything more complex at all.
>
Glibc doesn't implement a fallback and explicitly says it in the manual.
Using a different implementation in musl seems undesirable.
Note that CPython since 3.10 can use close_range() in fork/vfork child
for subprocess.Popen(close_fds=True) (which is the default), so it
expects close_range() to be async-signal-safe.
Alexey
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2022-08-18 0:35 ` Rich Felker
@ 2022-08-18 10:11 ` Érico Nogueira
2022-08-18 10:40 ` Alexey Izbyshev
2022-08-18 12:31 ` Guilherme Janczak
1 sibling, 1 reply; 15+ messages in thread
From: Érico Nogueira @ 2022-08-18 10:11 UTC (permalink / raw)
To: musl, Guilherme Janczak
On Wed Aug 17, 2022 at 9:35 PM -03, Rich Felker wrote:
> On Wed, Aug 10, 2022 at 01:03:11PM +0000, Guilherme Janczak wrote:
> > close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> > 2.34 added a wrapper.
> > ---
>
> The existence of this operation has been controversial, and it's
> arguable that it should be excluded by policy not to support UB (it's
> UB to close fds you don't own that might be used internally by the
> implementation) though I'm not sure it really helps since folks who
> want to use it will just make the syscall directly. We should probably
> at least consider it for inclusion.
I remember an idea to implement fallback logic, in case the syscall is
unavailable, had been mentioned. It would then avoid whatever fallback
code the application tried to implement, which might not be as relevant,
now that opendir() can be called in a forked child. And I don't know if
there's interest in implementing anything more complex at all.
>
>
> > include/unistd.h | 3 +++
> > src/linux/close_range.c | 8 ++++++++
> > 2 files changed, 11 insertions(+)
> > create mode 100644 src/linux/close_range.c
> >
> > diff --git a/include/unistd.h b/include/unistd.h
> > index 80be3b26..e4b8cbb1 100644
> > --- a/include/unistd.h
> > +++ b/include/unistd.h
> > @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
> > #define L_INCR 1
> > #define L_XTND 2
> > int brk(void *);
> > +#define CLOSE_RANGE_UNSHARE 0x01
> > +#define CLOSE_RANGE_CLOEXEC 0x02
> > +int close_range(unsigned int, unsigned int, int);
> > void *sbrk(intptr_t);
> > pid_t vfork(void);
> > int vhangup(void);
> > diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> > new file mode 100644
> > index 00000000..4c5cd68c
> > --- /dev/null
> > +++ b/src/linux/close_range.c
> > @@ -0,0 +1,8 @@
> > +#define _GNU_SOURCE
>
> This should probably be _BSD_SOURCE if it's exposed under _BSD_SOURCE.
>
> > +#include <unistd.h>
> > +#include "syscall.h"
> > +
> > +int close_range(unsigned int first, unsigned int last, int flags)
> > +{
> > + return syscall(SYS_close_range, first, last, (unsigned int)flags);
> > +}
> > --
> > 2.35.1
>
> Can you elaborate on what the cast to unsigned is for?
>
> Rich
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [musl] [PATCH] add close_range() syscall wrapper
2022-08-10 13:03 [musl] [PATCH] " Guilherme Janczak
@ 2022-08-18 0:35 ` Rich Felker
2022-08-18 10:11 ` Érico Nogueira
2022-08-18 12:31 ` Guilherme Janczak
0 siblings, 2 replies; 15+ messages in thread
From: Rich Felker @ 2022-08-18 0:35 UTC (permalink / raw)
To: Guilherme Janczak; +Cc: musl
On Wed, Aug 10, 2022 at 01:03:11PM +0000, Guilherme Janczak wrote:
> close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> 2.34 added a wrapper.
> ---
The existence of this operation has been controversial, and it's
arguable that it should be excluded by policy not to support UB (it's
UB to close fds you don't own that might be used internally by the
implementation) though I'm not sure it really helps since folks who
want to use it will just make the syscall directly. We should probably
at least consider it for inclusion.
> include/unistd.h | 3 +++
> src/linux/close_range.c | 8 ++++++++
> 2 files changed, 11 insertions(+)
> create mode 100644 src/linux/close_range.c
>
> diff --git a/include/unistd.h b/include/unistd.h
> index 80be3b26..e4b8cbb1 100644
> --- a/include/unistd.h
> +++ b/include/unistd.h
> @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
> #define L_INCR 1
> #define L_XTND 2
> int brk(void *);
> +#define CLOSE_RANGE_UNSHARE 0x01
> +#define CLOSE_RANGE_CLOEXEC 0x02
> +int close_range(unsigned int, unsigned int, int);
> void *sbrk(intptr_t);
> pid_t vfork(void);
> int vhangup(void);
> diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> new file mode 100644
> index 00000000..4c5cd68c
> --- /dev/null
> +++ b/src/linux/close_range.c
> @@ -0,0 +1,8 @@
> +#define _GNU_SOURCE
This should probably be _BSD_SOURCE if it's exposed under _BSD_SOURCE.
> +#include <unistd.h>
> +#include "syscall.h"
> +
> +int close_range(unsigned int first, unsigned int last, int flags)
> +{
> + return syscall(SYS_close_range, first, last, (unsigned int)flags);
> +}
> --
> 2.35.1
Can you elaborate on what the cast to unsigned is for?
Rich
^ permalink raw reply [flat|nested] 15+ messages in thread
* [musl] [PATCH] add close_range() syscall wrapper
@ 2022-08-10 13:03 Guilherme Janczak
2022-08-18 0:35 ` Rich Felker
0 siblings, 1 reply; 15+ messages in thread
From: Guilherme Janczak @ 2022-08-10 13:03 UTC (permalink / raw)
To: musl
close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
2.34 added a wrapper.
---
include/unistd.h | 3 +++
src/linux/close_range.c | 8 ++++++++
2 files changed, 11 insertions(+)
create mode 100644 src/linux/close_range.c
diff --git a/include/unistd.h b/include/unistd.h
index 80be3b26..e4b8cbb1 100644
--- a/include/unistd.h
+++ b/include/unistd.h
@@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
#define L_INCR 1
#define L_XTND 2
int brk(void *);
+#define CLOSE_RANGE_UNSHARE 0x01
+#define CLOSE_RANGE_CLOEXEC 0x02
+int close_range(unsigned int, unsigned int, int);
void *sbrk(intptr_t);
pid_t vfork(void);
int vhangup(void);
diff --git a/src/linux/close_range.c b/src/linux/close_range.c
new file mode 100644
index 00000000..4c5cd68c
--- /dev/null
+++ b/src/linux/close_range.c
@@ -0,0 +1,8 @@
+#define _GNU_SOURCE
+#include <unistd.h>
+#include "syscall.h"
+
+int close_range(unsigned int first, unsigned int last, int flags)
+{
+ return syscall(SYS_close_range, first, last, (unsigned int)flags);
+}
--
2.35.1
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-08-01 20:24 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-01 8:02 [musl] [PATCH] add close_range() syscall wrapper Natanael Copa
2023-09-01 13:57 ` Rich Felker
2023-09-01 14:55 ` Natanael Copa
2023-09-01 15:06 ` Rich Felker
2023-09-01 14:58 ` [musl] [PATCH v2] " Natanael Copa
2024-08-01 9:43 ` Alexey Izbyshev
2024-08-01 16:25 ` Rich Felker
2024-08-01 20:24 ` Alexey Izbyshev
-- strict thread matches above, loose matches on Subject: below --
2022-08-10 13:03 [musl] [PATCH] " Guilherme Janczak
2022-08-18 0:35 ` Rich Felker
2022-08-18 10:11 ` Érico Nogueira
2022-08-18 10:40 ` Alexey Izbyshev
2022-08-18 14:42 ` James Y Knight
2022-08-18 14:54 ` Alexey Izbyshev
2022-08-18 12:31 ` Guilherme Janczak
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).