* [musl] Why is setrlimit() considered to have per-thread effect?
@ 2020-10-15 5:01 Alexey Izbyshev
2020-10-15 8:50 ` Szabolcs Nagy
2020-10-15 15:43 ` Rich Felker
0 siblings, 2 replies; 10+ messages in thread
From: Alexey Izbyshev @ 2020-10-15 5:01 UTC (permalink / raw)
To: musl
Hello,
Commit 544ee752cd[1] claims that setrlimit() is per-thread on Linux,
similarly to setxid() calls, so it should be called via __synccall().
But this appears to be wrong: the kernel code operates on
tsk->signal[2], which is a per-thread-group structure. Glibc doesn't
call setrlimit() for each thread either. Am I missing something?
Tangentially, setgroups() is not called via __synccall(), though it does
have per-thread effect. Is this intentional?
Alexey
[1]
https://git.musl-libc.org/cgit/musl/commit/?id=544ee752cd38febfa3aa3798b4dfb6fabd13846b
[2] https://elixir.bootlin.com/linux/v5.9/source/kernel/sys.c#L1566
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 5:01 [musl] Why is setrlimit() considered to have per-thread effect? Alexey Izbyshev
@ 2020-10-15 8:50 ` Szabolcs Nagy
2020-10-15 15:49 ` Rich Felker
2020-10-15 15:50 ` Alexey Izbyshev
2020-10-15 15:43 ` Rich Felker
1 sibling, 2 replies; 10+ messages in thread
From: Szabolcs Nagy @ 2020-10-15 8:50 UTC (permalink / raw)
To: Alexey Izbyshev; +Cc: musl
* Alexey Izbyshev <izbyshev@ispras.ru> [2020-10-15 08:01:00 +0300]:
> Commit 544ee752cd[1] claims that setrlimit() is per-thread on Linux,
> similarly to setxid() calls, so it should be called via __synccall(). But
> this appears to be wrong: the kernel code operates on tsk->signal[2], which
> is a per-thread-group structure. Glibc doesn't call setrlimit() for each
> thread either. Am I missing something?
note that prlimit does not have synccall in
musl: the kernel implemented the per process
rlimit setting when prlimit was added.
(i think this is linux commit
1c1e618ddd15f69fd87ccea596769f78c8065504 )
but older kernels don't have that.
>
> Tangentially, setgroups() is not called via __synccall(), though it does
> have per-thread effect. Is this intentional?
that may be a bug, but it's not a posix api
so not a conformance issue, but a linux issue:
if other linux libcs don't do synccall then
that's the defacto interface contract.
>
> Alexey
>
> [1] https://git.musl-libc.org/cgit/musl/commit/?id=544ee752cd38febfa3aa3798b4dfb6fabd13846b
> [2] https://elixir.bootlin.com/linux/v5.9/source/kernel/sys.c#L1566
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 8:50 ` Szabolcs Nagy
@ 2020-10-15 15:49 ` Rich Felker
2020-10-15 16:13 ` Alexey Izbyshev
2020-10-15 15:50 ` Alexey Izbyshev
1 sibling, 1 reply; 10+ messages in thread
From: Rich Felker @ 2020-10-15 15:49 UTC (permalink / raw)
To: musl; +Cc: Alexey Izbyshev
On Thu, Oct 15, 2020 at 10:50:24AM +0200, Szabolcs Nagy wrote:
> * Alexey Izbyshev <izbyshev@ispras.ru> [2020-10-15 08:01:00 +0300]:
> > Commit 544ee752cd[1] claims that setrlimit() is per-thread on Linux,
> > similarly to setxid() calls, so it should be called via __synccall(). But
> > this appears to be wrong: the kernel code operates on tsk->signal[2], which
> > is a per-thread-group structure. Glibc doesn't call setrlimit() for each
> > thread either. Am I missing something?
>
> note that prlimit does not have synccall in
> musl: the kernel implemented the per process
> rlimit setting when prlimit was added.
> (i think this is linux commit
> 1c1e618ddd15f69fd87ccea596769f78c8065504 )
>
> but older kernels don't have that.
setrlimit implemented in terms of prlimit does; as far as I can tell
prlimit does not perform any process-global action itself but just
lets you target different tasks. This means we *could* "optimize"
setrlimit to skip __synccall and instead just iterate over the thread
list and SYS_prlimit each one from the calling thread context.
The prlimit function on the other hand behaves as the Linux syscall
and lets you set thread-specific limits.
> > Tangentially, setgroups() is not called via __synccall(), though it does
> > have per-thread effect. Is this intentional?
>
> that may be a bug, but it's not a posix api
> so not a conformance issue, but a linux issue:
> if other linux libcs don't do synccall then
> that's the defacto interface contract.
I'm not sure about that. We should probably examine whether there's
any intent to change just one thread in real world usage, as well as
whether other systems (e.g. BSDs) do the POSIX-like behavior and make
it process-global. If so I'm inclined to change it to be more
consistent.
Rich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 15:49 ` Rich Felker
@ 2020-10-15 16:13 ` Alexey Izbyshev
2020-10-15 17:13 ` Rich Felker
0 siblings, 1 reply; 10+ messages in thread
From: Alexey Izbyshev @ 2020-10-15 16:13 UTC (permalink / raw)
To: musl
On 2020-10-15 18:49, Rich Felker wrote:
> setrlimit implemented in terms of prlimit does; as far as I can tell
> prlimit does not perform any process-global action itself but just
> lets you target different tasks. This means we *could* "optimize"
> setrlimit to skip __synccall and instead just iterate over the thread
> list and SYS_prlimit each one from the calling thread context.
>
> The prlimit function on the other hand behaves as the Linux syscall
> and lets you set thread-specific limits.
>
But in my understanding, prlimit() sets process- (not thread-) specific
limits, and have done so since its introduction[1]. The code operates on
"signal" structure which is shared between threads of a thread group.
Further, an earlier commit[2] explicitly says that "...rlimit are
per process and not per-thread.". It's true that in pre-2.6.10 kernels
setrlimit() operated in per-thread limits (see my reply to Szabolcs),
but it's not related to prlimit() syscall, which was added much later.
To be clear, I did not propose to optimize setrlimit() in my initial
email, I was just surprised that synccall() is needed at all. But if we
want optimization, it seems that trying prlimit() first and falling back
to synccall() in case of ENOSYS would be what we want.
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c022a0acad534fd5f5d5f17280f6d4d135e74e81
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7855c35da7ba16b389d17710401c4a55a3ea2102
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 16:13 ` Alexey Izbyshev
@ 2020-10-15 17:13 ` Rich Felker
2020-10-15 18:26 ` Alexey Izbyshev
0 siblings, 1 reply; 10+ messages in thread
From: Rich Felker @ 2020-10-15 17:13 UTC (permalink / raw)
To: Alexey Izbyshev; +Cc: musl
On Thu, Oct 15, 2020 at 07:13:30PM +0300, Alexey Izbyshev wrote:
> On 2020-10-15 18:49, Rich Felker wrote:
> >setrlimit implemented in terms of prlimit does; as far as I can tell
> >prlimit does not perform any process-global action itself but just
> >lets you target different tasks. This means we *could* "optimize"
> >setrlimit to skip __synccall and instead just iterate over the thread
> >list and SYS_prlimit each one from the calling thread context.
> >
> >The prlimit function on the other hand behaves as the Linux syscall
> >and lets you set thread-specific limits.
> >
> But in my understanding, prlimit() sets process- (not thread-)
> specific limits, and have done so since its introduction[1]. The
That was not my understanding, but it may be true. I would not assume
it's true just because of the word "process" in a commit message or
comment since kernel folks, especially in that era, regularly used
"process" and "thread" interchangibly/inconsistently.
> code operates on "signal" structure which is shared between threads
> of a thread group. Further, an earlier commit[2] explicitly says
> that "...rlimit are
> per process and not per-thread.". It's true that in pre-2.6.10
> kernels setrlimit() operated in per-thread limits (see my reply to
> Szabolcs), but it's not related to prlimit() syscall, which was
> added much later.
>
> To be clear, I did not propose to optimize setrlimit() in my initial
> email, I was just surprised that synccall() is needed at all. But if
> we want optimization, it seems that trying prlimit() first and
> falling back to synccall() in case of ENOSYS would be what we want.
If correct, I agree -- we can avoid the need for __synccall when
prlimit works. I'd like to find commits or source lines supporting
that in their actual (code) content though rather than just as a
mention in commit messages, since it's contrary to what my (probably
outdated) understanding of how rlimits worked was.
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c022a0acad534fd5f5d5f17280f6d4d135e74e81
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7855c35da7ba16b389d17710401c4a55a3ea2102
Somewhat off-topic, but for some reason that second link is bringing
my browser to a crawl swapping, despite the commit being tiny when I
view it locally in my kernel tree. Weird.
Rich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 17:13 ` Rich Felker
@ 2020-10-15 18:26 ` Alexey Izbyshev
2020-10-15 20:03 ` Rich Felker
0 siblings, 1 reply; 10+ messages in thread
From: Alexey Izbyshev @ 2020-10-15 18:26 UTC (permalink / raw)
To: musl
On 2020-10-15 20:13, Rich Felker wrote:
> On Thu, Oct 15, 2020 at 07:13:30PM +0300, Alexey Izbyshev wrote:
> If correct, I agree -- we can avoid the need for __synccall when
> prlimit works. I'd like to find commits or source lines supporting
> that in their actual (code) content though rather than just as a
> mention in commit messages, since it's contrary to what my (probably
> outdated) understanding of how rlimits worked was.
>
Here they are (the first two were referenced in my reply to Szabolcs).
* Change of setrlimit() to operate on signal_struct in 2.6.10:
https://elixir.bootlin.com/linux/v2.6.10/source/kernel/sys.c#L1487
(compare with
https://elixir.bootlin.com/linux/v2.6.9/source/kernel/sys.c#L1537)
* Definition of signal_struct in 2.6.10, which is per-thread-group
(apart from "rlim", it contains many other thread-group-related fields):
https://elixir.bootlin.com/linux/v2.6.10/source/include/linux/sched.h#L268
* Usage if signal_struct in 2.6.36 (the first kernel with prlimit()) in
do_prlimit(), which is a common function implementing setrlimit(),
getrlimit() and prlimit():
https://elixir.bootlin.com/linux/v2.6.36/source/kernel/sys.c#L1333
Finally, I performed a simple experiment: on 2.6.30 kernel (with glibc
2.5), created a thread and changed RLIMIT_FSIZE via setrlimit(). After
that, "/proc/pid/limits" reported the new limit, so it was applied to
the whole process. Strace confirmed that only a single setrlimit()
system call was performed.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 18:26 ` Alexey Izbyshev
@ 2020-10-15 20:03 ` Rich Felker
0 siblings, 0 replies; 10+ messages in thread
From: Rich Felker @ 2020-10-15 20:03 UTC (permalink / raw)
To: Alexey Izbyshev; +Cc: musl
On Thu, Oct 15, 2020 at 09:26:33PM +0300, Alexey Izbyshev wrote:
> On 2020-10-15 20:13, Rich Felker wrote:
> >On Thu, Oct 15, 2020 at 07:13:30PM +0300, Alexey Izbyshev wrote:
> >If correct, I agree -- we can avoid the need for __synccall when
> >prlimit works. I'd like to find commits or source lines supporting
> >that in their actual (code) content though rather than just as a
> >mention in commit messages, since it's contrary to what my (probably
> >outdated) understanding of how rlimits worked was.
> >
> Here they are (the first two were referenced in my reply to Szabolcs).
>
> * Change of setrlimit() to operate on signal_struct in 2.6.10:
> https://elixir.bootlin.com/linux/v2.6.10/source/kernel/sys.c#L1487
> (compare with
> https://elixir.bootlin.com/linux/v2.6.9/source/kernel/sys.c#L1537)
>
> * Definition of signal_struct in 2.6.10, which is per-thread-group
> (apart from "rlim", it contains many other thread-group-related
> fields): https://elixir.bootlin.com/linux/v2.6.10/source/include/linux/sched.h#L268
>
> * Usage if signal_struct in 2.6.36 (the first kernel with prlimit())
> in do_prlimit(), which is a common function implementing
> setrlimit(), getrlimit() and prlimit():
> https://elixir.bootlin.com/linux/v2.6.36/source/kernel/sys.c#L1333
>
> Finally, I performed a simple experiment: on 2.6.30 kernel (with
> glibc 2.5), created a thread and changed RLIMIT_FSIZE via
> setrlimit(). After that, "/proc/pid/limits" reported the new limit,
> so it was applied to the whole process. Strace confirmed that only a
> single setrlimit() system call was performed.
Excellent, thanks for doing this research! I'll adjust setrlimit to
use __synccall only in the fallback where SYS_prlimit fails.
Rich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 8:50 ` Szabolcs Nagy
2020-10-15 15:49 ` Rich Felker
@ 2020-10-15 15:50 ` Alexey Izbyshev
2020-10-15 20:05 ` Rich Felker
1 sibling, 1 reply; 10+ messages in thread
From: Alexey Izbyshev @ 2020-10-15 15:50 UTC (permalink / raw)
To: musl
On 2020-10-15 11:50, Szabolcs Nagy wrote:
> note that prlimit does not have synccall in
> musl: the kernel implemented the per process
> rlimit setting when prlimit was added.
> (i think this is linux commit
> 1c1e618ddd15f69fd87ccea596769f78c8065504 )
>
> but older kernels don't have that.
>
Ah, thank you for checking that, though the transition appear to have
happened much earlier than the commit you referenced (which is not
relevant), in pre-git epoch between 2.6.9 and 2.6.10[1, 2]. I was
confused because Linux man pages never mention that and explicitly say
"Resource limits are per-process attributes that are shared by all of
the threads in a process."[3], but I should have checked old sources.
>> Tangentially, setgroups() is not called via __synccall(), though it
>> does
>> have per-thread effect. Is this intentional?
>
> that may be a bug, but it's not a posix api
> so not a conformance issue, but a linux issue:
> if other linux libcs don't do synccall then
> that's the defacto interface contract.
>
FWIW, glibc does synccall since 2011:
https://sourceware.org/git/?p=glibc.git;a=commit;h=70181fddf14
[1] https://elixir.bootlin.com/linux/v2.6.9/source/kernel/sys.c#L1537
[2] https://elixir.bootlin.com/linux/v2.6.10/source/kernel/sys.c#L1487
[3] https://man7.org/linux/man-pages/man2/setrlimit.2.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 15:50 ` Alexey Izbyshev
@ 2020-10-15 20:05 ` Rich Felker
0 siblings, 0 replies; 10+ messages in thread
From: Rich Felker @ 2020-10-15 20:05 UTC (permalink / raw)
To: Alexey Izbyshev; +Cc: musl
On Thu, Oct 15, 2020 at 06:50:41PM +0300, Alexey Izbyshev wrote:
> >>Tangentially, setgroups() is not called via __synccall(), though
> >>it does
> >>have per-thread effect. Is this intentional?
> >
> >that may be a bug, but it's not a posix api
> >so not a conformance issue, but a linux issue:
> >if other linux libcs don't do synccall then
> >that's the defacto interface contract.
> >
> FWIW, glibc does synccall since 2011:
> https://sourceware.org/git/?p=glibc.git;a=commit;h=70181fddf14
Thanks, looks like we should do the same here.
Rich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [musl] Why is setrlimit() considered to have per-thread effect?
2020-10-15 5:01 [musl] Why is setrlimit() considered to have per-thread effect? Alexey Izbyshev
2020-10-15 8:50 ` Szabolcs Nagy
@ 2020-10-15 15:43 ` Rich Felker
1 sibling, 0 replies; 10+ messages in thread
From: Rich Felker @ 2020-10-15 15:43 UTC (permalink / raw)
To: musl
On Thu, Oct 15, 2020 at 08:01:00AM +0300, Alexey Izbyshev wrote:
> Hello,
>
> Commit 544ee752cd[1] claims that setrlimit() is per-thread on Linux,
> similarly to setxid() calls, so it should be called via
> __synccall(). But this appears to be wrong: the kernel code operates
> on tsk->signal[2], which is a per-thread-group structure. Glibc
> doesn't call setrlimit() for each thread either. Am I missing
> something?
POSIX specifies that it sets the limits for the process. If the kernel
doesn't do that, we have to implement in userspace.
> Tangentially, setgroups() is not called via __synccall(), though it
> does have per-thread effect. Is this intentional?
POSIX doesn't define setgroups, so it's up to the implementation.
Conceptually since POSIX has supplemental groups they probably
*should* be forced process-global, so maybe we should change this.
For an example that's more clear-cut, setfs[ug]id is explicitly not
process-global because it's not a security boundary and the whole
purpose is to be able to do local id changes then revert them for the
sake of performing access as a different user/group.
Rich
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-10-15 20:05 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15 5:01 [musl] Why is setrlimit() considered to have per-thread effect? Alexey Izbyshev
2020-10-15 8:50 ` Szabolcs Nagy
2020-10-15 15:49 ` Rich Felker
2020-10-15 16:13 ` Alexey Izbyshev
2020-10-15 17:13 ` Rich Felker
2020-10-15 18:26 ` Alexey Izbyshev
2020-10-15 20:03 ` Rich Felker
2020-10-15 15:50 ` Alexey Izbyshev
2020-10-15 20:05 ` Rich Felker
2020-10-15 15:43 ` Rich Felker
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).