mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] synccall patches
@ 2023-10-31 16:21 Markus Wichmann
  2023-11-01 13:00 ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Markus Wichmann @ 2023-10-31 16:21 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1190 bytes --]

Hi all,

fittingly for Reformation Day, I chose to actually send in some patches
today, rather than provide comments. And fittingly for Halloween, they
concern a scary part of musl, namely __synccall().

First I noticed that the initial return value set in __setxid() is not a
valid return value for the functions that call it. Those are only
allowed to return 0 or -1, not 1. That is easily fixed.

Then I noticed that __synccall() has a logic bug explained in the commit
message. It wouldn't be a bug if semaphores had any kind of preference
for the thread that has waited longest or something, but musl does not
implement something like this. And the kernel explicitly documents that
FUTEX_WAKE will wake a random thread.

At first I thought the best solution would be to forego the serialized
execution of the callback; just release all threads in line 97 (and then
fix the callbacks to cope with parallel execution). But BSD semaphores
have no way to do that, and a mass-release loop like the one on line 110
would have the same issue as given here. It would only be less likely to
happen, with the likelihood rising with the number of threads. So a new
semaphore it is.

Ciao,
Markus

[-- Attachment #2: 0001-Initialize-setxid-retval-to-valid-value.patch --]
[-- Type: text/x-diff, Size: 1381 bytes --]

From 44e97fb238670e83eff31c3894cc8a024cef051c Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Tue, 31 Oct 2023 16:59:47 +0100
Subject: [PATCH 1/2] Initialize setxid retval to valid value.

If __synccall() fails to capture all threads because tkill fails for
some reason other than EAGAIN, then the callback given will never be
executed, so nothing will ever overwrite the initial value. So that is
the value that will be returned from the function. The previous setting
of 1 is not a valid value for setuid() et al. to return.

I chose -EAGAIN since I don't know the reason the synccall failed ahead
of time, but EAGAIN is a specified error code for a possibly temporary
failure in setuid().
---
 src/unistd/setxid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/unistd/setxid.c b/src/unistd/setxid.c
index 487c1a16..0a2d9d12 100644
--- a/src/unistd/setxid.c
+++ b/src/unistd/setxid.c
@@ -28,7 +28,7 @@ int __setxid(int nr, int id, int eid, int sid)
 {
 	/* ret is initially nonzero so that failure of the first thread does not
 	 * trigger the safety kill above. */
-	struct ctx c = { .nr = nr, .id = id, .eid = eid, .sid = sid, .ret = 1 };
+	struct ctx c = { .nr = nr, .id = id, .eid = eid, .sid = sid, .ret = -EAGAIN };
 	__synccall(do_setxid, &c);
 	return __syscall_ret(c.ret);
 }
--
2.39.2


[-- Attachment #3: 0002-synccall-Add-exit_sem.patch --]
[-- Type: text/x-diff, Size: 2469 bytes --]

From f0793c5565b0670fa5283edaaadf8fbdb7f197be Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Tue, 31 Oct 2023 17:03:44 +0100
Subject: [PATCH 2/2] synccall: Add exit_sem.

The code intends for the sem_post() in line 97 (now 98) to only unblock
target threads waiting on line 29. But after the first thread is
released, the next sem_post() might also unblock a thread waiting on
line 36. That would cause the thread to return to the execution of user
code before all threads are done, leading to user code being executed in
a mixed-credentials environment.

What's more, if this happens more than once, then the mass release on
line 110 (now line 111) will cause multiple threads to execute the
callback at the same time, and the callbacks are currently not written
to cope with that situation.

Adding another semaphore allows the caller to say explicitly which
threads it wants to release.
---
 src/thread/synccall.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/thread/synccall.c b/src/thread/synccall.c
index a6b177c0..38597254 100644
--- a/src/thread/synccall.c
+++ b/src/thread/synccall.c
@@ -11,7 +11,7 @@ weak_alias(dummy_0, __tl_unlock);

 static int target_tid;
 static void (*callback)(void *), *context;
-static sem_t target_sem, caller_sem;
+static sem_t target_sem, caller_sem, exit_sem;

 static void dummy(void *p)
 {
@@ -33,7 +33,7 @@ static void handler(int sig)
 	/* Inform caller we've complered the callback and wait
 	 * for the caller to release us to return. */
 	sem_post(&caller_sem);
-	sem_wait(&target_sem);
+	sem_wait(&exit_sem);

 	/* Inform caller we are returning and state is destroyable. */
 	sem_post(&caller_sem);
@@ -62,6 +62,7 @@ void __synccall(void (*func)(void *), void *ctx)

 	sem_init(&target_sem, 0, 0);
 	sem_init(&caller_sem, 0, 0);
+	sem_init(&exit_sem, 0, 0);

 	if (!libc.threads_minus_1 || __syscall(SYS_gettid) != self->tid)
 		goto single_threaded;
@@ -107,12 +108,13 @@ single_threaded:
 	/* Only release the caught threads once all threads, including the
 	 * caller, have returned from the callback function. */
 	for (i=0; i<count; i++)
-		sem_post(&target_sem);
+		sem_post(&exit_sem);
 	for (i=0; i<count; i++)
 		sem_wait(&caller_sem);

 	sem_destroy(&caller_sem);
 	sem_destroy(&target_sem);
+	sem_destroy(&exit_sem);

 	pthread_setcancelstate(cs, 0);
 	__tl_unlock();
--
2.39.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] synccall patches
  2023-10-31 16:21 [musl] synccall patches Markus Wichmann
@ 2023-11-01 13:00 ` Rich Felker
  2023-11-01 13:53   ` Markus Wichmann
  2023-11-01 17:28   ` Alexey Izbyshev
  0 siblings, 2 replies; 6+ messages in thread
From: Rich Felker @ 2023-11-01 13:00 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl

On Tue, Oct 31, 2023 at 05:21:01PM +0100, Markus Wichmann wrote:
> Hi all,
> 
> fittingly for Reformation Day, I chose to actually send in some patches
> today, rather than provide comments. And fittingly for Halloween, they
> concern a scary part of musl, namely __synccall().
> 
> First I noticed that the initial return value set in __setxid() is not a
> valid return value for the functions that call it. Those are only
> allowed to return 0 or -1, not 1. That is easily fixed.

That change completely breaks the logic. See the first line of
do_setxid. It will always return failure without ever attempting the
operation -- this is to prevent any other threads from trying the
operation once the first one has failed.

On the other hand, the thing you're worried about, the original value
of c.ret being passed to __syscall_ret, can't happen. If it was
initially positive on entry to do_setxid, a syscall is made and the
return value is stored into c.ret.

> Then I noticed that __synccall() has a logic bug explained in the commit
> message. It wouldn't be a bug if semaphores had any kind of preference
> for the thread that has waited longest or something, but musl does not
> implement something like this. And the kernel explicitly documents that
> FUTEX_WAKE will wake a random thread.

Indeed there is no such contract, and really cannot be. Aside from the
possibility of interruption and restart, the event of starting to wait
on a semaphore is not observable. A thread that hasn't been scheduled
yet that is about to execute sem_wait is indistinguishable from (in
the abstract machine) a thread that's already started waiting.

> At first I thought the best solution would be to forego the serialized
> execution of the callback; just release all threads in line 97 (and then
> fix the callbacks to cope with parallel execution). But BSD semaphores
> have no way to do that, and a mass-release loop like the one on line 110
> would have the same issue as given here. It would only be less likely to
> happen, with the likelihood rising with the number of threads. So a new
> semaphore it is.

Again, there's fundamentally no way to do that with a semaphore for
the above reason -- you can't know that all the threads are waiting
already, even if there were such an interface to perform the +=
operation atomically on the semaphore value.

I think your assessment of the problem is correct and I think your fix
works but I need to look at it in a little more detail. Review from
others familiar with this stuff would be very welcome too.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] synccall patches
  2023-11-01 13:00 ` Rich Felker
@ 2023-11-01 13:53   ` Markus Wichmann
  2023-11-01 17:05     ` Rich Felker
  2023-11-01 17:28   ` Alexey Izbyshev
  1 sibling, 1 reply; 6+ messages in thread
From: Markus Wichmann @ 2023-11-01 13:53 UTC (permalink / raw)
  To: musl

Am Wed, Nov 01, 2023 at 09:00:33AM -0400 schrieb Rich Felker:
> On the other hand, the thing you're worried about, the original value
> of c.ret being passed to __syscall_ret, can't happen. If it was
> initially positive on entry to do_setxid, a syscall is made and the
> return value is stored into c.ret.
>

But if the tkill fails in __synccall, do_setxid() will never be called
at all (that's what line 87 in synccall.c does). So the original value
will remain.

Perhaps __synccall should return failure in that case, after doing
everything else. Then the logic could otherwise remain untouched, and
__setxid could respond appropriately to the failure itself.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] synccall patches
  2023-11-01 13:53   ` Markus Wichmann
@ 2023-11-01 17:05     ` Rich Felker
  2023-11-02 16:42       ` Markus Wichmann
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2023-11-01 17:05 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl

On Wed, Nov 01, 2023 at 02:53:32PM +0100, Markus Wichmann wrote:
> Am Wed, Nov 01, 2023 at 09:00:33AM -0400 schrieb Rich Felker:
> > On the other hand, the thing you're worried about, the original value
> > of c.ret being passed to __syscall_ret, can't happen. If it was
> > initially positive on entry to do_setxid, a syscall is made and the
> > return value is stored into c.ret.
> >
> 
> But if the tkill fails in __synccall, do_setxid() will never be called
> at all (that's what line 87 in synccall.c does). So the original value
> will remain.
> 
> Perhaps __synccall should return failure in that case, after doing
> everything else. Then the logic could otherwise remain untouched, and
> __setxid could respond appropriately to the failure itself.

OK, that failure case should not actually be reachable short of bogus
seccomp filters or kernel doing something dumb. But if it is
reachable, yes that's a bug.

The simplest fix is probably just changing the setxid logic to detect
that the callback was never called and convert that to an error. One
motivation for changing the __synccall contract to return this error
might be reporting which error it was, but I don't really see a good
reason these errors that shouldn't happen (and which might clash with
the existing meaning of error codes used by set*id) should translate
directly into error codes for the caller.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] synccall patches
  2023-11-01 13:00 ` Rich Felker
  2023-11-01 13:53   ` Markus Wichmann
@ 2023-11-01 17:28   ` Alexey Izbyshev
  1 sibling, 0 replies; 6+ messages in thread
From: Alexey Izbyshev @ 2023-11-01 17:28 UTC (permalink / raw)
  To: musl

On 2023-11-01 20:00, Rich Felker wrote:
> On Tue, Oct 31, 2023 at 05:21:01PM +0100, Markus Wichmann wrote:
>> Then I noticed that __synccall() has a logic bug explained in the 
>> commit
>> message. It wouldn't be a bug if semaphores had any kind of preference
>> for the thread that has waited longest or something, but musl does not
>> implement something like this. And the kernel explicitly documents 
>> that
>> FUTEX_WAKE will wake a random thread.
>> 
>> At first I thought the best solution would be to forego the serialized
>> execution of the callback; just release all threads in line 97 (and 
>> then
>> fix the callbacks to cope with parallel execution). But BSD semaphores
>> have no way to do that, and a mass-release loop like the one on line 
>> 110
>> would have the same issue as given here. It would only be less likely 
>> to
>> happen, with the likelihood rising with the number of threads. So a 
>> new
>> semaphore it is.
> 
> I think your assessment of the problem is correct and I think your fix
> works but I need to look at it in a little more detail. Review from
> others familiar with this stuff would be very welcome too.
> 
I looked at synccall code back when it was rewritten in commit 
e4235d70672d9751d7718ddc2b52d0b426430768, but I missed this bug. Markus' 
patch looks good to me.

Alexey

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] synccall patches
  2023-11-01 17:05     ` Rich Felker
@ 2023-11-02 16:42       ` Markus Wichmann
  0 siblings, 0 replies; 6+ messages in thread
From: Markus Wichmann @ 2023-11-02 16:42 UTC (permalink / raw)
  To: musl; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 87 bytes --]

Hi all,

as discussed, I'm attaching a new patch for the __setxid issue.

Ciao,
Markus

[-- Attachment #2: 0004-Ensure-valid-setxid-return-value.patch --]
[-- Type: text/x-diff, Size: 1264 bytes --]

From 1f3b01f625dd7d33a7107efbd8417857461bb4fe Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Thu, 2 Nov 2023 17:39:33 +0100
Subject: [PATCH 4/4] Ensure valid setxid return value.

If __synccall() fails to capture all threads because tkill fails for
some reason other than EAGAIN, then the callback given will never be
executed, so nothing will ever overwrite the initial value. So that is
the value that will be returned from the function. The previous setting
of 1 is not a valid value for setuid() et al. to return.

I chose -EAGAIN since I don't know the reason the synccall failed ahead
of time, but EAGAIN is a specified error code for a possibly temporary
failure in setuid().
---
 src/unistd/setxid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/unistd/setxid.c b/src/unistd/setxid.c
index 487c1a16..1dfb2c01 100644
--- a/src/unistd/setxid.c
+++ b/src/unistd/setxid.c
@@ -30,5 +30,5 @@ int __setxid(int nr, int id, int eid, int sid)
 	 * trigger the safety kill above. */
 	struct ctx c = { .nr = nr, .id = id, .eid = eid, .sid = sid, .ret = 1 };
 	__synccall(do_setxid, &c);
-	return __syscall_ret(c.ret);
+	return __syscall_ret(c.ret > 0? -EAGAIN : c.ret);
 }
--
2.39.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-11-02 16:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-31 16:21 [musl] synccall patches Markus Wichmann
2023-11-01 13:00 ` Rich Felker
2023-11-01 13:53   ` Markus Wichmann
2023-11-01 17:05     ` Rich Felker
2023-11-02 16:42       ` Markus Wichmann
2023-11-01 17:28   ` Alexey Izbyshev

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).