mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [PATCH] mq_notify: fix close/recv race on failure path
@ 2022-11-09 10:46 Alexey Izbyshev
  2022-12-14  2:26 ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Alexey Izbyshev @ 2022-11-09 10:46 UTC (permalink / raw)
  To: musl

In case of failure mq_notify closes the socket immediately after
sending a cancellation request to the worker thread that is going to
call or have already called recv on that socket. Even if we don't
consider the kernel behavior when the only descriptor to an object that
is being used in a system call is closed, if the socket descriptor is
closed before the kernel looks at it, another thread could open a
descriptor with the same value in the meantime, resulting in recv
acting on a wrong object.

Fix the race by moving pthread_cancel call before the barrier wait to
guarantee that the cancellation flag is set before the worker thread
enters recv.
---
Other ways to fix this:

* Remove the racing close call from mq_notify and surround recv
  with pthread_cleanup_push/pop.

* Make the worker thread joinable initially, join it before closing
  the socket on the failure path, and detach it on the happy path.
  This would also require disabling cancellation around join/detach
  to ensure that mq_notify itself is not cancelled in an inappropriate
  state.

Alexey
---
 src/mq/mq_notify.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/mq/mq_notify.c b/src/mq/mq_notify.c
index 221591c7..177caccd 100644
--- a/src/mq/mq_notify.c
+++ b/src/mq/mq_notify.c
@@ -34,7 +34,7 @@ int mq_notify(mqd_t mqd, const struct sigevent *sev)
 	struct args args = { .sev = sev };
 	pthread_attr_t attr;
 	pthread_t td;
-	int s;
+	int s, r;
 	struct sigevent sev2;
 	static const char zeros[32];
 
@@ -56,18 +56,19 @@ int mq_notify(mqd_t mqd, const struct sigevent *sev)
 		return -1;
 	}
 
-	pthread_barrier_wait(&args.barrier);
-	pthread_barrier_destroy(&args.barrier);
-
 	sev2.sigev_notify = SIGEV_THREAD;
 	sev2.sigev_signo = s;
 	sev2.sigev_value.sival_ptr = (void *)&zeros;
 
-	if (syscall(SYS_mq_notify, mqd, &sev2) < 0) {
+	r = syscall(SYS_mq_notify, mqd, &sev2);
+	if (r < 0)
 		pthread_cancel(td);
+
+	pthread_barrier_wait(&args.barrier);
+	pthread_barrier_destroy(&args.barrier);
+
+	if (r < 0)
 		__syscall(SYS_close, s);
-		return -1;
-	}
 
-	return 0;
+	return r;
 }
-- 
2.37.2


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [musl] [PATCH] mq_notify: fix close/recv race on failure path
  2022-11-09 10:46 [musl] [PATCH] mq_notify: fix close/recv race on failure path Alexey Izbyshev
@ 2022-12-14  2:26 ` Rich Felker
  2022-12-14  6:49   ` Alexey Izbyshev
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2022-12-14  2:26 UTC (permalink / raw)
  To: musl

On Wed, Nov 09, 2022 at 01:46:13PM +0300, Alexey Izbyshev wrote:
> In case of failure mq_notify closes the socket immediately after
> sending a cancellation request to the worker thread that is going to
> call or have already called recv on that socket. Even if we don't
> consider the kernel behavior when the only descriptor to an object that
> is being used in a system call is closed, if the socket descriptor is
> closed before the kernel looks at it, another thread could open a
> descriptor with the same value in the meantime, resulting in recv
> acting on a wrong object.
> 
> Fix the race by moving pthread_cancel call before the barrier wait to
> guarantee that the cancellation flag is set before the worker thread
> enters recv.
> ---
> Other ways to fix this:
> 
> * Remove the racing close call from mq_notify and surround recv
>   with pthread_cleanup_push/pop.
> 
> * Make the worker thread joinable initially, join it before closing
>   the socket on the failure path, and detach it on the happy path.
>   This would also require disabling cancellation around join/detach
>   to ensure that mq_notify itself is not cancelled in an inappropriate
>   state.

I'd put this aside for a while because of the pthread barrier
involvement I kinda didn't want to deal with. The fix you have sounds
like it works, but I think I'd rather pursue one of the other
approaches, probably the joinable thread one.

At present, the implementation of barriers seems to be buggy (I need
to dig back up the post about that), and they're also a really
expensive synchronization tool that goes both directions where we
really only need one direction (notifying the caller we're done
consuming the args). I'd rather switch to a semaphore, which is the
lightest and most idiomatic (at least per present-day musl idioms) way
to do this.

Using a joinable thread also lets us ensure we don't leave around
threads that are waiting to be scheduled just to exit on failure
return. Depending on scheduling attributes, this probably could be
bad.

(Also, arguably we should perhaps start the thread with unchanged
scheduling attributes and only change to the caller-provided ones
after successfully making the SYS_mq_notify syscall -- bleh. But
that's a separate topic.)

Rich

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [musl] [PATCH] mq_notify: fix close/recv race on failure path
  2022-12-14  2:26 ` Rich Felker
@ 2022-12-14  6:49   ` Alexey Izbyshev
  0 siblings, 0 replies; 3+ messages in thread
From: Alexey Izbyshev @ 2022-12-14  6:49 UTC (permalink / raw)
  To: musl

On 2022-12-14 05:26, Rich Felker wrote:
> On Wed, Nov 09, 2022 at 01:46:13PM +0300, Alexey Izbyshev wrote:
>> In case of failure mq_notify closes the socket immediately after
>> sending a cancellation request to the worker thread that is going to
>> call or have already called recv on that socket. Even if we don't
>> consider the kernel behavior when the only descriptor to an object 
>> that
>> is being used in a system call is closed, if the socket descriptor is
>> closed before the kernel looks at it, another thread could open a
>> descriptor with the same value in the meantime, resulting in recv
>> acting on a wrong object.
>> 
>> Fix the race by moving pthread_cancel call before the barrier wait to
>> guarantee that the cancellation flag is set before the worker thread
>> enters recv.
>> ---
>> Other ways to fix this:
>> 
>> * Remove the racing close call from mq_notify and surround recv
>>   with pthread_cleanup_push/pop.
>> 
>> * Make the worker thread joinable initially, join it before closing
>>   the socket on the failure path, and detach it on the happy path.
>>   This would also require disabling cancellation around join/detach
>>   to ensure that mq_notify itself is not cancelled in an inappropriate
>>   state.
> 
> I'd put this aside for a while because of the pthread barrier
> involvement I kinda didn't want to deal with. The fix you have sounds
> like it works, but I think I'd rather pursue one of the other
> approaches, probably the joinable thread one.
> 
> At present, the implementation of barriers seems to be buggy (I need
> to dig back up the post about that), and they're also a really
> expensive synchronization tool that goes both directions where we
> really only need one direction (notifying the caller we're done
> consuming the args). I'd rather switch to a semaphore, which is the
> lightest and most idiomatic (at least per present-day musl idioms) way
> to do this.
> 
This sounds good to me. The same approach can also be used in 
timer_create (assuming it's acceptable to add dependency on 
pthread_cancel to that code).

> Using a joinable thread also lets us ensure we don't leave around
> threads that are waiting to be scheduled just to exit on failure
> return. Depending on scheduling attributes, this probably could be
> bad.
> 
I also prefer this approach, though mostly for aesthetic reasons (I 
haven't thought about the scheduling behavior). I didn't use it only 
because I felt it's a "logically larger" change than simply moving the 
pthread_barrier_wait call. And I wasn't aware that barriers are buggy in 
musl.

Thanks,
Alexey

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-14  6:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09 10:46 [musl] [PATCH] mq_notify: fix close/recv race on failure path Alexey Izbyshev
2022-12-14  2:26 ` Rich Felker
2022-12-14  6:49   ` Alexey Izbyshev

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).