mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Alexey Izbyshev <izbyshev@ispras.ru>
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] mq_notify: fix close/recv race on failure path
Date: Sat, 11 Feb 2023 18:13:59 +0300	[thread overview]
Message-ID: <5ca5f57982db1867b11ec9eecefc4df2@ispras.ru> (raw)
In-Reply-To: <20230211145246.GH4163@brightrain.aerifal.cx>

On 2023-02-11 17:52, Rich Felker wrote:
> On Sat, Feb 11, 2023 at 05:45:14PM +0300, Alexey Izbyshev wrote:
>> On 2023-02-10 19:29, Rich Felker wrote:
>> >On Wed, Dec 14, 2022 at 09:49:26AM +0300, Alexey Izbyshev wrote:
>> >>On 2022-12-14 05:26, Rich Felker wrote:
>> >>>On Wed, Nov 09, 2022 at 01:46:13PM +0300, Alexey Izbyshev wrote:
>> >>>>In case of failure mq_notify closes the socket immediately after
>> >>>>sending a cancellation request to the worker thread that is going to
>> >>>>call or have already called recv on that socket. Even if we don't
>> >>>>consider the kernel behavior when the only descriptor to an
>> >>>>object that
>> >>>>is being used in a system call is closed, if the socket descriptor is
>> >>>>closed before the kernel looks at it, another thread could open a
>> >>>>descriptor with the same value in the meantime, resulting in recv
>> >>>>acting on a wrong object.
>> >>>>
>> >>>>Fix the race by moving pthread_cancel call before the barrier wait to
>> >>>>guarantee that the cancellation flag is set before the worker thread
>> >>>>enters recv.
>> >>>>---
>> >>>>Other ways to fix this:
>> >>>>
>> >>>>* Remove the racing close call from mq_notify and surround recv
>> >>>>  with pthread_cleanup_push/pop.
>> >>>>
>> >>>>* Make the worker thread joinable initially, join it before closing
>> >>>>  the socket on the failure path, and detach it on the happy path.
>> >>>>  This would also require disabling cancellation around join/detach
>> >>>>  to ensure that mq_notify itself is not cancelled in an inappropriate
>> >>>>  state.
>> >>>
>> >>>I'd put this aside for a while because of the pthread barrier
>> >>>involvement I kinda didn't want to deal with. The fix you have sounds
>> >>>like it works, but I think I'd rather pursue one of the other
>> >>>approaches, probably the joinable thread one.
>> >>>
>> >>>At present, the implementation of barriers seems to be buggy (I need
>> >>>to dig back up the post about that), and they're also a really
>> >>>expensive synchronization tool that goes both directions where we
>> >>>really only need one direction (notifying the caller we're done
>> >>>consuming the args). I'd rather switch to a semaphore, which is the
>> >>>lightest and most idiomatic (at least per present-day musl idioms) way
>> >>>to do this.
>> >>>
>> >>This sounds good to me. The same approach can also be used in
>> >>timer_create (assuming it's acceptable to add dependency on
>> >>pthread_cancel to that code).
>> >>
>> >>>Using a joinable thread also lets us ensure we don't leave around
>> >>>threads that are waiting to be scheduled just to exit on failure
>> >>>return. Depending on scheduling attributes, this probably could be
>> >>>bad.
>> >>>
>> >>I also prefer this approach, though mostly for aesthetic reasons (I
>> >>haven't thought about the scheduling behavior). I didn't use it only
>> >>because I felt it's a "logically larger" change than simply moving
>> >>the pthread_barrier_wait call. And I wasn't aware that barriers are
>> >>buggy in musl.
>> >
>> >Finally following up on this. How do the attached commits look?
>> >
>> The first and third patches add calls to sem_wait, pthread_join, and
>> pthread_detach, which are cancellation points in musl, so
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Nice catch -- this is actually a bug. pthread_detach is not permitted
> to be a cancellation point.
> 
Indeed. I'd actually tried to check that before I sent the email, but 
got confused by the following sentence from "man 7 pthreads"[1]:

"An implementation may also mark other functions not specified in the 
standard as cancellation points."

My mistake is that I read this as "if a function is not specified to be 
a cancellation point in the standard, an implementation may still mark 
it as a cancellation point". But apparently it means that "if a function 
is not mentioned in the standard at all, an implementation may still 
mark it as a cancellation point".

To anyone wondering, the actual text from POSIX is[2]:

"In addition, a cancellation point may occur when a thread is executing 
any function that this standard does not require to be thread-safe but 
the implementation documents as being thread-safe. If a thread is 
cancelled while executing a non-thread-safe function, the behavior is 
undefined.

An implementation shall not introduce cancellation points into any other 
functions specified in this volume of POSIX.1-2017."

(And pthread_detach is required to be thread-safe).

[1] https://man7.org/linux/man-pages/man7/pthreads.7.html
[2] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_05_02

  reply	other threads:[~2023-02-11 15:14 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-09 10:46 Alexey Izbyshev
2022-12-14  2:26 ` Rich Felker
2022-12-14  6:49   ` Alexey Izbyshev
2023-02-10 16:29     ` Rich Felker
2023-02-11 14:45       ` Alexey Izbyshev
2023-02-11 14:52         ` Rich Felker
2023-02-11 15:13           ` Alexey Izbyshev [this message]
2023-02-11 15:06         ` Rich Felker
2023-02-11 17:13           ` Markus Wichmann
2023-02-11 17:46             ` Rich Felker
2023-02-11 17:50             ` Alexey Izbyshev
2023-02-11 17:59               ` Rich Felker
2023-02-11 18:08                 ` Alexey Izbyshev
2023-02-11 18:35                   ` Rich Felker
2023-02-11 19:28                     ` Alexey Izbyshev
2023-02-11 19:49                       ` Rich Felker
2023-02-11 20:14                         ` Alexey Izbyshev
2023-02-12  0:32                           ` Rich Felker
2023-02-12 18:23                             ` Alexey Izbyshev
2023-02-12 19:35                               ` Alexey Izbyshev
2023-02-12 20:04                                 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ca5f57982db1867b11ec9eecefc4df2@ispras.ru \
    --to=izbyshev@ispras.ru \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).