From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] Re: MT fork and key_lock in pthread_key_create.c
Date: Thu, 6 Oct 2022 15:50:54 -0400 [thread overview]
Message-ID: <20221006195053.GX29905@brightrain.aerifal.cx> (raw)
In-Reply-To: <20221006192042.GW29905@brightrain.aerifal.cx>
[-- Attachment #1: Type: text/plain, Size: 2692 bytes --]
On Thu, Oct 06, 2022 at 03:20:42PM -0400, Rich Felker wrote:
> On Thu, Oct 06, 2022 at 10:02:11AM +0300, Alexey Izbyshev wrote:
> > On 2022-10-06 09:37, Alexey Izbyshev wrote:
> > >Hi,
> > >
> > >I noticed that fork() doesn't take key_lock that is used to protect
> > >the global table of thread-specific keys. I couldn't find mentions of
> > >this lock in the MT fork discussion in the mailing list archive. Was
> > >this lock overlooked?
> > >
> > >Also, I looked at how __aio_atfork() handles a similar case with
> > >maplock, and it seems wrong. It takes the read lock and then simply
> > >unlocks it both in the parent and in the child. But if there were
> > >other holders of the read lock at the time of fork(), the lock won't
> > >end up in the unlocked state in the child. It should probably be
> > >completely nulled-out in the child instead.
> > >
> > Looking at aio further, I don't understand how it's supposed to work
> > with MT fork at all. __aio_atfork() is called in _Fork() when the
> > allocator locks are already held. Meanwhile another thread could be
> > stuck in __aio_get_queue() holding maplock in exclusive mode while
> > trying to allocate, resulting in deadlock.
>
> Indeed, this is messy and I don't think it makes sense to be doing
> this at all. The child is just going to throw away the state so the
> parent shouldn't need to synchronize at all, but if we walk the
> multi-level map[] table in the child after async fork, it's possible
> that the contents seen are inconsistent, even that the pointers are
> only half-written or something.
>
> I see a few possible solutions:
>
> 1. Just set map = 0 in the child and leak the memory. This is not
> going to matter unless you're doing multiple generations of fork
> with aio anyway.
>
> 2. The same, but be a little bit smarter. pthread_rwlock_tryrdlock in
> the child, and if it succeeds, we know the map is consistent so we
> can just zero it out the same as now. Still "leaks" but only on
> contention to expand the map.
>
> 3. Getting a little smarter still: move the __aio_atfork for the
> parent side from _Fork to fork, outside of the critical section
> where malloc lock is held. Then proceed as in (2). Now, the
> tryrdlock is guaranteed to succeed in the child. Leak is only
> possible when _Fork is used (in which case the child context is an
> async signal one, and thus calling any aio_* that would allocate
> map[] again is UB -- note that in this case, the only reason we
> have to do anything at all in the child is to prevent close from
> interacting with aio).
>
> After writing them out, 3 seems like the right choice.
Proposed patch attached.
[-- Attachment #2: aio_atfork.diff --]
[-- Type: text/plain, Size: 2550 bytes --]
diff --git a/src/aio/aio.c b/src/aio/aio.c
index fa24f6b6..4c3379e1 100644
--- a/src/aio/aio.c
+++ b/src/aio/aio.c
@@ -401,11 +401,25 @@ void __aio_atfork(int who)
if (who<0) {
pthread_rwlock_rdlock(&maplock);
return;
+ } else if (!who) {
+ pthread_rwlock_unlock(&maplock);
+ return;
}
- if (who>0 && map) for (int a=0; a<(-1U/2+1)>>24; a++)
+ if (pthread_rwlock_tryrdlock(&maplock)) {
+ /* Obtaining lock may fail if _Fork was called nor via
+ * fork. In this case, no further aio is possible from
+ * child and we can just null out map so __aio_close
+ * does not attempt to do anything. */
+ map = 0;
+ return;
+ }
+ if (map) for (int a=0; a<(-1U/2+1)>>24; a++)
if (map[a]) for (int b=0; b<256; b++)
if (map[a][b]) for (int c=0; c<256; c++)
if (map[a][b][c]) for (int d=0; d<256; d++)
map[a][b][c][d] = 0;
- pthread_rwlock_unlock(&maplock);
+ /* Re-initialize the rwlock rather than unlocking since there
+ * may have been more than one reference on it in the parent.
+ * We are not a lock holder anyway; the thread in the parent was. */
+ pthread_rwlock_init(&maplock, 0);
}
diff --git a/src/process/_Fork.c b/src/process/_Fork.c
index da063868..fb0fdc2c 100644
--- a/src/process/_Fork.c
+++ b/src/process/_Fork.c
@@ -14,7 +14,6 @@ pid_t _Fork(void)
pid_t ret;
sigset_t set;
__block_all_sigs(&set);
- __aio_atfork(-1);
LOCK(__abort_lock);
#ifdef SYS_fork
ret = __syscall(SYS_fork);
@@ -32,7 +31,7 @@ pid_t _Fork(void)
if (libc.need_locks) libc.need_locks = -1;
}
UNLOCK(__abort_lock);
- __aio_atfork(!ret);
+ if (!ret) __aio_atfork(1);
__restore_sigs(&set);
return __syscall_ret(ret);
}
diff --git a/src/process/fork.c b/src/process/fork.c
index ff71845c..80e804b1 100644
--- a/src/process/fork.c
+++ b/src/process/fork.c
@@ -36,6 +36,7 @@ static volatile int *const *const atfork_locks[] = {
static void dummy(int x) { }
weak_alias(dummy, __fork_handler);
weak_alias(dummy, __malloc_atfork);
+weak_alias(dummy, __aio_atfork);
weak_alias(dummy, __ldso_atfork);
static void dummy_0(void) { }
@@ -50,6 +51,7 @@ pid_t fork(void)
int need_locks = libc.need_locks > 0;
if (need_locks) {
__ldso_atfork(-1);
+ __aio_atfork(-1);
__inhibit_ptc();
for (int i=0; i<sizeof atfork_locks/sizeof *atfork_locks; i++)
if (*atfork_locks[i]) LOCK(*atfork_locks[i]);
@@ -75,6 +77,7 @@ pid_t fork(void)
if (ret) UNLOCK(*atfork_locks[i]);
else **atfork_locks[i] = 0;
__release_ptc();
+ if (ret) __aio_atfork(0);
__ldso_atfork(!ret);
}
__restore_sigs(&set);
next prev parent reply other threads:[~2022-10-06 19:51 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-06 6:37 [musl] " Alexey Izbyshev
2022-10-06 7:02 ` [musl] " Alexey Izbyshev
2022-10-06 19:20 ` Rich Felker
2022-10-06 19:50 ` Rich Felker [this message]
2022-10-07 1:26 ` Rich Felker
2022-10-07 10:53 ` Alexey Izbyshev
2022-10-07 21:18 ` Rich Felker
2022-10-08 16:07 ` Alexey Izbyshev
2022-10-07 8:18 ` Alexey Izbyshev
2022-10-06 20:04 ` Jeffrey Walton
2022-10-06 20:09 ` Rich Felker
2022-10-06 18:21 ` [musl] " Rich Felker
2022-10-08 1:36 ` Rich Felker
2022-10-08 17:03 ` Alexey Izbyshev
2022-10-11 17:50 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221006195053.GX29905@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).