From: Alexey Izbyshev <izbyshev@ispras.ru>
To: musl@lists.openwall.com
Subject: Re: [musl] Illegal killlock skipping when transitioning to single-threaded state
Date: Mon, 03 Oct 2022 09:16:03 +0300 [thread overview]
Message-ID: <ae1b4d72f2de2fbf859ef1df9a2f7ff9@ispras.ru> (raw)
In-Reply-To: <20220919152937.GQ9709@brightrain.aerifal.cx>
On 2022-09-19 18:29, Rich Felker wrote:
> On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
>> Hi,
>>
>> While reading pthread_exit() implementation I noticed that it can
>> set "libc.need_locks" to -1 while still holding the killlock of the
>> exiting thread:
>>
>> if (!--libc.threads_minus_1) libc.need_locks = -1;
>>
>> If the remaining thread attempts to acquire the same killlock
>> concurrently (which is valid for joinable threads), it works fine
>> because LOCK() resets "libc.need_locks" only after a_cas():
>>
>> int need_locks = libc.need_locks;
>> if (!need_locks) return;
>> /* fast path: INT_MIN for the lock, +1 for the congestion */
>> int current = a_cas(l, 0, INT_MIN + 1);
>> if (need_locks < 0) libc.need_locks = 0;
>> if (!current) return;
>>
>> However, because "libc.need_locks" is reset when using LOCK() for
>> any other lock too, the following could happen (E is the exiting
>> thread, R is the remaining thread):
>>
>> E: libc.need_locks = -1
>> R: LOCK(unrelated_lock)
>> R: libc.need_locks = 0
>> R: UNLOCK(unrelated_lock)
>> R: LOCK(E->killlock) // skips locking, now both R and E think they
>> are holding the lock
>> R: UNLOCK(E->killlock)
>> E: UNLOCK(E->killlock)
>>
>> The lack of mutual exclusion means that the tid reuse problem that
>> killlock is supposed to prevent might occur.
>>
>> Moreover, when UNLOCK(E->killlock) is called concurrently by R and
>> E, a_fetch_add() could be done twice if timing is such that both
>> threads notice that "l[0] < 0":
>>
>> /* Check l[0] to see if we are multi-threaded. */
>> if (l[0] < 0) {
>> if (a_fetch_add(l, -(INT_MIN + 1)) != (INT_MIN + 1)) {
>> __wake(l, 1, 1);
>> }
>> }
>>
>> In that case E->killlock will be assigned to INT_MAX (i.e. "unlocked
>> with INT_MAX waiters"). This is a bad state because the next LOCK()
>> will wrap it to "unlocked" state instead of locking. Also, if more
>> than two threads attempt to use it, a deadlock will occur if two
>> supposedly-owners execute a_fetch_add() concurrently in UNLOCK()
>> after a third thread registered itself as a waiter because the lock
>> will wrap to INT_MIN.
>>
>> Reordering the "libc.need_locks = -1" assignment and
>> UNLOCK(E->killlock) and providing a store barrier between them
>> should fix the issue.
>
> I think this all sounds correct. I'm not sure what you mean by a store
> barrier between them, since all lock and unlock operations are already
> full barriers.
>
Before sending the report I tried to infer the intended ordering
semantics of LOCK/UNLOCK by looking at their implementations. For
AArch64, I didn't see why they would provide a full barrier (my
reasoning is below), so I concluded that probably acquire/release
semantics was intended in general and suggested an extra store barrier
to prevent hoisting of "libc.need_locks = -1" store spelled after
UNLOCK(E->killlock) back into the critical section.
UNLOCK is implemented via a_fetch_add(). On AArch64, it is a simple
a_ll()/a_sc() loop without extra barriers, and a_ll()/a_sc() are
implemented via load-acquire/store-release instructions. Therefore, if
we consider a LOCK/UNLOCK critical section containing only plain loads
and stores, (a) any such memory access can be reordered with the initial
ldaxr in UNLOCK, and (b) any plain load following UNLOCK can be
reordered with stlxr (assuming the processor predicts that stlxr
succeeds), and further, due to (a), with any memory access inside the
critical section. Therefore, UNLOCK is not full barrier. Is this right?
As for a store following UNLOCK, I'm not sure. A plain store following
stlxr can be reordered with it, but here that store is conditional on
stlxr success. So even if the processor predicts stlxr success, it can't
make the following store visible before it's sure that stlxr succeeded.
But I don't know whether the events "the processor knows that stlxr
succeeded" and "the result of stlxr is globally visible" are separate or
not, and if they are separate, what comes first. Depending on the
answer, UNLOCK acts as a store barrier or not.
> I'm still a little bit concerned though that there's no real model for
> synchronization of the access to modify need_locks though.
> Conceptually, the store of -1 from the exiting thread and the store of
> 0 in the surviving thread are not ordered with respect to each other
> except by an unsynchronized causality relationship.
>
I don't understand implications of the lack of (stronger) ordering of
these two stores. Could you clarify?
> I suspect what "should be" happening is that, if we observe
> need_locks==-1 (as a relaxed atomic load in our memory model), we take
> the thread list lock (as the lock controlling write access to these
> values) which ensures that the exiting thread has exited, then confirm
> that we are the only thread left (are there async-signal ways without
> UB that another thread could be created in the mean time?) before
> setting it to 0. But this is a rather heavy operation. If there's an
> assurance that no other threads can be created in the middle of LOCK
> (which I think there should be), we could just use __tl_sync(), which
> is much lighter, to conclude that we've synchronized with being the
> only remaining thread and that unsynchronized access is okay.
>
LOCK is called either with application signals blocked or from
async-signal-unsafe functions, so interrupting it with a signal handler
that calls pthread_create() is not possible without UB. Therefore, I
don't see any issues with using __tl_sync() for synchronization with the
exiting thread (although, as I said above, I don't understand what
problem is being solved).
By the way, I noticed a suspicious usage of "libc.need_locks" in fork():
int need_locks = libc.need_locks > 0;
This is different from how LOCK behaves: LOCK will still perform normal
locking one last time if it sees -1, but here locking will be skipped
entirely. Is this intentional?
Thanks,
Alexey
next prev parent reply other threads:[~2022-10-03 6:16 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-07 0:46 Alexey Izbyshev
2022-09-19 15:29 ` Rich Felker
2022-10-03 6:16 ` Alexey Izbyshev [this message]
2022-10-03 12:33 ` Rich Felker
2022-10-03 13:26 ` Szabolcs Nagy
2022-10-03 21:27 ` Szabolcs Nagy
2022-10-03 22:54 ` Rich Felker
2022-10-03 23:05 ` Rich Felker
2022-10-04 13:50 ` Alexey Izbyshev
2022-10-04 14:12 ` Rich Felker
2022-10-04 14:19 ` Rich Felker
2022-10-04 15:43 ` Alexey Izbyshev
2022-10-04 15:57 ` Rich Felker
2022-10-04 18:15 ` Alexey Izbyshev
2022-10-04 23:21 ` Rich Felker
2022-10-04 16:24 ` James Y Knight
2022-10-04 16:45 ` Rich Felker
2022-10-05 13:52 ` James Y Knight
2022-10-04 16:01 ` Alexey Izbyshev
2022-10-04 2:58 ` Rich Felker
2022-10-04 3:00 ` Rich Felker
2022-10-04 4:59 ` Rich Felker
2022-10-04 8:16 ` Szabolcs Nagy
2022-10-04 10:18 ` Alexey Izbyshev
2022-10-04 5:16 ` Alexey Izbyshev
2022-10-04 8:31 ` Szabolcs Nagy
2022-10-04 10:28 ` Alexey Izbyshev
2022-10-05 1:00 ` Rich Felker
2022-10-05 12:10 ` Alexey Izbyshev
2022-10-05 14:03 ` Rich Felker
2022-10-05 14:37 ` Rich Felker
2022-10-05 16:23 ` Alexey Izbyshev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae1b4d72f2de2fbf859ef1df9a2f7ff9@ispras.ru \
--to=izbyshev@ispras.ru \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).