From: Alexey Izbyshev <izbyshev@ispras.ru>
To: musl@lists.openwall.com
Subject: Re: [musl] Illegal killlock skipping when transitioning to single-threaded state
Date: Tue, 04 Oct 2022 08:16:16 +0300 [thread overview]
Message-ID: <bd7c341816819554877e452643d1f698@ispras.ru> (raw)
In-Reply-To: <20221003212705.GG2158779@port70.net>
On 2022-10-04 00:27, Szabolcs Nagy wrote:
> * Szabolcs Nagy <nsz@port70.net> [2022-10-03 15:26:15 +0200]:
>
>> * Alexey Izbyshev <izbyshev@ispras.ru> [2022-10-03 09:16:03 +0300]:
>> > On 2022-09-19 18:29, Rich Felker wrote:
>> > > On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
>> ...
>> > > > Reordering the "libc.need_locks = -1" assignment and
>> > > > UNLOCK(E->killlock) and providing a store barrier between them
>> > > > should fix the issue.
>> > >
>> > > I think this all sounds correct. I'm not sure what you mean by a store
>> > > barrier between them, since all lock and unlock operations are already
>> > > full barriers.
>> > >
>> >
>> > Before sending the report I tried to infer the intended ordering semantics
>> > of LOCK/UNLOCK by looking at their implementations. For AArch64, I didn't
>> > see why they would provide a full barrier (my reasoning is below), so I
>> > concluded that probably acquire/release semantics was intended in general
>> > and suggested an extra store barrier to prevent hoisting of "libc.need_locks
>> > = -1" store spelled after UNLOCK(E->killlock) back into the critical
>> > section.
>> >
>> > UNLOCK is implemented via a_fetch_add(). On AArch64, it is a simple
>> > a_ll()/a_sc() loop without extra barriers, and a_ll()/a_sc() are implemented
>> > via load-acquire/store-release instructions. Therefore, if we consider a
>> > LOCK/UNLOCK critical section containing only plain loads and stores, (a) any
>> > such memory access can be reordered with the initial ldaxr in UNLOCK, and
>> > (b) any plain load following UNLOCK can be reordered with stlxr (assuming
>> > the processor predicts that stlxr succeeds), and further, due to (a), with
>> > any memory access inside the critical section. Therefore, UNLOCK is not full
>> > barrier. Is this right?
>>
>> i dont think this is right.
>
>
> i think i was wrong and you are right.
>
> so with your suggested swap of UNLOCK(killlock) and need_locks=-1 and
> starting with 'something == 0' the exiting E and remaining R threads:
>
> E:something=1 // protected by killlock
> E:UNLOCK(killlock)
> E:need_locks=-1
>
> R:LOCK(unrelated) // reads need_locks == -1
> R:need_locks=0
> R:UNLOCK(unrelated)
> R:LOCK(killlock) // does not lock
> R:read something // can it be 0 ?
>
> and here something can be 0 (ie. not protected by killlock) on aarch64
> because
>
> T1
> something=1
> ldaxr ... killlock
> stlxr ... killlock
> need_locks=-1
>
> T2
> x=need_locks
> ldaxr ... unrelated
> stlxr ... unrelated
> y=something
>
> can end with x==-1 and y==0.
>
Yes, overall this matches my understanding. However, your UNLOCK
expansion (in T1/T2) omits the branch instruction between stlxr and the
following store, and, as I mentioned, I'm not sufficiently knowledgeable
to understand the effects of this branch on the order of visibility of
"stlxr killlock" (and preceding stores) and "need_locks=-1".
> and to fix it, both a_fetch_add and a_cas need an a_barrier.
>
> i need to think how to support such lock usage on aarch64
> without adding too many dmb.
>
I'd also like to point out that our discussion so far has been focused
on reordering of "something" protected by the critical section and
"need_locks=-1", but my original bug report also mentioned the
possibility of lock corruption, and for the latter we're also interested
in reordering of "stlxr" and "need_locks=-1". It's possible that some
UNLOCK implementations can provide the first guarantee, but not the
second.
>
>
>>
>> memory operations in the critical section cannot visibly happen after
>> the
>> final stlxr.
>>
>> memory operations after the critical section cannot visibly happen
>> before
>> the ldaxr of UNLOCK.
>>
>> the only issue can be inside the ll/sc loop in UNLOCK if there are
>> independent
>> memory operations there, but there arent.
>>
You've already said this is wrong, but I want to spell out why it's so
for any interested parties to hopefully clear some confusion. Both
premises (about stlxr and ldaxr) are correct, but the third point
doesn't follow and is wrong. Because ldaxr/stlxr act as one-way
barriers, the following pseudo-code
r1 = X
ldaxr LOCK
stlxr LOCK
r2 = Y
can be observed, in my understanding of AArch64 memory model, as
ldaxr LOCK
r1 = X
r2 = Y
stlxr LOCK
and even as
ldaxr LOCK
r2 = Y
r1 = X
stlxr LOCK
The same is true for stores as well.
So effectively the loop in UNLOCK can absorb memory operations from both
directions, which then can be reordered with each other (with a possible
exception of stores that follow UNLOCK, as I said above).
Thanks,
Alexey
next prev parent reply other threads:[~2022-10-04 5:16 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-07 0:46 Alexey Izbyshev
2022-09-19 15:29 ` Rich Felker
2022-10-03 6:16 ` Alexey Izbyshev
2022-10-03 12:33 ` Rich Felker
2022-10-03 13:26 ` Szabolcs Nagy
2022-10-03 21:27 ` Szabolcs Nagy
2022-10-03 22:54 ` Rich Felker
2022-10-03 23:05 ` Rich Felker
2022-10-04 13:50 ` Alexey Izbyshev
2022-10-04 14:12 ` Rich Felker
2022-10-04 14:19 ` Rich Felker
2022-10-04 15:43 ` Alexey Izbyshev
2022-10-04 15:57 ` Rich Felker
2022-10-04 18:15 ` Alexey Izbyshev
2022-10-04 23:21 ` Rich Felker
2022-10-04 16:24 ` James Y Knight
2022-10-04 16:45 ` Rich Felker
2022-10-05 13:52 ` James Y Knight
2022-10-04 16:01 ` Alexey Izbyshev
2022-10-04 2:58 ` Rich Felker
2022-10-04 3:00 ` Rich Felker
2022-10-04 4:59 ` Rich Felker
2022-10-04 8:16 ` Szabolcs Nagy
2022-10-04 10:18 ` Alexey Izbyshev
2022-10-04 5:16 ` Alexey Izbyshev [this message]
2022-10-04 8:31 ` Szabolcs Nagy
2022-10-04 10:28 ` Alexey Izbyshev
2022-10-05 1:00 ` Rich Felker
2022-10-05 12:10 ` Alexey Izbyshev
2022-10-05 14:03 ` Rich Felker
2022-10-05 14:37 ` Rich Felker
2022-10-05 16:23 ` Alexey Izbyshev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bd7c341816819554877e452643d1f698@ispras.ru \
--to=izbyshev@ispras.ru \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).