Re: [musl] Illegal killlock skipping when transitioning to single-threaded state

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Alexey Izbyshev <izbyshev@ispras.ru>
To: musl@lists.openwall.com
Subject: Re: [musl] Illegal killlock skipping when transitioning to single-threaded state
Date: Tue, 04 Oct 2022 08:16:16 +0300	[thread overview]
Message-ID: <bd7c341816819554877e452643d1f698@ispras.ru> (raw)
In-Reply-To: <20221003212705.GG2158779@port70.net>

On 2022-10-04 00:27, Szabolcs Nagy wrote:
> * Szabolcs Nagy <nsz@port70.net> [2022-10-03 15:26:15 +0200]:
> 
>> * Alexey Izbyshev <izbyshev@ispras.ru> [2022-10-03 09:16:03 +0300]:
>> > On 2022-09-19 18:29, Rich Felker wrote:
>> > > On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
>> ...
>> > > > Reordering the "libc.need_locks = -1" assignment and
>> > > > UNLOCK(E->killlock) and providing a store barrier between them
>> > > > should fix the issue.
>> > >
>> > > I think this all sounds correct. I'm not sure what you mean by a store
>> > > barrier between them, since all lock and unlock operations are already
>> > > full barriers.
>> > >
>> >
>> > Before sending the report I tried to infer the intended ordering semantics
>> > of LOCK/UNLOCK by looking at their implementations. For AArch64, I didn't
>> > see why they would provide a full barrier (my reasoning is below), so I
>> > concluded that probably acquire/release semantics was intended in general
>> > and suggested an extra store barrier to prevent hoisting of "libc.need_locks
>> > = -1" store spelled after UNLOCK(E->killlock) back into the critical
>> > section.
>> >
>> > UNLOCK is implemented via a_fetch_add(). On AArch64, it is a simple
>> > a_ll()/a_sc() loop without extra barriers, and a_ll()/a_sc() are implemented
>> > via load-acquire/store-release instructions. Therefore, if we consider a
>> > LOCK/UNLOCK critical section containing only plain loads and stores, (a) any
>> > such memory access can be reordered with the initial ldaxr in UNLOCK, and
>> > (b) any plain load following UNLOCK can be reordered with stlxr (assuming
>> > the processor predicts that stlxr succeeds), and further, due to (a), with
>> > any memory access inside the critical section. Therefore, UNLOCK is not full
>> > barrier. Is this right?
>> 
>> i dont think this is right.
> 
> 
> i think i was wrong and you are right.
> 
> so with your suggested swap of UNLOCK(killlock) and need_locks=-1 and
> starting with 'something == 0' the exiting E and remaining R threads:
> 
> E:something=1      // protected by killlock
> E:UNLOCK(killlock)
> E:need_locks=-1
> 
> R:LOCK(unrelated)  // reads need_locks == -1
> R:need_locks=0
> R:UNLOCK(unrelated)
> R:LOCK(killlock)   // does not lock
> R:read something   // can it be 0 ?
> 
> and here something can be 0 (ie. not protected by killlock) on aarch64
> because
> 
> T1
> 	something=1
> 	ldaxr ... killlock
> 	stlxr ... killlock
> 	need_locks=-1
> 
> T2
> 	x=need_locks
> 	ldaxr ... unrelated
> 	stlxr ... unrelated
> 	y=something
> 
> can end with x==-1 and y==0.
> 
Yes, overall this matches my understanding. However, your UNLOCK 
expansion (in T1/T2) omits the branch instruction between stlxr and the 
following store, and, as I mentioned, I'm not sufficiently knowledgeable 
to understand the effects of this branch on the order of visibility of 
"stlxr killlock" (and preceding stores) and "need_locks=-1".

> and to fix it, both a_fetch_add and a_cas need an a_barrier.
> 
> i need to think how to support such lock usage on aarch64
> without adding too many dmb.
> 

I'd also like to point out that our discussion so far has been focused 
on reordering of "something" protected by the critical section and 
"need_locks=-1", but my original bug report also mentioned the 
possibility of lock corruption, and for the latter we're also interested 
in reordering of "stlxr" and "need_locks=-1". It's possible that some 
UNLOCK implementations can provide the first guarantee, but not the 
second.

> 
> 
>> 
>> memory operations in the critical section cannot visibly happen after 
>> the
>> final stlxr.
>> 
>> memory operations after the critical section cannot visibly happen 
>> before
>> the ldaxr of UNLOCK.
>> 
>> the only issue can be inside the ll/sc loop in UNLOCK if there are 
>> independent
>> memory operations there, but there arent.
>> 
You've already said this is wrong, but I want to spell out why it's so 
for any interested parties to hopefully clear some confusion. Both 
premises (about stlxr and ldaxr) are correct, but the third point 
doesn't follow and is wrong. Because ldaxr/stlxr act as one-way 
barriers, the following pseudo-code

r1 = X
ldaxr LOCK
stlxr LOCK
r2 = Y

can be observed, in my understanding of AArch64 memory model, as

ldaxr LOCK
r1 = X
r2 = Y
stlxr LOCK

and even as

ldaxr LOCK
r2 = Y
r1 = X
stlxr LOCK

The same is true for stores as well.

So effectively the loop in UNLOCK can absorb memory operations from both 
directions, which then can be reordered with each other (with a possible 
exception of stores that follow UNLOCK, as I said above).

Thanks,
Alexey

next prev parent reply	other threads:[~2022-10-04  5:16 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-07  0:46 Alexey Izbyshev
2022-09-19 15:29 ` Rich Felker
2022-10-03  6:16   ` Alexey Izbyshev
2022-10-03 12:33     ` Rich Felker
2022-10-03 13:26     ` Szabolcs Nagy
2022-10-03 21:27       ` Szabolcs Nagy
2022-10-03 22:54         ` Rich Felker
2022-10-03 23:05           ` Rich Felker
2022-10-04 13:50             ` Alexey Izbyshev
2022-10-04 14:12               ` Rich Felker
2022-10-04 14:19                 ` Rich Felker
2022-10-04 15:43                   ` Alexey Izbyshev
2022-10-04 15:57                     ` Rich Felker
2022-10-04 18:15                       ` Alexey Izbyshev
2022-10-04 23:21                         ` Rich Felker
2022-10-04 16:24                 ` James Y Knight
2022-10-04 16:45                   ` Rich Felker
2022-10-05 13:52                     ` James Y Knight
2022-10-04 16:01               ` Alexey Izbyshev
2022-10-04  2:58         ` Rich Felker
2022-10-04  3:00           ` Rich Felker
2022-10-04  4:59             ` Rich Felker
2022-10-04  8:16               ` Szabolcs Nagy
2022-10-04 10:18               ` Alexey Izbyshev
2022-10-04  5:16         ` Alexey Izbyshev [this message]
2022-10-04  8:31           ` Szabolcs Nagy
2022-10-04 10:28             ` Alexey Izbyshev
2022-10-05  1:00 ` Rich Felker
2022-10-05 12:10   ` Alexey Izbyshev
2022-10-05 14:03     ` Rich Felker
2022-10-05 14:37       ` Rich Felker
2022-10-05 16:23         ` Alexey Izbyshev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd7c341816819554877e452643d1f698@ispras.ru \
    --to=izbyshev@ispras.ru \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).