From: Alexey Izbyshev <izbyshev@ispras.ru>
To: musl@lists.openwall.com
Subject: Re: [musl] Illegal killlock skipping when transitioning to single-threaded state
Date: Tue, 04 Oct 2022 19:01:04 +0300 [thread overview]
Message-ID: <11b02cf9c4a16cc512c8c38db2e39211@ispras.ru> (raw)
In-Reply-To: <2e77a700561a059e85daad5311306cfb@ispras.ru>
On 2022-10-04 16:50, Alexey Izbyshev wrote:
> On 2022-10-04 02:05, Rich Felker wrote:
>> On Mon, Oct 03, 2022 at 06:54:17PM -0400, Rich Felker wrote:
>>> On Mon, Oct 03, 2022 at 11:27:05PM +0200, Szabolcs Nagy wrote:
>>> > * Szabolcs Nagy <nsz@port70.net> [2022-10-03 15:26:15 +0200]:
>>> >
>>> > > * Alexey Izbyshev <izbyshev@ispras.ru> [2022-10-03 09:16:03 +0300]:
>>> > > > On 2022-09-19 18:29, Rich Felker wrote:
>>> > > > > On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
>>> > > ...
>>> > > > > > Reordering the "libc.need_locks = -1" assignment and
>>> > > > > > UNLOCK(E->killlock) and providing a store barrier between them
>>> > > > > > should fix the issue.
>>> > > > >
>>> > > > > I think this all sounds correct. I'm not sure what you mean by a store
>>> > > > > barrier between them, since all lock and unlock operations are already
>>> > > > > full barriers.
>>> > > > >
>>> > > >
>>> > > > Before sending the report I tried to infer the intended ordering semantics
>>> > > > of LOCK/UNLOCK by looking at their implementations. For AArch64, I didn't
>>> > > > see why they would provide a full barrier (my reasoning is below), so I
>>> > > > concluded that probably acquire/release semantics was intended in general
>>> > > > and suggested an extra store barrier to prevent hoisting of "libc.need_locks
>>> > > > = -1" store spelled after UNLOCK(E->killlock) back into the critical
>>> > > > section.
>>> > > >
>>> > > > UNLOCK is implemented via a_fetch_add(). On AArch64, it is a simple
>>> > > > a_ll()/a_sc() loop without extra barriers, and a_ll()/a_sc() are implemented
>>> > > > via load-acquire/store-release instructions. Therefore, if we consider a
>>> > > > LOCK/UNLOCK critical section containing only plain loads and stores, (a) any
>>> > > > such memory access can be reordered with the initial ldaxr in UNLOCK, and
>>> > > > (b) any plain load following UNLOCK can be reordered with stlxr (assuming
>>> > > > the processor predicts that stlxr succeeds), and further, due to (a), with
>>> > > > any memory access inside the critical section. Therefore, UNLOCK is not full
>>> > > > barrier. Is this right?
>>> > >
>>> > > i dont think this is right.
>>> >
>>> >
>>> > i think i was wrong and you are right.
>>> >
>>> > so with your suggested swap of UNLOCK(killlock) and need_locks=-1 and
>>> > starting with 'something == 0' the exiting E and remaining R threads:
>>> >
>>> > E:something=1 // protected by killlock
>>> > E:UNLOCK(killlock)
>>> > E:need_locks=-1
>>> >
>>> > R:LOCK(unrelated) // reads need_locks == -1
>>> > R:need_locks=0
>>> > R:UNLOCK(unrelated)
>>> > R:LOCK(killlock) // does not lock
>>> > R:read something // can it be 0 ?
>>> >
>>> > and here something can be 0 (ie. not protected by killlock) on aarch64
>>> > because
>>> >
>>> > T1
>>> > something=1
>>> > ldaxr ... killlock
>>> > stlxr ... killlock
>>> > need_locks=-1
>>> >
>>> > T2
>>> > x=need_locks
>>> > ldaxr ... unrelated
>>> > stlxr ... unrelated
>>> > y=something
>>> >
>>> > can end with x==-1 and y==0.
>>> >
>>> > and to fix it, both a_fetch_add and a_cas need an a_barrier.
>>> >
>>> > i need to think how to support such lock usage on aarch64
>>> > without adding too many dmb.
>>>
>>> I don't really understand this, but FWIW gcc emits
>>>
>>> ldxr
>>> ...
>>> stlxr
>>> ...
>>> dmb ish
>>>
>>> for __sync_val_compare_and_swap. So this is probably the right thing
>>> we should have. And it seems to match what the kernel folks discussed
>>> here:
>>>
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/229588.html
>>>
>>> I wondered if there are similar issues for any others archs which
>>> need
>>> review, but it looks like all the other llsc archs have explicit
>>> pre/post barriers defined.
>>
>> Actually I don't understand what's going on with cmpxchg there. The
>> patch I linked has it using ldxr/stxr (not stlxr) for cmpxchg. There's
>> some follow-up in the thread I don't understand, about the case where
>> the cas fails, but we already handle that by doing an explicit barrier
>> in that case.
>>
> I think in that follow-up[1] they mean the following case (in musl
> terms):
>
> volatile int x, flag;
>
> T1:
> x = 1;
> a_store(&flag, 1);
>
> T2:
> while (!flag);
> a_cas(&x, 0, 1); // can this fail?
>
I made a mistake here, this should be a_cas(&x, 1, 2). Everything else
stands.
> They want it to never fail. But if a_cas() is implemented as
> ldrx/stlrx/dmb, this is not guaranteed because ldxr can be reordered
> with the load of flag.
>
> Note that musl does *not* handle this now, because a_barrier() in the
> failure path is after a_ll().
>
> [1]
> https://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/229693.html
>
> Alexey
next prev parent reply other threads:[~2022-10-04 16:01 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-07 0:46 Alexey Izbyshev
2022-09-19 15:29 ` Rich Felker
2022-10-03 6:16 ` Alexey Izbyshev
2022-10-03 12:33 ` Rich Felker
2022-10-03 13:26 ` Szabolcs Nagy
2022-10-03 21:27 ` Szabolcs Nagy
2022-10-03 22:54 ` Rich Felker
2022-10-03 23:05 ` Rich Felker
2022-10-04 13:50 ` Alexey Izbyshev
2022-10-04 14:12 ` Rich Felker
2022-10-04 14:19 ` Rich Felker
2022-10-04 15:43 ` Alexey Izbyshev
2022-10-04 15:57 ` Rich Felker
2022-10-04 18:15 ` Alexey Izbyshev
2022-10-04 23:21 ` Rich Felker
2022-10-04 16:24 ` James Y Knight
2022-10-04 16:45 ` Rich Felker
2022-10-05 13:52 ` James Y Knight
2022-10-04 16:01 ` Alexey Izbyshev [this message]
2022-10-04 2:58 ` Rich Felker
2022-10-04 3:00 ` Rich Felker
2022-10-04 4:59 ` Rich Felker
2022-10-04 8:16 ` Szabolcs Nagy
2022-10-04 10:18 ` Alexey Izbyshev
2022-10-04 5:16 ` Alexey Izbyshev
2022-10-04 8:31 ` Szabolcs Nagy
2022-10-04 10:28 ` Alexey Izbyshev
2022-10-05 1:00 ` Rich Felker
2022-10-05 12:10 ` Alexey Izbyshev
2022-10-05 14:03 ` Rich Felker
2022-10-05 14:37 ` Rich Felker
2022-10-05 16:23 ` Alexey Izbyshev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11b02cf9c4a16cc512c8c38db2e39211@ispras.ru \
--to=izbyshev@ispras.ru \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).