From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29719 invoked from network); 3 Oct 2022 13:26:30 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 3 Oct 2022 13:26:30 -0000 Received: (qmail 13809 invoked by uid 550); 3 Oct 2022 13:26:28 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 13773 invoked from network); 3 Oct 2022 13:26:27 -0000 Date: Mon, 3 Oct 2022 15:26:15 +0200 From: Szabolcs Nagy To: musl@lists.openwall.com Cc: Alexey Izbyshev Message-ID: <20221003132615.GF2158779@port70.net> Mail-Followup-To: musl@lists.openwall.com, Alexey Izbyshev References: <20220919152937.GQ9709@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [musl] Illegal killlock skipping when transitioning to single-threaded state * Alexey Izbyshev [2022-10-03 09:16:03 +0300]: > On 2022-09-19 18:29, Rich Felker wrote: > > On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote: ... > > > Reordering the "libc.need_locks = -1" assignment and > > > UNLOCK(E->killlock) and providing a store barrier between them > > > should fix the issue. > > > > I think this all sounds correct. I'm not sure what you mean by a store > > barrier between them, since all lock and unlock operations are already > > full barriers. > > > > Before sending the report I tried to infer the intended ordering semantics > of LOCK/UNLOCK by looking at their implementations. For AArch64, I didn't > see why they would provide a full barrier (my reasoning is below), so I > concluded that probably acquire/release semantics was intended in general > and suggested an extra store barrier to prevent hoisting of "libc.need_locks > = -1" store spelled after UNLOCK(E->killlock) back into the critical > section. > > UNLOCK is implemented via a_fetch_add(). On AArch64, it is a simple > a_ll()/a_sc() loop without extra barriers, and a_ll()/a_sc() are implemented > via load-acquire/store-release instructions. Therefore, if we consider a > LOCK/UNLOCK critical section containing only plain loads and stores, (a) any > such memory access can be reordered with the initial ldaxr in UNLOCK, and > (b) any plain load following UNLOCK can be reordered with stlxr (assuming > the processor predicts that stlxr succeeds), and further, due to (a), with > any memory access inside the critical section. Therefore, UNLOCK is not full > barrier. Is this right? i dont think this is right. memory operations in the critical section cannot visibly happen after the final stlxr. memory operations after the critical section cannot visibly happen before the ldaxr of UNLOCK. the only issue can be inside the ll/sc loop in UNLOCK if there are independent memory operations there, but there arent. > > As for a store following UNLOCK, I'm not sure. A plain store following stlxr > can be reordered with it, but here that store is conditional on stlxr > success. So even if the processor predicts stlxr success, it can't make the > following store visible before it's sure that stlxr succeeded. But I don't > know whether the events "the processor knows that stlxr succeeded" and "the > result of stlxr is globally visible" are separate or not, and if they are > separate, what comes first. Depending on the answer, UNLOCK acts as a store > barrier or not. > UNLOCK on aarch64 acts as a full seqcst barrier as far as i can see. but looking into the arch implementations needs a detailed understanding of the arch memory model (eg aarch64 stlxr is RCsc not RCpc like iso c release store), but there is no need for that: the musl model is simple seqcst synchronization everywhere.