From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5817 invoked from network); 18 May 2023 12:23:24 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 18 May 2023 12:23:24 -0000 Received: (qmail 18385 invoked by uid 550); 18 May 2023 12:23:20 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 18347 invoked from network); 18 May 2023 12:23:19 -0000 Date: Thu, 18 May 2023 14:23:06 +0200 From: Szabolcs Nagy To: 847567161 <847567161@qq.com> Cc: musl Message-ID: <20230518122306.GU3630668@port70.net> Mail-Followup-To: 847567161 <847567161@qq.com>, musl References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: Subject: Re: Re: [musl] =?utf-8?B?UXVlc3Rpb27vvJpX?= =?utf-8?Q?hy?= musl call a_barrier in __pthread_once? * 847567161 <847567161@qq.com> [2023-05-18 10:49:44 +0800]: > > There is an alternate algorithm for pthread_once that doesn't require > > a barrier in the common case, which I've considered implementing. But > > it does need efficient access to thread-local storage. At one time, > > this was a kinda bad assumption (especially legacy mips is horribly > > slow at TLS) but nowadays it's probably the right choice to make, and > > we should check that out again... >=20 > 1=E3=80=81Can we move dmb after we get the value of control=EF=BC=9F like= this=EF=BC=9A >=20 > int __pthread_once(pthread_once_t *control, void (*init)(void)) > { > /* Return immediately if init finished before, but ensure that > * effects of the init routine are visible to the caller. */ > if (*(volatile int *)control =3D=3D 2) { > // a_barrier(); > return 0; > } writes in init may not be visible when *control=3D=3D2, without the barrier. (there are many explanations on the web why double-checked locking is wrong without an acquire barrier, that's the same issue if you are interested in the details) > 2=E3=80=81Can we use 'ldar' to instead of dmb here? I see musl > already use 'stlxr' in a_sc. like this: >=20 > static inline int load(volatile int *p) > { > int v; > __asm__ __volatile__ ("ldar %w0,%1" : "=3Dr"(v) : "Q"(*p)); > return v; > } >=20 > if (load((volatile int *)control) =3D=3D 2) { > return 0; > } i think acquire ordering is enough because posix does not require pthread_once to synchronize memory, but musl does not have an acquire barrier/load, so it uses a_barrier. it is probably not worth optimizing the memory order since we know there is an algorithm that does not need a barrier in the common case.