From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29837 invoked from network); 5 Oct 2022 14:03:23 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 5 Oct 2022 14:03:23 -0000 Received: (qmail 17899 invoked by uid 550); 5 Oct 2022 14:03:19 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 17865 invoked from network); 5 Oct 2022 14:03:18 -0000 Date: Wed, 5 Oct 2022 10:03:03 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20221005140303.GS29905@brightrain.aerifal.cx> References: <20221005010044.GR29905@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Illegal killlock skipping when transitioning to single-threaded state On Wed, Oct 05, 2022 at 03:10:09PM +0300, Alexey Izbyshev wrote: > On 2022-10-05 04:00, Rich Felker wrote: > >On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote: > >>Reordering the "libc.need_locks = -1" assignment and > >>UNLOCK(E->killlock) and providing a store barrier between them > >>should fix the issue. > > > >Back to this, because it's immediately actionable without resolving > >the aarch64 atomics issue: > > > >Do you have something in mind for how this reordering is done, since > >there are other intervening steps that are potentially ordered with > >respect to either or both? I don't think there is actually any > >ordering constraint at all on the unlocking of killlock (with the > >accompanying assignment self->tid=0 kept with it) except that it be > >past the point where we are committed to the thread terminating > >without executing any more application code. So my leaning would be to > >move this block from the end of pthread_exit up to right after the > >point-of-no-return comment. > > > This was my conclusion as well back when I looked at it before > sending the report. > > I was initially concerned about whether reordering with > a_store(&self->detach_state, DT_EXITED) could cause an unwanted > observable change (pthread_tryjoin_np() returning EBUSY after a > pthread function acting on tid like pthread_getschedparam() returns > ESRCH), but no, pthread_tryjoin_np() will block/trap if the thread > is not DT_JOINABLE. > > >Unfortunately while reading this I found another bug, this time a lock > >order one. __dl_thread_cleanup() takes a lock while the thread list > >lock is already held, but fork takes these in the opposite order. I > >think the lock here could be dropped and replaced with an atomic-cas > >list head, but that's rather messy and I'm open to other ideas. > > > I'm not sure why using a lock-free list is messy, it seems like a > perfect fit here to me. Just in general I've tried to reduce the direct use of atomics and use high-level primitives, because (as this thread is evidence of) I find the reasoning about direct use of atomics and their correctness to be difficult and inaccessible to a lot of people who would otherwise be successful readers of the code. But you're right that it's a "good match" for the problem at hand. > However, doesn't __dl_vseterr() use the libc-internal allocator > after 34952fe5de44a833370cbe87b63fb8eec61466d7? If so, the problem > that freebuf_queue was originally solving doesn't exist anymore. We > still can't call the allocator after __tl_lock(), but maybe this > whole free deferral approach can be reconsidered? I almost made that change when the MT-fork changes were done, but didn't because it was wrong. I'm not sure if I documented this anywhere (it might be in mail threads related to that or IRC) but it was probably because it would need to take malloc locks with the thread list lock held, which isn't allowed. It would be nice if we could get rid of the deferred freeing here, but I don't see a good way. The reason we can't free the buffer until after the thread list lock is taken is that it's only freeable if this isn't the last exiting thread. If it is the last exiting thread, the buffer contents still need to be present for the atexit handlers to see. And whether this is the last exiting thread is only stable/determinate as long as the thread list lock is held. Rich