From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/9121 Path: news.gmane.org!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: Re: dlopen deadlock Date: Fri, 15 Jan 2016 01:31:49 +0100 Message-ID: <20160115003148.GH13558@port70.net> References: <20160113110937.GE13558@port70.net> <20160114224115.GW238@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1452817925 16512 80.91.229.3 (15 Jan 2016 00:32:05 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 15 Jan 2016 00:32:05 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-9134-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jan 15 01:32:04 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1aJsJ9-0003RW-MA for gllmg-musl@m.gmane.org; Fri, 15 Jan 2016 01:32:03 +0100 Original-Received: (qmail 24239 invoked by uid 550); 15 Jan 2016 00:32:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 24212 invoked from network); 15 Jan 2016 00:32:01 -0000 Mail-Followup-To: musl@lists.openwall.com Content-Disposition: inline In-Reply-To: <20160114224115.GW238@brightrain.aerifal.cx> User-Agent: Mutt/1.5.24 (2015-08-30) Xref: news.gmane.org gmane.linux.lib.musl.general:9121 Archived-At: * Rich Felker [2016-01-14 17:41:15 -0500]: > On Wed, Jan 13, 2016 at 12:09:37PM +0100, Szabolcs Nagy wrote: > > This bug i reported against glibc also affects musl: > > https://sourceware.org/bugzilla/show_bug.cgi?id=19448 > > > > in case of musl it's not the global load lock, but the > > init_fini_lock that causes the problem. > > The deadlock happens when a ctor makes a thread that calls dlopen and > does not return until the new thread's dlopen returns, right? > yes (not a common scenario) > > the multi-threadedness detection is also problematic in > > do_init_fini: > > > > need_locking = has_threads > > if (need_locking) > > lock(init_fini_lock) > > for all deps > > run_ctors(dep) > > if (!need_locking && has_threads) > > need_locking = 1 > > lock(init_fini_lock) > > if (need_locking) > > unlock(init_fini_lock) > > > > checking for threads after ctors are run is too late if > > the ctors may start new threads that can dlopen libs with > > common deps with the currently loaded lib. > > The logic seems unnecessary now that there's no lazy/optional thread > pointer initialization (originally it was a problem because > pthread_mutex_lock with a recursive mutex needed to access TLS for the > owner tid, but TLS might not have been initialized when the ctors ran) > but I don't immediately see how it's harmful. The only state the lock > protects is p->constructed and the fini chain (fini_head, > p->fini_next) which are all used before the ctors run. The need for > locking is re-evaluated after the ctors run. > hm ok i thought the ctors of the same lib might end up being called twice, concurrently, but i see p->constructed protects against that > > one solution i can think of is to have an init_fini_lock > > for each dso, then the deadlock only happens if a ctor > > tries to dlopen its own lib (directly or indirectly) > > which is nonsense (the library depends on itself being > > loaded) > > The lock has to protect the fini chain linked list (used to control > order of dtors) so I don't think having it be per-dso is a > possibility. > i guess using lockfree atomics could solve the deadlock then > Rich