From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/9121
Path: news.gmane.org!not-for-mail
From: Szabolcs Nagy <nsz@port70.net>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: dlopen deadlock
Date: Fri, 15 Jan 2016 01:31:49 +0100
Message-ID: <20160115003148.GH13558@port70.net>
References: <20160113110937.GE13558@port70.net>
 <20160114224115.GW238@brightrain.aerifal.cx>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1452817925 16512 80.91.229.3 (15 Jan 2016 00:32:05 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Fri, 15 Jan 2016 00:32:05 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-9134-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jan 15 01:32:04 2016
Return-path: <musl-return-9134-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-9134-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1aJsJ9-0003RW-MA
	for gllmg-musl@m.gmane.org; Fri, 15 Jan 2016 01:32:03 +0100
Original-Received: (qmail 24239 invoked by uid 550); 15 Jan 2016 00:32:01 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 24212 invoked from network); 15 Jan 2016 00:32:01 -0000
Mail-Followup-To: musl@lists.openwall.com
Content-Disposition: inline
In-Reply-To: <20160114224115.GW238@brightrain.aerifal.cx>
User-Agent: Mutt/1.5.24 (2015-08-30)
Xref: news.gmane.org gmane.linux.lib.musl.general:9121
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/9121>

* Rich Felker <dalias@libc.org> [2016-01-14 17:41:15 -0500]:
> On Wed, Jan 13, 2016 at 12:09:37PM +0100, Szabolcs Nagy wrote:
> > This bug i reported against glibc also affects musl:
> > https://sourceware.org/bugzilla/show_bug.cgi?id=19448
> > 
> > in case of musl it's not the global load lock, but the
> > init_fini_lock that causes the problem.
> 
> The deadlock happens when a ctor makes a thread that calls dlopen and
> does not return until the new thread's dlopen returns, right?
> 

yes
(not a common scenario)

> > the multi-threadedness detection is also problematic in
> > do_init_fini:
> > 
> > 	need_locking = has_threads
> > 	if (need_locking)
> > 		lock(init_fini_lock)
> > 	for all deps
> > 		run_ctors(dep)
> > 		if (!need_locking && has_threads)
> > 			need_locking = 1
> > 			lock(init_fini_lock)
> > 	if (need_locking)
> > 		unlock(init_fini_lock)
> > 
> > checking for threads after ctors are run is too late if
> > the ctors may start new threads that can dlopen libs with
> > common deps with the currently loaded lib.
> 
> The logic seems unnecessary now that there's no lazy/optional thread
> pointer initialization (originally it was a problem because
> pthread_mutex_lock with a recursive mutex needed to access TLS for the
> owner tid, but TLS might not have been initialized when the ctors ran)
> but I don't immediately see how it's harmful. The only state the lock
> protects is p->constructed and the fini chain (fini_head,
> p->fini_next) which are all used before the ctors run. The need for
> locking is re-evaluated after the ctors run.
> 

hm ok
i thought the ctors of the same lib might end up being
called twice, concurrently, but i see p->constructed
protects against that

> > one solution i can think of is to have an init_fini_lock
> > for each dso, then the deadlock only happens if a ctor
> > tries to dlopen its own lib (directly or indirectly)
> > which is nonsense (the library depends on itself being
> > loaded)
> 
> The lock has to protect the fini chain linked list (used to control
> order of dtors) so I don't think having it be per-dso is a
> possibility.
> 

i guess using lockfree atomics could solve the deadlock then

> Rich