From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/11124 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Reviving planned ldso changes Date: Wed, 8 Mar 2017 13:55:17 -0500 Message-ID: <20170308185517.GB1520@brightrain.aerifal.cx> References: <20170226010429.GQ12395@port70.net> <20170226013926.GY1520@brightrain.aerifal.cx> <20170226102830.GR12395@port70.net> <20170226152016.GZ1520@brightrain.aerifal.cx> <20170226153436.GA2082@port70.net> <20170226213925.GB1520@brightrain.aerifal.cx> <20170303013026.GJ1520@brightrain.aerifal.cx> <20170304105817.GF2082@port70.net> <20170306011159.GM1520@brightrain.aerifal.cx> <20170307220209.GV1520@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1488999334 13641 195.159.176.226 (8 Mar 2017 18:55:34 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 8 Mar 2017 18:55:34 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-11139-gllmg-musl=m.gmane.org@lists.openwall.com Wed Mar 08 19:55:30 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1clgk9-0002mK-EB for gllmg-musl@m.gmane.org; Wed, 08 Mar 2017 19:55:25 +0100 Original-Received: (qmail 26019 invoked by uid 550); 8 Mar 2017 18:55:29 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 25998 invoked from network); 8 Mar 2017 18:55:29 -0000 Content-Disposition: inline In-Reply-To: <20170307220209.GV1520@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:11124 Archived-At: On Tue, Mar 07, 2017 at 05:02:09PM -0500, Rich Felker wrote: > On Sun, Mar 05, 2017 at 08:11:59PM -0500, Rich Felker wrote: > > On Sat, Mar 04, 2017 at 11:58:18AM +0100, Szabolcs Nagy wrote: > > > * Rich Felker [2017-03-02 20:30:26 -0500]: > > > > Here's a v4 of the patch that saves the "init parent" we descended > > > > from so that it can return where it left off. There are a couple > > > > gratuitous hunks left over adding setting of "needed_by" where it made > > > > sense to be set, but it's not actually used anymore. They could be > > > > dropped if desired but are probably nice to keep for the sake of > > > > consistency of data, even thoough it's data we don't use. > > > > > > > > I believe this can be extended to allow concurrent dlopen by amending > > > > the case in the tree-walk where a dependency isn't constructed yet but > > > > already has an "init parent" to check whether it's > > > > pending-construction in the calling thread (recursive dlopen from a > > > > ctor) or another thread; in the former case (as now) treat it as > > > > already-constructed; in the latter, wait on a condvar that gets > > > > signaled at the end of each construction, then continue the loop > > > > without advancing p. There are probably some subtleties I'm missing, > > > > though. > > > .... > > > > static void do_init_fini(struct dso *p) > > > > { > > > > size_t dyn[DYN_CNT]; > > > > - int need_locking = libc.threads_minus_1; > > > > - /* Allow recursive calls that arise when a library calls > > > > - * dlopen from one of its constructors, but block any > > > > - * other threads until all ctors have finished. */ > > > > - if (need_locking) pthread_mutex_lock(&init_fini_lock); > > > > - for (; p; p=p->prev) { > > > > - if (p->constructed) continue; > > > > + pthread_mutex_lock(&init_fini_lock); > > > > + /* Construct in dependency order without any recursive state. */ > > > > + while (p && !p->constructed) { > > > > + /* The following loop descends into the first dependency > > > > + * that is neither alredy constructed nor pending > > > > + * construction due to circular deps, stopping only > > > > + * when it reaches a dso with no remaining dependencies > > > > + * to descend into. */ > > > > + while (p->deps && p->deps[p->next_dep]) { > > > > + if (!p->deps[p->next_dep]->constructed && > > > > + !p->deps[p->next_dep]->init_parent) { > > > > + p->deps[p->next_dep]->init_parent = p; > > > > + p = p->deps[p->next_dep++]; > > > > > > i think the root may be visited twice because it > > > has no init_parent, which may be problematic with > > > the concurrent dlopen (and can cause unexpected > > > ctor order: the root node is not constructed last > > > if there is a cycle through it) > > > > Ah, the case where the root is an indirect dependency for itself? Yes, > > I think you're right in that case. We should be able to avoid it by > > setting the initial p->init_parent to head (the application), I think. > > > > > i think only checking init_parent of a dep is > > > enough and the root node can have a dummy parent > > > that is guaranteed to be not a dependency (ldso?) > > > and constructed so it stops the loop. > > > > I think ldso would work too, but in principle it need not be a > > dependency of anything if you have a dynamic-linked program that > > doesn't use libc (-nostdlib), so it's better to use head, I think. > > > > Also I agree we don't need to check p->constructed now, but once we > > unlock during ctor execution, the !init_parent and !constructed cases > > need to be treated separately. If it's constructed or pending > > construction in the same thread, we can just do p->next_dep++, but if > > it has an init_parent but isn't constructed or pending construction in > > same thread (recursive) we need to condvar wait and re-check instead, > > right? > > Arg, deep problems I missed. Quoting from IRC: > > nsz, uhg, the dep-order draft so far has a big bug > p->deps is not actually deps for p > rather, it's all indirect deps, but only set for a lib that was explicitly dlopen'd > so the new code doesn't actually do dep-order > it just walks a flat list of all (breadth-first, not depth-first) direct and indirect dependencies of p > and descends into each then immediately backs out > because after descending, p->deps is null > i think we should get rid of the old use of p->deps > which is just undoing temp-globalization of libs during load for reloc purposes > > If I first do the work of having a separate global-namespace dso list > (which is a pending change that will speed up relocations anyway), > then the old use of p->deps is no longer needed and we can simply > repurpose it to be direct-deps only. This is incorrect. p->deps in its current form is used for dlsym "dependency ordering" symbol resolution, where a breadth-first list of all direct and indirect dependencies is exactly what you want. So I don't think it can be eliminated. I wonder if it suffices to walk the flat p->deps in reverse. I suspect there are cases where this is wrong when a dependency appears more than once. Rich