From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13724 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Markus Wichmann Newsgroups: gmane.linux.lib.musl.general Subject: Re: dlsym(handle) may search in unrelated libraries Date: Thu, 7 Feb 2019 19:36:38 +0100 Message-ID: <20190207183637.GF5469@voyager> References: <20190206160248.GB5469@voyager> <20190206202518.GC5469@voyager> <96c367533236e3e203f04a994ee65c47@ispras.ru> <20190207053327.GD5469@voyager> <20190207165447.GP23599@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="249235"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.10.1 (2018-07-13) To: musl@lists.openwall.com Original-X-From: musl-return-13740-gllmg-musl=m.gmane.org@lists.openwall.com Thu Feb 07 19:37:43 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1groYQ-0012kA-Ta for gllmg-musl@m.gmane.org; Thu, 07 Feb 2019 19:37:42 +0100 Original-Received: (qmail 25881 invoked by uid 550); 7 Feb 2019 18:37:40 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 25857 invoked from network); 7 Feb 2019 18:37:40 -0000 Content-Disposition: inline In-Reply-To: <20190207165447.GP23599@brightrain.aerifal.cx> X-Provags-ID: V03:K1:fBIn8idVryW+C16qRFNyCBrJWmmppx7tsxGYRy+fP2GxnqcquuW bcNqNQp+UWigyUrH5cQHh76OY57IIgXTuSTnGWkaPz7+DRb7Vt/5OSRzu3dVYSbwLiNQhWS zYCIAQsJ7KYW1F2pw84rB+/CxwoSqWQWmDHaC+yLcZqSacQ2cBoIBWM54u8kTr7yQ+i4lLc FrtmUtszFanpycvP08Bbg== X-UI-Out-Filterresults: notjunk:1;V03:K0:O7wSidnbY4w=:yHfNs3hX3NLOvTDXEP04oU EtbrpV23+IjbpTiIghSdyMl05uy56142V4/aCpTEa3qEimtSBvtF0sZrJpp8UzZxoOPVP+ftw z0p6jMBw5CpZJi8CAQS5HdGhxR0LZG7eWo60d+WiP0+ZBleZMhwOmHu6nM2OaOu+zY0fqwQCP doHIYwJ0kZUEhzRxTfUXENLOZL9OGS9d4lBoKGUeP7Vo0gRCmxorDF6M2Q/cI9FPQD+BG9IHO JcgD4vDxxu+ByB12C3AL50WvWVDyqTnZS/lSewuQj6rKtcQgHqc4mbQ3D10QtD9vn9kLe7H/o W2SO4gcI7pg5vQL/osUHXiV67Yw662eFpDl2Qo7KXgj6DiXVNsoc4Vso/sobYu7lNpMpMIx9c EHflbeA2VMEhIKFY0fKuPy9C4YW0TVWB5qi+Wdte8LUyrIV4woiQSwi70+MsEa+kmSa+UfBgT tsQ1ireZ8iZ0EODC4lunEY2yZnmikU9MUvjBOc35r5026VhqWnwznlVXTQyRRfhvKG7ixHYxa OOHmAOz0x42V6MdmVF+3q3pGh8bnvtO9kBBaL7azOBvMVBGv4VCV0UHtArwb4dUbuzMkXuNUu mlnSQKy/BoPC41ehphKAXDlRrwO03Y2Ph0Ve+smGOwUPjLmaY3xFc7m9Y8YuS2ybrBYN9zJlK VGOcY3XAmHcV76uS6j87r+ZJDEnoVb5jCPGYrQOA95qju174WIMKdEdVF6TsBbjp5uGwrO7cZ scY+tdvryFYke7pFt9+od31N98/vlsGfA3yD9Re9gBFXl9ihAvCwhEcjqb7nmndSe2cmsXC7 Xref: news.gmane.org gmane.linux.lib.musl.general:13724 Archived-At: On Thu, Feb 07, 2019 at 11:54:47AM -0500, Rich Felker wrote: > Yes, it needs to contain all direct and indirect dependencies, in the > correct order. Right now I think it incorrectly contains a superset of > that. Is that your assessment of the situation too? > Yes, it is a superset, if the original argument to load_deps() is not the tail of the chain of libraries. > However, a depth-first list of dependencies is also needed to solve > the longstanding ctor-order issue discussed in several past threads > (which I can look up if search doesn't turn them up for you). This is > not a requirement of POSIX (which doesn't even have ctors), but it's a > reasonable expectation we currently get wrong and I think it might be > specified in ELF or some related sysv "standards". > Requirements first, nice-to-haves later, right? Once we have the dlsym() conformance issue squared away, we can still deal with this issue. You mean, people expect their dependencies to be initialized in their own initializers? When it is well-known that there is no order to initializers, and e.g. the C++ standard says there is no order to constructors of static objects. > I'm not sure if it makes sense to build both of these lists at the > same time. We currently try to avoid allocating dependency lists for > libraries loaded at startup in order not to allocate and setup data > that might never be used, defering until if/when dlopen is called on > them. I've wanted to do the ctor order walk without allocating a > dependency list, but I don't know a good way to do so. Note that the > code that runs ctors cannot allocate because, at the time it runs, > dlopen has already "committed" the load and can't back it out. It has > to have all resources it needs for the walk precommitted. > Depth first means stack. Means recursion or explicit stack. Explicit is going to be hard, considering there is no known limit to dependency nesting depth. We could argue that the user better make their stack size ulimit large enough for the main thread, but I hazard a guess that nobody expects dlopen() to use overly much stack in other threads. Alright, what's the algorithm here? init(lib): if (!lib.inited): foreach d in lib.deps init(d) start_init(lib) lib.inited = 1 That about right? Because that means we need a stack as high as the nesting depth of dependencies. We can get a pessimistic estimation from the number of deps loaded, but that may be way too high (see below). > Since the beginning of a list of breadth-first dependencies is just > the direct dependencies, if we recorded the number of direct > dependencies for each dso, the breadth-first lists could be used to > perform a depth-first walk to run ctors; this can be made > non-recursive, at the cost of quadratic time but with trivially-tiny > constant factor, by simply restarting at the root of the tree after > each node and finding the deepest not-yet-constructed dso. This > suggests always allocating the breadth-first dependency lists might be > a good idea. That "feels" unfortunate since in principle they can get > large, but it's only going to be large for ridiculous bloatware with > hundreds of libraries, in which case these dependency lists are very > small on a relative scale. > $ ldd =mpv | wc -l 292 And that, I will remind you, is the "suckless" alternative to mplayer. Anyway, that is breadth, not depth. Experimentally, it is hard to find a chain of libraries deeper than ten. They all eventually just lead back to libc. So what you could do, when it comes time to run the initializers, is to walk the dyn vectors again. Those are already allocated, and all the libs they reference are already loaded, so load_library() would only be a search operation. By the time the initializers run, that must be the case, both at load time and at runtime. Unless loading the library failed the first time, in which case at runtime we have already bailed. And at load time we... haven't. Are you sure continuing on load failure of a dependency during load time is a good idea? > I don't think recursive can be done safely since you don't have a > small bound on levels of recursion. It could be done with an explicit > stack in allocated storage though, since the operation has to allocate > memory anyway and has to fail if allocation fails. > > Rich Nevermind, it was a misunderstanding of the requirements. Ciao, Markus