From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/11644 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Multiple bugs in dlopen & dependency tracking Date: Tue, 4 Jul 2017 10:53:28 -0400 Message-ID: <20170704145328.GA15186@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1499180028 23666 195.159.176.226 (4 Jul 2017 14:53:48 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 4 Jul 2017 14:53:48 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-11657-gllmg-musl=m.gmane.org@lists.openwall.com Tue Jul 04 16:53:44 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1dSPCs-0005kb-SI for gllmg-musl@m.gmane.org; Tue, 04 Jul 2017 16:53:38 +0200 Original-Received: (qmail 1893 invoked by uid 550); 4 Jul 2017 14:53:42 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 1853 invoked from network); 4 Jul 2017 14:53:40 -0000 Content-Disposition: inline Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:11644 Archived-At: Commit 4ff234f6cba96403b5de6d29d48a59fd73252040 introduced a regression whereby dlopen RTLD_GLOBAL of a library that was previously loaded RTLD_LOCAL no longer promotes the library to the global namespace as intended. This is easy to fix, but there are other related bugs I've found in the process, and I want to document them here. Some may be trickier to fix. They relate to the ->deps list for a dso, which is supposed to contain a dependency-order list of dsos that dlsym will search when called on the dso handle. These are also connected to the stalled attempt to do dependency-order execution of constructors. 1. Rather than building the deps list by recursive descent through DT_NEEDED tables, it's built by iterating all libraries loaded after the library whose deps list is needed (i.e. p->next to end). This is correct only at the time of initial load of the library (when all new additions to the list are its dependencies) and only if none of those dependencies in turn have dependencies that happened to already have been loaded earlier. 2. The deps pointer remains null if a library has no dependencies. Since this pointer being null is also used to indicate that the deps list was not yet built, if it's later dlopen'd again, a library that had no dependencies will falsely get assigned the dependencies of any libraries that were loaded after it but before the second call to dlopen. In practice almost everything depends on libc.so, so this is unlikely to show up in the wild. But: 3. Libraries that are loaded at startup do not have deps tables; they only get them on the first dlopen. Like the above case, they will falsely get assigned the dependencies of any libraries that happened to be loaded after them, either at startup or by dlopen. The issue of overloaded meanings for null deps pointer is easy to fix on its own, so I'm going to try to go ahead and fix it along with fixing the regression that started this investigation. But I think to fix all these problems we really need to overhaul the way dependencies are represented, and we may need to build the tables for all libraries at the time they're loaded (especially if we want to do the ctor dep order thing, which I still think makes sense) rather than deferring until dlopen. Rich