From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13806 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Bugs found while working on global thread list Date: Fri, 15 Feb 2019 13:38:36 -0500 Message-ID: <20190215183836.GC23599@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="250295"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13822-gllmg-musl=m.gmane.org@lists.openwall.com Fri Feb 15 19:38:55 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1guiNv-0012vj-Iy for gllmg-musl@m.gmane.org; Fri, 15 Feb 2019 19:38:51 +0100 Original-Received: (qmail 26583 invoked by uid 550); 15 Feb 2019 18:38:49 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 26549 invoked from network); 15 Feb 2019 18:38:48 -0000 Content-Disposition: inline Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13806 Archived-At: 1. __dl_thread_cleanup, used to free the thread-local dlerror buffer, calls free from a context where the thread is no longer in a consistent state, which is invalid now what free may be defined by the application. The simplest fix seems to be queuing the buffer to a global list where it will be seen and freed by some future call to dl functions; this precludes unbounded (memory leak) growth. 2. dlsym for RTLD_NEXT or RTLD_DEFAULT walks the symbol-defining DSOs list without holding any lock on it, and the full DSOs list for addr2dso lookup. This is intentional in some sense, to avoid heavy overhead, but it seems incorrect and unsafe, as it can cause a definition which is only temporarily-global (as part of an in-progress dlopen with RTLD_LOCAL) or not-yet-committed (as part of an in-progress dlopen that eventually fails) to spuriously return success for a symbol that should be seen as undefined. I think this may be salvagable with some atomic sentinels, but it's probably better (simpler, clearly correct without complex reasoning) to just use a rwlock. (This issue was found looking at whether dlsym would be a place to perform deferred free of buffers for #1 above.)