From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12689 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: tcmalloc compatibility Date: Tue, 10 Apr 2018 11:35:50 -0400 Message-ID: <20180410153550.GG3094@brightrain.aerifal.cx> References: <0ea267bf-ea3a-9810-be1a-50e71b6cfce1@denis.im> <20180410143359.GF3094@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1523374444 428 195.159.176.226 (10 Apr 2018 15:34:04 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 10 Apr 2018 15:34:04 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12704-gllmg-musl=m.gmane.org@lists.openwall.com Tue Apr 10 17:33:59 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1f5vHQ-0008OR-JU for gllmg-musl@m.gmane.org; Tue, 10 Apr 2018 17:33:56 +0200 Original-Received: (qmail 1987 invoked by uid 550); 10 Apr 2018 15:36:03 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 1914 invoked from network); 10 Apr 2018 15:36:02 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12689 Archived-At: On Tue, Apr 10, 2018 at 10:45:03AM -0400, Bobby Powers wrote: > On Tue, Apr 10, 2018 at 2:34 PM Rich Felker wrote: > > This claim doesn't seem to be well-justified. Myself and members of > > our community have written a lot on why existing malloc interposition > > hacks are broken, but there's also an interest in what would take to > > make it work, and I particularly am interested in this from a > > standpoint that musl's malloc is not very good, and that being able to > > dynamically interpose it would facilitate developing and testing a > > replacement. > > This sounds super interesting -- what needs to happen to make progress > on this? I would love to help out. For allowing interposition, it's mainly working out policy so that it's clear what can and can't be supported and we don't get stuck seemingly promising something impossible. The actual code changes are fairly small. We'd need to switch from -Wl,-Bsymbolic-functions when linking to -Wl,--dynamic-list in order to exclude the malloc functions from being bound at link time, and some changes might be necessary in the dynamic linker in how it deals with donating gaps to malloc and early allocations before the interposed malloc is available. There's also a question of whether the dynamic linker should have code to detect and refuse to run with incorrect malloc interposition (some but not all functions interposed). But back to the point, the main issue is specifying the constraints on the interposing functions. > > Note however that if malloc interposition is supported at some point, > > there will be a specification for constraints on the malloc > > implementation including what functions you can call from it (e.g. > > something like AS-safety), and bug reports for implementations that do > > things outside this spec (and thereby inherently can't work safely or > > reliably) will not be considered bugs. > > That sounds reasonable. Some existing software (like Hoard) goes out > of its way to interpose on all functions that might call into malloc > to ensure the system allocator isn't called indirectly: > > https://github.com/emeryberger/Heap-Layers/blob/master/wrappers/wrapper.cpp This is really impossible to do correctly, for multiple reasons: 1. Some such functions are fundamentally not replacable, like the dynamic linker functions (dlopen). 2. There is no specification for which libc functions call into malloc; this is an implementation detail. The only related things that are parts of the public contract are whether they return memory "as if by malloc" and whether they're AS-safe or AC-safe (in which case it's not formally correct, but it's fairly reasonable to assume they don't call malloc). For example on glibc, qsort and printf call malloc (but qsort has, and necessarily has to have, a fallback for when it fails since qsort cant' fail). 3. Some functions which use malloc are sufficiently heavy that you'd be replacing (and possibly changing or reducing functionality in) whole major libc components if you wanted to replace them. For example, getaddrinfo (the whole resolver infrastructure), iconv, regex, ... Note that, unless the malloc replacement and the system malloc somehow step on each other's state, there's no harm in having both present and getting called as long as the libc functions that return memory "as if by malloc" (thus that's permissible to pass to realloc and free) use the interposed malloc replacement. This would just be things like strdup. So if you only replace those, it's a more managable task. But I still think it's a wrong approach. If musl does add support for malloc interposition, I'm strongly leaning towards using the interposed malloc everywhere so that this kind of issue does not matter. Otherwise there are too many opportunities for subtle errors. The main argument for not calling the interposed malloc from libc except when you have to is that you don't have to deal with reentrancy & inconsistent state issues that could be prone to incorrect usage, but getline() inherently has to return memory as if by malloc, and thus you're already stuck with at least one function that has to call the interposed malloc with stdio locks held (or else work in a temp buffer managed by internal malloc, then only move to a public-malloc-allocated buffer after finishing, but that's an awful hack, and imposing libc implementation constraints like that around the allowance for interposing malloc is exactly the type of nasty situation I don't want to get into). Rich