From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5679 invoked from network); 9 Nov 2020 22:23:37 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 9 Nov 2020 22:23:37 -0000 Received: (qmail 22380 invoked by uid 550); 9 Nov 2020 22:23:34 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 22362 invoked from network); 9 Nov 2020 22:23:33 -0000 Date: Mon, 9 Nov 2020 17:23:21 -0500 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20201109222320.GC534@brightrain.aerifal.cx> References: <20201027211735.GV534@brightrain.aerifal.cx> <20201030185205.GA10849@arya.arvanta.net> <20201030185716.GE534@brightrain.aerifal.cx> <5298816.XTEcGr0bgB@nanabozho> <20201031033117.GH534@brightrain.aerifal.cx> <20201106033616.GX534@brightrain.aerifal.cx> <20201108161215.GE1370092@port70.net> <20201109170729.GA534@brightrain.aerifal.cx> <20201109215901.GG1370092@port70.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201109215901.GG1370092@port70.net> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH v2] MT fork On Mon, Nov 09, 2020 at 10:59:01PM +0100, Szabolcs Nagy wrote: > * Rich Felker [2020-11-09 12:07:36 -0500]: > > On Sun, Nov 08, 2020 at 05:12:15PM +0100, Szabolcs Nagy wrote: > > > * Rich Felker [2020-11-05 22:36:17 -0500]: > > > > On Fri, Oct 30, 2020 at 11:31:17PM -0400, Rich Felker wrote: > > > > > One thing I know is potentially problematic is interaction with malloc > > > > > replacement -- locking of any of the subsystems locked at fork time > > > > > necessarily takes place after application atfork handlers, so if the > > > > > malloc replacement registers atfork handlers (as many do), it could > > > > > deadlock. I'm exploring whether malloc use in these systems can be > > > > > eliminated. A few are almost-surely better just using direct mmap > > > > > anyway, but for some it's borderline. I'll have a better idea sometime > > > > > in the next few days. > > > > > > > > OK, here's a summary of the affected locks (where there's a lock order > > > > conflict between them and application-replaced malloc): > > > > > > if malloc replacements take internal locks in atfork > > > handlers then it's not just libc internal locks that > > > can cause problems but locks taken by other atfork > > > handlers that were registered before the malloc one. > > > > No other locks taken there could impede forward process in libc. The > > the problem is not in the libc: if malloc locks are > taken and user code can run (which can in an atfork > handler) then that's a problem between the malloc > lock and other user lock. > > i.e. it's simply not valid for a malloc implementation > to register an atfork handler to take locks since it > cannot guarantee lock ordering. (the only reason it > works in practice because atfork handlers are rare) The only constraint for it to work is that the malloc replacement's atfork handler is registered first. This is reasonable for an application that's replacing malloc to guarantee (e.g. by not using atfork handlers otherwise, or by knowing which ones it uses and taking care with the order they're registered). > > > if malloc interposers want to do something around fork > > > then libc may need to expose some better api than atfork. > > > > Most of them use existing pthread_atfork if they're intended to be > > usable with MT-fork. I don't think inventing a new musl-specific API > > is a solution here. Their pthread_atfork approach already fully works > > with glibc because glibc *doesn't* make anything but malloc work in > > the MT-forked child. All the other subsystems mentioned here will > > deadlock or blow up in the child with glibc, but only with 0.01% > > probability since it's rare for them to be under hammering at fork > > time. > > yes the libc internal locks are a problem, but > other atfork handlers are a problem too that > cannot be fixed. > > we can make the atfork ordering an application > responsibility but it feels a bit sketchy (locks > can be library internals an application does not > know about), so i think to solve the more general > problem of locking around fork needs some new api, > pthread_atfork is not good for that. pthread_atfork is highly flawed, but I don't think this is anywhere near the biggest flaw with it. Either the application takes responsibility for order, or it just ensures that there is no interaction between the components locked. (If malloc replacement is one of the things using atfork, that implies that none of the other atfork handlers take locks that could be held while malloc is called.) But the issue at hand is not the interaction of multiple atfork handlers by the application, which is already delicate and something we have no control over. The issue at hand is avoiding a regression whereby applications that are replacing malloc, and using an atfork handler to make that "MT-fork-safe", now hang in the parent on MT-fork, due to fork (not atfork handlers, fork itself) taking locks on libc subsystems that use malloc. I don't think this is an appropriate regression to introduce for the sake of making badly-behaved programs work. > > One solution you might actually like: getting rid of > > application-provided-malloc use inside libc. This could be achieved by > > making malloc a thin wrapper for __libc_malloc or whatever, which > > could be called by everything in libc that doesn't actually have a > > contract to return "as-if-by-malloc" memory. Only a few functions like > > getdelim would be left still calling malloc. > > if none of the as-if-by-malloc allocations are > behind libc internal locks then this sounds good. I think the only as-if-by-malloc ones are str[n]dup, getline/getdelim, scanf family, open_[w]memstream, and none of these could reasonably ever hold libc locks (they're not operating on global state). But there are a lot of other things that could stick with public malloc use too. glob was mentioned already, and the same applies to anything else that's "library code" vs a libc singleton. > (not ideal, since then interposers can't see all > allocations, which some tools would like to see, > but at least correct and robust. and it is annoying > that we have to do all this extra work just because > of mt-fork) Yes. On the other hand if this were done more rigorously it would fix the valgrind breakage of malloc during ldso startup.. > > The other pros of such an approach are stuff like making it so > > application code doesn't get called as a callback from messy contexts > > inside libc, e.g. with dynamic linker in inconsistent state. The major > > con I see is that it precludes omitting the libc malloc entirely when > > static linking, assuming you link any part of libc that uses malloc > > internally. However, almost all such places only call malloc, not > > free, so you'd just get the trivial bump allocator gratuitously > > linked, rather than full mallocng or oldmalloc, except for dlerror > > which shouldn't come up in static linked programs anyway. > > i see. > that sounds fine to me. I'm still not sure it's fine, so I appreciate your input and anyone else's who has spent some time thinking about this. Rich