From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12752 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: What's wrong with musl's malloc Date: Sat, 21 Apr 2018 01:28:12 -0400 Message-ID: <20180421052812.GU3094@brightrain.aerifal.cx> References: <20180420200904.GS3094@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1524288384 28822 195.159.176.226 (21 Apr 2018 05:26:24 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 21 Apr 2018 05:26:24 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12768-gllmg-musl=m.gmane.org@lists.openwall.com Sat Apr 21 07:26:20 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1f9l2R-0007PL-0p for gllmg-musl@m.gmane.org; Sat, 21 Apr 2018 07:26:19 +0200 Original-Received: (qmail 23939 invoked by uid 550); 21 Apr 2018 05:28:25 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 23858 invoked from network); 21 Apr 2018 05:28:24 -0000 Content-Disposition: inline In-Reply-To: <20180420200904.GS3094@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12752 Archived-At: On Fri, Apr 20, 2018 at 04:09:04PM -0400, Rich Felker wrote: > I've talked on and off about musl's malloc needing to be overhauled or > replaced, and gaining the ability to experiment with that is one of > the motivations for making malloc interposable/replacable. Here's a > brief write-up of what's wrong that needs to be addressed: > > > The main benefits of musl's malloc vs the standard dlmalloc algorithms > it's based on is the fine-grained locking. As long as there are binned > free chunks of various sizes available, threads calling malloc will > only contend for a lock when they're requesting allocations of same or > similar size. This works out well under artificial random loads; I'm > not sure how much it helps on typical real-world loads. > > > In any case, the fine-grained locking also created a problem I didn't > anticipate when I designed it: when allocating memory that involves > splitting a free chunk, or when freeing memory that will get merged > with an adjacent free chunk, the operation as a whole is not atomic; > rather, a large free chunk is consumed, possibly emptying the bin it > lies in, split or merged, then returned either to the same or a > different bin. By saying this operation is non-atomic, I mean other > threads see the intermediate state where the large free chunk has been > consumed but not yet returned, and when this happens, they > unnecessarily split another free chunk or expand the heap rather than > waiting for it to be returned. This yields bad, potentially > catastrophic, fragmentation. > > The malloc side already partially mitigates this with the "pretrim" > function, which is used whenever the leftover part after splitting > would remain in the same bin it was already in. In particular his > covers splitting small allocations off a large "top of heap" zone, but > it doesn't help with splitting small chunks. And the free side is much > worse -- large free zones momentarily disappear every time something > adjacent to them is freed. > > In order to overcome this fully while maintaining the fine-grained > locking, we'd need the ability to hold multiple locks concurrently, > which would require a protocol for lock acquisition order to avoid > deadlock. But there may be cheap mitigations that could be made in > free like on the malloc side. For instance it might be possible to > make a simpler protocol whereby we could always hold the lock for bin > 63, thereby avoiding exposing a state where large free zones are > unavailable. If there's a way to make this work, moving the binmap > masking for newly-emptied bins from unbin() to unlock_bin() would > ensure that malloc doesn't "see" a "temporary empty" state. One protocol that might work: When locking multiple bins, they must be locked in decreasing order. For malloc, this is the natural thing to do -- the bin you return the excess to will be the same or smaller than the bin you allocate from. For free, take the freelock from the beginning, rather than just momentarily at the end. Then you know the bin sizes you'll be working with (up to at most 3 -- self, prev, and next) and can lock them in the protocol order, merge them, and bin the result without any looping. The MADV_DONTNEED operation (the most expensive and contention-generating part when it happens) can be deferred until after the freelock and all but one of the bin locks are released. Overall I like this because it creates an opportunity to simplify the code. My guess is that performance would be roughly the same, or at least not significantly worse (and probably significantly better for single-threaded use), and the chances at catastrophic fragmentation from contention would be gone. I still think a complete redesign is the right way forward in the long term. Rich