From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12758
Path: news.gmane.org!.POSTED!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: What's wrong with musl's malloc
Date: Sun, 22 Apr 2018 22:09:33 -0400
Message-ID: <20180423020933.GY3094@brightrain.aerifal.cx>
References: <20180420200904.GS3094@brightrain.aerifal.cx>
 <0EC2E771-368F-454E-B08D-7D6A5321F598@mac.com>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: blaine.gmane.org 1524449265 6620 195.159.176.226 (23 Apr 2018 02:07:45 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Mon, 23 Apr 2018 02:07:45 +0000 (UTC)
User-Agent: Mutt/1.5.21 (2010-09-15)
To: musl@lists.openwall.com
Original-X-From: musl-return-12774-gllmg-musl=m.gmane.org@lists.openwall.com Mon Apr 23 04:07:41 2018
Return-path: <musl-return-12774-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.84_2)
	(envelope-from <musl-return-12774-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1fAQtI-0001cQ-9U
	for gllmg-musl@m.gmane.org; Mon, 23 Apr 2018 04:07:40 +0200
Original-Received: (qmail 14295 invoked by uid 550); 23 Apr 2018 02:09:48 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 14274 invoked from network); 23 Apr 2018 02:09:47 -0000
Content-Disposition: inline
In-Reply-To: <0EC2E771-368F-454E-B08D-7D6A5321F598@mac.com>
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:12758
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/12758>

On Mon, Apr 23, 2018 at 09:40:08AM +1200, Michael Clark wrote:
> 
> > On 21/04/2018, at 8:09 AM, Rich Felker <dalias@libc.org> wrote:
> > 
> > I've talked on and off about musl's malloc needing to be overhauled or
> > replaced, and gaining the ability to experiment with that is one of
> > the motivations for making malloc interposable/replacable. Here's a
> > brief write-up of what's wrong that needs to be addressed:
> > 
> > 
> > The main benefits of musl's malloc vs the standard dlmalloc algorithms
> > it's based on is the fine-grained locking. As long as there are binned
> > free chunks of various sizes available, threads calling malloc will
> > only contend for a lock when they're requesting allocations of same or
> > similar size. This works out well under artificial random loads; I'm
> > not sure how much it helps on typical real-world loads.
> > 
> > 
> > In any case, the fine-grained locking also created a problem I didn't
> > anticipate when I designed it: when allocating memory that involves
> > splitting a free chunk, or when freeing memory that will get merged
> > with an adjacent free chunk, the operation as a whole is not atomic;
> > rather, a large free chunk is consumed, possibly emptying the bin it
> > lies in, split or merged, then returned either to the same or a
> > different bin. By saying this operation is non-atomic, I mean other
> > threads see the intermediate state where the large free chunk has been
> > consumed but not yet returned, and when this happens, they
> > unnecessarily split another free chunk or expand the heap rather than
> > waiting for it to be returned. This yields bad, potentially
> > catastrophic, fragmentation.
> > 
> > The malloc side already partially mitigates this with the "pretrim"
> > function, which is used whenever the leftover part after splitting
> > would remain in the same bin it was already in. In particular his
> > covers splitting small allocations off a large "top of heap" zone, but
> > it doesn't help with splitting small chunks. And the free side is much
> > worse -- large free zones momentarily disappear every time something
> > adjacent to them is freed.
> > 
> > In order to overcome this fully while maintaining the fine-grained
> > locking, we'd need the ability to hold multiple locks concurrently,
> > which would require a protocol for lock acquisition order to avoid
> > deadlock. But there may be cheap mitigations that could be made in
> > free like on the malloc side. For instance it might be possible to
> > make a simpler protocol whereby we could always hold the lock for bin
> > 63, thereby avoiding exposing a state where large free zones are
> > unavailable. If there's a way to make this work, moving the binmap
> > masking for newly-emptied bins from unbin() to unlock_bin() would
> > ensure that malloc doesn't "see" a "temporary empty" state.
> > 
> > 
> > In the long term, I think the whole malloc design we're using now is
> > wrong, and should be replaced, but until the time comes to actually do
> > that, it may make sense to try to improve some of the worst
> > properties, or even to reduce or eliminate the granularity of the
> > locking if it proves not to be very valuable on real-world loads.
> 
> Hi Rich,
> 
> Have you considered importing jemalloc?

No, and that's not happening. It's got serious bloat problems,
problems with undermining ASLR, and is optimized pretty much only for
being as fast as possible without caring how much memory you use.
That's nothing like the goals of musl. With the recent changes to
allow interposing malloc, projects that want to use it now can, but it
definitely shouldn't be imposed on the vast majority of programs that
will only hurt (grow much bigger in memory) from using it.

Aside from that, importing nontrivial code is pretty much off the
table -- every time we've done it it's been almost entirely a mistake
(TRE being the worst), since it ends up being lots of code that nobody
understands with bugs and UB and often vulns. Just this release cycle,
blatent nonsensical special cases in the OpenBSD complex math code bit
us.

> It’s been battle tested for 12 years in FreeBSD and in many other
> apps. Given it’s bundled and used in MariaDB which is undoubtedly a
> demanding user and presumably Monty approved the choice of
> allocator, so it shouldn’t be a bad choice. I’d trust the allocator
> choice of a well respected database architect.
> 
> From looking at the Lockless, Inc benchmarks jemalloc has good

These benchmarks were specifically tuned for thresholds that make
glibc's ptmalloc look bad. They're not a credible source. While
Lockless Inc has some nice documentation on multithreading topics, I'm
a lot more skeptical of their product and performance claims about
other products...

> performance from very small to very large allocations. From the
> Lockless Inc memory allocator comparisons: “The disadvantage of the
> jemalloc allocator is its memory usage. It uses power-of-two sized
> bins, which leads to a greatly increased memory footprint compared
> to other allocators. This can affect real-world performance due to
> excess cache and TLB misses.”

Yes, see above.

Rich