From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14856 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: New malloc - intro, motivation & design goals document Date: Tue, 22 Oct 2019 13:40:51 -0400 Message-ID: <20191022174051.GA24726@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="/9DWx/yDrRhgMJTb" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="21397"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14872-gllmg-musl=m.gmane.org@lists.openwall.com Tue Oct 22 19:41:08 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iMy9b-0005Ry-VJ for gllmg-musl@m.gmane.org; Tue, 22 Oct 2019 19:41:08 +0200 Original-Received: (qmail 31889 invoked by uid 550); 22 Oct 2019 17:41:04 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 31854 invoked from network); 22 Oct 2019 17:41:04 -0000 Content-Disposition: inline Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14856 Archived-At: --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Alongside time64 and other enhancements, over the next few months I'll be working on a major project to replace musl's malloc, which has known fundamental problems that can't really be fixed without significant redesign. The early stages of this, in progress now, involve preparing tests that can identify pitfalls into pathological fragmentation, some of which may be avoidable and others of which may be fundamental, for use in assessing possible allocation strategies and other existing malloc implementations. I'm attaching a document describing motivations and goals for the project, and a dead simple minimal test to demonstrate the unbounded heap expansion problem that's the core motivation. The latter produces no output but can be watched via top(1) or similar. I'll follow up soon with some thoughts/findings so far on designs that might work for us, or that don't work but have interesting properties that suggest they may lead to workable designs. I'm not going to try to do development of this inside the musl tree; getting the ability to experiment with and do this out-of-tree, then integrate it later, was one of the motivations for supporting malloc replacement which musl 1.1.20 introduced. --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="newmalloc.txt" musl libc next-gen malloc implementation *** Motivation and Design Goals Desirable qualities of existing allocator: - Very small code size. - Near-zero constant minimal overhead, preventing large numbers of small processes with only minor use of malloc from consuming excess resources. - Low, well-understood internal fragmentation. - Returning of (physical storage for) large contiguous freed memory to the operating system. - Detection of heap-based overflows (with limitations) at realloc/free time via clobbered headers/footers. - Detection of double-free (with inherent limitations) via chunk header/flags. Known flaws in existing allocator: - Race condition (not data race, a distinct concept) in determining the availability of free chunk of desired size while splitting/merging is in progress leads to external fragmentation (unnecessary splitting of larger chunks) and over-expansion of heap (via failure to see large free zone in bin 63). - Reliance on fine-grained per-bin locking to achieve good performance on typical (not even all; it doesn't help when all threads want the same size of allocation) multithreaded loads fundamentally precludes fixing the above race without significant performance regressions. - The overall dlmalloc design (arbitrary split/merge) inherently has usage patterns that can produce catastrophic external fragmentation. - Heuristic for returning dirty pages to kernel (via MADV_DONTNEED) is over-aggressive in some cases (bad performance), under-aggressive in others (failure to return memory), and it's not clear that it can be significantly improved in a dlmalloc-like design. Together, these necessitate not just enhancements to the existing allocator, but a completely different design. Goals for new allocator: Primarily, the new allocator should eliminate the above known flaws while preserving the good properties of the existing allocator to the maximum extent that's possible/practical. In addition, it should: - Harden protections against heap-based overflows and double free not to be susceptible to attacks that blindly replace clobbered headers/footers with valid data. This likely involves out-of-band metadata and random secrets, but may be achieved in whatever way is determined to be most effective and compatible with other goals. - Provide either strong rigorous bounds on external fragmentation, or heuristically mitigate the chance of reasonable usage patterns leading to unbounded external fragmentation. - Expand or restructure the heap only when no free chunks that satisfy an allocation request (without significant internal fragmentation) exist. - Take advantage of realloc as an opportunity to defragment. - Improve performance, both under reasonable contention and under single-thread load (with or without existence of other threads). This likely amounts to short fast-case code paths and designing synchronization around minimizing the number of atomic/lock operations. --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="expand_race.c" #include #include #include static void *worker(void *p) { for (;;) { void *p = malloc(100000); free(p); } } int main() { for (int i=0; i<10; i++) pthread_create(&(pthread_t){0}, 0, worker, 0); for (;;) pause(); } --/9DWx/yDrRhgMJTb--