Background: GNU Emacs' build process depends on the ability of the build-stage binary (temacs) to "dump" itself to a new executable file containing preloaded lisp objects/state in its .data segment. This process is highly non-portable even in principle; in practice, the big issue is where malloc allocations end up. They need to all be contiguous just above the .data/.bss in the original binary so that they can become part of the .data mapping. Against musl's malloc, this has two major ways it can fail: 1. musl uses mmap for large allocations (roughly, > 128-256k) and has no mechanism for obtaining such large objects from the main brk-based heap or even requesting such (whereas glibc has mallopt and/or an environment variable to control the mmap threshold, and emacs cheats and uses that to control glibc). 2. musl reclaims the gaps around the edges of writable mappings in the main program and shared libraries and uses them for malloc. If these are in shared libraries, they won't be dumped at all, and if they're in the main program, they actually overlap with .text on disk (the same page is mapped twice; this is the cause of the gaps) and thus the .text, not the heap data, gets written out to disk by the dumper. Emacs provides its own malloc replacement and tries to use it by default, but this has to be disabled with musl, since replacing malloc in dynamic programs doesn't work (and static binaries don't work right at all with emacs' dumper because libc state would get included in the dump -- state which is "intentionally lost" when it resides in a shared library whose state isn't dumped). The right solution: As I discussed on the emacs-devel list nearly a year ago, the right solution is to get rid of the non-portable code in emacs, dumping the lisp heap and its data (rather than the whole program) to a file and either mmapping it at runtime (and possibly relocating pointers in it, if the new location it's loaded at differs) or converting it to a C source file that's compiled and linked and for which the (static or dynamic) linker can perform relocations at link/load time. This solution also solves a number of other serious issues related to the dumper, including its incompatibility with PIE binaries. Unfortunately, the right solution requires a significant overhaul by someone with expertise in emacs internals, and it's not practical in the short term. Meanwhile, we have users wanting emacs on musl-based distros (myself included). So, here's an alternate solution. The hack: The basic trick is that we need to satisfy emacs assumptions about malloc, but only at build (dumping) time, not permanently. My first thought was to build emacs in the presence of a modified musl libc.so whose malloc never uses mmap (issue 1) and never reclaims gaps at the edge of writable mappings (issue 2), but then I realized we could achieve the same thing without having to build a custom libc.so at package-build time by exploiting LD_PRELOAD. The attached file is my current draft of the LD_PRELOAD module to be loaded when running temacs to dump. In short, what it does is: - Throws away (wastes/leaks) and retries whenever it gets a result from malloc that's not between the initial value of the brk and the current (after malloc) value of the brk, i.e. anything not on the "main heap" that's contiguous with .bss/.data. - For large allocations that would be serviced by mmap, and for which musl's malloc won't/can't allocate from the "main heap", allocate 64k at a time, many times, from the heap, and exploit knowledge of the malloc chunk header/footer structures to paste them together to make one large chunk. (If the wrapper can't get contiguous chunks for this, then malloc will just fail and report failure.) The first part of the hack is simple and clean. The second part is hideously ugly, but the key point to realize is that it's only making an assumption about the library implementation used at build time, not when the emacs binary is later run. The dumped emacs does not include any code from the LD_PRELOAD hack and it does not depend on the assumptions made in the hack still being valid for the libc.so that's used at runtime. If these assumptions do become invalidated (unlikely, but possible), then all that's needed to get emacs building again is updating the hack (or just building with an outdated libc.so). With any luck, the non-portable dumping in emacs will be fixed long before this is needed, anyway. Over the next few days I hope to be working with people doing Alpine Linux (and/or other dists) packaging to get this turned into a clean, reproducible build procedure for GNU-Emacs-on-musl. In the mean time, the source for the hack is attached in case anyone wants to start hacking on it. Rich