From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/6951 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: GNU Emacs LD_PRELOAD build hack Date: Thu, 5 Feb 2015 00:22:59 -0500 Message-ID: <20150205052259.GI23507@brightrain.aerifal.cx> References: <20150203035407.GA14795@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="NDin8bjvE/0mNLFQ" X-Trace: ger.gmane.org 1423113799 10646 80.91.229.3 (5 Feb 2015 05:23:19 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 5 Feb 2015 05:23:19 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-6964-gllmg-musl=m.gmane.org@lists.openwall.com Thu Feb 05 06:23:18 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YJEuK-0007cV-Bc for gllmg-musl@m.gmane.org; Thu, 05 Feb 2015 06:23:16 +0100 Original-Received: (qmail 3495 invoked by uid 550); 5 Feb 2015 05:23:13 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3480 invoked from network); 5 Feb 2015 05:23:12 -0000 Content-Disposition: inline In-Reply-To: <20150203035407.GA14795@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:6951 Archived-At: --NDin8bjvE/0mNLFQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Feb 02, 2015 at 10:54:07PM -0500, Rich Felker wrote: > The right solution: As I discussed on the emacs-devel list nearly a > year ago, the right solution is to get rid of the non-portable code in > emacs, dumping the lisp heap and its data (rather than the whole > program) to a file and either mmapping it at runtime (and possibly > relocating pointers in it, if the new location it's loaded at differs) > or converting it to a C source file that's compiled and linked and for > which the (static or dynamic) linker can perform relocations at > link/load time. This solution also solves a number of other serious > issues related to the dumper, including its incompatibility with PIE > binaries. Apparently since the discussion last year, the emacs folks went forward with one of their proposed fixes -- not the best possible one, but a good one nonetheless. These changes aren't in any release and won't be for quite a while I expect, but using emacs git master, I was able to build successfully with the attached patch and no hacks. > So, here's an alternate solution. > > The hack: The basic trick is that we need to satisfy emacs assumptions > about malloc, but only at build (dumping) time, not permanently. My > first thought was to build emacs in the presence of a modified musl > libc.so whose malloc never uses mmap (issue 1) and never reclaims gaps > at the edge of writable mappings (issue 2), but then I realized we > could achieve the same thing without having to build a custom libc.so > at package-build time by exploiting LD_PRELOAD. Unfortunately there was another invalid assumption emacs was making that I missed, which only came up when I tried to build on 64-bit: under some conditions, it actually passes the pre-dump objects obtained from malloc to the post-dump realloc/free functions. This results in horrible heap-structure corruption and I have no idea how/why it's working on glibc since it should break there too. Anyway, discussing this on emacs-devel led to the much better solution using git master, but I do also have a fixed based on the previously reported LD_PRELOAD hack; it just depends on patching emacs not to pass these pre-dump pointers to realloc/free. I'm attaching that patch too in case anyone is interested. Attached files: emacs_alloc_invalid_frees.diff is the patch to supplement the LD_PRELOAD hack on emacs-24.x. emacs-master-musl.diff is the (lazy) patch for emacs git master (presently commit 4188e3cc2bc69e75d4387b369e72e89fecc46a86) to make it build on musl. It's not acceptable for upstream at this time because the changes made are mostly unconditional. If anyone is willing to put this into a form where it could be submitted upstream, I would very much appreciate it; otherwise, distros packaging emacs can just use the patch as-is or with minor changes. Rich --NDin8bjvE/0mNLFQ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="emacs-master-musl.diff" --- emacs-4188e3cc2bc69e75d4387b369e72e89fecc46a86/configure.ac +++ emacs/configure.ac @@ -2092,7 +2092,7 @@ system_malloc=$emacs_cv_sanitize_address -hybrid_malloc= +hybrid_malloc=yes case "$opsys" in ## darwin ld insists on the use of malloc routines in the System framework. --- emacs-4188e3cc2bc69e75d4387b369e72e89fecc46a86/src/Makefile.in +++ emacs/src/Makefile.in @@ -373,6 +373,7 @@ region-cache.o sound.o atimer.o \ doprnt.o intervals.o textprop.o composite.o xml.o $(NOTIFY_OBJ) \ profiler.o decompress.o \ + sheap.o \ $(MSDOS_OBJ) $(MSDOS_X_OBJ) $(NS_OBJ) $(CYGWIN_OBJ) $(FONT_OBJ) \ $(W32_OBJ) $(WINDOW_SYSTEM_OBJ) $(XGSELOBJ) obj = $(base_obj) $(NS_OBJC_OBJ) --- emacs-4188e3cc2bc69e75d4387b369e72e89fecc46a86/src/gmalloc.c +++ emacs/src/gmalloc.c @@ -72,7 +72,7 @@ #define free gfree #endif /* HYBRID_MALLOC */ -#ifdef CYGWIN +//#ifdef CYGWIN extern void *bss_sbrk (ptrdiff_t size); extern int bss_sbrk_did_unexec; extern char bss_sbrk_buffer[]; @@ -80,7 +80,7 @@ #define DUMPED bss_sbrk_did_unexec #define ALLOCATED_BEFORE_DUMPING(P) \ ((P) < bss_sbrk_buffer_end && (P) >= (void *) bss_sbrk_buffer) -#endif +//#endif #ifdef __cplusplus extern "C" @@ -1525,16 +1525,19 @@ __default_morecore (ptrdiff_t increment) { void *result; -#if defined (CYGWIN) +//#if defined (CYGWIN) if (!DUMPED) { return bss_sbrk (increment); } -#endif +//#endif +#if 0 result = (void *) __sbrk (increment); if (result == (void *) -1) return NULL; return result; +#endif + return NULL; } /* Copyright (C) 1991, 92, 93, 94, 95, 96 Free Software Foundation, Inc. --- emacs-4188e3cc2bc69e75d4387b369e72e89fecc46a86/src/print.c +++ emacs/src/print.c @@ -755,7 +755,7 @@ print_output_debug_flag = x; } -#if defined (GNU_LINUX) +#if defined (GNU_LINUX) && defined (__GLIBC__) /* This functionality is not vitally important in general, so we rely on non-portable ability to use stderr as lvalue. */ --- emacs-4188e3cc2bc69e75d4387b369e72e89fecc46a86/src/unexelf.c +++ emacs/src/unexelf.c @@ -632,6 +632,9 @@ off_t new_file_size; void *new_break; + extern int bss_sbrk_did_unexec; + bss_sbrk_did_unexec = 1; + /* Pointers to the base of the image of the two files. */ caddr_t old_base, new_base; --NDin8bjvE/0mNLFQ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="emacs_alloc_invalid_frees.diff" --- emacs-24.3.orig/src/alloc.c +++ emacs-24.3/src/alloc.c @@ -47,6 +47,13 @@ #include +static void *initial_brk; +__attribute__((__constructor__)) +static void init() +{ + initial_brk = sbrk(0); +} + /* GC_CHECK_MARKED_OBJECTS means do sanity checks on allocated objects. Doable only if GC_MARK_STACK. */ #if ! GC_MARK_STACK @@ -699,6 +706,14 @@ { void *val; + if (block && block < initial_brk) { + size_t len = (char *)initial_brk - (char *)block; + if (len > size) len = size; + void *p = xmalloc(size); + memcpy(p, block, len); + return p; + } + MALLOC_BLOCK_INPUT; /* We must call malloc explicitly when BLOCK is 0, since some reallocs don't do this. */ @@ -720,6 +735,7 @@ void xfree (void *block) { + if (block < initial_brk) return; if (!block) return; MALLOC_BLOCK_INPUT; @@ -910,6 +926,7 @@ static void lisp_free (void *block) { + if (block < initial_brk) return; MALLOC_BLOCK_INPUT; free (block); #if GC_MARK_STACK && !defined GC_MALLOC_CHECK @@ -1117,6 +1134,8 @@ { struct ablock *ablock = block; struct ablocks *abase = ABLOCK_ABASE (ablock); + + if (block < initial_brk) return; MALLOC_BLOCK_INPUT; #if GC_MARK_STACK && !defined GC_MALLOC_CHECK --NDin8bjvE/0mNLFQ--