From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13902 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Asterisk 16 segmentation fault Date: Thu, 28 Feb 2019 16:15:17 -0500 Message-ID: <20190228211517.GC23599@brightrain.aerifal.cx> References: <20190228202751.GB31965@darth.lan> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="77836"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) Cc: musl@lists.openwall.com To: Sebastian Kemper Original-X-From: musl-return-13918-gllmg-musl=m.gmane.org@lists.openwall.com Thu Feb 28 22:15:38 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1gzT1l-000K77-PO for gllmg-musl@m.gmane.org; Thu, 28 Feb 2019 22:15:37 +0100 Original-Received: (qmail 15776 invoked by uid 550); 28 Feb 2019 21:15:35 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 15746 invoked from network); 28 Feb 2019 21:15:34 -0000 Content-Disposition: inline In-Reply-To: <20190228202751.GB31965@darth.lan> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13902 Archived-At: On Thu, Feb 28, 2019 at 09:27:51PM +0100, Sebastian Kemper wrote: > Hello list, > > I'm trying to update Asterisk 15 to 16 on OpenWrt. It's apparently not > easy %-). > > The first thing I ran into was that now the header astmm.h is included > by asterisk.h. In earlier releases this header was only included when > doing malloc debugging it seems. > > astmm.h redefines some memory allocation functions as far as I > understand. musl doesn't seem to like this very much and the compile > fails. I worked around this with a small patch: > > --- a/include/asterisk/astmm.h > +++ b/include/asterisk/astmm.h > @@ -113,8 +113,21 @@ int __ast_repl_vasprintf(char **strp, co > */ > > #if !defined(ASTMM_LIBC) > +#if defined(__UCLIBC__) || defined(__GLIBC__) > /* BLOCK libc allocators by default. */ > #define ASTMM_LIBC ASTMM_BLOCK > +#else > +/* musl */ > +#define ASTMM_LIBC ASTMM_IGNORE > +#endif > +#endif > + > +#if !defined(__UCLIBC__) && !defined(__GLIBC__) > +#if ASTMM_LIBC == ASTMM_REDIRECT > +/* musl */ > +#undef ASTMM_LIBC > +#define ASTMM_LIBC ASTMM_IGNORE > +#endif > #endif > > #if ASTMM_LIBC == ASTMM_IGNORE > > With this patch applied I can now compile Asterisk 16 against musl. > > But when I tried to run the binary I ran into a segmentation fault. It > starts to initialize - and everything looks good - until it attempts to > load the modules/plugins. > > I connected to the box with a remote gdb and extracted some information. > The segmentation fault seems to happen in main/loader.c > (https://github.com/asterisk/asterisk/blob/16.2/main/loader.c), > specifically in load_dlopen() I think. > > I'm not a programmer so I have problems making sense of this. So I'm > hoping that maybe one of you can shine a light. > > I set a break point in the mentioned file, line 952. Then I stepped > forward. > > (gdb) > > Thread 1 "asterisk" hit Breakpoint 1, load_dlopen (resource_in=0x77d8f52b "D\350\240e", resource_in@entry=0x5fa910 "res_pjproject.so", so_ext=0x0, > so_ext@entry=0x53b91c "", filename=0x5fa910 "res_pjproject.so", filename@entry=0x7fff7a04 "/usr/lib/asterisk/modules/res_pjproject.so", flags=796226418, > flags@entry=258, suppress_logging=suppress_logging@entry=0) at loader.c:952 > 952 if (resource_being_loaded) { > (gdb) > 951 mod->lib = dlopen(filename, flags); > (gdb) > 952 if (resource_being_loaded) { > (gdb) > 955 const char *dlerror_msg = ast_strdupa(dlerror()); > (gdb) > > Thread 1 "asterisk" received signal SIGSEGV, Segmentation fault. > strlen (s=0x0, s@entry=0x48d79d "\t\360\"\223\f\234\200\353\216#\005\032\240z\364e") at src/string/strlen.c:17 > 17 for (w = (const void *)s; !HASZERO(*w); w++); > (gdb) bt > #0 strlen (s=0x0, s@entry=0x48d79d "\t\360\"\223\f\234\200\353\216#\005\032\240z\364e") at src/string/strlen.c:17 > #1 0x0048d5db in load_dlopen (resource_in=0x77d8f52b "D\350\240e", resource_in@entry=0x5fa910 "res_pjproject.so", so_ext=0x0, > so_ext@entry=0x53b91c "", filename=0x5fa910 "res_pjproject.so", filename@entry=0x7fff7a04 "/usr/lib/asterisk/modules/res_pjproject.so", flags=796226418, > flags@entry=258, suppress_logging=suppress_logging@entry=0) at loader.c:955 > #2 0x0048d79d in load_dynamic_module (resource_in=resource_in@entry=0x5fa910 "res_pjproject.so", suppress_logging=suppress_logging@entry=1) at loader.c:1039 > #3 0x0048eea3 in load_resource (resource_name=0x5fa910 "res_pjproject.so", suppress_logging=suppress_logging@entry=1, > module_priorities=module_priorities@entry=0x7fff8c24, required=0, preload=0) at loader.c:1635 > #4 0x0048f5e1 in load_resource_list (mod_count=, load_order=0x7fff8c1c) at loader.c:1984 > #5 load_modules () at loader.c:2232 > #6 0x0042c99d in asterisk_daemon (isroot=, rungroup=, runuser=) at asterisk.c:4146 > #7 main (argc=, argv=) at asterisk.c:3918 > (gdb) > > Any help appreciated! > > Kind regards, > Seb The immediate cause of the crash seems to be passing a null pointer to ast_strdupa, where dlerror() returns a null pointer because dlopen did not fail (they either need to check the return value of dlopen or check the return value of dlerror before using it; preferably the former since dlerror could retain an outdated error from an earlier call). It looks like they expect resource_being_loaded to be 0 if the dlopen succeeds; assuming the intent is that some constructor in the loaded library sets it to 0, this is only valid for the first open. I'm guessing they've introduced some new plugin architecture that's not compatible with dlopen being permanent like it is in musl. The right way to fix this is probably to get rid of the constructors and instead have an explicit init function that their loader finds with dlsym and calls after dlopen succeeds. It looks like they also have support for building the modules builtin to the main program, rather than loading them by dlopen; this may be able to work around the problem. I think they have a lot of bugs and misunderstandings in this area. For instance, this commit is completely bogus: https://github.com/asterisk/asterisk/commit/23aa20bf209dbc02b01c757c5f5f1a787548ed96 I don't see where the behavior you're seeing was introduced, though. Rich