From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14787 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Patrick Oppenlander Newsgroups: gmane.linux.lib.musl.general Subject: Re: armv7-m segv when throwing c++ exception Date: Thu, 3 Oct 2019 08:34:10 +1000 Message-ID: References: <20191002092233.GB7832@port70.net> <20191002114045.GI16318@brightrain.aerifal.cx> <20191002114955.GJ16318@brightrain.aerifal.cx> <20191002122543.GK16318@brightrain.aerifal.cx> <20191002124912.GL16318@brightrain.aerifal.cx> <20191002145555.GM16318@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="67126"; mail-complaints-to="usenet@blaine.gmane.org" To: musl@lists.openwall.com Original-X-From: musl-return-14803-gllmg-musl=m.gmane.org@lists.openwall.com Thu Oct 03 00:34:38 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iFnCe-000HLJ-Af for gllmg-musl@m.gmane.org; Thu, 03 Oct 2019 00:34:36 +0200 Original-Received: (qmail 26293 invoked by uid 550); 2 Oct 2019 22:34:33 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 26275 invoked from network); 2 Oct 2019 22:34:33 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=aYuEn7a79oZhlh8syrxvAo//MxyczzESjkoXLYuBv2g=; b=YSG+9OfqWK+o06zI7JTvZE0hOD9gHKK5n1Y4sT1sFk8dFRZTBfzqkZQZH3R1W5xO8X L8Ahf6rcJ9P2THThk+Td0YeG2ctOsMvgONWanDYxtoB6QURDYYVmB/XdzOy4+g3dakHa pIkQS+7jhDZEIKmErmomrh/NSYLJZKn+dElAAs7XQl0LS8tr97a4oLw+2/1lRWvgHdJ7 flvWyKvCP54BCg05gbtm2kSGrdXBGwGZwOCz3THrn53EzRTeTBWsEtOyOAkO5t9HC/PP R+6WZWHYDpF2KcZgagh+hRRLeIv2pmvD+Af69a5qPkQHN0Y+4vor2rHHX9t/C2L+14lp 1zbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=aYuEn7a79oZhlh8syrxvAo//MxyczzESjkoXLYuBv2g=; b=cw1QIZ+SKxCbFDXlTQ9yQ0W/EcEGfQ9QpUEUZILcOBfVRzcgbhDEyMDZGRxTruInX6 CbSKDlAG2IvjCzry7bHxu9P+d2sLHSEl602bL4kuiR7F2uptHRl2GO6qKe2Kz3e6/ioZ sz1V9gLxgIMzfJb0Y7PaHBm5GfFnpVymbhNwsCcw9tuLdmbQbyc9t7leGb+CNtoqdU3z M0m2k5ZZWpscX0FRhyJPtGRLF2X6dnOfLNlnRxxNqYrUteRvNLEsG51bcn+/oq67cNuF 7V9OGSTm6he0htkfW3i5bABLyBzDhB/2rb9nJXSvOKe5KrlyeXRzRvw7IClFqXP9Doz8 ioig== X-Gm-Message-State: APjAAAV6RgmhUKlT2yq6s1IWqskqsM8FMt7vEI9PcSTY/G1mIPYx1J+5 6qJ6RjYM/31NXaTa2Pw1nEzJUQq8Y19pQZTMXsRVrw== X-Google-Smtp-Source: APXvYqwPo/f0dItpoQBCO9kfqd0o2AtUhrKL2z2AFeawR/9E8N/g+/K9Tw4Idf9NLFfx+bZNp8m9NP+t0/PDNZJ4sk8= X-Received: by 2002:a05:651c:1102:: with SMTP id d2mr4024481ljo.74.1570055661460; Wed, 02 Oct 2019 15:34:21 -0700 (PDT) In-Reply-To: <20191002145555.GM16318@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:14787 Archived-At: Wow, that was fast! Thank you! I really appreciate the effort you & the whole musl community put in. Patrick On Thu, Oct 3, 2019 at 12:56 AM Rich Felker wrote: > > On Wed, Oct 02, 2019 at 08:49:12AM -0400, Rich Felker wrote: > > On Wed, Oct 02, 2019 at 08:25:43AM -0400, Rich Felker wrote: > > > On Wed, Oct 02, 2019 at 07:49:55AM -0400, Rich Felker wrote: > > > > On Wed, Oct 02, 2019 at 07:40:45AM -0400, Rich Felker wrote: > > > > > On Wed, Oct 02, 2019 at 11:22:33AM +0200, Szabolcs Nagy wrote: > > > > > > * Patrick Oppenlander [2019-10-02 12:00:19 +1000]: > > > > > > > Hi all, > > > > > > > > > > > > > > I'm running into an issue with a gcc-8.3.0/musl-1.1.23 toolchain built > > > > > > > using musl-cross-make for armv7-m. > > > > > > > > > > > > > > % cat test.cpp > > > > > > > int main() > > > > > > > { > > > > > > > throw 0; > > > > > > > } > > > > > > > % armv7m-linux-musleabihf-g++ -ggdb -static test.cpp > > > > > > > % qemu-arm ./a.out > > > > > > > qemu: uncaught target signal 11 (Segmentation fault) - core dumped > > > > > > > [2] 198246 segmentation fault (core dumped) qemu-arm ./a.out > > > > > > > > > > > > > > We should be seeing something like "terminate called after throwing an > > > > > > > instance of 'int'". > > > > > > > > > > > > there is at least one known issue in this area: > > > > > > libgcc uses pthread apis via weak refs so with static > > > > > > linking they don't get linked in, however i think that > > > > > > should only affect unwinding in multi-thread processes, > > > > > > so this is something else > > > > > > > > > > > > > I have an ODROID-C2 here which is a convenient place to run gdb: > > > > > > > > > > > > > > $ gdb -q ./a.out > > > > > > > Reading symbols from ./a.out... > > > > > > > (gdb) run > > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > > > > > 0xf77cf9f4 in __cxxabiv1::__cxa_throw (obj=0xf77f5fa0, > > > > > > > tinfo=0xf77efb20 , dest=0x0) > > > > > > > at ../../../../src_gcc/libstdc++-v3/libsupc++/eh_throw.cc:81 > > > > > > > 81 ../../../../src_gcc/libstdc++-v3/libsupc++/eh_throw.cc: No such > > > > > > > file or directory. > > > > > > > (gdb) bt > > > > > > > #0 0xf77cf9f4 in __cxxabiv1::__cxa_throw (obj=0xf77f5fa0, > > > > > > > tinfo=0xf77efb20 , dest=0x0) > > > > > > > at ../../../../src_gcc/libstdc++-v3/libsupc++/eh_throw.cc:81 > > > > > > > #1 0xf77cf902 in main () at test.cpp:3 > > > > > > > (gdb) disas > > > > > > > Dump of assembler code for function __cxxabiv1::__cxa_throw(void*, > > > > > > > std::type_info*, void (*)(void*)): > > > > > > > 0xf77cf9e0 <+0>: push {r3, r4, r5, r6, r7, lr} > > > > > > > 0xf77cf9e2 <+2>: mov r5, r0 > > > > > > > 0xf77cf9e4 <+4>: mov r7, r1 > > > > > > > 0xf77cf9e6 <+6>: mov r6, r2 > > > > > > > 0xf77cf9e8 <+8>: bl 0xf77d01c4 <__cxxabiv1::__cxa_get_globals()> > > > > > > > 0xf77cf9ec <+12>: mov r4, r0 > > > > > > > 0xf77cf9ee <+14>: mov r1, r7 > > > > > > > 0xf77cf9f0 <+16>: mov r2, r6 > > > > > > > 0xf77cf9f2 <+18>: mov r0, r5 > > > > > > > => 0xf77cf9f4 <+20>: ldr r3, [r4, #4] > > > > > > > 0xf77cf9f6 <+22>: adds r3, #1 > > > > > > > 0xf77cf9f8 <+24>: str r3, [r4, #4] > > > > > > > 0xf77cf9fa <+26>: bl 0xf77cf984 > > > > > > > <__cxxabiv1::__cxa_init_primary_exception(void*, std::type_info*, void > > > > > > > (*)(void*))> > > > > > > > 0xf77cf9fe <+30>: movs r3, #1 > > > > > > > 0xf77cfa00 <+32>: mov r4, r0 > > > > > > > 0xf77cfa02 <+34>: str.w r3, [r4], #40 > > > > > > > 0xf77cfa06 <+38>: mov r0, r4 > > > > > > > 0xf77cfa08 <+40>: bl 0xf77d88a8 <___Unwind_RaiseException> > > > > > > > 0xf77cfa0c <+44>: mov r0, r4 > > > > > > > 0xf77cfa0e <+46>: bl 0xf77d01e0 <__cxxabiv1::__cxa_begin_catch(void*)> > > > > > > > 0xf77cfa12 <+50>: bl 0xf77cfaf0 > > > > > > > End of assembler dump. > > > > > > > (gdb) p/x $r4 > > > > > > > $1 = 0x1 > > > > > > > > > > > > > > That doesn't look good. > > > > > > > > > > > > > > (gdb) disas __cxa_get_globals > > > > > > > Dump of assembler code for function __cxxabiv1::__cxa_get_globals(): > > > > > > > 0xf77d01c4 <+0>: ldr r0, [pc, #12] ; (0xf77d01d4 > > > > > > > <__cxxabiv1::__cxa_get_globals()+16>) > > > > > > > 0xf77d01c6 <+2>: push {r3, lr} > > > > > > > 0xf77d01c8 <+4>: add r0, pc > > > > > > > 0xf77d01ca <+6>: bl 0xf77da940 <__tls_get_addr> > > > > > > > 0xf77d01ce <+10>: ldr r3, [pc, #8] ; (0xf77d01d8 > > > > > > > <__cxxabiv1::__cxa_get_globals()+20>) > > > > > > > 0xf77d01d0 <+12>: add r0, r3 > > > > > > > 0xf77d01d2 <+14>: pop {r3, pc} > > > > > > > 0xf77d01d4 <+16>: ; instruction: 0001fe40 > > > > > > > 0xf77d01d8 <+20>: ; instruction: 00000000 > > > > > > > End of assembler dump. > > > > > > > > > > > > interesting, i thought the linker would relax __tls_get_addr > > > > > > calls in static linked executables to local-exec access model, > > > > > > but maybe that's not implemented on arm. > > > > > > > > > > > > can you look at the readelf -aW a.out and see if the linker > > > > > > left any dynamic tls relocs in the exe? > > > > > > > > > > > > > > > > > > > > OK, so what did __tls_get_addr return? > > > > > > > > > > > > > > (gdb) b *(__cxa_get_globals + 10) > > > > > > > Breakpoint 3 at 0xf77d01ce: file > > > > > > > ../../../../src_gcc/libstdc++-v3/libsupc++/eh_globals.cc, line 62. > > > > > > > (gdb) run > > > > > > > Breakpoint 3, 0xf77d01ce in __cxxabiv1::__cxa_get_globals () at > > > > > > > ../../../../src_gcc/libstdc++-v3/libsupc++/eh_globals.cc:62 > > > > > > > 62 ../../../../src_gcc/libstdc++-v3/libsupc++/eh_globals.cc: No such > > > > > > > file or directory. > > > > > > > (gdb) p/x $r0 > > > > > > > $3 = 0x1 > > > > > > > > > > > > > > That's doesn't look good either. Let's look at __tls_get_addr. > > > > > > > > > > > > > > void *__tls_get_addr(tls_mod_off_t *v) > > > > > > > { > > > > > > > pthread_t self = __pthread_self(); > > > > > > > return (void *)(self->dtv[v[0]] + v[1]); > > > > > > > } > > > > > > > > > > > > > > (gdb) disas __tls_get_addr (annotations added by me) > > > > > > > Dump of assembler code for function __tls_get_addr: > > > > > > > r2 = v[0] > > > > > > > 0x0000c940 <+0>: ldr r2, [r0, #0] > > > > > > > r3 = tls > > > > > > > 0x0000c942 <+2>: ; instruction: 0xee1d3f70 > > > > > > > self = r3 - 120 > > > > > > > r3 = *(r3 - 116) = self->dtv > > > > > > > 0x0000c946 <+6>: ldr.w r3, [r3, #-116] > > > > > > > r0 = v[1] > > > > > > > 0x0000c94a <+10>: ldr r0, [r0, #4] > > > > > > > r2 = *(r3 + r2 * 4) = dtv[r2] > > > > > > > 0x0000c94c <+12>: ldr.w r2, [r3, r2, lsl #2] > > > > > > > r0 = r0 + r2 > > > > > > > 0x0000c950 <+16>: add r0, r2 > > > > > > > 0x0000c952 <+18>: bx lr > > > > > > > End of assembler dump. > > > > > > > (gdb) b __tls_get_addr > > > > > > > > > > > > > > So after pulling out the values I end up with: > > > > > > > v = 0xf77f000c > > > > > > > 0xf77f000c: 0x00000000 0x00000000 0xf77f072c 0xf77eff1c > > > > > > > v[0] = 0 > > > > > > > v[1] = 0 > > > > > > > self = 0xf77f064c > > > > > > > 0xf77f064c : 0xf77f064c 0xf77f06dc 0xf77f064c 0xf77f064c > > > > > > > self->dtv = 0xf77f06dc > > > > > > > 0xf77f06dc : 0x00000001 0xf77f06cc 0x00000000 0x00000000 > > > > > > > return = dtv[v[0]] + v[1] = 1 + 0 = 1 > > > > > > > > > > > > > > At this point I'm out of my depth. > > > > > > > > > > > > dtv[0] is special and just stores the length of the dtv > > > > > > (which is correctly 1, meaning there is 1 elf module with tls). > > > > > > > > > > > > so the issue is that v[0]==0 instead of 1 (dtv[1] would hold > > > > > > the tls address of the executable) i think v[0] should be set > > > > > > up by the static linker so it may be a binutils bug. > > > > > > (or maybe static linking is special and then dtv[0] should be > > > > > > set equal to dtv[1]?) > > > > > > > > > > MIPS had the exact same bug; I think it's fixed upstream now since we > > > > > had the patch for binutils 2.27 in mcm but dropped it in 2.32: > > > > > > > > > > https://github.com/richfelker/musl-cross-make/blob/master/patches/binutils-2.27/0004-mips-pie-tls.diff > > > > > > > > > > It affects static PIE only, so -no-pie should be able to work around > > > > > it (conditional for setting the 1 was essentially !ET_DYN instead of > > > > > is_executable). > > > > > > > > > > We could work around this in musl in one of two ways: rcrt1 could > > > > > process DTPMOD relocs and write a 1, or the static version of > > > > > __init_tls could just duplicate dtv[1] in dtv[0] (the latter is only > > > > > valid up until we add static-linked dlopen, if we ever do that). > > > > > However I was fairly strongly against doing that for MIPS, since it > > > > > was a binutils bug and only affected static linking (so it's not a > > > > > runtime ABI issue; the binary either works at link-time or it doesn't) > > > > > and I'd lean towards treating ARM the same. > > > > > > > > > > It should be easy to make and apply the binutils patch to fix this, > > > > > and shouldn't be too hard to grep for the same bug in all archs (since > > > > > binutils nicely duplicates this logic for every single arch, for no > > > > > reason whatsoever). > > > > > > > > Attached patch is untested but should fix it. > > > > > > It's missing a second instance and doesn't work. I'll send an updated > > > patch after testing. > > > > Attached works for me. > > > --- binutils-2.32/bfd/elf32-arm.c.orig 2019-10-02 07:47:36.153918869 -0400 > > +++ binutils-2.32/bfd/elf32-arm.c 2019-10-02 08:37:09.108263016 -0400 > > @@ -11624,7 +11624,7 @@ > > { > > /* If we don't know the module number, create a relocation > > for it. */ > > - if (bfd_link_pic (info)) > > + if (bfd_link_dll (info)) > > { > > Elf_Internal_Rela outrel; > > > > @@ -11728,7 +11728,7 @@ > > now, and emit any relocations. If both an IE GOT and a > > GD GOT are necessary, we emit the GD first. */ > > > > - if ((bfd_link_pic (info) || indx != 0) > > + if ((bfd_link_dll (info) || indx != 0) > > && (h == NULL > > || (ELF_ST_VISIBILITY (h->other) == STV_DEFAULT > > && !resolved_to_zero) > > And, filed bug report with patch: > > https://sourceware.org/bugzilla/show_bug.cgi?id=25056 > > Patches are now in musl-cross-make too. > > Rich