From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7061 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Executable crashes at __libc_start_main Date: Mon, 16 Feb 2015 23:21:07 -0500 Message-ID: <20150217042107.GC23507@brightrain.aerifal.cx> References: <54E29C2C.5080907@davidgf.es> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1424146892 724 80.91.229.3 (17 Feb 2015 04:21:32 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 17 Feb 2015 04:21:32 +0000 (UTC) Cc: musl@lists.openwall.com To: David Guillen Fandos Original-X-From: musl-return-7074-gllmg-musl=m.gmane.org@lists.openwall.com Tue Feb 17 05:21:32 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YNZf9-0002Vx-Vz for gllmg-musl@m.gmane.org; Tue, 17 Feb 2015 05:21:32 +0100 Original-Received: (qmail 12200 invoked by uid 550); 17 Feb 2015 04:21:30 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 12153 invoked from network); 17 Feb 2015 04:21:25 -0000 Content-Disposition: inline In-Reply-To: <54E29C2C.5080907@davidgf.es> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7061 Archived-At: On Tue, Feb 17, 2015 at 01:41:00AM +0000, David Guillen Fandos wrote: > Hello! > > I'm creating an app which is an ARM ELF (linux) which runs in very small > machines (routers). Using buildroot to create my toolchain I can choose > between uClibc and musl. Using uclibc my binary crashes at loading, so I > switched to musl and tried. It fails too. > > The problem seems to be at __libc_start_main, in this part: > > uintptr_t a = (uintptr_t)&__init_array_start; > for (; a<(uintptr_t)&__init_array_end; a+=sizeof(void(*)())) > (*(void (**)())a)(); > > I checked a little bit (dumping the map file) and I get: > > ..init_array 0x0000000000016230 0x4 > 0x0000000000016230 PROVIDE > (__init_array_start, .) > *(SORT(.init_array.*)) > *(.init_array) > .init_array 0x0000000000016230 0x4 > /XXX/arm-buildroot-linux-musleabi/4.8.3/crtbeginT.o > 0x0000000000016234 PROVIDE > (__init_array_end, .) > > ..fini_array 0x0000000000016234 0x4 > 0x0000000000016234 > > Which tells me there is only one function pointer there. Now dumping the > binary: > > 00016230 <__frame_dummy_init_array_entry>: > 16230: 00008210 andeq r8, r0, r0, lsl r2 > > Disassembly of section .fini_array: > > Which is pointer 0x8210 which points to function: > > 00008210 : > 8210: e92d4008 push {r3, lr} > 8214: e59f3034 ldr r3, [pc, #52] ; 8250 > > 8218: e3530000 cmp r3, #0 > > .... > > So far so good. The binary runs OK on a ARM machine running Debian, but > when I run this program on this other machine it crashes. The CPU is: > > ARMv6-compatible processor rev 7 (v6l) > CPU implementer : 0x41 > CPU architecture: 6TEJ > CPU variant : 0x0 > CPU part : 0xb76 > CPU revision : 7 > > Finally I got a core dump and the program crashes here: > > 88c8: e1550007 cmp r5, r7 > 88cc: 2a000003 bcs 88e0 <__libc_start_main+0x1b0> > 88d0: e4953004 ldr r3, [r5], #4 > 88d4: e1a0e00f mov lr, pc > 88d8: e12fff13 bx r3 > 88dc: eafffff9 b 88c8 <__libc_start_main+0x198> > > In the 88d8 instruction to be more exact. Seems that R3 is holding the > value 0xc8000082!!! Where is that 0xC8 at the beginning comming from? > The PC reported by the core dump is 0xc8000080 which I guess it's just > the vlaue of R3 aligned to 4 byte boundary. R5 points to the right > place, it's just the value loaded by the load. Could it be that > something corrupts my ELF? Could it be the OS being really dumb at > loading the ELF? It's a pretty old kernel, 2.6.21. Are you sure r5 is right? It sounds to me like r5 is off by one and you have a chip that's not trapping misaligned accesses. You should start by dumping all registers and checking that they make sense. Building musl as thumb is not widely tested, and I suspect it might be related to what's going on. If pc-relative addressing is being used for __init_array_start and the linker is not properly aware of the facts that (1) the calling code is thumb, and (2) the init array is data (not code), then you could end up with an off-by-one address due to the way thumb works. Actually I think you really have something going on wrong here since 0x8210 is not even a valid function address for thumb code. The address would be 0x8211. Rich