Executable crashes at __libc_start

mailing list of musl libc
 help / color / mirror / code / Atom feed

* Executable crashes at __libc_start_main
@ 2015-02-17  1:41 David Guillen Fandos
  2015-02-17  4:21 ` Rich Felker
  2015-02-17  6:49 ` Igmar Palsenberg
  0 siblings, 2 replies; 7+ messages in thread
From: David Guillen Fandos @ 2015-02-17  1:41 UTC (permalink / raw)
  To: musl

Hello!

I'm creating an app which is an ARM ELF (linux) which runs in very small
machines (routers). Using buildroot to create my toolchain I can choose
between uClibc and musl. Using uclibc my binary crashes at loading, so I
switched to musl and tried. It fails too.

The problem seems to be at __libc_start_main, in this part:

        uintptr_t a = (uintptr_t)&__init_array_start;
        for (; a<(uintptr_t)&__init_array_end; a+=sizeof(void(*)()))
                (*(void (**)())a)();

I checked a little bit (dumping the map file) and I get:

.init_array     0x0000000000016230        0x4
                0x0000000000016230                PROVIDE
(__init_array_start, .)
 *(SORT(.init_array.*))
 *(.init_array)
 .init_array    0x0000000000016230        0x4
/XXX/arm-buildroot-linux-musleabi/4.8.3/crtbeginT.o
                0x0000000000016234                PROVIDE
(__init_array_end, .)

.fini_array     0x0000000000016234        0x4
                0x0000000000016234

Which tells me there is only one function pointer there. Now dumping the
binary:

00016230 <__frame_dummy_init_array_entry>:
   16230:       00008210        andeq   r8, r0, r0, lsl r2

Disassembly of section .fini_array:

Which is pointer 0x8210 which points to function:

00008210 <frame_dummy>:
    8210:       e92d4008        push    {r3, lr}
    8214:       e59f3034        ldr     r3, [pc, #52]   ; 8250
<frame_dummy+0x40>
    8218:       e3530000        cmp     r3, #0

...

So far so good. The binary runs OK on a ARM machine running Debian, but
when I run this program on this other machine it crashes. The CPU is:

ARMv6-compatible processor rev 7 (v6l)
CPU implementer	: 0x41
CPU architecture: 6TEJ
CPU variant	: 0x0
CPU part	: 0xb76
CPU revision	: 7

Finally I got a core dump and the program crashes here:

    88c8:       e1550007        cmp     r5, r7
    88cc:       2a000003        bcs     88e0 <__libc_start_main+0x1b0>
    88d0:       e4953004        ldr     r3, [r5], #4
    88d4:       e1a0e00f        mov     lr, pc
    88d8:       e12fff13        bx      r3
    88dc:       eafffff9        b       88c8 <__libc_start_main+0x198>

In the 88d8 instruction to be more exact. Seems that R3 is holding the
value 0xc8000082!!! Where is that 0xC8 at the beginning comming from?
The PC reported by the core dump is 0xc8000080 which I guess it's just
the vlaue of R3 aligned to 4 byte boundary. R5 points to the right
place, it's just the value loaded by the load. Could it be that
something corrupts my ELF? Could it be the OS being really dumb at
loading the ELF? It's a pretty old kernel, 2.6.21.

Thanks a lot!

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Executable crashes at __libc_start_main
  2015-02-17  1:41 Executable crashes at __libc_start_main David Guillen Fandos
@ 2015-02-17  4:21 ` Rich Felker
  2015-02-17  6:49 ` Igmar Palsenberg
  1 sibling, 0 replies; 7+ messages in thread
From: Rich Felker @ 2015-02-17  4:21 UTC (permalink / raw)
  To: David Guillen Fandos; +Cc: musl

On Tue, Feb 17, 2015 at 01:41:00AM +0000, David Guillen Fandos wrote:
> Hello!
> 
> I'm creating an app which is an ARM ELF (linux) which runs in very small
> machines (routers). Using buildroot to create my toolchain I can choose
> between uClibc and musl. Using uclibc my binary crashes at loading, so I
> switched to musl and tried. It fails too.
> 
> The problem seems to be at __libc_start_main, in this part:
> 
>         uintptr_t a = (uintptr_t)&__init_array_start;
>         for (; a<(uintptr_t)&__init_array_end; a+=sizeof(void(*)()))
>                 (*(void (**)())a)();
> 
> I checked a little bit (dumping the map file) and I get:
> 
> ..init_array     0x0000000000016230        0x4
>                 0x0000000000016230                PROVIDE
> (__init_array_start, .)
>  *(SORT(.init_array.*))
>  *(.init_array)
>  .init_array    0x0000000000016230        0x4
> /XXX/arm-buildroot-linux-musleabi/4.8.3/crtbeginT.o
>                 0x0000000000016234                PROVIDE
> (__init_array_end, .)
> 
> ..fini_array     0x0000000000016234        0x4
>                 0x0000000000016234
> 
> Which tells me there is only one function pointer there. Now dumping the
> binary:
> 
> 00016230 <__frame_dummy_init_array_entry>:
>    16230:       00008210        andeq   r8, r0, r0, lsl r2
> 
> Disassembly of section .fini_array:
> 
> Which is pointer 0x8210 which points to function:
> 
> 00008210 <frame_dummy>:
>     8210:       e92d4008        push    {r3, lr}
>     8214:       e59f3034        ldr     r3, [pc, #52]   ; 8250
> <frame_dummy+0x40>
>     8218:       e3530000        cmp     r3, #0
> 
> ....
> 
> So far so good. The binary runs OK on a ARM machine running Debian, but
> when I run this program on this other machine it crashes. The CPU is:
> 
> ARMv6-compatible processor rev 7 (v6l)
> CPU implementer	: 0x41
> CPU architecture: 6TEJ
> CPU variant	: 0x0
> CPU part	: 0xb76
> CPU revision	: 7
> 
> Finally I got a core dump and the program crashes here:
> 
>     88c8:       e1550007        cmp     r5, r7
>     88cc:       2a000003        bcs     88e0 <__libc_start_main+0x1b0>
>     88d0:       e4953004        ldr     r3, [r5], #4
>     88d4:       e1a0e00f        mov     lr, pc
>     88d8:       e12fff13        bx      r3
>     88dc:       eafffff9        b       88c8 <__libc_start_main+0x198>
> 
> In the 88d8 instruction to be more exact. Seems that R3 is holding the
> value 0xc8000082!!! Where is that 0xC8 at the beginning comming from?
> The PC reported by the core dump is 0xc8000080 which I guess it's just
> the vlaue of R3 aligned to 4 byte boundary. R5 points to the right
> place, it's just the value loaded by the load. Could it be that
> something corrupts my ELF? Could it be the OS being really dumb at
> loading the ELF? It's a pretty old kernel, 2.6.21.

Are you sure r5 is right? It sounds to me like r5 is off by one and
you have a chip that's not trapping misaligned accesses. You should
start by dumping all registers and checking that they make sense.

Building musl as thumb is not widely tested, and I suspect it might be
related to what's going on. If pc-relative addressing is being used
for __init_array_start and the linker is not properly aware of the
facts that (1) the calling code is thumb, and (2) the init array is
data (not code), then you could end up with an off-by-one address due
to the way thumb works.

Actually I think you really have something going on wrong here since
0x8210 is not even a valid function address for thumb code. The
address would be 0x8211.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Executable crashes at __libc_start_main
  2015-02-17  1:41 Executable crashes at __libc_start_main David Guillen Fandos
  2015-02-17  4:21 ` Rich Felker
@ 2015-02-17  6:49 ` Igmar Palsenberg
  2015-02-17  9:20   ` David Guillen
  1 sibling, 1 reply; 7+ messages in thread
From: Igmar Palsenberg @ 2015-02-17  6:49 UTC (permalink / raw)
  To: musl


> Finally I got a core dump and the program crashes here:
> 
>     88c8:       e1550007        cmp     r5, r7
>     88cc:       2a000003        bcs     88e0 <__libc_start_main+0x1b0>
>     88d0:       e4953004        ldr     r3, [r5], #4
>     88d4:       e1a0e00f        mov     lr, pc
>     88d8:       e12fff13        bx      r3
>     88dc:       eafffff9        b       88c8 <__libc_start_main+0x198>
> 
> In the 88d8 instruction to be more exact. Seems that R3 is holding the
> value 0xc8000082!!! Where is that 0xC8 at the beginning comming from?
> The PC reported by the core dump is 0xc8000080 which I guess it's just
> the vlaue of R3 aligned to 4 byte boundary. R5 points to the right
> place, it's just the value loaded by the load. Could it be that
> something corrupts my ELF? Could it be the OS being really dumb at
> loading the ELF? It's a pretty old kernel, 2.6.21.

You're absolutely sure your toolchain is OK ? Hard to track issues like 
this are usually caused by a wrong toolchain, and ARM has some nice quirks 
when it comes to this.



	Igmar


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Executable crashes at __libc_start_main
  2015-02-17  6:49 ` Igmar Palsenberg
@ 2015-02-17  9:20   ` David Guillen
  2015-02-17 15:46     ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: David Guillen @ 2015-02-17  9:20 UTC (permalink / raw)
  To: musl

Hi,

The toolchain is a "buildroot" one, so it _should_ be OK. The funny
think as I said is that it works well on some ARM boxes and qemu, so
it might be something related to the ld-linux.so.

Rich: R5 is OK, it points to the following 4 bytes (due to
postincrement), so I guess it must be OK before the load. And BTW I'm
not using thumb code, all instructions are ARM 32 bit wide
instructions.

Thanks
David

2015-02-17 6:49 GMT+00:00 Igmar Palsenberg <igmar@palsenberg.com>:
>
>> Finally I got a core dump and the program crashes here:
>>
>>     88c8:       e1550007        cmp     r5, r7
>>     88cc:       2a000003        bcs     88e0 <__libc_start_main+0x1b0>
>>     88d0:       e4953004        ldr     r3, [r5], #4
>>     88d4:       e1a0e00f        mov     lr, pc
>>     88d8:       e12fff13        bx      r3
>>     88dc:       eafffff9        b       88c8 <__libc_start_main+0x198>
>>
>> In the 88d8 instruction to be more exact. Seems that R3 is holding the
>> value 0xc8000082!!! Where is that 0xC8 at the beginning comming from?
>> The PC reported by the core dump is 0xc8000080 which I guess it's just
>> the vlaue of R3 aligned to 4 byte boundary. R5 points to the right
>> place, it's just the value loaded by the load. Could it be that
>> something corrupts my ELF? Could it be the OS being really dumb at
>> loading the ELF? It's a pretty old kernel, 2.6.21.
>
> You're absolutely sure your toolchain is OK ? Hard to track issues like
> this are usually caused by a wrong toolchain, and ARM has some nice quirks
> when it comes to this.
>
>
>
>         Igmar


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Executable crashes at __libc_start_main
  2015-02-17  9:20   ` David Guillen
@ 2015-02-17 15:46     ` Rich Felker
  2015-02-17 15:50       ` David Guillen
  2015-02-17 17:31       ` David Guillen Fandos
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2015-02-17 15:46 UTC (permalink / raw)
  To: David Guillen; +Cc: musl

On Tue, Feb 17, 2015 at 09:20:38AM +0000, David Guillen wrote:
> Hi,
> 
> The toolchain is a "buildroot" one, so it _should_ be OK. The funny
> think as I said is that it works well on some ARM boxes and qemu, so
> it might be something related to the ld-linux.so.

That code is not supposed to be compiled at all in shared libc, only
static, and for static there is no "ld-linux". Also the dynamic linker
should be ld-musl-arm.so.1; if it's using ld-linux that's a foreign
dynamic linker that's not going to work.

> Rich: R5 is OK, it points to the following 4 bytes (due to
> postincrement), so I guess it must be OK before the load. And BTW I'm
> not using thumb code, all instructions are ARM 32 bit wide
> instructions.

Sorry, I misread the address column as the instruction encoding when I
saw just 4 hex digits. :-) So that's not the issue.

Can you dump the address range for __init_array_start at runtime in
gdb using the x command?

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Executable crashes at __libc_start_main
  2015-02-17 15:46     ` Rich Felker
@ 2015-02-17 15:50       ` David Guillen
  2015-02-17 17:31       ` David Guillen Fandos
  1 sibling, 0 replies; 7+ messages in thread
From: David Guillen @ 2015-02-17 15:50 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Uhm good idea, let me try this later.

I'm compiling everything statically. As you say that means no ld.so is
used, therefore I don't understand who could be causing the problem.

Thanks,
David

2015-02-17 15:46 GMT+00:00 Rich Felker <dalias@libc.org>:
> On Tue, Feb 17, 2015 at 09:20:38AM +0000, David Guillen wrote:
>> Hi,
>>
>> The toolchain is a "buildroot" one, so it _should_ be OK. The funny
>> think as I said is that it works well on some ARM boxes and qemu, so
>> it might be something related to the ld-linux.so.
>
> That code is not supposed to be compiled at all in shared libc, only
> static, and for static there is no "ld-linux". Also the dynamic linker
> should be ld-musl-arm.so.1; if it's using ld-linux that's a foreign
> dynamic linker that's not going to work.
>
>> Rich: R5 is OK, it points to the following 4 bytes (due to
>> postincrement), so I guess it must be OK before the load. And BTW I'm
>> not using thumb code, all instructions are ARM 32 bit wide
>> instructions.
>
> Sorry, I misread the address column as the instruction encoding when I
> saw just 4 hex digits. :-) So that's not the issue.
>
> Can you dump the address range for __init_array_start at runtime in
> gdb using the x command?
>
> Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Executable crashes at __libc_start_main
  2015-02-17 15:46     ` Rich Felker
  2015-02-17 15:50       ` David Guillen
@ 2015-02-17 17:31       ` David Guillen Fandos
  1 sibling, 0 replies; 7+ messages in thread
From: David Guillen Fandos @ 2015-02-17 17:31 UTC (permalink / raw)
  Cc: musl

I checked the core dump.

At addr 0x00016230 (init_array) value is 0xc8000082 as reported by r3.
So either someone corrupted it or the OS corrupted it at loading.

The mentioned platform does not ship gdb. Any idea on how to "debug"
this? Even if it shipped gdb I don't think the error would be
reproducible, since it works great on other ARM systems.

Thanks!
David

El 17/02/15 a las 15:46, Rich Felker escribió:
> On Tue, Feb 17, 2015 at 09:20:38AM +0000, David Guillen wrote:
>> Hi,
>>
>> The toolchain is a "buildroot" one, so it _should_ be OK. The funny
>> think as I said is that it works well on some ARM boxes and qemu, so
>> it might be something related to the ld-linux.so.
> 
> That code is not supposed to be compiled at all in shared libc, only
> static, and for static there is no "ld-linux". Also the dynamic linker
> should be ld-musl-arm.so.1; if it's using ld-linux that's a foreign
> dynamic linker that's not going to work.
> 
>> Rich: R5 is OK, it points to the following 4 bytes (due to
>> postincrement), so I guess it must be OK before the load. And BTW I'm
>> not using thumb code, all instructions are ARM 32 bit wide
>> instructions.
> 
> Sorry, I misread the address column as the instruction encoding when I
> saw just 4 hex digits. :-) So that's not the issue.
> 
> Can you dump the address range for __init_array_start at runtime in
> gdb using the x command?
> 
> Rich
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-02-17 17:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-17  1:41 Executable crashes at __libc_start_main David Guillen Fandos
2015-02-17  4:21 ` Rich Felker
2015-02-17  6:49 ` Igmar Palsenberg
2015-02-17  9:20   ` David Guillen
2015-02-17 15:46     ` Rich Felker
2015-02-17 15:50       ` David Guillen
2015-02-17 17:31       ` David Guillen Fandos

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).