mailing list of musl libc
 help / color / mirror / code / Atom feed
* Crash in 'system' while executing '__clone'
@ 2017-02-22 11:44 Tobias Koch
  2017-02-22 15:27 ` Markus Wichmann
  0 siblings, 1 reply; 3+ messages in thread
From: Tobias Koch @ 2017-02-22 11:44 UTC (permalink / raw)
  To: musl

Hi,

the following code snippet

    #include <stdlib.h>

    int main(void)
    {
        system("ls");
    }

segfaults while running inside a 1.1.16 musl-based chroot on a

    Linux debian 4.9.0-1-amd64 #1 SMP Debian 4.9.6-3 (2017-01-28) x86_64 GNU/Linux

host. The crash happens when __clone returns:

    Reading symbols from test...done.
    (gdb) break __clone
    Function "__clone" not defined.
    Make breakpoint pending on future shared library load? (y or [n]) y
    Breakpoint 1 (__clone) pending.
    (gdb) run
    Starting program: /home/tobias/test

    Breakpoint 1, __clone () at src/thread/x86_64/clone.s:5
    56xor %eax,%eax
    (gdb) next
    64mov $56,%al
    (gdb) 
    77mov %rdi,%r11
    (gdb) 
    8mov %rdx,%rdi
    (gdb) 
    9899mov %r8,%rdx
    (gdb) 
    1000mov %r9,%r8
    (gdb) 
    11mov 8(%rsp),%r10
    (gdb)
    128mov %r11,%r9
    (gdb)
    13and $-16,%rsi
    (gdb)
    14sub $8,%rsi
    (gdb)
    15mov %rcx,(%rsi)
    (gdb)
    16syscall
    (gdb)
    17test %eax,%eax
    (gdb) backup    git       pkgs      repo      spool     temp.txt  test      test.c    test.txt

    18jnz 1f
    (gdb)
    __clone () at src/thread/x86_64/clone.s:27
    271:271ret(gdb)
    0x0000000000000000 in ?? ()

Any ideas what might be wrong or what I can do to investigate further?

Tobias


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Crash in 'system' while executing '__clone'
  2017-02-22 11:44 Crash in 'system' while executing '__clone' Tobias Koch
@ 2017-02-22 15:27 ` Markus Wichmann
  2017-02-22 16:00   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Markus Wichmann @ 2017-02-22 15:27 UTC (permalink / raw)
  To: musl

On Wed, Feb 22, 2017 at 11:44:12AM +0000, Tobias Koch wrote:
>     16syscall
>     (gdb)
>     17test %eax,%eax
>     (gdb) backup    git       pkgs      repo      spool     temp.txt  test      test.c    test.txt
> 

OK, so the clone call was successful. Good. In system() we clone with
vfork() semantics, so the caller is blocked until the child exec()s.

BTW, what's with the line numbers? Why are they doubled (up in the
single digits)?

>     18jnz 1f
>     (gdb)
>     __clone () at src/thread/x86_64/clone.s:27
>     271:271ret(gdb)
>     0x0000000000000000 in ?? ()
> 
> Any ideas what might be wrong or what I can do to investigate further?
> 
> Tobias

So the last few steps mean that the ret instruction loaded a zero into
RIP. Which means that [rsp] has been replaced with a zero byte.

I'd probably debug this again, setting a watchpoint on the value RSP is
pointing to. Then set the debugger to follow a created child (set
follow-fork-mode child) and run this snippet again. As I said, vfork()
semantics are in use, i.e. the child process might clobber the return
address of its parent.

Ciao,
Markus


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Crash in 'system' while executing '__clone'
  2017-02-22 15:27 ` Markus Wichmann
@ 2017-02-22 16:00   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2017-02-22 16:00 UTC (permalink / raw)
  To: musl

On Wed, Feb 22, 2017 at 04:27:39PM +0100, Markus Wichmann wrote:
> On Wed, Feb 22, 2017 at 11:44:12AM +0000, Tobias Koch wrote:
> >     16syscall
> >     (gdb)
> >     17test %eax,%eax
> >     (gdb) backup    git       pkgs      repo      spool     temp.txt  test      test.c    test.txt
> > 
> 
> OK, so the clone call was successful. Good. In system() we clone with
> vfork() semantics, so the caller is blocked until the child exec()s.

The code does not actually rely on that; CLONE_VFORK is just an
optimization hint to prevent the kernel from scheduling the parent
only to have it immediately block (it also avoids mis-emulation bugs
in qemu app-level emulation). It's safe for the parent to run here
because the child has a separate stack; deallocation of the child
stack is protected by additional synchronization with the child.

> BTW, what's with the line numbers? Why are they doubled (up in the
> single digits)?
> 
> >     18jnz 1f
> >     (gdb)
> >     __clone () at src/thread/x86_64/clone.s:27
> >     271:271ret(gdb)
> >     0x0000000000000000 in ?? ()
> > 
> > Any ideas what might be wrong or what I can do to investigate further?
> > 
> > Tobias
> 
> So the last few steps mean that the ret instruction loaded a zero into
> RIP. Which means that [rsp] has been replaced with a zero byte.
> 
> I'd probably debug this again, setting a watchpoint on the value RSP is
> pointing to. Then set the debugger to follow a created child (set
> follow-fork-mode child) and run this snippet again. As I said, vfork()
> semantics are in use, i.e. the child process might clobber the return
> address of its parent.

Yes, this sounds like a good debugging approach, even though what
seems to be happening shouldn't be possible.

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-02-22 16:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-22 11:44 Crash in 'system' while executing '__clone' Tobias Koch
2017-02-22 15:27 ` Markus Wichmann
2017-02-22 16:00   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).