Hello! I'm having an issue with musl (at git master.) It appears as the waitpid[1] implementation assumes that the syscall returns values matching the waitpid specification, but it does not! This causes some programs to hang. Runit's "runsv" program is one example. It does something like this: for (;;) { child = waitpid(-1, &wstat, WNOHANG); if (!child) break; if ((child == -1) && (errno != EINTR)) break; if (child == svd[0].pid) { // do things with child } } When I inspect a hung runsv process with strace I find it calling wait4 at full speed, stuck in this loop. wait4 comes from calling the waitpid function which in musl performs the wait4 syscall and returns the value. The waitpid spec[2] says its return value is either the PID, -1 with errno set to EINTR or 0 in WNOHANG mode. So, the expected returns values are: >0, 0, -1. However the wait4 syscall[3] in Linux 5 returns other values, specifically it returns errors as negative values. The error that trips up programs like runit's runsv is ECHILD (-10) which wait4 returns when there are no children (i.e. they have exited.) I propose that you change the waitpid implementation to handle this. Something like this: pid_t waitpid(pid_t pid, int *status, int options) { pid_t r = syscall_cp(SYS_wait4, pid, status, options, 0); if (r < 0) { errno = -r; r = -1; } return r; } -- Rasmus [1] waitpid in musl: http://git.musl-libc.org/cgit/musl/tree/src/process/waitpid.c [2] waitpid spec: https://pubs.opengroup.org/onlinepubs/9699919799/functions/wait.html [2] waitpid in glibc: https://man7.org/linux/man-pages/man2/waitpid.2.html [3] wait4 syscall implementation in Linux 5.10.1: kernel/exit.c:1638 and kernel/exit.c:1579 (online: https://elixir.bootlin.com/linux/v5.10.1/source/kernel/exit.c#L1579)
On Tue, Jan 19, 2021 at 10:18:04AM -0800, Rasmus Andersson wrote:
> However the wait4 syscall[3] in Linux 5 returns other values,
> specifically it returns errors as negative values. The error that
> trips up programs like runit's runsv is ECHILD (-10) which wait4
> returns when there are no children (i.e. they have exited.)
>
Hmm... that is very weird. That should not happen. Because the
syscall_cp macro already contains a call to __syscall_ret(), which does
exactly what you propose. So that means, there is something else going
on in your copy of the code. Did you change anything about the source
code? Or can you disassemble the function to see what it does?
Ciao,
Markus
On Tue, Jan 19, 2021 at 11:33 AM Markus Wichmann <nullplan@gmx.net> wrote: > > On Tue, Jan 19, 2021 at 10:18:04AM -0800, Rasmus Andersson wrote: > > However the wait4 syscall[3] in Linux 5 returns other values, > > specifically it returns errors as negative values. The error that > > trips up programs like runit's runsv is ECHILD (-10) which wait4 > > returns when there are no children (i.e. they have exited.) > > > > Hmm... that is very weird. That should not happen. Because the > syscall_cp macro already contains a call to __syscall_ret(), which does > exactly what you propose. So that means, there is something else going > on in your copy of the code. Did you change anything about the source > code? Or can you disassemble the function to see what it does? > Strange! You are right of course; looking at src/internal/syscall.h I indeed see that syscall_cp calls __syscall_ret(__syscall_cp(args)) The musl I'm building with comes from https://musl.cc/#binaries which is created from these scripts according to the author: https://git.zv.io/xstatic/mcm It doesn't seem to apply any patches. Anyhow, I'm currently building musl & gcc myself using musl-cross-make. I will see how that pans out and report back. Thank you for your quick response Markus!
[-- Attachment #1: Type: text/plain, Size: 1852 bytes --] In the meantime, in case you're curious, here's a program that reproduces the issue. I've attached a statically-compiled version too that can be disassembled. Running it: $ gcc -static a.c && ./a.out waitpid returned -10, wstat=0 #include <sys/wait.h> #include <stdio.h> int main(int argc, char **argv) { int wstat = 0; pid_t child = waitpid(-1, &wstat, WNOHANG); printf("waitpid returned %d, wstat=%d\n", child, wstat); return 0; } On Tue, Jan 19, 2021 at 12:17 PM Rasmus Andersson <rasmus@notion.se> wrote: > > On Tue, Jan 19, 2021 at 11:33 AM Markus Wichmann <nullplan@gmx.net> wrote: > > > > On Tue, Jan 19, 2021 at 10:18:04AM -0800, Rasmus Andersson wrote: > > > However the wait4 syscall[3] in Linux 5 returns other values, > > > specifically it returns errors as negative values. The error that > > > trips up programs like runit's runsv is ECHILD (-10) which wait4 > > > returns when there are no children (i.e. they have exited.) > > > > > > > Hmm... that is very weird. That should not happen. Because the > > syscall_cp macro already contains a call to __syscall_ret(), which does > > exactly what you propose. So that means, there is something else going > > on in your copy of the code. Did you change anything about the source > > code? Or can you disassemble the function to see what it does? > > > Strange! You are right of course; looking at src/internal/syscall.h I > indeed see that syscall_cp calls __syscall_ret(__syscall_cp(args)) > The musl I'm building with comes from https://musl.cc/#binaries which > is created from these scripts according to the author: > https://git.zv.io/xstatic/mcm It doesn't seem to apply any patches. > Anyhow, I'm currently building musl & gcc myself using > musl-cross-make. I will see how that pans out and report back. > Thank you for your quick response Markus! [-- Attachment #2: a.tar.gz --] [-- Type: application/x-gzip, Size: 9769 bytes --]
Relevant parts from disassembly: (objdump --disassemble --source
--line-numbers a.out)
0000000000401f61 <__syscall_cp_c>:
sccp():
401f61: 48 89 f8 mov %rdi,%rax
401f64: 4d 89 c2 mov %r8,%r10
401f67: 48 89 f7 mov %rsi,%rdi
401f6a: 4d 89 c8 mov %r9,%r8
401f6d: 48 89 d6 mov %rdx,%rsi
401f70: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
401f75: 48 89 ca mov %rcx,%rdx
401f78: 0f 05 syscall
401f7a: c3 retq
0000000000401f7b <__syscall_cp>:
__syscall_cp():
401f7b: e9 e1 ff ff ff jmpq 401f61 <__syscall_cp_c>
00000000004004fd <waitpid>:
waitpid():
...
40051a: e8 5c 1a 00 00 callq 401f7b <__syscall_cp>
40051f: 48 83 c4 18 add $0x18,%rsp
400523: c3 retq
main():
...
4002a2: e8 56 02 00 00 callq 4004fd <waitpid>
On Tue, Jan 19, 2021 at 12:28 PM Rasmus Andersson <rasmus@notion.se> wrote:
>
> In the meantime, in case you're curious, here's a program that
> reproduces the issue. I've attached a statically-compiled version too
> that can be disassembled. Running it:
> $ gcc -static a.c && ./a.out
> waitpid returned -10, wstat=0
>
> #include <sys/wait.h>
> #include <stdio.h>
> int main(int argc, char **argv) {
> int wstat = 0;
> pid_t child = waitpid(-1, &wstat, WNOHANG);
> printf("waitpid returned %d, wstat=%d\n", child, wstat);
> return 0;
> }
>
> On Tue, Jan 19, 2021 at 12:17 PM Rasmus Andersson <rasmus@notion.se> wrote:
> >
> > On Tue, Jan 19, 2021 at 11:33 AM Markus Wichmann <nullplan@gmx.net> wrote:
> > >
> > > On Tue, Jan 19, 2021 at 10:18:04AM -0800, Rasmus Andersson wrote:
> > > > However the wait4 syscall[3] in Linux 5 returns other values,
> > > > specifically it returns errors as negative values. The error that
> > > > trips up programs like runit's runsv is ECHILD (-10) which wait4
> > > > returns when there are no children (i.e. they have exited.)
> > > >
> > >
> > > Hmm... that is very weird. That should not happen. Because the
> > > syscall_cp macro already contains a call to __syscall_ret(), which does
> > > exactly what you propose. So that means, there is something else going
> > > on in your copy of the code. Did you change anything about the source
> > > code? Or can you disassemble the function to see what it does?
> > >
> > Strange! You are right of course; looking at src/internal/syscall.h I
> > indeed see that syscall_cp calls __syscall_ret(__syscall_cp(args))
> > The musl I'm building with comes from https://musl.cc/#binaries which
> > is created from these scripts according to the author:
> > https://git.zv.io/xstatic/mcm It doesn't seem to apply any patches.
> > Anyhow, I'm currently building musl & gcc myself using
> > musl-cross-make. I will see how that pans out and report back.
> > Thank you for your quick response Markus!
waitpid works as expected after building musl 1.2.1 and GCC 9.2.0 from source. It must be so that the binary distributions at https://musl.cc/#binaries are broken somehow. $ /build/gcc-x86_64-linux-musl/bin/x86_64-linux-musl-gcc -static a.c && ./a.out waitpid returned -1, wstat=0, errno=10 "No child process" Markus thank you for pointing me in the right direction :-) On Tue, Jan 19, 2021 at 12:35 PM Rasmus Andersson <rasmus@notion.se> wrote: > > Relevant parts from disassembly: (objdump --disassemble --source > --line-numbers a.out) > > 0000000000401f61 <__syscall_cp_c>: > sccp(): > 401f61: 48 89 f8 mov %rdi,%rax > 401f64: 4d 89 c2 mov %r8,%r10 > 401f67: 48 89 f7 mov %rsi,%rdi > 401f6a: 4d 89 c8 mov %r9,%r8 > 401f6d: 48 89 d6 mov %rdx,%rsi > 401f70: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9 > 401f75: 48 89 ca mov %rcx,%rdx > 401f78: 0f 05 syscall > 401f7a: c3 retq > > 0000000000401f7b <__syscall_cp>: > __syscall_cp(): > 401f7b: e9 e1 ff ff ff jmpq 401f61 <__syscall_cp_c> > > 00000000004004fd <waitpid>: > waitpid(): > ... > 40051a: e8 5c 1a 00 00 callq 401f7b <__syscall_cp> > 40051f: 48 83 c4 18 add $0x18,%rsp > 400523: c3 retq > > main(): > ... > 4002a2: e8 56 02 00 00 callq 4004fd <waitpid> > > On Tue, Jan 19, 2021 at 12:28 PM Rasmus Andersson <rasmus@notion.se> wrote: > > > > In the meantime, in case you're curious, here's a program that > > reproduces the issue. I've attached a statically-compiled version too > > that can be disassembled. Running it: > > $ gcc -static a.c && ./a.out > > waitpid returned -10, wstat=0 > > > > #include <sys/wait.h> > > #include <stdio.h> > > int main(int argc, char **argv) { > > int wstat = 0; > > pid_t child = waitpid(-1, &wstat, WNOHANG); > > printf("waitpid returned %d, wstat=%d\n", child, wstat); > > return 0; > > } > > > > On Tue, Jan 19, 2021 at 12:17 PM Rasmus Andersson <rasmus@notion.se> wrote: > > > > > > On Tue, Jan 19, 2021 at 11:33 AM Markus Wichmann <nullplan@gmx.net> wrote: > > > > > > > > On Tue, Jan 19, 2021 at 10:18:04AM -0800, Rasmus Andersson wrote: > > > > > However the wait4 syscall[3] in Linux 5 returns other values, > > > > > specifically it returns errors as negative values. The error that > > > > > trips up programs like runit's runsv is ECHILD (-10) which wait4 > > > > > returns when there are no children (i.e. they have exited.) > > > > > > > > > > > > > Hmm... that is very weird. That should not happen. Because the > > > > syscall_cp macro already contains a call to __syscall_ret(), which does > > > > exactly what you propose. So that means, there is something else going > > > > on in your copy of the code. Did you change anything about the source > > > > code? Or can you disassemble the function to see what it does? > > > > > > > Strange! You are right of course; looking at src/internal/syscall.h I > > > indeed see that syscall_cp calls __syscall_ret(__syscall_cp(args)) > > > The musl I'm building with comes from https://musl.cc/#binaries which > > > is created from these scripts according to the author: > > > https://git.zv.io/xstatic/mcm It doesn't seem to apply any patches. > > > Anyhow, I'm currently building musl & gcc myself using > > > musl-cross-make. I will see how that pans out and report back. > > > Thank you for your quick response Markus!
On Tue, Jan 19, 2021 at 12:35:26PM -0800, Rasmus Andersson wrote:
> Relevant parts from disassembly: (objdump --disassemble --source
> --line-numbers a.out)
>
> 0000000000401f61 <__syscall_cp_c>:
> sccp():
> 401f61: 48 89 f8 mov %rdi,%rax
> 401f64: 4d 89 c2 mov %r8,%r10
> 401f67: 48 89 f7 mov %rsi,%rdi
> 401f6a: 4d 89 c8 mov %r9,%r8
> 401f6d: 48 89 d6 mov %rdx,%rsi
> 401f70: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
> 401f75: 48 89 ca mov %rcx,%rdx
> 401f78: 0f 05 syscall
> 401f7a: c3 retq
>
> 0000000000401f7b <__syscall_cp>:
> __syscall_cp():
> 401f7b: e9 e1 ff ff ff jmpq 401f61 <__syscall_cp_c>
>
> 00000000004004fd <waitpid>:
> waitpid():
> ...
> 40051a: e8 5c 1a 00 00 callq 401f7b <__syscall_cp>
> 40051f: 48 83 c4 18 add $0x18,%rsp
> 400523: c3 retq
This disassembly shows a miscompiled (or compiled with wrong patches)
waitpid. It should be a tail call to __syscall_ret, not a retq.
Rich
That was compiled using https://more.musl.cc/10/x86_64-linux-musl/x86_64-linux-musl-native.tgz on Alpine Linux in a docker container (alpine:3.12) Full repro: $ wget https://more.musl.cc/10/x86_64-linux-musl/x86_64-linux-musl-native.tgz $ tar -xf x86_64-linux-musl-native.tgz $ cat << EOF > a.c #include <sys/wait.h> #include <stdio.h> #include <errno.h> #include <string.h> int main(int argc, char **argv) { int wstat = 0; pid_t child = waitpid(-1, &wstat, WNOHANG); printf("waitpid returned %d, wstat=%d, errno=%d (%s)\n", child, wstat, errno, strerror(errno)); return 0; } EOF $ x86_64-linux-musl-native/bin/gcc -static a.c $ ./a.out waitpid returned -10, wstat=0, errno=0 (No error information) On Tue, Jan 19, 2021 at 1:16 PM Rich Felker <dalias@libc.org> wrote: > > On Tue, Jan 19, 2021 at 12:35:26PM -0800, Rasmus Andersson wrote: > > Relevant parts from disassembly: (objdump --disassemble --source > > --line-numbers a.out) > > > > 0000000000401f61 <__syscall_cp_c>: > > sccp(): > > 401f61: 48 89 f8 mov %rdi,%rax > > 401f64: 4d 89 c2 mov %r8,%r10 > > 401f67: 48 89 f7 mov %rsi,%rdi > > 401f6a: 4d 89 c8 mov %r9,%r8 > > 401f6d: 48 89 d6 mov %rdx,%rsi > > 401f70: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9 > > 401f75: 48 89 ca mov %rcx,%rdx > > 401f78: 0f 05 syscall > > 401f7a: c3 retq > > > > 0000000000401f7b <__syscall_cp>: > > __syscall_cp(): > > 401f7b: e9 e1 ff ff ff jmpq 401f61 <__syscall_cp_c> > > > > 00000000004004fd <waitpid>: > > waitpid(): > > ... > > 40051a: e8 5c 1a 00 00 callq 401f7b <__syscall_cp> > > 40051f: 48 83 c4 18 add $0x18,%rsp > > 400523: c3 retq > > This disassembly shows a miscompiled (or compiled with wrong patches) > waitpid. It should be a tail call to __syscall_ret, not a retq. > > Rich
The program compiles correctly with https://more.musl.cc/9/x86_64-linux-musl/x86_64-linux-musl-native.tgz (The GCC 9 version) Perhaps an issue with GCC 10? On Tue, Jan 19, 2021 at 2:00 PM Rasmus Andersson <rasmus@notion.se> wrote: > > That was compiled using > https://more.musl.cc/10/x86_64-linux-musl/x86_64-linux-musl-native.tgz > on Alpine Linux in a docker container (alpine:3.12) > > Full repro: > $ wget https://more.musl.cc/10/x86_64-linux-musl/x86_64-linux-musl-native.tgz > $ tar -xf x86_64-linux-musl-native.tgz > $ cat << EOF > a.c > #include <sys/wait.h> > #include <stdio.h> > #include <errno.h> > #include <string.h> > int main(int argc, char **argv) { > int wstat = 0; > pid_t child = waitpid(-1, &wstat, WNOHANG); > printf("waitpid returned %d, wstat=%d, errno=%d (%s)\n", child, > wstat, errno, strerror(errno)); > return 0; > } > EOF > $ x86_64-linux-musl-native/bin/gcc -static a.c > $ ./a.out > waitpid returned -10, wstat=0, errno=0 (No error information) > > On Tue, Jan 19, 2021 at 1:16 PM Rich Felker <dalias@libc.org> wrote: > > > > On Tue, Jan 19, 2021 at 12:35:26PM -0800, Rasmus Andersson wrote: > > > Relevant parts from disassembly: (objdump --disassemble --source > > > --line-numbers a.out) > > > > > > 0000000000401f61 <__syscall_cp_c>: > > > sccp(): > > > 401f61: 48 89 f8 mov %rdi,%rax > > > 401f64: 4d 89 c2 mov %r8,%r10 > > > 401f67: 48 89 f7 mov %rsi,%rdi > > > 401f6a: 4d 89 c8 mov %r9,%r8 > > > 401f6d: 48 89 d6 mov %rdx,%rsi > > > 401f70: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9 > > > 401f75: 48 89 ca mov %rcx,%rdx > > > 401f78: 0f 05 syscall > > > 401f7a: c3 retq > > > > > > 0000000000401f7b <__syscall_cp>: > > > __syscall_cp(): > > > 401f7b: e9 e1 ff ff ff jmpq 401f61 <__syscall_cp_c> > > > > > > 00000000004004fd <waitpid>: > > > waitpid(): > > > ... > > > 40051a: e8 5c 1a 00 00 callq 401f7b <__syscall_cp> > > > 40051f: 48 83 c4 18 add $0x18,%rsp > > > 400523: c3 retq > > > > This disassembly shows a miscompiled (or compiled with wrong patches) > > waitpid. It should be a tail call to __syscall_ret, not a retq. > > > > Rich
On Tue, 2021-01-19 at 14:02 -0800, Rasmus Andersson wrote: > The program compiles correctly with > https://more.musl.cc/9/x86_64-linux-musl/x86_64-linux-musl-native.tgz > (The GCC 9 version) Perhaps an issue with GCC 10? Rasmus, thank you for reporting this issue. Using your reproducer I'm able to observe the following: (1) GCC 10 as-published on musl.cc : BUG waitpid returned -10, wstat=0, errno=0 (No error information) (2) GCC 10 without this patch [1] : OK waitpid returned -1, wstat=0, errno=10 (No child process) which matches your GCC 9 (20200828) observation. This patch was applied to musl.cc as part of a series to add experimental riscv32 support in September 2020, but was applied universally and not strictly riscv32 targets. The patch in question was /not/ applied to the GCC 9 binaries as they were last updated in August of 2020, nor to earlier 10 ones. I am updating the build infrastructure to avoid contamination of "supported" targets by experimental patches such as this one. New (fixed and newer GCC) toolchains will be published as soon as they finish building (~24 hours). I will ping you off-list. Additional tests (not simply spot-checks) will be implemented for future releases. Sorry for the inconvenience. [1]: https://www.openwall.com/lists/musl/2020/09/03/14
Zach I'm glad I was able to help! No trouble at all.
Thank you all for your work on musl & accompanying tools :-)
On Tue, Jan 19, 2021 at 3:01 PM Zach van Rijn <me@zv.io> wrote:
>
> On Tue, 2021-01-19 at 14:02 -0800, Rasmus Andersson wrote:
> > The program compiles correctly with
> > https://more.musl.cc/9/x86_64-linux-musl/x86_64-linux-musl-native.tgz
> > (The GCC 9 version) Perhaps an issue with GCC 10?
>
> Rasmus, thank you for reporting this issue.
>
>
> Using your reproducer I'm able to observe the following:
>
> (1) GCC 10 as-published on musl.cc : BUG
> waitpid returned -10, wstat=0, errno=0 (No error information)
>
> (2) GCC 10 without this patch [1] : OK
> waitpid returned -1, wstat=0, errno=10 (No child process)
>
> which matches your GCC 9 (20200828) observation.
>
>
> This patch was applied to musl.cc as part of a series to add
> experimental riscv32 support in September 2020, but was applied
> universally and not strictly riscv32 targets.
>
> The patch in question was /not/ applied to the GCC 9 binaries as
> they were last updated in August of 2020, nor to earlier 10 ones.
>
> I am updating the build infrastructure to avoid contamination of
> "supported" targets by experimental patches such as this one.
>
> New (fixed and newer GCC) toolchains will be published as soon as
> they finish building (~24 hours). I will ping you off-list.
>
> Additional tests (not simply spot-checks) will be implemented for
> future releases.
>
>
> Sorry for the inconvenience.
>
>
> [1]: https://www.openwall.com/lists/musl/2020/09/03/14
>