mailing list of musl libc
 help / color / mirror / code / Atom feed
* more fun with non-Linux Linux ABI
@ 2017-06-11 18:01 u-uy74
  2017-06-11 21:50 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: u-uy74 @ 2017-06-11 18:01 UTC (permalink / raw)
  To: musl

FWIIW: when running under Linux ABI on FreeBSD
in musl-linked programs the child processes segfault right after a
vfork(), before doing anything else:

-----
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>

int main( int argc, char **argv ) {
  int s;
  pid_t pid = vfork();
  if( !pid ){ _exit(0); }
  else      { wait(&s); printf("vfork() = %d\n", pid); }
  fflush(stdout);
  _exit(0);
}
-----
 50124 ktrace   RET   ktrace 0
 50124 ktrace   CALL  execve(0x7fffffffee4b,0x7fffffffec38,0x7fffffffec48)
 50124 ktrace   NAMI  "./v"
 50124 v        RET   linux_olduname 0
 50124 v        CALL  linux_set_thread_area(0xffffcc6c)
 50124 v        RET   linux_set_thread_area 0
 50124 v        CALL  linux_set_tid_address(0x804d524)
 50124 v        RET   linux_set_tid_address 50124/0xc3cc
 50124 v        CALL  linux_vfork
 50124 v        RET   linux_vfork 50125/0xc3cd
 50125 v        RET   linux_fork 0
 50125 v        PSIG  SIGSEGV SIG_DFL code=SEGV_MAPERR
 50125 v        NAMI  "v.core"
 50124 v        CALL  linux_wait4(0xffffffff,0xffffcd4c,0,0)
 50124 v        RET   linux_wait4 50125/0xc3cd
 50124 v        CALL  linux_ioctl(0x1,0x5413,0xffffcb04)
 50124 v        RET   linux_ioctl 0
 50124 v        CALL  linux_writev(0x1,0xffffcad4,0x2)
 50124 v        GIO   fd 1 wrote 16 bytes
       "vfork() = 50125
       "
 50124 v        RET   linux_writev 16/0x10
 50124 v        CALL  linux_exit_group(0)
-----

Remarkably this apparently does not affect glibc-based builds
(I have not tested right now but otherwise it would have been known).
Wonder what makes the difference.

Otherwise a simple workaround would be an option to make vfork()
a fork() synonym while building musl. (I do the this at applications
build time instead, which helps.)

Such an option would most probably result in a pretty small performance
impact on modern (native) Linux.

Cheers,
Rune



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: more fun with non-Linux Linux ABI
  2017-06-11 18:01 more fun with non-Linux Linux ABI u-uy74
@ 2017-06-11 21:50 ` Rich Felker
  2017-06-12  8:30   ` u-uy74
  2017-06-12 15:29   ` Bobby Powers
  0 siblings, 2 replies; 5+ messages in thread
From: Rich Felker @ 2017-06-11 21:50 UTC (permalink / raw)
  To: musl

On Sun, Jun 11, 2017 at 08:01:58PM +0200, u-uy74@aetey.se wrote:
> FWIIW: when running under Linux ABI on FreeBSD
> in musl-linked programs the child processes segfault right after a
> vfork(), before doing anything else:

My first guess is that this is a FreeBSD bug...

> -----
> #include <sys/types.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <sys/wait.h>
> 
> int main( int argc, char **argv ) {
>   int s;
>   pid_t pid = vfork();
>   if( !pid ){ _exit(0); }
>   else      { wait(&s); printf("vfork() = %d\n", pid); }
>   fflush(stdout);
>   _exit(0);
> }
> -----
>  50124 ktrace   RET   ktrace 0
>  50124 ktrace   CALL  execve(0x7fffffffee4b,0x7fffffffec38,0x7fffffffec48)
>  50124 ktrace   NAMI  "./v"
>  50124 v        RET   linux_olduname 0
>  50124 v        CALL  linux_set_thread_area(0xffffcc6c)
>  50124 v        RET   linux_set_thread_area 0
>  50124 v        CALL  linux_set_tid_address(0x804d524)
>  50124 v        RET   linux_set_tid_address 50124/0xc3cc
>  50124 v        CALL  linux_vfork
>  50124 v        RET   linux_vfork 50125/0xc3cd
>  50125 v        RET   linux_fork 0
>  50125 v        PSIG  SIGSEGV SIG_DFL code=SEGV_MAPERR
>  50125 v        NAMI  "v.core"
>  50124 v        CALL  linux_wait4(0xffffffff,0xffffcd4c,0,0)
>  50124 v        RET   linux_wait4 50125/0xc3cd
>  50124 v        CALL  linux_ioctl(0x1,0x5413,0xffffcb04)
>  50124 v        RET   linux_ioctl 0
>  50124 v        CALL  linux_writev(0x1,0xffffcad4,0x2)
>  50124 v        GIO   fd 1 wrote 16 bytes
>        "vfork() = 50125
>        "
>  50124 v        RET   linux_writev 16/0x10
>  50124 v        CALL  linux_exit_group(0)
> -----
> 
> Remarkably this apparently does not affect glibc-based builds
> (I have not tested right now but otherwise it would have been known).
> Wonder what makes the difference.

Is it possible that FreeBSD's Linux syscall emulation uses the
userspace stack to store some state during syscalls? For example maybe
they just translate to the FreeBSD stack-based syscall convention by
pushing registers onto the userspace stack then jumping to the normal
FreeBSD syscall entry points. If so, There's no way they can implement
vfork and they ought to just emulate it as fork.

> Otherwise a simple workaround would be an option to make vfork()
> a fork() synonym while building musl. (I do the this at applications
> build time instead, which helps.)
> 
> Such an option would most probably result in a pretty small performance
> impact on modern (native) Linux.

It's actually a pretty large impact; recent (4.x+ IIRC) versions of
GNU makes are considerably slower because they dropped use of vfork
and switched to fork rather than making it use posix_spawn like it
should.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: more fun with non-Linux Linux ABI
  2017-06-11 21:50 ` Rich Felker
@ 2017-06-12  8:30   ` u-uy74
  2017-06-12 15:29   ` Bobby Powers
  1 sibling, 0 replies; 5+ messages in thread
From: u-uy74 @ 2017-06-12  8:30 UTC (permalink / raw)
  To: musl

On Sun, Jun 11, 2017 at 05:50:20PM -0400, Rich Felker wrote:
> On Sun, Jun 11, 2017 at 08:01:58PM +0200, u-uy74@aetey.se wrote:
> > FWIIW: when running under Linux ABI on FreeBSD
> > in musl-linked programs the child processes segfault right after a
> > vfork(), before doing anything else:
> 
> My first guess is that this is a FreeBSD bug...

I agree.

> > Remarkably this apparently does not affect glibc-based builds
> > (I have not tested right now but otherwise it would have been known).
> > Wonder what makes the difference.
> 
> Is it possible that FreeBSD's Linux syscall emulation uses the
> userspace stack to store some state during syscalls? For example maybe

I am not sufficiently familiar with the FreeBSD internals to tell this.
The linux_fork() and linux_vfork() functions are about 30 lines each
and the only difference is

--- fork
+++ vfork
 ...
         struct fork_req fr;
 ...
         bzero(&fr, sizeof(fr));
-        fr.fr_flags = RFFDG | RFPROC | RFSTOPPED;
+        fr.fr_flags = RFFDG | RFPROC | RFMEM | RFPPWAIT | RFSTOPPED;
         fr.fr_procp = &p2;
         if ((error = fork1(td, &fr)) != 0)
                 return (error);
 ...

I guess the musl vs glibc difference can happen if the latter implements
vfork() in terms of clone() (?)

linux_clone() has a much larger implementation in FreeBSD than
linux_vfork() which could explain why the one works and the other
does not.

> > Otherwise a simple workaround would be an option to make vfork()
> > a fork() synonym while building musl. (I do the this at applications
> > build time instead, which helps.)
> > 
> > Such an option would most probably result in a pretty small performance
> > impact on modern (native) Linux.
> 
> It's actually a pretty large impact; recent (4.x+ IIRC) versions of
> GNU makes are considerably slower because they dropped use of vfork
> and switched to fork rather than making it use posix_spawn like it
> should.

Thanks, good to know. Then I have to live with this impact on Linux
or convince the FreeBSD team to fix linux_vfork().

Rune



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: more fun with non-Linux Linux ABI
  2017-06-11 21:50 ` Rich Felker
  2017-06-12  8:30   ` u-uy74
@ 2017-06-12 15:29   ` Bobby Powers
  2017-06-15  0:15     ` Rich Felker
  1 sibling, 1 reply; 5+ messages in thread
From: Bobby Powers @ 2017-06-12 15:29 UTC (permalink / raw)
  To: musl

On Sun, Jun 11, 2017 at 2:50 PM, Rich Felker <dalias@libc.org> wrote:
>> Such an option would most probably result in a pretty small performance
>> impact on modern (native) Linux.
>
> It's actually a pretty large impact; recent (4.x+ IIRC) versions of
> GNU makes are considerably slower because they dropped use of vfork
> and switched to fork rather than making it use posix_spawn like it
> should.

Do you know why they chose not to use posix_spawn?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: more fun with non-Linux Linux ABI
  2017-06-12 15:29   ` Bobby Powers
@ 2017-06-15  0:15     ` Rich Felker
  0 siblings, 0 replies; 5+ messages in thread
From: Rich Felker @ 2017-06-15  0:15 UTC (permalink / raw)
  To: musl

On Mon, Jun 12, 2017 at 08:29:59AM -0700, Bobby Powers wrote:
> On Sun, Jun 11, 2017 at 2:50 PM, Rich Felker <dalias@libc.org> wrote:
> >> Such an option would most probably result in a pretty small performance
> >> impact on modern (native) Linux.
> >
> > It's actually a pretty large impact; recent (4.x+ IIRC) versions of
> > GNU makes are considerably slower because they dropped use of vfork
> > and switched to fork rather than making it use posix_spawn like it
> > should.
> 
> Do you know why they chose not to use posix_spawn?

Probably lack of knowledge, or lack of resources to test a major
change like that. Possibly missing functionality they want to do
between fork and exec, but it seems like they could use posix_spawn
for the cases where they don't need extra functionality and fallback
to fork+exec when they do. You'd have to follow up with the GNU make
project to know for sure.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-06-15  0:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-11 18:01 more fun with non-Linux Linux ABI u-uy74
2017-06-11 21:50 ` Rich Felker
2017-06-12  8:30   ` u-uy74
2017-06-12 15:29   ` Bobby Powers
2017-06-15  0:15     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).