mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: Markus Wichmann <nullplan@gmx.net>
Cc: musl@lists.openwall.com
Subject: Re: [musl] [bug] Ctrl-Z when process is doing posix_spawn makes the process hard to kill
Date: Sat, 18 Jan 2025 22:18:00 -0500	[thread overview]
Message-ID: <20250119031759.GP10433@brightrain.aerifal.cx> (raw)
In-Reply-To: <Z4wMOhFmeB-bfg1-@voyager>

On Sat, Jan 18, 2025 at 09:16:58PM +0100, Markus Wichmann wrote:
> Hi all,
> 
> here is my understanding of the bug first.
> 
> 1. Foreground process calls posix_spawn without the POSIX_SPAWN_SETSID
> or POSIX_SPAWN_SETPGROUP flags (either of those prevent the bug).
> 2. User presses terminal suspend character between the parent process
> masking signals and the child process execing the target program.
> 3. Kernel sends SIGTSTP to foreground process group.
> 4. SIGTSTP is blocked in parent process, so parent process does not
> stop. Parent process is blocked in trying to read the pipe to the child,
> though.
> 5. Child process unblocks signals before calling exec(), thereby
> unblocking SIGTSTP and stopping.
> 6. User has an issue mainly because parent process never acts on SIGTSTP
> and stops (which is why the shell's wait() call never returns).
> 
> Looking at the ingredients of the problem, it seems that unblocking
> signals before reading the pipe would be the simplest way out of this
> pickle. We cannot avoid blocking signals before calling clone() to spawn
> the child with blocked signals, and they cannot be unblocked in exec(),
> because all exec() functions pass on the signal mask, but the parent
> could read the pipe with unblocked signals.

I think this is a misunderstanding of the bug. My understanding is
that, due to signals sent from a controlling terminal or to a process
group, it's posssible for a process which logically does not exist yet
to enter a stopped state.

If the parent also stopped, most likely they would get resumed
together, but there is no requirement that this happen. In a worst
case, the child stop may be queued before the child changes to a new
process group; in that case, it's acted upon after the process group
has already changed (because that necessarily happens before signals
are unblocked), and sending SIGCONT to the parent process group (like
a shell would do) will not resume it.

This cannot happen in the case of a hard SIGSTOP though, only SIGTSTP.
So one could argue that my original fix for SIGTSTP suffices, if
you're willing to assume something sending hard SIGSTOP to a process
group will send the SIGCONT to the process group as well.

> The code for reading the pipe and waiting for the child process
> obviously would need to account for the possibility of EINTR, and there
> is a possibility the pipe FD would escape to fork-without-exec in a
> signal handler. That could be helped with FD_CLOFORK emulation in libc,
> though (keep track of CLOFORK FDs in an FD set and close them all in
> _Fork()), since FD_CLOFORK is not in the kernel, sadly.

This doesn't matter. It's always expected that libc-internal fds can
escape this way, and in this case it's completely harmless except for
the resource leak. If you _Fork from a signal handler you're in a
permanent AS context, and can't really do much except exec or _exit.
So the resource usage really doesn't matter. It does not block forward
progress of anything.

> Or else you could tell applications that weird things happen if you fork
> in a signal handler without execing (that's weird usage, anyway).

This is basically what the standard already does.

I'm not really convinced that unblocking signals in the parent is
relevant to fixing this bug, but it might be a better behavior, since
posix_spawn can block forward progress indefinitely if the child file
actions do stupid things like opening a file type that blocks in open.
While the implementation may of course block signals internally where
needed, generally this should follow the as-if rule whereby the
application can't see that they were blocked except by timing
differences. Blocking forward progress that can only occur by a signal
being handled seems like at least bad QoI if not nonconforming.

Rich


  reply	other threads:[~2025-01-19  3:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-16 23:14 Askar Safin
2025-01-17  6:37 ` Rich Felker
2025-01-17  6:46   ` Rich Felker
2025-01-17 17:55   ` Askar Safin
2025-01-18  9:51     ` Florian Weimer
2025-01-18 10:23       ` Rich Felker
2025-01-18 11:13         ` Florian Weimer
2025-01-18 20:58           ` Askar Safin
2025-01-18 11:17     ` Rich Felker
2025-01-18 20:16       ` Markus Wichmann
2025-01-19  3:18         ` Rich Felker [this message]
2025-01-18 20:52       ` Askar Safin
2025-01-22 21:45       ` Askar Safin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250119031759.GP10433@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    --cc=nullplan@gmx.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).