mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [PATCH] Split fork and abort locks
@ 2024-01-20  8:12 Markus Wichmann
  2024-01-20 11:29 ` Alexey Izbyshev
  2024-01-20 21:32 ` Rich Felker
  0 siblings, 2 replies; 4+ messages in thread
From: Markus Wichmann @ 2024-01-20  8:12 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1906 bytes --]

Hi all,

a while ago I had noticed that __abort_lock was being taken in some
functions that have nothing to do with SIGABRT. Namely in the forking
functions. Investigating this a bit, I noticed that __abort_lock had
become dual purpose. But this is a code smell.

Actually, there are several locks that have expanded in scope a bit
since their introduction. At least the ptc lock (__inhibit_ptc() et al.)
deserves a closer look later on as well. Seems to me like in case of the
default stack size, that lock is used simply because an rwlock was
needed and this one was around.

The corruption in this case probably came from posix_spawn(). That
function takes the abort lock, because, as a baffling comment above the
lock statement tells us, this guards against SIGABRT disposition
changing. This is because abort() changes the disposition without
calling sigaction(), so a handler would not be noted in the handler set.

However, actually posix_spawn() does not need to care about this. The
handler set is strictly additive, so all the signals it contains /may/
have a handler. And abort() removes a handler. In the worst case, the
handler set will spuriously contain SIGABRT, which is a condition the
child needs to check for anyway.

And a concurrent sigaction() call from the application establishing a
handler is no different for SIGABRT than for any other signal. That is
handled by retrieving the handler set in the child, when everything is
fixed since the child is single-threaded. For the same reason, there
cannot be a concurrent call to abort() in the child.

Anyway, with posix_spawn() taking __abort_lock, when it came time to
guard against leaking communication pipes to concurrently created child
processes, it was only natural to use the same lock in clone() and
_Fork(). But this is a new purpose and deserves a new lock.

So I am submitting a patch to introduce that lock.

Ciao,
Markus

[-- Attachment #2: 0001-Introduce-new-fork-lock-separate-from-__abort_lock.patch --]
[-- Type: text/x-diff, Size: 4000 bytes --]

From 7931b41e4b7ff7bc0c25d3141fef1fdc113b8203 Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Sat, 20 Jan 2024 08:44:04 +0100
Subject: [PATCH] Introduce new fork lock separate from __abort_lock.

The abort lock was being taken in several functions that have nothing to
do with SIGABRT or signal dispositions. The lock had become
dual-purpose, guarding against SIGABRT disposition changes and against a
concurrent fork. But these are two different purposes, so use a new lock
for the second one.

posix_spawn() has no reason to care about the SIGABRT disposition any
more than any other one, and it only changes signal dispositions in the
child process, which cannot call abort() concurrently because it is
single-threaded.
---
 src/internal/fork_impl.h  |  1 +
 src/linux/clone.c         |  2 +-
 src/process/_Fork.c       |  4 ++--
 src/process/fork_lock.c   |  3 +++
 src/process/posix_spawn.c | 10 +++++-----
 5 files changed, 12 insertions(+), 8 deletions(-)
 create mode 100644 src/process/fork_lock.c

diff --git a/src/internal/fork_impl.h b/src/internal/fork_impl.h
index f995fce2..3889d38a 100644
--- a/src/internal/fork_impl.h
+++ b/src/internal/fork_impl.h
@@ -13,6 +13,7 @@ extern hidden volatile int *const __timezone_lockptr;
 extern hidden volatile int *const __bump_lockptr;

 extern hidden volatile int *const __vmlock_lockptr;
+extern hidden volatile int __fork_lock[1];

 hidden void __malloc_atfork(int);
 hidden void __ldso_atfork(int);
diff --git a/src/linux/clone.c b/src/linux/clone.c
index 257c1cec..1e4201ae 100644
--- a/src/linux/clone.c
+++ b/src/linux/clone.c
@@ -51,7 +51,7 @@ int clone(int (*func)(void *), void *stack, int flags, void *arg, ...)
 		__clone(func, stack, flags, arg, ptid, tls, ctid));

 	__block_all_sigs(&csa.sigmask);
-	LOCK(__abort_lock);
+	LOCK(__fork_lock);

 	/* Setup the a wrapper start function for the child process to do
 	 * mimic _Fork in producing a consistent execution state. */
diff --git a/src/process/_Fork.c b/src/process/_Fork.c
index 9c07792d..370191c1 100644
--- a/src/process/_Fork.c
+++ b/src/process/_Fork.c
@@ -22,7 +22,7 @@ void __post_Fork(int ret)
 		libc.threads_minus_1 = 0;
 		if (libc.need_locks) libc.need_locks = -1;
 	}
-	UNLOCK(__abort_lock);
+	UNLOCK(__fork_lock);
 	if (!ret) __aio_atfork(1);
 }

@@ -31,7 +31,7 @@ pid_t _Fork(void)
 	pid_t ret;
 	sigset_t set;
 	__block_all_sigs(&set);
-	LOCK(__abort_lock);
+	LOCK(__fork_lock);
 #ifdef SYS_fork
 	ret = __syscall(SYS_fork);
 #else
diff --git a/src/process/fork_lock.c b/src/process/fork_lock.c
new file mode 100644
index 00000000..204af1f4
--- /dev/null
+++ b/src/process/fork_lock.c
@@ -0,0 +1,3 @@
+#include "fork_impl.h"
+
+volatile int __fork_lock[1];
diff --git a/src/process/posix_spawn.c b/src/process/posix_spawn.c
index 728551b3..74883527 100644
--- a/src/process/posix_spawn.c
+++ b/src/process/posix_spawn.c
@@ -9,6 +9,7 @@
 #include "lock.h"
 #include "pthread_impl.h"
 #include "fdop.h"
+#include "fork_impl.h"

 struct args {
 	int p[2];
@@ -180,12 +181,11 @@ int posix_spawn(pid_t *restrict res, const char *restrict path,
 	args.envp = envp;
 	pthread_sigmask(SIG_BLOCK, SIGALL_SET, &args.oldmask);

-	/* The lock guards both against seeing a SIGABRT disposition change
-	 * by abort and against leaking the pipe fd to fork-without-exec. */
-	LOCK(__abort_lock);
+	/* The lock guards against leaking the pipe fd to fork-without-exec. */
+	LOCK(__fork_lock);

 	if (pipe2(args.p, O_CLOEXEC)) {
-		UNLOCK(__abort_lock);
+		UNLOCK(__fork_lock);
 		ec = errno;
 		goto fail;
 	}
@@ -193,7 +193,7 @@ int posix_spawn(pid_t *restrict res, const char *restrict path,
 	pid = __clone(child, stack+sizeof stack,
 		CLONE_VM|CLONE_VFORK|SIGCHLD, &args);
 	close(args.p[1]);
-	UNLOCK(__abort_lock);
+	UNLOCK(__fork_lock);

 	if (pid > 0) {
 		if (read(args.p[0], &ec, sizeof ec) != sizeof ec) ec = 0;
--
2.39.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] [PATCH] Split fork and abort locks
  2024-01-20  8:12 [musl] [PATCH] Split fork and abort locks Markus Wichmann
@ 2024-01-20 11:29 ` Alexey Izbyshev
  2024-01-20 21:24   ` Rich Felker
  2024-01-20 21:32 ` Rich Felker
  1 sibling, 1 reply; 4+ messages in thread
From: Alexey Izbyshev @ 2024-01-20 11:29 UTC (permalink / raw)
  To: musl

On 2024-01-20 11:12, Markus Wichmann wrote:
> Hi all,
> 
> a while ago I had noticed that __abort_lock was being taken in some
> functions that have nothing to do with SIGABRT. Namely in the forking
> functions. Investigating this a bit, I noticed that __abort_lock had
> become dual purpose. But this is a code smell.
> 
> Actually, there are several locks that have expanded in scope a bit
> since their introduction. At least the ptc lock (__inhibit_ptc() et 
> al.)
> deserves a closer look later on as well. Seems to me like in case of 
> the
> default stack size, that lock is used simply because an rwlock was
> needed and this one was around.
> 
> The corruption in this case probably came from posix_spawn(). That
> function takes the abort lock, because, as a baffling comment above the
> lock statement tells us, this guards against SIGABRT disposition
> changing. This is because abort() changes the disposition without
> calling sigaction(), so a handler would not be noted in the handler 
> set.
> 
> However, actually posix_spawn() does not need to care about this. The
> handler set is strictly additive, so all the signals it contains /may/
> have a handler. And abort() removes a handler. In the worst case, the
> handler set will spuriously contain SIGABRT, which is a condition the
> child needs to check for anyway.
> 
> And a concurrent sigaction() call from the application establishing a
> handler is no different for SIGABRT than for any other signal. That is
> handled by retrieving the handler set in the child, when everything is
> fixed since the child is single-threaded. For the same reason, there
> cannot be a concurrent call to abort() in the child.
> 
The problem that __abort_lock solves is that a child process created by 
fork/_Fork/clone/posix_spawn should not observe SIGABRT disposition 
reset  if abort is called by the parent concurrently with the child 
creation. For example, if the initial SIGABRT disposition is SIG_IGN, 
and one thread of a program calls posix_spawn while another thread calls 
abort, without __abort_lock the child could inherit SIG_DFL disposition. 
This is not related to maintaining handler_set used for posix_spawn 
optimization.

A separate reason for why removing __abort_lock LOCK/UNLOCK from clone.c 
and _Fork.c (as done in your patch) is wrong is because they take part 
in creation of a consistent execution state in the child (the child 
should be able to call abort and not deadlock).

Alexey

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] [PATCH] Split fork and abort locks
  2024-01-20 11:29 ` Alexey Izbyshev
@ 2024-01-20 21:24   ` Rich Felker
  0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2024-01-20 21:24 UTC (permalink / raw)
  To: musl

On Sat, Jan 20, 2024 at 02:29:28PM +0300, Alexey Izbyshev wrote:
> On 2024-01-20 11:12, Markus Wichmann wrote:
> >Hi all,
> >
> >a while ago I had noticed that __abort_lock was being taken in some
> >functions that have nothing to do with SIGABRT. Namely in the forking
> >functions. Investigating this a bit, I noticed that __abort_lock had
> >become dual purpose. But this is a code smell.
> >
> >Actually, there are several locks that have expanded in scope a bit
> >since their introduction. At least the ptc lock (__inhibit_ptc()
> >et al.)
> >deserves a closer look later on as well. Seems to me like in case
> >of the
> >default stack size, that lock is used simply because an rwlock was
> >needed and this one was around.
> >
> >The corruption in this case probably came from posix_spawn(). That
> >function takes the abort lock, because, as a baffling comment above the
> >lock statement tells us, this guards against SIGABRT disposition
> >changing. This is because abort() changes the disposition without
> >calling sigaction(), so a handler would not be noted in the
> >handler set.
> >
> >However, actually posix_spawn() does not need to care about this. The
> >handler set is strictly additive, so all the signals it contains /may/
> >have a handler. And abort() removes a handler. In the worst case, the
> >handler set will spuriously contain SIGABRT, which is a condition the
> >child needs to check for anyway.
> >
> >And a concurrent sigaction() call from the application establishing a
> >handler is no different for SIGABRT than for any other signal. That is
> >handled by retrieving the handler set in the child, when everything is
> >fixed since the child is single-threaded. For the same reason, there
> >cannot be a concurrent call to abort() in the child.
> >
> The problem that __abort_lock solves is that a child process created
> by fork/_Fork/clone/posix_spawn should not observe SIGABRT
> disposition reset  if abort is called by the parent concurrently
> with the child creation. For example, if the initial SIGABRT
> disposition is SIG_IGN, and one thread of a program calls
> posix_spawn while another thread calls abort, without __abort_lock
> the child could inherit SIG_DFL disposition. This is not related to
> maintaining handler_set used for posix_spawn optimization.
> 
> A separate reason for why removing __abort_lock LOCK/UNLOCK from
> clone.c and _Fork.c (as done in your patch) is wrong is because they
> take part in creation of a consistent execution state in the child
> (the child should be able to call abort and not deadlock).

Yes. In order to separate these locks, most places that take either
lock would have to take both locks. This is more costly and
error-prone, and has no advantages I can see.

Rich

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] [PATCH] Split fork and abort locks
  2024-01-20  8:12 [musl] [PATCH] Split fork and abort locks Markus Wichmann
  2024-01-20 11:29 ` Alexey Izbyshev
@ 2024-01-20 21:32 ` Rich Felker
  1 sibling, 0 replies; 4+ messages in thread
From: Rich Felker @ 2024-01-20 21:32 UTC (permalink / raw)
  To: Markus Wichmann; +Cc: musl

On Sat, Jan 20, 2024 at 09:12:28AM +0100, Markus Wichmann wrote:
> Hi all,
> 
> a while ago I had noticed that __abort_lock was being taken in some
> functions that have nothing to do with SIGABRT. Namely in the forking
> functions. Investigating this a bit, I noticed that __abort_lock had
> become dual purpose. But this is a code smell.
> 
> Actually, there are several locks that have expanded in scope a bit
> since their introduction. At least the ptc lock (__inhibit_ptc() et al.)
> deserves a closer look later on as well. Seems to me like in case of the
> default stack size, that lock is used simply because an rwlock was
> needed and this one was around.

The scope of this lock has always been modification to state that new
thread creation depends on, which is largely "how much storage does a
new thread need?"

Originally, the relevant state was just the amount of TLS memory
needed. With the addition of default stack size control, that also
became part of the relevant state. In theory these could be protected
by different locks, but I don't see any good reason for splitting
them.

The reason fork takes the __inhibit_ptc lock is that it has to take
all the internal locks; this has nothing to do with the scope of
what's protected by the particular lock.

Rich

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-01-20 21:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-20  8:12 [musl] [PATCH] Split fork and abort locks Markus Wichmann
2024-01-20 11:29 ` Alexey Izbyshev
2024-01-20 21:24   ` Rich Felker
2024-01-20 21:32 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).