mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Pending patches for MT-fork stuff
@ 2020-09-28 23:26 Rich Felker
  2020-09-28 23:37 ` Rich Felker
  0 siblings, 1 reply; 2+ messages in thread
From: Rich Felker @ 2020-09-28 23:26 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 476 bytes --]

In investigating the MT-fork deadlock stuff and working on a patch
that makes it attempt to trap rather than deadlocking, I found a
problem with interaction of fork and aio. A fix for that, along with
the patch to trap, and a minimal testcase for the aio bug, are
attached.

There's also a problematic interaction of abort with fork that was
just found in glibc that also exists in musl, that I still need to
fix. I'll follow up later with a proposed solution for that.

Rich

[-- Attachment #2: 0001-fix-fork-of-processes-with-active-async-io-contexts.patch --]
[-- Type: text/plain, Size: 2876 bytes --]

From 34904d830a9fd1f6fc47218f38c111698303d2fe Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Mon, 28 Sep 2020 18:38:27 -0400
Subject: [PATCH 1/3] fix fork of processes with active async io contexts

previously, if a file descriptor had aio operations pending in the
parent before fork, attempting to close it in the child would attempt
to cancel a thread belonging to the parent. this could deadlock, fail,
or crash the whole process of the cancellation signal handler was not
yet installed in the parent. in addition, further use of aio from the
child could malfunction or deadlock.

POSIX specifies that async io operations are not inherited by the
child on fork, so clear the entire aio fd map in the child, and take
the aio map lock (with signals blocked) across the fork so that the
lock is kept in a consistent state.
---
 src/aio/aio.c               | 14 ++++++++++++++
 src/internal/pthread_impl.h |  2 ++
 src/process/fork.c          |  3 +++
 3 files changed, 19 insertions(+)

diff --git a/src/aio/aio.c b/src/aio/aio.c
index 6d34fa86..f59679c3 100644
--- a/src/aio/aio.c
+++ b/src/aio/aio.c
@@ -392,6 +392,20 @@ int __aio_close(int fd)
 	return fd;
 }
 
+void __aio_atfork(int who)
+{
+	if (who<0) {
+		pthread_rwlock_rdlock(&maplock);
+		return;
+	}
+	if (who>0 && map) for (int a=0; a<(-1U/2+1)>>24; a++)
+		if (map[a]) for (int b=0; b<256; b++)
+			if (map[a][b]) for (int c=0; c<256; c++)
+				if (map[a][b][c]) for (int d=0; d<256; d++)
+					map[a][b][c][d] = 0;
+	pthread_rwlock_unlock(&maplock);
+}
+
 weak_alias(aio_cancel, aio_cancel64);
 weak_alias(aio_error, aio_error64);
 weak_alias(aio_fsync, aio_fsync64);
diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index 4d709bbc..358ad1ce 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -162,6 +162,8 @@ extern hidden void *__pthread_tsd_main[];
 extern hidden volatile int __aio_fut;
 extern hidden volatile int __eintr_valid_flag;
 
+extern hidden void __aio_atfork(int);
+
 hidden int __clone(int (*)(void *), void *, int, void *, ...);
 hidden int __set_thread_area(void *);
 hidden int __libc_sigaction(int, const struct sigaction *, struct sigaction *);
diff --git a/src/process/fork.c b/src/process/fork.c
index 7e984ff8..dbaa9402 100644
--- a/src/process/fork.c
+++ b/src/process/fork.c
@@ -10,6 +10,7 @@ static void dummy(int x)
 }
 
 weak_alias(dummy, __fork_handler);
+weak_alias(dummy, __aio_atfork);
 
 pid_t fork(void)
 {
@@ -17,6 +18,7 @@ pid_t fork(void)
 	sigset_t set;
 	__fork_handler(-1);
 	__block_all_sigs(&set);
+	__aio_atfork(-1);
 #ifdef SYS_fork
 	ret = __syscall(SYS_fork);
 #else
@@ -32,6 +34,7 @@ pid_t fork(void)
 		libc.threads_minus_1 = 0;
 		if (libc.need_locks) libc.need_locks = -1;
 	}
+	__aio_atfork(!ret);
 	__restore_sigs(&set);
 	__fork_handler(!ret);
 	return __syscall_ret(ret);
-- 
2.21.0


[-- Attachment #3: 0002-add-hidden-declaration-for-internal-__aio_close-func.patch --]
[-- Type: text/plain, Size: 774 bytes --]

From 65c93ff68f93f83b471e52edd989c30d1c0f437f Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Mon, 28 Sep 2020 18:47:13 -0400
Subject: [PATCH 2/3] add hidden declaration for internal __aio_close function

---
 src/internal/pthread_impl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index 358ad1ce..23ee566c 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -162,6 +162,7 @@ extern hidden void *__pthread_tsd_main[];
 extern hidden volatile int __aio_fut;
 extern hidden volatile int __eintr_valid_flag;
 
+extern hidden int __aio_close(int);
 extern hidden void __aio_atfork(int);
 
 hidden int __clone(int (*)(void *), void *, int, void *, ...);
-- 
2.21.0


[-- Attachment #4: 0003-trap-don-t-deadlock-on-AS-unsafe-operations-after-mu.patch --]
[-- Type: text/plain, Size: 3119 bytes --]

From 9179864fa24ecdabd2d2aadddf4d937cfb824090 Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Mon, 28 Sep 2020 17:30:04 -0400
Subject: [PATCH 3/3] trap, don't deadlock, on AS-unsafe operations after
 multithreaded fork

POSIX specifies the forked child of a multithreaded process to run in
an async signal context where it cannot call AS-unsafe functions. the
underlying motivation for this restriction is that internal resources
in the parent may have been owned by other threads, which do not exist
in the child, at the moment of fork, and may not be possible to access
safely.

prior to commit e01b5939b38aea5ecbe41670643199825874b26c, the child
simply skipped locking due to its being single-threaded, resulting in
silent use of inconsistent state, possibly leading to crashes,
deadlocks, or other kinds of misbehavior. since that change, any
attempt by the child to take a lock that was held in another thread of
the parent reliably deadlocks. this prevents unsafe forward progress,
but is an unexpected failure mode and difficult to log and handle in
automated build and testing workflows.

instead, track in the child that the fork was by a multithreaded
parent, and if so, trap before waiting on a lock. this only works for
internal locks serviced by LOCK/__lock, so it's possible that some
things using other mechanisms will still deadlock, but this should
catch most of the common problems and assist in finding and fixing
erroneous application code.

one subtlety that almost makes this change wrong is that we use
threads internally for POSIX AIO and for SIGEV_THREAD timers. the
existence of these internal threads cannot impose on what the
application can do in the child. these subsystems don't use the
affected locks themselves, but it's possible that a process that's
only "multithreaded as an implementation detail" in the parent could
fork, then legitimately create multiple threads in the child, leading
to lock contention in the child that is not a result of UB by the
application. in this case, however, pthread_create will have reset
need_locks to a positive value, disarming the trap.
---
 src/process/fork.c  | 2 +-
 src/thread/__lock.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/process/fork.c b/src/process/fork.c
index dbaa9402..538a0bcb 100644
--- a/src/process/fork.c
+++ b/src/process/fork.c
@@ -32,7 +32,7 @@ pid_t fork(void)
 		self->next = self->prev = self;
 		__thread_list_lock = 0;
 		libc.threads_minus_1 = 0;
-		if (libc.need_locks) libc.need_locks = -1;
+		if (libc.need_locks > 0) libc.need_locks = -2;
 	}
 	__aio_atfork(!ret);
 	__restore_sigs(&set);
diff --git a/src/thread/__lock.c b/src/thread/__lock.c
index 60eece49..75412cc8 100644
--- a/src/thread/__lock.c
+++ b/src/thread/__lock.c
@@ -24,6 +24,7 @@ void __lock(volatile int *l)
 	int current = a_cas(l, 0, INT_MIN + 1);
 	if (need_locks < 0) libc.need_locks = 0;
 	if (!current) return;
+	if (need_locks < -1) a_crash();
 	/* A first spin loop, for medium congestion. */
 	for (unsigned i = 0; i < 10; ++i) {
 		if (current < 0) current -= INT_MIN + 1;
-- 
2.21.0


[-- Attachment #5: aio_fork.c --]
[-- Type: text/plain, Size: 292 bytes --]

#include <aio.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{
	char buf[256];
	int fd = dup(0);
	struct aiocb cb = { .aio_fildes = fd, .aio_buf = buf, .aio_nbytes = sizeof buf };
	aio_read(&cb);

	pid_t pid = fork();
	if (!pid) {
		close(fd);
		_exit(0);
	}
	waitpid(pid, 0, 0);
}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [musl] Pending patches for MT-fork stuff
  2020-09-28 23:26 [musl] Pending patches for MT-fork stuff Rich Felker
@ 2020-09-28 23:37 ` Rich Felker
  0 siblings, 0 replies; 2+ messages in thread
From: Rich Felker @ 2020-09-28 23:37 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 585 bytes --]

On Mon, Sep 28, 2020 at 07:26:15PM -0400, Rich Felker wrote:
> In investigating the MT-fork deadlock stuff and working on a patch
> that makes it attempt to trap rather than deadlocking, I found a
> problem with interaction of fork and aio. A fix for that, along with
> the patch to trap, and a minimal testcase for the aio bug, are
> attached.
> 
> There's also a problematic interaction of abort with fork that was
> just found in glibc that also exists in musl, that I still need to
> fix. I'll follow up later with a proposed solution for that.

And here's the proposed abort fix.

[-- Attachment #2: 0004-move-__abort_lock-to-its-own-file-and-drop-pointless.patch --]
[-- Type: text/plain, Size: 2081 bytes --]

From 000b86dffecf68c1f479e532dcf4e9a403062521 Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Mon, 28 Sep 2020 19:30:19 -0400
Subject: [PATCH 4/5] move __abort_lock to its own file and drop pointless
 weak_alias trick

the dummy definition of __abort_lock in sigaction.c was performing
exactly the same role that putting the lock in its own source file
could and should have been used to achieve.

while we're moving it, give it a proper declaration.
---
 src/exit/abort.c            | 2 --
 src/exit/abort_lock.c       | 3 +++
 src/internal/pthread_impl.h | 2 ++
 src/signal/sigaction.c      | 6 ------
 4 files changed, 5 insertions(+), 8 deletions(-)
 create mode 100644 src/exit/abort_lock.c

diff --git a/src/exit/abort.c b/src/exit/abort.c
index e1980f10..f21f458e 100644
--- a/src/exit/abort.c
+++ b/src/exit/abort.c
@@ -6,8 +6,6 @@
 #include "lock.h"
 #include "ksigaction.h"
 
-hidden volatile int __abort_lock[1];
-
 _Noreturn void abort(void)
 {
 	raise(SIGABRT);
diff --git a/src/exit/abort_lock.c b/src/exit/abort_lock.c
new file mode 100644
index 00000000..3af72c7b
--- /dev/null
+++ b/src/exit/abort_lock.c
@@ -0,0 +1,3 @@
+#include "pthread_impl.h"
+
+volatile int __abort_lock[1];
diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index 23ee566c..21d51be6 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -197,6 +197,8 @@ hidden void __tl_sync(pthread_t);
 
 extern hidden volatile int __thread_list_lock;
 
+extern hidden volatile int __abort_lock[1];
+
 extern hidden unsigned __default_stacksize;
 extern hidden unsigned __default_guardsize;
 
diff --git a/src/signal/sigaction.c b/src/signal/sigaction.c
index c109bea0..a4737404 100644
--- a/src/signal/sigaction.c
+++ b/src/signal/sigaction.c
@@ -7,12 +7,6 @@
 #include "lock.h"
 #include "ksigaction.h"
 
-static volatile int dummy_lock[1] = { 0 };
-
-extern hidden volatile int __abort_lock[1];
-
-weak_alias(dummy_lock, __abort_lock);
-
 static int unmask_done;
 static unsigned long handler_set[_NSIG/(8*sizeof(long))];
 
-- 
2.21.0


[-- Attachment #3: 0005-fix-race-deadlock-with-abort-in-forked-child-of-mult.patch --]
[-- Type: text/plain, Size: 1178 bytes --]

From 7e661e4ff9f70b625e496b54729943bf3f7ecfac Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Mon, 28 Sep 2020 19:32:34 -0400
Subject: [PATCH 5/5] fix race deadlock with abort in forked child of
 multithreaded parent

if the multithreaded parent forked while another thread was calling
sigaction for SIGABRT or calling abort, the child could inherit a lock
state in which future calls to abort will deadlock. this is
nonconforming since abort is AS-safe and permitted to be called in the
MT-forked child.

since there is no userspace data state protected by the lock (it's
only used to exclude sigaction for SIGABRT while abort is in
progress), just reset the lock word to 0 in the child rather than
holding the lock across the fork.
---
 src/process/fork.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/process/fork.c b/src/process/fork.c
index 538a0bcb..b5a5fa11 100644
--- a/src/process/fork.c
+++ b/src/process/fork.c
@@ -33,6 +33,7 @@ pid_t fork(void)
 		__thread_list_lock = 0;
 		libc.threads_minus_1 = 0;
 		if (libc.need_locks > 0) libc.need_locks = -2;
+		*__abort_lock = 0;
 	}
 	__aio_atfork(!ret);
 	__restore_sigs(&set);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-09-28 23:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-28 23:26 [musl] Pending patches for MT-fork stuff Rich Felker
2020-09-28 23:37 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).