From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_LOW, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 22113 invoked from network); 1 Jun 2023 10:13:13 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 1 Jun 2023 10:13:13 -0000 Received: (qmail 32680 invoked by uid 550); 1 Jun 2023 10:13:10 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 32646 invoked from network); 1 Jun 2023 10:13:09 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru A478744C1017 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ispras.ru; s=default; t=1685614377; bh=pMDzZG7RC4HDOQ13qdJUXHuiiPhOWl1nzZHlmUKg8lg=; h=Date:From:To:Subject:Reply-To:In-Reply-To:References:From; b=QoArK1+WhREezdqQ3p1g+IZF3Hgee30N5n/jVJGsIMvrqgR1PATGr1gHkWNLjTmVL WDODADNIqn9vrKK5dGzaxXcCsSLIcA5MyucBxiU8Fl4s4q0Ko48vUSrIwQvC5pLz5J kw1YFal9vyA+GjNQY2W/9mssTINO2x3gpEebjLi4= MIME-Version: 1.0 Date: Thu, 01 Jun 2023 13:12:57 +0300 From: Alexey Izbyshev To: musl@lists.openwall.com Mail-Followup-To: musl@lists.openwall.com In-Reply-To: <20230530233529.GZ4163@brightrain.aerifal.cx> References: <20230530233529.GZ4163@brightrain.aerifal.cx> User-Agent: Roundcube Webmail/1.4.4 Message-ID: X-Sender: izbyshev@ispras.ru Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [musl] [PATCH] [RFC] make clone() usable On 2023-05-31 02:35, Rich Felker wrote: > As discussed before (see the 2021 thread "Incorrect thread TID > caching") clone() has been effectively unusable because it produces a > child process in invalid state, with wrong tid in its thread > structure, among other problems. > > The attached proposed patch attempts to make clone() usable by having > it share the _Fork logic for establishing a consistent process state > after forking, and also blocks use of flags which produce invalid > state. > Tangentially, while thinking about this, I realized that because the kernel clears the address of the exit futex on clone and _Fork doesn't re-establish it, the following blocks forever: void *thr(void *arg) { // Blocks in __tl_sync because __thread_list_lock is never unlocked pthread_join(*(pthread_t *)arg, NULL)); return NULL; } int main() { if (!fork()) { static pthread_t pt; pt = pthread_self(); pthread_create(&(pthread_t){0}, NULL, thr, &pt); pthread_exit(NULL); } wait(NULL); } This could be fixed by the following change in _Fork.c: - self->tid = __syscall(SYS_gettid); + self->tid = __syscall(SYS_set_tid_address, &__thread_list_lock); > With CLONE_VM, the old behavior is kept, with a caveat that the child > context is extremely restrictive, ala vfork. > > It was raised previously (pardon the pun) that raise() should perhaps > be modified to make a SYS_gettid syscall rather than using the thread > structure tid, so that it's possible to call raise() in a vfork > context without signaling the wrong process. It might make sense to do > that now too. > Yes, it would be nice to have raise and abort (which also reads the cached tid separately) working in a vfork child. > diff --git a/src/linux/clone.c b/src/linux/clone.c > index 8c1af7d3..e8f76bbc 100644 > --- a/src/linux/clone.c > +++ b/src/linux/clone.c > @@ -4,18 +4,55 @@ > #include > #include "pthread_impl.h" > #include "syscall.h" > +#include "lock.h" > +#include "fork_impl.h" > + > +struct clone_start_args { > + int (*func)(void *); > + void *arg; > + sigset_t sigmask; > +}; > + > +static int clone_start(void *arg) > +{ > + struct clone_start_args *csa = arg; > + __post_Fork(0); > + __restore_sigs(&csa->sigmask); > + return csa->func(csa->arg); > +} > > int clone(int (*func)(void *), void *stack, int flags, void *arg, ...) > { > + struct clone_start_args csa; > va_list ap; > - pid_t *ptid, *ctid; > - void *tls; > + pid_t *ptid = 0, *ctid = 0; > + void *tls = 0; > + > + /* Flags that produce an invalid thread/TLS state are disallowed. */ > + int badflags = CLONE_THREAD | CLONE_SETTLS | CLONE_CHILD_CLEARTID; > + if (flags & badflags) > + return __syscall_ret(-EINVAL); > Maybe also check for NULL stack, as proposed in https://www.openwall.com/lists/musl/2022/08/02/1 ? > va_start(ap, arg); > - ptid = va_arg(ap, pid_t *); > - tls = va_arg(ap, void *); > - ctid = va_arg(ap, pid_t *); > + if (flags & (CLONE_PIDFD | CLONE_PARENT_SETTID | CLONE_CHILD_SETTID)) > + ptid = va_arg(ap, pid_t *); > + if (flags & CLONE_CHILD_SETTID) { > + tls = va_arg(ap, void *); > + ctid = va_arg(ap, pid_t *); > + } > va_end(ap); > > - return __syscall_ret(__clone(func, stack, flags, arg, ptid, tls, > ctid)); > + if (flags & CLONE_VM) > + return __syscall_ret(__clone(func, stack, flags, arg, ptid, tls, > ctid)); > + > + __block_all_sigs(&csa.sigmask); > + LOCK(__abort_lock); > + > + csa.func = func; > + csa.arg = arg; > + int ret = __clone(clone_start, stack, flags, &csa, ptid, tls, ctid); > + > + __post_Fork(ret); > + __restore_sigs(&csa.sigmask); > + return __syscall_ret(ret); > } Another thing that might be somewhat relevant if we expect to see wider usage of clone is that syscall (the public function) currently blindly extracts six arguments via va_arg, which on some architectures requires sufficient stack space not to crash. So on i386 the following silly function will crash if passed to clone, provided that "stack" points to the beginning of unmapped space past the stack page end. int f(void *arg) { return syscall(SYS_gettid); } And it will crash even on x86-64 if tail call optimization triggers, which doesn't happen with current GCC due to long/int return type mismatch. (This test crashes with glibc 2.37 as well.) pthread_create masks this issue because it always consumes some stack space with struct start_args. Thanks, Alexey