From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 17938 invoked from network); 11 Feb 2023 20:14:51 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 11 Feb 2023 20:14:51 -0000 Received: (qmail 15717 invoked by uid 550); 11 Feb 2023 20:14:48 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 15679 invoked from network); 11 Feb 2023 20:14:47 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru AC69040737BB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ispras.ru; s=default; t=1676146473; bh=xHzyQqaDCdBXd6cacGF/fRc6758QPTkaMmZAkNaMi/8=; h=Date:From:To:Subject:Reply-To:In-Reply-To:References:From; b=JTygxLzSY4uU8hUInhbzGuqUlUGYYnZKNbWdfZtEJG65LFfaMwn87ERHwZ52/6qUQ a0viUJZfjb2y4hxLl3Jvt4E80ezoIzM/jJwpI197cTUS7JdWoDmHPnhsChW7/5vjH6 S74lkGUcWezdthJwNpX7UXJWVz0eQkjfcVPBv4tU= MIME-Version: 1.0 Date: Sat, 11 Feb 2023 23:14:33 +0300 From: Alexey Izbyshev To: musl@lists.openwall.com Mail-Followup-To: musl@lists.openwall.com In-Reply-To: <20230211194950.GN4163@brightrain.aerifal.cx> References: <1a0289c15879bef6d538c0066f58545c@ispras.ru> <20230210162957.GB4163@brightrain.aerifal.cx> <63c0897d647936c946268f5a967a5e4d@ispras.ru> <20230211150603.GI4163@brightrain.aerifal.cx> <20230211171338.GD1903@voyager> <2da3840a9345c0a810e9d93ab4f6bca7@ispras.ru> <20230211175948.GK4163@brightrain.aerifal.cx> <20230211183505.GL4163@brightrain.aerifal.cx> <20230211194950.GN4163@brightrain.aerifal.cx> User-Agent: Roundcube Webmail/1.4.4 Message-ID: <4408deeb62fe668bf720d3c6c8bedda2@ispras.ru> X-Sender: izbyshev@ispras.ru Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [musl] [PATCH] mq_notify: fix close/recv race on failure path On 2023-02-11 22:49, Rich Felker wrote: > On Sat, Feb 11, 2023 at 10:28:20PM +0300, Alexey Izbyshev wrote: >> On 2023-02-11 21:35, Rich Felker wrote: >> >On Sat, Feb 11, 2023 at 09:08:53PM +0300, Alexey Izbyshev wrote: >> >>On 2023-02-11 20:59, Rich Felker wrote: >> >>>On Sat, Feb 11, 2023 at 08:50:15PM +0300, Alexey Izbyshev wrote: >> >>>>On 2023-02-11 20:13, Markus Wichmann wrote: >> >>>>>On Sat, Feb 11, 2023 at 10:06:03AM -0500, Rich Felker wrote: >> >>>>>>--- a/src/thread/pthread_detach.c >> >>>>>>+++ b/src/thread/pthread_detach.c >> >>>>>>@@ -5,8 +5,12 @@ static int __pthread_detach(pthread_t t) >> >>>>>> { >> >>>>>> /* If the cas fails, detach state is either already-detached >> >>>>>> * or exiting/exited, and pthread_join will trap or cleanup. */ >> >>>>>>- if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) != >> >>>>>>DT_JOINABLE) >> >>>>>>+ if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) != >> >>>>>>DT_JOINABLE) { >> >>>>>>+ int cs; >> >>>>>>+ __pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cs); >> >>>>>> return __pthread_join(t, 0); >> >>>>> ^^^^^^ I think you forgot to rework this. >> >>>>>>+ __pthread_setcancelstate(cs, 0); >> >>>>>>+ } >> >>>>>> return 0; >> >>>>>> } >> >>>>>> >> >>>>> >> >>>>>I see no other obvious missteps, though. >> >>>>> >> >>>>Same here, apart from this and misspelled "pthred_detach" in the >> >>>>commit message, the patches look good to me. >> >>>> >> >>>>Regarding the POSIX requirement to run sigev_notify_function in the >> >>>>context of a detached thread, while it's possible to observe the >> >>>>wrong detachstate for a short while via pthread_getattr_np after >> >>>>these patches, I'm not sure there is a standard way to do that. Even >> >>>>if it exists, this minor issue may be not worth caring about. >> >>> >> >>>Would this just be if the notification callback executes before >> >>>mq_notify returns in the parent? >> >> >> >>Yes, it seems so. >> >> >> >>>I suppose we could have the newly >> >>>created thread do the work of making the syscall, handling the error >> >>>case, detaching itself on success and and reporting back to the >> >>>mq_notify function whether it succeeded or failed via the >> >>>semaphore/args structure. Thoughts on that? >> >>> >> >>Could we just move pthread_detach call to the worker thread to the >> >>point after pthread_cleanup_pop? >> > >> >I thought that sounded dubious, in that it might lead to an attempt to >> >join a detached thread, but maybe it's safe to assume recv will never >> >return if the mq_notify syscall failed...? >> > >> Actually, because app signals are not blocked when the worker thread >> is created, recv can indeed return early with EINTR. But this looks >> like just a bug. > > Yes. While it's not a conformance bug to run with signals unblocked > ("The signal mask of this thread is implementation-defined.") it's a > functional bug to ever introduce threads that don't block all > application signals, since these interfere with sigwait & other > application control of where signals are delivered. This is an > oversight. I'll make it mask all signals. > >> Otherwise, mq_notify already assumes that recv can't return before >> SYS_mq_notify (if it did, the syscall would try to register a closed >> fd). I haven't tried to prove it (e.g. maybe recv may need to >> allocate something before blocking and hence can fail with ENOMEM?), >> but if it's true, I don't see how a failed SYS_mq_notify could cause >> recv to return, so joining a detached thread should be impossible if >> we make pthread_detach follow recv. > > I'm thinking for now maybe we should just drop the joining on error, > and leave it starting out detached. While recv should not fail, it's > obviously possible to make it fail in a seccomp sandbox, and you don't > want that to turn into UB inside the implementation. If it does fail, > the thread should still exit, but we have no way to synchronize with > the mq_notify parent to decide whether it's being joined or not in > this case without extra sync machinery... > By dropping pthread_join we'd avoid introducing a new UB case if recv fails unexpectedly, but the existing case that I mentioned (SYS_mq_notify trying to register a closed fd) would remain. It seems to me that moving SYS_mq_notify into the worker thread as you suggested earlier is the cleanest option if we're worrying about recv. Alexey