From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 2498 invoked from network); 10 Jan 2024 01:55:51 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 10 Jan 2024 01:55:51 -0000 Received: (qmail 30274 invoked by uid 550); 10 Jan 2024 01:54:17 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 30230 invoked from network); 10 Jan 2024 01:54:17 -0000 Date: Tue, 9 Jan 2024 20:55:50 -0500 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20240110015550.GP4163@brightrain.aerifal.cx> References: <20240109190726.GO4163@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240109190726.GO4163@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Protect pthreads' mutexes against use-after-destroy On Tue, Jan 09, 2024 at 02:07:26PM -0500, Rich Felker wrote: > On Tue, Jan 09, 2024 at 03:37:17PM +0100, jvoisin wrote: > > Ohai, > > > > as discussed on irc, Android's bionic has a check to prevent > > use-after-destroy on phtread mutexes > > (https://github.com/LineageOS/android_bionic/blob/e0aac7df6f58138dae903b5d456c947a3f8092ea/libc/bionic/pthread_mutex.cpp#L803), > > and musl doesn't. > > > > While odds are that this is a super-duper common bug, it would still be > > nice to have this kind of protection, since it's cheap, and would > > prevent/make it easy to diagnose weird states. > > > > Is this something that should/could be implemented? > > > > o/ > > I think you meant that the odds are it's not common. There's already > enough complexity in the code paths for supporting all the different > mutex types that my leaning would be, if we do any hardening for > use-after-destroy, that it should probably just take the form of > putting the object in a state that will naturally deadlock or error > rather than adding extra checks to every path where it's used. > > If OTOH we do want it to actually trap in all cases where it's used > after destroy, the simplest way to achieve that is probably to set it > up as a non-robust non-PI recursive or errorchecking mutex with > invalid prev/next pointers and owner of 0x3fffffff. Then the only > place that would actually have to have an explicit trap is trylock in > the code path: > > if (own == 0x3fffffff) return ENOTRECOVERABLE; > > where it could trap if type isn't robust. The unlock code path would > trap on accessing invalid prev/next pointers. Unfortunately I discovered a problem we need to deal with in researching for this: at some point Linux quietly changed the futex ABI, so that bit 29 is no longer reserved but potentially a tid bit. This was documented in 9c40365a65d62d7c06a95fb331b3442cb02d2fd9 but apparently actually happened at the source level a long time before that. So, we cannot assume 0x3fffffff is not a valid tid, and thereby cannot assume 0x7fffffff is not equal to ownerdead|valid_tid. This probably means we need to find a way to encode "not recoverable" as 0x40000000, as 0 is now the _only_ value in the low-30-bits that can't potentially be a valid tid. I'll look at this more over the next day or two. It's probably fixable but requires fiddling with delicate logic. Note that the only in-the-wild breakage possible is on systems where the pid/tid limit has been set extremely high, where attempts to lock a recursive or errorchecking mutex owned by a thread with tid 0x3fffffff could malfunction. Rich