From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12857 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: pthread cancel cleanup and pthread_mutex_lock Date: Tue, 29 May 2018 20:50:09 -0400 Message-ID: <20180530005009.GM1392@brightrain.aerifal.cx> References: Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1527641298 7070 195.159.176.226 (30 May 2018 00:48:18 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 30 May 2018 00:48:18 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12873-gllmg-musl=m.gmane.org@lists.openwall.com Wed May 30 02:48:14 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1fNpHi-0001jy-0L for gllmg-musl@m.gmane.org; Wed, 30 May 2018 02:48:14 +0200 Original-Received: (qmail 9504 invoked by uid 550); 30 May 2018 00:50:22 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 9486 invoked from network); 30 May 2018 00:50:22 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12857 Archived-At: On Wed, May 30, 2018 at 10:06:17AM +1000, Patrick Oppenlander wrote: > I accidentally hit send before I finished typing.. > > > I've recently been running some of the open posix testsuite tests from > > the linux test project. > > > > One particular test has been giving me headaches: > > https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/conformance/interfaces/pthread_mutex_init/1-2.c > > > > There are a couple of different tests in there but the most > > interesting one is the deadlock test which does the following: > > > > Thread A: Thread B: > > pthread_create > > pthread_cleanup_push(...) > > pthread_mutex_lock(M) > > pthread_setcanceltype(ASYNC) > > pthread_setcancelstate(ENABLE) > pthread_mutex_lock(M) <-- blocks here > pthread_cancel(B) > pthread_join(B) > > The test then expects the cleanup handler to run and unlock mutex M > allowing thread B to run to completion and the join to succeed. This test is invalid. pthread_mutex_lock is not async-cancel-safe and cannot legally be called while cancel type is async. FYI something like 50% of the "Open POSIX Test Suite" tests are invalid; in the majority of cases they're testing some property after undefined behavior has been invoked like here. > I've run this test with musl, glibc and on some different platforms > with varying results: > > x86_64 linux 4.16.11, glibc: test runs to completion > x86_64 linux 4.16.11, musl: deadlock (cleanup handler doesn't run) > arm linux 4.16.5, musl: test runs to completion The test is invalid in other ways too, involving races. It attempts to use sched_yield to ensure that the test thread enters pthread_mutex_lock a second time, but there's no reason to expect that to do anything, especially if there are sufficiently many cores (as many or more than running threads). I suspect the different behaviors come down to just different scheduling properties due to performance differences, or something like that. Naively, I would expect the test to "work" despite being invalid. > I'm not even sure that this test is valid -- I can't find any > documentation which says that pthread_mutex_lock is a cancellation > point, or that you're allowed to call pthread_mutex_unlock from an > async cancel handler. You can call anything you want from an async cancel handler, but you can't call any libc functions except the ones controlling cancel state while cancel type is async. Basically, all you can do in async cancel state is pure computation. > However, it's still concerning to see different results on different platforms. > > What's the expected behaviour here? Nothing meaningful. Rich