* pthread cancel cleanup and pthread_mutex_lock @ 2018-05-29 23:54 Patrick Oppenlander 2018-05-30 0:06 ` Patrick Oppenlander 0 siblings, 1 reply; 4+ messages in thread From: Patrick Oppenlander @ 2018-05-29 23:54 UTC (permalink / raw) To: musl I've recently been running some of the open posix testsuite tests from the linux test project. One particular test has been giving me headaches: https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/conformance/interfaces/pthread_mutex_init/1-2.c There are a couple of different tests in there but the most interesting one is the deadlock test which does the following: Thread A: Thread B: pthread_create pthread_cleanup_push(...) pthread_mutex_lock(M) pthread_setcanceltype(ASYNC) pthread_setcancelstate(ENABLE) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: pthread cancel cleanup and pthread_mutex_lock 2018-05-29 23:54 pthread cancel cleanup and pthread_mutex_lock Patrick Oppenlander @ 2018-05-30 0:06 ` Patrick Oppenlander 2018-05-30 0:50 ` Rich Felker 0 siblings, 1 reply; 4+ messages in thread From: Patrick Oppenlander @ 2018-05-30 0:06 UTC (permalink / raw) To: musl I accidentally hit send before I finished typing.. > I've recently been running some of the open posix testsuite tests from > the linux test project. > > One particular test has been giving me headaches: > https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/conformance/interfaces/pthread_mutex_init/1-2.c > > There are a couple of different tests in there but the most > interesting one is the deadlock test which does the following: > > Thread A: Thread B: > pthread_create > pthread_cleanup_push(...) > pthread_mutex_lock(M) > pthread_setcanceltype(ASYNC) > pthread_setcancelstate(ENABLE) pthread_mutex_lock(M) <-- blocks here pthread_cancel(B) pthread_join(B) The test then expects the cleanup handler to run and unlock mutex M allowing thread B to run to completion and the join to succeed. I've run this test with musl, glibc and on some different platforms with varying results: x86_64 linux 4.16.11, glibc: test runs to completion x86_64 linux 4.16.11, musl: deadlock (cleanup handler doesn't run) arm linux 4.16.5, musl: test runs to completion I'm not even sure that this test is valid -- I can't find any documentation which says that pthread_mutex_lock is a cancellation point, or that you're allowed to call pthread_mutex_unlock from an async cancel handler. However, it's still concerning to see different results on different platforms. What's the expected behaviour here? Patrick ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: pthread cancel cleanup and pthread_mutex_lock 2018-05-30 0:06 ` Patrick Oppenlander @ 2018-05-30 0:50 ` Rich Felker 2018-05-30 1:36 ` Patrick Oppenlander 0 siblings, 1 reply; 4+ messages in thread From: Rich Felker @ 2018-05-30 0:50 UTC (permalink / raw) To: musl On Wed, May 30, 2018 at 10:06:17AM +1000, Patrick Oppenlander wrote: > I accidentally hit send before I finished typing.. > > > I've recently been running some of the open posix testsuite tests from > > the linux test project. > > > > One particular test has been giving me headaches: > > https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/conformance/interfaces/pthread_mutex_init/1-2.c > > > > There are a couple of different tests in there but the most > > interesting one is the deadlock test which does the following: > > > > Thread A: Thread B: > > pthread_create > > pthread_cleanup_push(...) > > pthread_mutex_lock(M) > > pthread_setcanceltype(ASYNC) > > pthread_setcancelstate(ENABLE) > pthread_mutex_lock(M) <-- blocks here > pthread_cancel(B) > pthread_join(B) > > The test then expects the cleanup handler to run and unlock mutex M > allowing thread B to run to completion and the join to succeed. This test is invalid. pthread_mutex_lock is not async-cancel-safe and cannot legally be called while cancel type is async. FYI something like 50% of the "Open POSIX Test Suite" tests are invalid; in the majority of cases they're testing some property after undefined behavior has been invoked like here. > I've run this test with musl, glibc and on some different platforms > with varying results: > > x86_64 linux 4.16.11, glibc: test runs to completion > x86_64 linux 4.16.11, musl: deadlock (cleanup handler doesn't run) > arm linux 4.16.5, musl: test runs to completion The test is invalid in other ways too, involving races. It attempts to use sched_yield to ensure that the test thread enters pthread_mutex_lock a second time, but there's no reason to expect that to do anything, especially if there are sufficiently many cores (as many or more than running threads). I suspect the different behaviors come down to just different scheduling properties due to performance differences, or something like that. Naively, I would expect the test to "work" despite being invalid. > I'm not even sure that this test is valid -- I can't find any > documentation which says that pthread_mutex_lock is a cancellation > point, or that you're allowed to call pthread_mutex_unlock from an > async cancel handler. You can call anything you want from an async cancel handler, but you can't call any libc functions except the ones controlling cancel state while cancel type is async. Basically, all you can do in async cancel state is pure computation. > However, it's still concerning to see different results on different platforms. > > What's the expected behaviour here? Nothing meaningful. Rich ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: pthread cancel cleanup and pthread_mutex_lock 2018-05-30 0:50 ` Rich Felker @ 2018-05-30 1:36 ` Patrick Oppenlander 0 siblings, 0 replies; 4+ messages in thread From: Patrick Oppenlander @ 2018-05-30 1:36 UTC (permalink / raw) To: musl On Wed, May 30, 2018 at 10:50 AM, Rich Felker <dalias@libc.org> wrote: > This test is invalid. pthread_mutex_lock is not async-cancel-safe and > cannot legally be called while cancel type is async. I suspected as much. > FYI something like 50% of the "Open POSIX Test Suite" tests are > invalid; in the majority of cases they're testing some property after > undefined behavior has been invoked like here. Thanks. Do you know of any better tests? >> I've run this test with musl, glibc and on some different platforms >> with varying results: >> >> x86_64 linux 4.16.11, glibc: test runs to completion >> x86_64 linux 4.16.11, musl: deadlock (cleanup handler doesn't run) >> arm linux 4.16.5, musl: test runs to completion > > The test is invalid in other ways too, involving races. It attempts to > use sched_yield to ensure that the test thread enters > pthread_mutex_lock a second time, but there's no reason to expect that > to do anything, especially if there are sufficiently many cores (as > many or more than running threads). I suspect the different behaviors > come down to just different scheduling properties due to performance > differences, or something like that. Naively, I would expect the test > to "work" despite being invalid. The reason it doesn't "work" (besides being stupid) is because the cleanup handler isn't invoked while the thread is blocked in the pthread_mutex_lock call. Should it be in the async cancellation case? >> What's the expected behaviour here? > > Nothing meaningful. Thanks. Patrick ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-05-30 1:36 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-29 23:54 pthread cancel cleanup and pthread_mutex_lock Patrick Oppenlander 2018-05-30 0:06 ` Patrick Oppenlander 2018-05-30 0:50 ` Rich Felker 2018-05-30 1:36 ` Patrick Oppenlander
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).