mailing list of musl libc
 help / color / mirror / code / Atom feed
* Inherent race condition in linux robust_list system
@ 2015-04-10  3:31 Rich Felker
  0 siblings, 0 replies; only message in thread
From: Rich Felker @ 2015-04-10  3:31 UTC (permalink / raw)
  To: musl; +Cc: libc-alpha

While working on some of the code handling robust_list for robust (and
other owner-tracked) mutexes in musl, I've come across a race
condition that's inherent in the kernel's design for robust_list.
There is no way to eliminate it with the current API, and I see no way
to eliminate it without requiring a syscall to unlock robust mutexes.

The procedure for unlocking a robust_list tracked mutex looks like
this:

1. Store the address of the mutex to be unlocked in the robust_list
   "pending" slot.

2. Remove the mutex from the robust_list linked list.

3. Unlock the mutex.

4. Clear the "pending" slot in the robust_list.

The purpose of the pending slot is so that the kernel can handle the
case where a process dies asynchronously after removing the mutex from
the linked list but before it's unlocked; in this case it treats the
mutex like it's still in the list. But the kernel has no way of
knowing whether such asynchronous process death occurs before or after
step 3; it only knows it occurs between steps 2 and 4. This is very
bad.

As soon as step 3 takes place, another process can take ownership of
the mutex, and if it knows it's the last user, it can unlock and
destroy the mutex and then reuse the same memory for a new purpose
(imagine a shared-memory heap managed by a malloc-like allocator,
which would be a good application for robust mutexes). Now, if the new
use happens to store a value matching the tid of the thread whose
process is dying at the offset where the mutex owner would be stored,
the kernel misinterprets the new data stored there as a mutex
belonging to the dying process, and happily proceeds to corrupt it!

Fixing this does not look easy. The obvious way is to make clearing
the pending slot of the robust_list effectively atomic with unlocking
the mutex by doing them together in a (futex) syscall, but that would
require a syscall every time a robust mutex is unlocked. An alternate
approach would be enlarging the robust_list to have a PC range during
which the pending slot is valid. This would avoid a syscall but would
require the atomic unlock to be performed in asm (to provide labels
for the PC range). I do not see any way to fix it without kernel
changes.

Please note that this issue is distinct from glibc bug #14485, which
is easily fixable and does not affect musl. The issue I'm describing
here is much harder to fix because it's legal reuse of the same shared
memory mapping the robust mutex existed in rather than reuse of the
same virtual address range for a new mapping.

Rich


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2015-04-10  3:31 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-10  3:31 Inherent race condition in linux robust_list system Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).