From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/289
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@aerifal.cx>
Newsgroups: gmane.linux.lib.musl.general
Subject: New pthread cancellation.
Date: Sun, 17 Apr 2011 20:09:48 -0400
Message-ID: <20110418000948.GC277@brightrain.aerifal.cx>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: dough.gmane.org 1312595706 11463 80.91.229.12 (6 Aug 2011 01:55:06 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Sat, 6 Aug 2011 01:55:06 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: envelope-from@hidden Mon Apr 18 04:14:48 2011
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
Xref: news.gmane.org gmane.linux.lib.musl.general:289
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/289>

Today I committed to musl git the new version of POSIX thread
cancellation. This is the second in a series of designs to rememdy two
critical flaws in the classic way cancellation is implemented by glibc
and other libraries:

1. Cancellation can act after the syscall has returned successfully
   from kernelspace, but before userspace saves the return value. This
   results in a resource leak if the syscall allocated a resource, and
   there is no way to patch over it with cancellation handlers.

2. If a signal is handled while the thread is blocked at a cancellable
   syscall, the entire signal handler runs with asynchronous
   cancellation enabled. This could be extremely dangerous, since the
   signal handler may call functions which are async-signal-safe but
   not async-cancel-safe.

While I've heard mixed opinions on whether these flaws are violations
of the POSIX requirements on cancellation, either way they make it
virtually impossible to use cancellation for the intended purpose.
Both flaws stem from a cancellation-point idiom of:

1. Enable asynchronous cancellation.
2. Perform the operation (usually a syscall).
3. Disable asynchronous cancellation (actually restore the old state).

My first idea to remedy the situation appeared in musl 0.7.5, but
turned out to have its own set of flaws, so I went about designing a
new approach, which works like this:

A specialized version of the syscall wrapper assembly code is used for
cancellation points, and records its stack address and a pointer to
the syscall instruction. The cancellation signal handler can then
compare the stack and instruction pointers of the interrupted context
to determine at which point the cancellation request came:

- in the code leading up to, or while blocked at, the syscall,
- after completion of the syscall, OR
- while executing a signal handler which interrupted the syscall.

In the first case, cancellation is immediately acted upon. In either
of the second two cases, the cancellation signal handler re-raises the
cancellation signal, but leaves the signal blocked when it returns.
The cancel signal can then only be unblocked in the third case, when a
previously-executing signal handler returns and restores its saved
signal mask. This will immediately trigger the cancellation signal
again, and it can inspect the context again. If there are multiple
layers of signal handlers between the original cancellation point and
the cancellation signal handler, each one will be peeled off in this
way as they return, and the cancellation request will propagate all
the way back.

Surprisingly, this entire cancellation system has very few machine
dependencies, beyond the need for machine-specific syscall code which
was already a requirement. Everything else is written in plain POSIX
C, and makes only the following assumptions:

- The saved context received by signal handlers contains the saved
  value of the call stack register and current instruction address
  from the interrupted code (the offsets for these are defined in an
  arch-specific file).

- Restartable syscalls work by the kernel adjusting the saved
  instruction pointer to point back to the syscall instruction rather
  than the following instruction.

- Instruction pointer moves in the positive direction with forward
  code flow.

For comparison, my first try at this depended on an arch-specific
macro to read code from the saved instruction pointer and inspect for
the syscall opcode.

One limitation of this whole design, on plain x86 (not x86_64), is
that it is incompatible with the "sysenter" method of making syscalls.
Fortunately, relatively few syscalls are cancellable, and there is no
reason the non-cancellable majority of syscalls could not use the
"sysenter" syscall method. At present musl does not support sysenter
or the vdso syscall system whatsoever, but the issue may be relevant
to other libraries wanting to adopt the general approach. If sysenter
support is critical to anyone, I believe it's possible to make it
work, but it requires some ugly hacks I don't care to put in musl.
I'll be happy to explain the idea to anyone interested.

Aside from the correctness benefits, the new cancellation
implementation has been factored to avoid pulling cancellation-related
code into static-linked programs that don't use cancellation, even if
they use other pthread features. This should allow for even smaller
threaded programs.

The one cancellation-related task that remains is ensuring that
interfaces which are not supposed to be cancellation points do not
trigger cancellation. The recent changes have also made this task
easier.

Unless there are unforseen problems, a new release of musl with the
new cancellation system should be out in the next few days. In the
mean time, it's available via git.

--
Rich