mailing list of musl libc
 help / color / mirror / code / Atom feed
* Masked cancellation mode draft
@ 2015-02-22  3:24 Rich Felker
  2015-02-22 17:51 ` Szabolcs Nagy
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2015-02-22  3:24 UTC (permalink / raw)
  To: musl

Masked cancellation mode -- PTHREAD_CANCEL_MASKED

Background

POSIX thread cancellation provides an exception-like model to letting
threads process cancellation requests and clean up their state in
preparation to exit. Unfortunately, this model is completely foreign
to the C language and requires anti-idiomatic techniques like stuffing
all local state into a structure to make it available to cleanup
routines. It also makes it impossible to construct cancellable
primitives that, upon receiving the cancellation request, need to back
out the operation in progress without actually acting on cancellation,
because the caller needs to see their return. An example is
pthread_cond_wait when it discovers after receiving the cancellation
request that it already consumed a signal; in this case it must return
and leave the cancellation request pending. While, as part of the
implementation, pthread_cond_wait can use various hacks to satisfy
this requirement, the current standard for cancellation leaves
applications with no way to construct custom primitives with the same
property.

In addition, POSIX thread cancellation in its current state is
incompatible with third-party library code which was not specifically
written to be cancelable. If a thread acts on a cancellation request
during a call to library code which was not written to be
cancellation-aware, any data the library code was operating on may be
left in an inconsistent state. Locks may be left locked, and
resources, including file descriptors and allocated memory, may leak
or may have dangling references left behind after they are freed.
Thus, a thread calling such library code must either ensure that it is
never the target of cancellation requests or that it blocks
cancellation during library calls. This of course defeats one of the
most important usage cases for cancellation: stopping an asynchronous
query operation (network connection, database query, etc.) whose
results are no longer needed and which is stuck in a blocking
operation.


Adapting Cancellation to Idiomatic C

Well-written C functions check the return value of any function call
which can fail and properly back out partially completed work and
return their failure status to their caller. The new MASKED mode
allows this existing idiomatic error handling pattern to process
cancellation requests.

When the cancellation state is set to MASKED, the first cancellation
point (other than close, which is special) called with cancellation
pending, or which has a cancellation request arrive while it's
blocking, returns with an error of ECANCELED, and sets the
cancellation state to DISABLE.

Even code which was not specifically written to be cancellation-aware
is compatible with this behavior. As long as it is responding to
errors, it will see the error, but will have the full repertoire of
standard functions available to use while cleaning up and returning
after the error. If the error is ignored, cancellation will be
delayed, but the behavior is no worse than what could already happen
from ignoring errors.


Design Choices

One-off or sticky failure: One obvious question when returning an
error to report cancellation is whether only the first cancellation
point, or all calls to cancellation points, should fail with errors.
The one-off approach was chosen mainly because it's the most
compatible with existing library code, which may need to call other
functions which are cancellation points in its error paths. 

Exempting close: While close is a cancellation point, it's rare for
applications to check for errors from close, and when they do check
they often mishandle it. But more importantly, POSIX (with pending
Austin Group interpretations applied) requires that the fd be released
when close fails with an error other than EINTR, and also requires
that close not release the fd when acting on cancellation. These
requirements are mutually contradictory if close is to return an error
of ECANCELED, and are best resolved by simply suppressing close's
status as a cancellation point in MASKED cancellation mode.

Choice of error code: ECANCELED was chosen because it semantically
matches cancellation and because it was not otherwise used as a
standard error code for any interfaces which are cancellation points.
EINTR was also a good candidate since side effects on cancellation are
specified to match side effects on EINTR, but using EINTR would
prevent applications from differentiating interruption by a signal
from cancellation and would thereby violate the POSIX requirement that
implementation-defined error conditions not alias standardized errors.

Consuming cancellation request vs disabling: There are two potential
ways to achieve one-off failure. One is clearing the pending
cancellation request when reporting the error. The other is setting
the state to DISABLE. While the ability to clear pending cancellation
requests would be highly desirable in itself, it potentially increases
the implementation burden (including the complexity of synchronizating
such consumption/clearing with threads sending cancellation requests)
and yields worse default behavior: code wanting to leave the
cancellation request pending when restoring the default cancellation
state would have to re-raise it via pthread_cancel(pthread_self()).

State vs type: PTHREAD_CANCEL_MASKED is defined as a new cancellation
state rather than a type. This is for two main reasons:

1. The existing types represent times at which the implementation is
   permitted to act on cancellation, while the existing states
   represent whether acting on cancellation is permitted at all. In
   the new MASKED mode, cancellation is never acted upon. Its pending
   status or arrival is merely made available to the application via
   new error conditions in functions which are cancellation points.

2. The intended usage is simpler with a state than with a type. Since
   the first cancellation point to report failure switches the state
   to DISABLE, the caller would need to save and restore both state
   and type if MASKED were a type. By being a state, the cost of
   saving and restoring the mode is minimized.

Graceful fallback: By defining a new state macro rather than
completely new interfaces, applications can gracefully fallback to
disabling cancellation on implementations which lack MASKED
cancellation state with the following:

    #ifndef PTHREAD_CANCEL_MASKED
    #define PTHREAD_CANCEL_MASKED PTHREAD_CANCEL_DISABLE
    #endif

No other changes are needed. Any error-checking code that treats
ECANCELED as special will simply be a dead code path since it will not
be seen on such implementations.


Implementation

In signal-based implementations of cancellation, the desired behavior
is easily achieved simply by having the signal handler replace the
saved program counter in its ucontext_t, which necessarily contains an
address in critical range between the pre-syscall cancellation check
and the syscall instruction, with the address of code that returns an
ECANCELED error and resets the cancellation state to DISABLE.



Stability and Status

Presently all of the above is an experimental interface in musl libc
that should not be used in production code (outside of libc itself).
Details of the behavior and/or public interfaces may change based on
feedback and experience gained from use in musl and experimental use
by users.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Masked cancellation mode draft
  2015-02-22  3:24 Masked cancellation mode draft Rich Felker
@ 2015-02-22 17:51 ` Szabolcs Nagy
  2015-02-22 19:10   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Szabolcs Nagy @ 2015-02-22 17:51 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2015-02-21 22:24:53 -0500]:
> When the cancellation state is set to MASKED, the first cancellation
> point (other than close, which is special) called with cancellation
> pending, or which has a cancellation request arrive while it's
> blocking, returns with an error of ECANCELED, and sets the
> cancellation state to DISABLE.
> 
> Even code which was not specifically written to be cancellation-aware
> is compatible with this behavior. As long as it is responding to
> errors, it will see the error, but will have the full repertoire of
> standard functions available to use while cleaning up and returning
> after the error. If the error is ignored, cancellation will be
> delayed, but the behavior is no worse than what could already happen
> from ignoring errors.
> 

so it works like a special signal that only acts at blocking calls

since the thread is not forcefully killed, only notified about the
cancellation, the cleanup mechanism is under the control of the
programmer

this seems like a relevant approach to c11 and c++11 which currently
lack any way to safely cancel blocking threads

the only difficulty i see is that posix has a lot of cancellation
points (some of which are optional) so code that wants to be
'masked cancellation safe' should properly do the error handling
at a lot of places (eg some stdio functions like printf maybe
cancellation points and usually not checked for errors directly
only in aggregate through ferror)

if i understood correctly code that does not want to immediately
act upon masked cancellation (only at specific calls) should reset
the cancellation state with

  pthread_setcancelstate(PTHREAD_CANCEL_MASKED, 0)

and then cancellation is deferred until the next cancellation point.

another issue is that pthread_testcancel() has no return value so it
cannot be used for non-blocking testing of masked cancel.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Masked cancellation mode draft
  2015-02-22 17:51 ` Szabolcs Nagy
@ 2015-02-22 19:10   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2015-02-22 19:10 UTC (permalink / raw)
  To: musl

On Sun, Feb 22, 2015 at 06:51:47PM +0100, Szabolcs Nagy wrote:
> * Rich Felker <dalias@libc.org> [2015-02-21 22:24:53 -0500]:
> > When the cancellation state is set to MASKED, the first cancellation
> > point (other than close, which is special) called with cancellation
> > pending, or which has a cancellation request arrive while it's
> > blocking, returns with an error of ECANCELED, and sets the
> > cancellation state to DISABLE.
> > 
> > Even code which was not specifically written to be cancellation-aware
> > is compatible with this behavior. As long as it is responding to
> > errors, it will see the error, but will have the full repertoire of
> > standard functions available to use while cleaning up and returning
> > after the error. If the error is ignored, cancellation will be
> > delayed, but the behavior is no worse than what could already happen
> > from ignoring errors.
> 
> so it works like a special signal that only acts at blocking calls

And unlike portable signal-based approaches, it lacks the race
conditions and global state.

> since the thread is not forcefully killed, only notified about the
> cancellation, the cleanup mechanism is under the control of the
> programmer
> 
> this seems like a relevant approach to c11 and c++11 which currently
> lack any way to safely cancel blocking threads

Yes.

> the only difficulty i see is that posix has a lot of cancellation
> points (some of which are optional) so code that wants to be
> 'masked cancellation safe' should properly do the error handling
> at a lot of places (eg some stdio functions like printf maybe
> cancellation points and usually not checked for errors directly
> only in aggregate through ferror)
> 
> if i understood correctly code that does not want to immediately
> act upon masked cancellation (only at specific calls) should reset
> the cancellation state with
> 
>   pthread_setcancelstate(PTHREAD_CANCEL_MASKED, 0)

Library code would do this but with &old_cs rather than 0, then
restore the state before exiting. It it wants to actually behave like
a POSIX cancellation point, it would do something like:

    pthread_setcancelstate(old_cs, 0)
    if (was_canceled) {
        pthread_testcancel();
        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, 0);
    }

Then, if old_cs was PTHREAD_CANCEL_ENABLE, the pthread_testcancel will
cause cancellation to be acted upon and the caller's cleanup handlers
to run. If old_cs was PTHREAD_CANCEL_MASKED, then pthread_testcancel
will do nothing and the subsequent call to set PTHREAD_CANCEL_DISABLE
will prevent further ECANCELED from happening (since your caller is
going to get its ECANCELED error from the function that's about to
return).

> and then cancellation is deferred until the next cancellation point.
> 
> another issue is that pthread_testcancel() has no return value so it
> cannot be used for non-blocking testing of masked cancel.

Indeed, I noticed that. There are stupid ways to test but they're not
terribly efficient: lots of cancellation points have timeouts that can
be zero, but most of them are likely to result in a syscall if
cancellation is not already pending.

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-02-22 19:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-22  3:24 Masked cancellation mode draft Rich Felker
2015-02-22 17:51 ` Szabolcs Nagy
2015-02-22 19:10   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).