From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5979
Path: news.gmane.org!not-for-mail
From: Alexander Monakov <amonakov@ispras.ru>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: sem_getvalue conformance considerations
Date: Sat, 30 Aug 2014 02:51:35 +0400 (MSK)
Message-ID: <alpine.LNX.2.00.1408292251540.5292@monopod.intra.ispras.ru>
References: <20140827023338.GA21076@brightrain.aerifal.cx> <1409123141.4476.18.camel@eris.loria.fr> <20140827074310.GK12888@brightrain.aerifal.cx> <alpine.LNX.2.00.1408271435160.6544@monopod.intra.ispras.ru> <alpine.LNX.2.00.1408271614380.6544@monopod.intra.ispras.ru>
 <alpine.LNX.2.00.1408272249230.6544@monopod.intra.ispras.ru> <alpine.LNX.2.00.1408280026590.6544@monopod.intra.ispras.ru> <alpine.LNX.2.00.1408290034120.5292@monopod.intra.ispras.ru>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Trace: ger.gmane.org 1409352827 7817 80.91.229.3 (29 Aug 2014 22:53:47 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Fri, 29 Aug 2014 22:53:47 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-5986-gllmg-musl=m.gmane.org@lists.openwall.com Sat Aug 30 00:53:40 2014
Return-path: <musl-return-5986-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@plane.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-5986-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1XNV34-0002vT-N2
	for gllmg-musl@plane.gmane.org; Sat, 30 Aug 2014 00:53:38 +0200
Original-Received: (qmail 9372 invoked by uid 550); 29 Aug 2014 22:53:37 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 9364 invoked from network); 29 Aug 2014 22:53:37 -0000
In-Reply-To: <alpine.LNX.2.00.1408290034120.5292@monopod.intra.ispras.ru>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
Xref: news.gmane.org gmane.linux.lib.musl.general:5979
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/5979>

Hi,

To begin, I have a few questions regarding cancellation (please excuse
deviating from topic for a while):

1.  It looks non-conforming not to act on pending cancellation in uncontended
sem_wait (or any cancellation point that does not invoke a syscall, for that
matter; nsz pointed out that many instances should already be handled, but
pthread_join for instance seems to miss explicit timedwait).  A bug?

2.  I had trouble understanding cancel_handler.  What is re-raising
SIGCANCEL at the end for?

3.  My understanding is that for deferred cancellation, ideally you'd want
SIGCANCEL to be delivered only as interruptions to syscalls --- that would
eliminate some scaffolding around syscall_cp including IP range check in
cancel_handler, correct?

4.  If the above could be provided, it would be possible to guarantee that
pthread cancellation cleanup is executing in a signal handler context only for
asynchronous cancellation.  I'd like that for my sem unwait.  Too bad?

======

Trying to approach a formal description of my proposed sem ops
implementation, here's what I can do now:

1.  Sem ops begin execution by performing an atomic read or read-modify-write
on val[0].  The sequence of atomic accesses establishes an order between
threads doing the ops (with respect to a given semaphore).

2.  Returning from futex wait with timeout or interrupt also accesses
val[0] and is included in that order

3.  The following invariant is maintained: val[0] is equal to the semaphore
value minus the number of current waiters (as given by the linear sequence of
events established above)

4. sem_post begins by performing an atomic increment on val[0]; if the
previous value was negative, there was at least one waiter, which is now
discounted from val[0]; we must arrange that one waiter eventually wakes up
and proceeds

4.1. Note: when sem_post increments negative val[0], we need that one waiter
returns successfully from sem_wait; if they were to exit exceptionally (eintr,
timeout, cancel), they would need to increment val[0] yet again;
but this will allow other threads to observe incrementing sem_getvalue despite
no new posts, as Rich noted in his email

4.2. We arrange that exactly one waiter proceeds by atomically incrementing
val[1] and futex-waking it; since we are doing that non-atomically with val[0]
update, we must guarantee that no waiters can proceed to destroy the semaphore
between val[0] and val[1] updates in sem_post

5.  sem_wait begins by atomically decrementing val[0]; if the previous value
was positive, we have simply successfully decremented the semaphore;
otherwise, we have become a waiter and must suspend

5.1.  sem_wait futex-waits on val[1]; on normal wakeups, it tries to
atomically-decrement-if-positive val[1] and leaves if successful

FIXME: we are reading from val[1] even though val[0] was increased long ago and
semaphore may be now positive; is there a justification for that, why can't
other threads assume the semaphore is free to be destroyed?

5.2.  On eintr/timeout/cancel, a waiter must decide if it may just discount
itself as a waiter, or must deal with a racing post; in case there is a racing
post, the waiter must either consume it and return normally, or discount
itself as a waiter nevertheless and ensure that rather than causing a wake the
post increases semaphore value beyond zero

It seems there are two ways of doing it.  We either prefer to consume a
pending post if available (a), or to propagate eintr/cancel (b).

5.2.a. Begin by incrementing-if-negative val[0]. If successful, we have
discounted ourselves as a waiter, so we don't need to care about concurrent
posts (there are fewer pending posts than waiters) and may propagate
eintr/timeout/cancel.  If unsuccessful, there is a pending post.   Consume it
and cause normal return from sem_wait (as if interruption did not happen).
Can't easily cause normal return from sem_wait if acting upon cancellation.

5.2.b.  Begin by incrementing val[0] unconditionally.  If it was negative, we
have discounted ourselves as a waiter; otherwise, there is a pending post, and
our increment is making it look as if the post is increasing an uncontended
semaphore (out of sequence, so other threads may observe oddly increasing
value); however there is a pending increase of val[1] and we must undo it.
Wait for val[1] to become positive by futex-waiting and then decrement it.
This looks more hairy, causes non-conforming behavior, but easier to execute
from a signal handler context for cancellation.

Does that help?  Thoughts?

Alexander