From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7346 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Resuming work on new semaphore Date: Sun, 5 Apr 2015 16:23:14 -0400 Message-ID: <20150405202314.GG6817@brightrain.aerifal.cx> References: <20150402013006.GA1108@brightrain.aerifal.cx> <20150402152642.GW6817@brightrain.aerifal.cx> <20150402231457.GC6817@brightrain.aerifal.cx> <20150405190214.GF6817@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1428265410 9701 80.91.229.3 (5 Apr 2015 20:23:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 5 Apr 2015 20:23:30 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7359-gllmg-musl=m.gmane.org@lists.openwall.com Sun Apr 05 22:23:30 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Yer4q-0000Xc-So for gllmg-musl@m.gmane.org; Sun, 05 Apr 2015 22:23:28 +0200 Original-Received: (qmail 28187 invoked by uid 550); 5 Apr 2015 20:23:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 28169 invoked from network); 5 Apr 2015 20:23:26 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7346 Archived-At: On Sun, Apr 05, 2015 at 11:03:34PM +0300, Alexander Monakov wrote: > On Sun, 5 Apr 2015, Rich Felker wrote: > > 1. Thread A enters sem_wait. > > 2. Thread B observes thread A in sem_wait via failed sem_trywait. > > Hm, I don't see how that can be achieved. As a result I'm afraid I didn't > fully understand your example. Indeed I was wrong about that, so I agree the whole scenario may fall apart. Only sem_getvalue could show this, and only if it returns -1 rather than 0. So returning negative values from sem_getvalue seems like a very bad idea -- it puts difficult- or impossible-to-satisfy additional constraints on the implementation. > > > Well we can make sem_getvalue return val[0]+val[1] instead... ;) > > > > That just makes the new implementation look like the old one, no? :-) > > Can't be bad if it behaves the same but works a bit faster. > Apropos, like I've said on IRC, looks like there's "semaphore uncertainty > principle": that formal semaphore value is between val[0] and (val[0] +/- > val[1]) (clamped to 0 as needed). It seems you can either do your hack and > pretend that there are never any waiters, or try to faithfully count waiters > in sem_getvalue, but then also reveal that sometimes the implementation works > by stealing a post. I believe you could argue that the latter is explicitely > disallowed by the spec. Yes, I think I agree. > By the way, I think there's an interesting interplay with cancellation. > Consider the following. Thread B does "return sem_wait(sem);". Thread A does: > > pthread_cancel(thread_B); > sem_post(sem); > sem_getvalue(sem); > > If it observes semaphore value as 1 it follows that thread B has not become a > waiter yet, and since it must have cancellation already pending, it may not > consume the post. And yet if thread B is already futex-waiting in sem_wait, > consuming the post takes priority over acting on cancellation. So if then > thread A does > > pthread_join(thread_B); > sem_getvalue(sem); > > and gets value of 0, it sees a contradiction. And return value from > pthread_join will indicate that thread_B exited normally rather than was > cancelled. So the contradiction you claim exists is that cancellation happened before the post, and thus thread B can't act on the post when it didn't act on cancellation? I don't think that follows from the rules of cancellation. The relevant text is: "Whenever a thread has cancelability enabled and a cancellation request has been made with that thread as the target, and the thread then calls any function that is a cancellation point (such as pthread_testcancel() or read()), the cancellation request shall be acted upon before the function." So if cancellation was pending _before_ the call to sem_wait, then sem_wait has to honor it. But there is no requirement that entry to the sem_wait function be "atomic" with becoming a waiter on the semaphore, and of course this is impossible to satisfy or even specify. So it's totally legal to have the sequence: 1. Thread B enters sem_wait. 2. Thread B observes that cancellation was not already pending. 3. Thread A sends cancellation request. 4. Thread A sends post. 5. Thread B receives both, and chooses to act on the post per this text: "It is unspecified whether the cancellation request is acted upon or whether the cancellation request remains pending and the thread resumes normal execution if: - The thread is suspended at a cancellation point and the event for which it is waiting occurs - A specified timeout expired before the cancellation request is acted upon." Here, the event for which it was waiting (the post) clearly occurs. > And on the contrary, if you make acting on cancellation/timeout take priority, > you can observe semaphore value increasing when waiters leave the wait on > error path without consuming the post. Yes obviously that is not possible. Rich