From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7516 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Resuming work on new semaphore Date: Fri, 24 Apr 2015 11:03:41 -0400 Message-ID: <20150424150341.GP17573@brightrain.aerifal.cx> References: <20150405190214.GF6817@brightrain.aerifal.cx> <20150405202314.GG6817@brightrain.aerifal.cx> <20150423160624.GF17573@brightrain.aerifal.cx> <20150424024638.GO17573@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1429887840 5504 80.91.229.3 (24 Apr 2015 15:04:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 24 Apr 2015 15:04:00 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7529-gllmg-musl=m.gmane.org@lists.openwall.com Fri Apr 24 17:03:59 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Ylf93-00068Q-Mj for gllmg-musl@m.gmane.org; Fri, 24 Apr 2015 17:03:57 +0200 Original-Received: (qmail 3094 invoked by uid 550); 24 Apr 2015 15:03:55 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3073 invoked from network); 24 Apr 2015 15:03:54 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7516 Archived-At: On Fri, Apr 24, 2015 at 01:23:27PM +0300, Alexander Monakov wrote: > On Thu, 23 Apr 2015, Rich Felker wrote: > > Perhaps this can be patched up by saturating sem_getvalue's result? In > > the case where the overflow happens it's transient, right? I think > > that means discounting the overflow would be valid. But I'll need to > > think about it more... > > Hm, can't agree here. This whole line of discussion stems from attempt to > align timedwait/trywait/getvalue behavior in light of dead waiters, which are > indistinguishable from preempted waiters. I don't think dead waiters are a solvable problem with this design, but they're a minor problem until you hit overflow. > If "it's transient" claim can be > made, it also can be used as a reason not to modify getvalue to look at val[1]. No, because you can interrupt a waiter with a signal handler and the "transient" state becomes something you can synchronize with and observe and thus no longer transient. That was the motivation for needing to count the pending wakes. > > With that said, my inclination right now is that we should hold off on > > trying to commit the new semaphore for this release cycle. I've been > > aiming for this month and just about everything else is in order for > > release, but the semaphore rabbit-hole keeps going deeper and I think > > we need to work through this properly. I hope that's not too much of a > > disappointment. > > Ack; thankfully I don't feel disappointment in this case, this discussion has > been quite entertaining. When I proposed my modification I felt it was very > intuitive. What I did not grasp back then is that the definition of a waiter > is not solid. > > How do you interpret the following? > > 1. Semaphore initialized to 0. There's only one thread. > 2. alarm(1) > 3. sem_wait() > .... (in SIGALRM handler) > 4. sem_post() > 5. sem_getvalue() > > May getvalue be 0 here? At step 4, can the thread possibly "be a waiter" > on the semaphore? Here steps 4 and 5 are UB (calling AS-unsafe functions from AS context). But you can achieve the same with another thread observing entry to the signal handler in a valid way (e.g. via posting of a second sem from the signal handler). With that problem solved, I think it's valid at this point to observe a value of 0 or 1. But if 0 is observed, sem_trywait would have to fail, and sem_wait or sem_timedwait could return only in the case of an error. This is why returning 0 does not seem to be practical -- I don't know a way to let the existing suspended waiter take the wake without allowing new waiters to steal it (and thus expose inconsistency). Rich