From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7495 Path: news.gmane.org!not-for-mail From: Alexander Monakov Newsgroups: gmane.linux.lib.musl.general Subject: Re: Resuming work on new semaphore Date: Thu, 23 Apr 2015 21:24:36 +0300 (MSK) Message-ID: References: <20150402152642.GW6817@brightrain.aerifal.cx> <20150402231457.GC6817@brightrain.aerifal.cx> <20150405190214.GF6817@brightrain.aerifal.cx> <20150405202314.GG6817@brightrain.aerifal.cx> <20150423160624.GF17573@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: ger.gmane.org 1429813495 4796 80.91.229.3 (23 Apr 2015 18:24:55 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 23 Apr 2015 18:24:55 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7508-gllmg-musl=m.gmane.org@lists.openwall.com Thu Apr 23 20:24:55 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YlLnu-0008Dv-36 for gllmg-musl@m.gmane.org; Thu, 23 Apr 2015 20:24:50 +0200 Original-Received: (qmail 19987 invoked by uid 550); 23 Apr 2015 18:24:48 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 19966 invoked from network); 23 Apr 2015 18:24:48 -0000 In-Reply-To: <20150423160624.GF17573@brightrain.aerifal.cx> User-Agent: Alpine 2.11 (LNX 23 2013-08-11) Xref: news.gmane.org gmane.linux.lib.musl.general:7495 Archived-At: > The latter saves the result of a_cas to prevent an extra load, but I > don't think it makes any significant difference and it might be seen > as uglier. I think we should use the result of a_cas here: it's part of sem_post "fast path", and doing it is not too difficult. I'm using a slightly different version below. > However neither of those address the overflow issue, which I've tried > to address here: > > #define VAL0_MAX ((SEM_VALUE_MAX+1)/2) Signed integer overflow here -- using corrected version below. > Does this all sound correct? I'm afraid not. We must always do futex-wake when incrementing val[1]. Otherwise wake loss is possible: 1. Semaphore initialized to VAL0_MAX 2. Thread A enters sem_post, observes saturated val[0] 3. Thread B downs val[0] to 0 by calling sem_wait VAL0_MAX times 4. Thread B calls sem_wait again and enters futex_wait 5. Thread A ups val[1]. .. At this point thread A must futex-wake val[1]. My version: #define VAL0_MAX (SEM_VALUE_MAX/2+1) #define VAL1_MAX (SEM_VALUE_MAX/2) int sem_post(sem_t *sem) { int old, val = sem->__val[0]; val -= val == VAL0_MAX; while (old = val, (val = a_cas(sem->__val, val, val+1)) != old) if (val == VAL0_MAX) goto wake; if (val < 0) { wake:; int priv = sem->__val[2]; do if ((val = sem->__val[1]) == VAL1_MAX) { errno = EOVERFLOW; return -1; } while (val != a_cas(sem->__val+1, val, val+1)); __wake(sem->__val+1, 1, priv); } return 0; } After sufficiently many waiters have been killed, val[1] can reach VAL1_MAX without val[0] also reaching VAL0_MAX, in which case sem_post can report EOVERFLOW prematurely. From previous emails it seems it's not a big concern. It is also possible that EOVERFLOW will be reported prematurely in race windows when a waiter returning from futex-wait with EWOULDBLOCK has not decremented val[1] of a recently saturated semaphore yet. Example: 1. Semaphore initialized to SEM_VALUE_MAX 2. Thread A downs val[0] to 0 by calling sem_wait VAL0_MAX times. val[1] remains at VAL1_MAX. 3. Thread B calls sem_wait and enters futex wait 4. Thread A calls sem_post, observes val[0]<0 && val[1] == VAL1_MAX It's possible to make the window smaller by reordering futex-wait loop, but it will remain. At the moment I don't have a good way out. Thanks. Alexander