From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 13240 invoked from network); 14 Dec 2022 10:30:11 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 14 Dec 2022 10:30:11 -0000 Received: (qmail 5153 invoked by uid 550); 14 Dec 2022 10:30:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 4091 invoked from network); 14 Dec 2022 10:30:06 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru B3F06419E9C8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ispras.ru; s=default; t=1671013794; bh=ta4nzp8P3SxCrJbyr6Ed2ZaVMYIWsieDN0V1efvJzb4=; h=Date:From:To:Subject:In-Reply-To:References:From; b=hxTgWjdB30+/Uojn2vyzea7r3ryFTTK/FXbgGBe3VlUL97P+OvOaQahf8KeklDinB Ex0dTk7bEPJVkmg5xilTTOweZzZrdKU2b68zel8tHg0n34VP/9C12Ne/DrQQ+wi0fk MGxp8gXSnT40IW1z6yEdx6xYOZzfu+PkR2Iaze9Q= MIME-Version: 1.0 Date: Wed, 14 Dec 2022 13:29:54 +0300 From: Alexey Izbyshev To: musl@lists.openwall.com In-Reply-To: <20221214014857.GA15716@brightrain.aerifal.cx> References: <20221122000958.GQ29905@brightrain.aerifal.cx> <87064ce2eef96ffbced93da7812d5358@ispras.ru> <20221214014857.GA15716@brightrain.aerifal.cx> User-Agent: Roundcube Webmail/1.4.4 Message-ID: <583865e860854d059f50d23a9acd7a8f@ispras.ru> X-Sender: izbyshev@ispras.ru Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [musl] sem_post() can miss waiters On 2022-12-14 04:48, Rich Felker wrote: > On Tue, Nov 22, 2022 at 08:41:44AM +0300, Alexey Izbyshev wrote: >> On 2022-11-22 03:09, Rich Felker wrote: >> >If I understand correctly, the cost of the first option is generally >> >an extra "last" broadcast wake that shouldn't be needed. Is that >> >right? >> > >> >For example, if sem starts with count 0 and thread A calls wait, then >> >thread B calls post twice, both posts make a syscall even though only >> >the first one should have. >> > >> Yes, exactly. >> >> >What if you instead perform the broadcast wake when the observed >> >waiters count is <=1 rather than ==0? This should have no cost in the >> >common case, but in the race case, I think it forces any hiding >> >(just-arrived) extra waiters to wake, fail, and re-publish their >> >waiting status to the waiters bit. >> > >> Indeed, I think this solves the overhead issue quite nicely, thanks. >> So sem_post wake up logic would basically boil down to this: >> >> * WAITERS_BIT is set and waiters > 1: don't reset WAITERS_BIT since >> it's likely that some waiters will remain (and it's always fine to >> err on the side of preserving the WAITERS_BIT); wake a single >> waiter. >> >> * WAITERS_BIT is set and waiters <= 1: reset WAITERS_BIT and wake >> all waiters. In non-racy cases only a single waiter will be woken. >> >> * WAITERS_BIT is unset: don't wake anybody. Even if there are some >> waiters, another sem_post (that reset the WAITERS_BIT) is >> responsible for waking them. >> >> In code: >> >> int val, new, waiters, priv = sem->__val[2]; >> do { >> val = sem->__val[0]; >> waiters = sem->__val[1]; >> if ((val & SEM_VALUE_MAX) == SEM_VALUE_MAX) { >> errno = EOVERFLOW; >> return -1; >> } >> new = val + 1; >> if (waiters <= 1) >> new &= ~0x80000000; >> } while (a_cas(sem->__val, val, new) != val); >> if (val<0) __wake(sem->__val, waiters <= 1 ? -1 : 1, priv); > > Yes, this looks good to me. How is the attached patch? > The patch looks good to me. >> >I think there's another really stupid solution too that I would even >> >consider, partly motivated by the fact that, with long-lived >> >process-shared semaphores, the waiters count can become stale if >> >waiters are killed. (Note: semaphores aren't required to be robust >> >against this, but it's ugly that they're not.) This solution is just >> >to keep the current code, but drop the waiters count entirely, and use >> >broadcast wakes for all wakes. Then, any still-live waiters are >> >*always* responsible for re-asserting their waiting status when all >> >but one fail to acquire the semaphore after a wake. Of course this is >> >a thundering herd, which is arguably something we should not want, but >> >we accept it for correctness in several other places like condvars... >> > >> I'm not sure that the kind of "partial robustness" that could be >> achieved by always waking all waiters is worth punishing normal >> cases. Also, I don't think waiters count being stale is an issue per >> se because it can be wrong only in one way (greater than the real >> count of waiters), assuming you don't mean overflow (but overflow >> could be handled by simply pinning it at INT_MAX permanently). But >> many other issues are possible if we allow killing processes at >> arbitrary points, including sem_post not sending a wake up >> notification after updating the semaphore value, or sem_wait >> consuming a notification but not updating the value. Granted, some >> of these issues may "self-heal" on a future sem_wait/sem_post (in >> the sense that the semaphore will leave a forbidden state), but they >> might still interfere with the program shutdown logic (e.g. if the >> program expects to claim the semaphore N times at the end without >> being aware that some consumers are dead), and, of course, with any >> logic outside of semaphore manipulation. So it seems to me that >> either the program already watches for death of processes it cares >> about, or it's not even clear that allowing further progress (due to >> a broadcast) is always more desirable than blocking. > > Indeed, this seems like it's just a bad direction to go for the > reasons you described. One thing I'd like to note here though, but not > act on at this time: > > When we were designing the semaphore long ago, there was a vague > proposal to make the post operation responsible for removing waiter > counts, rather than the waiter. It was thrown out because it didn't > seem to work. But if we ever did have a reason to want to do this, I > think it's possible now, since it's essentially just a "countdown to > broadcast wake" with no constraint that it be an accurate count, and > the new waiters bit is what really controls whether wakes happen. > Indeed, now the waiters count is just a hint used for resetting the waiters bit. However, its increments/decrements must still be balanced for it to remain useful, and I don't see an easy way to achieve that if sem_post is responsible for decrementing it. One further note: after this bug report I'd been pointed to a semaphore redesign proposal from 2015 (starts in https://www.openwall.com/lists/musl/2015/02/27/1 after some off-list discussion, ends in April). In that design, the waiters count is stored in the semaphore word, and sem_post is indeed responsible for adjusting it. However, another counter ("wake count") is still needed to make the semaphore work. Thanks, Alexey