From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5866 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: New private cond var design Date: Fri, 15 Aug 2014 15:35:36 -0400 Message-ID: <20140815193536.GA26312@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1408131357 1396 80.91.229.3 (15 Aug 2014 19:35:57 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 15 Aug 2014 19:35:57 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5872-gllmg-musl=m.gmane.org@lists.openwall.com Fri Aug 15 21:35:50 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XINHx-0004Uj-Ue for gllmg-musl@plane.gmane.org; Fri, 15 Aug 2014 21:35:50 +0200 Original-Received: (qmail 25809 invoked by uid 550); 15 Aug 2014 19:35:48 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 25798 invoked from network); 15 Aug 2014 19:35:48 -0000 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:5866 Archived-At: The current cv bug reported by Jens occurs when a cv is reused with a new mutex before all the former-waiters from the previous mutex have woken up and decremented themselves from the waiter count. In this case, they can't know whether to decrement the in-cv waiter count or the in-mutex waiter count, and thereby end up corrupting these counts. Jens' proposed solution tracked "instances" via dynamically allocated, reference-counted objects. I finally think I have a solution which avoids dynamic allocation: representing the "instance" as a doubly-linked-list of automatic objects on the stack of each waiter. The cv object itself needs a single pointer to the head of the current instance. This pointer is set by the first waiter on an instance. Subsequent waiters which arrive when it's already set can check that the mutex argument is the same; if not, this is an error. The pointer is cleared when the last (formal) waiter is removed by the signal or broadcast operation. Storing this list eliminates the need to keep a waiter count. The length of the linked list itself is the number of waiters which need to be moved to the mutex on broadcast. This requires an O(n) walk of the list at broadcast time, but that's really a non-issue since the kernel is already doing a much more expensive O(n) walk of the futex waiter list anyway. The list also allows us to eliminate the sequence number wrapping issue (sadly, only for private, non-process-shared cv's, since process-shared can't use process-local memory like this) in one of two ways: Option 1: If the list elements store the sequence number their waiter is waiting on, the signal/broadcast operations can choose a new sequence number distinct from that of all waiters. Option 2: Each waiter can wait on a separate futex on its own stack, so that sequence numbers are totally unneeded. This eliminates all spurious wakes; signal can precisely control exactly which waiter wakes (e.g. choosing the oldest), thereby waking only one waiter. Broadcast then becomes much more expensive: the broadcasting thread has to make one requeue syscall per waiter. But this still might be a good design. Unless anyone sees problems with this design, I'll probably start working on it soon. I think I'll try to commit the private-futex stuff first, though, to avoid having to rebase it; fixing the cv issue in 1.0.x will not be a direct cherry-pick anyway, so there's no point in putting off 1.0.x-incompatible changes pending the fix. Rich