From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5806 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: bug in pthread_cond_broadcast Date: Tue, 12 Aug 2014 13:19:41 -0400 Message-ID: <20140812171941.GA12888@brightrain.aerifal.cx> References: <1407801532.15134.96.camel@eris.loria.fr> <20140812165033.GM22308@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1407864002 10700 80.91.229.3 (12 Aug 2014 17:20:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 12 Aug 2014 17:20:02 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5812-gllmg-musl=m.gmane.org@lists.openwall.com Tue Aug 12 19:19:56 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XHFjn-0001mF-ML for gllmg-musl@plane.gmane.org; Tue, 12 Aug 2014 19:19:55 +0200 Original-Received: (qmail 9957 invoked by uid 550); 12 Aug 2014 17:19:54 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 9949 invoked from network); 12 Aug 2014 17:19:54 -0000 Content-Disposition: inline In-Reply-To: <20140812165033.GM22308@port70.net> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:5806 Archived-At: On Tue, Aug 12, 2014 at 06:50:34PM +0200, Szabolcs Nagy wrote: > > trace("thread %u is last, signalling main, %s\n", *number, errorstring(ret)); > > } > > while (i == phase) { > > tell("thread %u in phase %u (%u), waiting\n", *number, i, phase); > > int ret = condition_wait(&cond_client, &mut[i]); > > trace("thread %u in phase %u (%u), finished, %s\n", *number, i, phase, errorstring(ret)); > > the last client thread will wait here unlocking mut[0] so > the main thread can continue > > the main thread broadcast wakes all clients while holding both > mut[0] and mut[1] then unlocks mut[0] and starts waiting on > cond_main using mut[1] > > the awaken clients will go into the next phase locking mut[1] > and waiting on cond_client using mut[1] > > however there might be still clients waiting on cond_client > using mut[0] (eg. the broadcast is not yet finished) A waiter cannot assume broadcast was finished (or that it was even performed) just because it's returned from the wait. Waits are always subject to spurious wakes, and a spurious wake is indistinguishable from a non-spurious one without additional synchronization and checking of the predicate. So, while I still haven't read the test case in detail, I'm suspicious that it might actually be invalid... > i see logs where one thread is already in phase 1 (using mut[1]) > while another is not yet out of condition_wait (using mut[0]): > > pthread_cond_smasher.c:120: thread 3 in phase 1 (1), waiting > pthread_cond_smasher.c:122: thread 6 in phase 0 (1), finished, No error information > > "When a thread waits on a condition variable, having specified a particular > mutex to either the pthread_cond_timedwait() or the pthread_cond_wait() > operation, a dynamic binding is formed between that mutex and condition > variable that remains in effect as long as at least one thread is blocked > on the condition variable. During this time, the effect of an attempt by > any thread to wait on that condition variable using a different mutex is > undefined. " > > so are all clients considered unblocked after a broadcast? Once broadcast returns (as observed by the thread which called broadcast, or any thread that synchronizes with this thread after broadcast returns), there are no waiters and it's valid to use a new mutex with the cond var (or destroy it if it won't be used again). Rich