From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8398 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH 1/2] let them spin Date: Sat, 29 Aug 2015 13:16:12 -0400 Message-ID: <20150829171612.GD7833@brightrain.aerifal.cx> References: <1440838179.693.4.camel@dysnomia.u-strasbg.fr> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1440868590 29717 80.91.229.3 (29 Aug 2015 17:16:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 29 Aug 2015 17:16:30 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-8410-gllmg-musl=m.gmane.org@lists.openwall.com Sat Aug 29 19:16:30 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZVjjx-00069q-VH for gllmg-musl@m.gmane.org; Sat, 29 Aug 2015 19:16:30 +0200 Original-Received: (qmail 14095 invoked by uid 550); 29 Aug 2015 17:16:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 14071 invoked from network); 29 Aug 2015 17:16:26 -0000 Content-Disposition: inline In-Reply-To: <1440838179.693.4.camel@dysnomia.u-strasbg.fr> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:8398 Archived-At: On Sat, Aug 29, 2015 at 10:50:44AM +0200, Jens Gustedt wrote: > Remove a test in __wait that looked if other threads already attempted to > go to sleep in futex_wait. > > This has no impact on the fast path. But other than one might think at a > first glance, this slows down if there is congestion. > > Applying this patch shows no difference in behavior in a mono-core > setting, so it seems that this shortcut is just superfluous. The purpose of this code is is twofold: improving fairness of the lock and avoiding burning cpu time that's _known_ to be a waste. If you spin on a lock that already has waiters, the thread that spins has a much better chance to get the lock than any of the existing waiters which are in futex_wait. Assuming sufficiently many cores that all threads that are not sleeping don't get preempted, the spinning thread is basically guaranteed to get the lock unless it spins so long to make it futex_wait. This is simply because returning from futex_wake (which all the other waiters have to do) takes a lot more time than one spin. I suspect there are common loads under which many of the waiters will NEVER get the lock. The other issue is simply wasted cpu time. Unlike in the no-waiters case where spinning _can_ save a lot of cpu time (no futex_wait or futex_wake syscall), spinning in the with-waiters case always wastes cpu time. If the spinning thread gets the lock when the old owner unlocks it, the old owner has already determined that it has to send a futex_wake. So now you have (1) the wasted futex_wake syscall in the old owner thread and (2) a "spurious" wake in one of the waiters, which wakes up only to find that the lock was already stolen, and therefore has to call futex_wait again and start the whole process over. Rich