Re: New optimized normal-type mutex?

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: New optimized normal-type mutex?
Date: Thu, 30 Jul 2015 09:45:19 -0400	[thread overview]
Message-ID: <20150730134519.GB16376@brightrain.aerifal.cx> (raw)
In-Reply-To: <1438243654.10742.9.camel@inria.fr>

On Thu, Jul 30, 2015 at 10:07:34AM +0200, Jens Gustedt wrote:
> Am Mittwoch, den 29.07.2015, 20:10 -0400 schrieb Rich Felker:
> > On Thu, Jul 30, 2015 at 01:49:20AM +0200, Jens Gustedt wrote:
> > > Hm, could you be more specific about where this hurts?
> > > 
> > > In the code I have there is
> > > 
> > >         for (;val & lockbit;) {
> > >           __syscall(SYS_futex, loc, FUTEX_WAIT, val, 0);
> > >           val = atomic_load_explicit(loc, memory_order_consume);
> > >         }
> > > 
> > > so this should be robust against spurious wakeups, no?
> > 
> > The problem is that futex_wait returns immediately with EAGAIN if
> > *loc!=val, which happens very often if *loc is incremented or
> > otherwise changed on each arriving waiter.
> 
> Yes, sure, it may change. Whether or not this is often may depend, I
> don't think we can easily make a quantitative statement, here.

The same happened with the obvious implementation of cond vars, having
a waiter store its tid then wait on *val==tid with futex_wait. The
ratio of spurious wakes to intended wakes was something like 10:1. So
this is not a purely theoretical problem.

> In the case of atomics the critical section is extremely short, and
> the count, if it varies so much, should have a bit stabilized during
> the spinlock phase before coming to the futex part. That futex part is
> really only a last resort for the rare case that the thread that is
> holding the lock has been descheduled in the middle.

Spinning is near-useless whenever you have more contenders than cores.
It's an optimization for a special case (fewer contenders than cores)
but you don't want to rely on it.

> My current test case is having X threads hammer on one single
> location, X being up to some hundred. On my 2x2 hyperthreaded CPU for
> a reasonable number of threads (X = 16 or 32) I have an overall
> performance improvement of 30%, say, when using my version of the lock
> instead of the original musl one. The point of inversion where the
> original musl lock is better is at about 200 threads.

Interesting. I suspect this would change quickly if you increased the
amount of work done with the lock held (e.g. to more closely mimic
malloc). The nasty thing is that, as soon as you _do_ cause futex_wait
to start happening rather than just spinning, performance blows up
because each thread that needs to wait calls futex_wait several times
before it succeeds, taking up cpu time where the thread that holds the
lock could be running.

Rich

     prev parent reply	other threads:[~2015-07-30 13:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 23:44 Rich Felker
2015-05-22  7:30 ` Jens Gustedt
2015-05-22  7:51   ` Rich Felker
2015-07-29 12:09 ` Joakim Sindholt
2015-07-29 22:11   ` Jens Gustedt
2015-07-29 23:30     ` Rich Felker
2015-07-29 23:49       ` Jens Gustedt
2015-07-30  0:10         ` Rich Felker
2015-07-30  8:07           ` Jens Gustedt
2015-07-30  9:10             ` Jens Gustedt
2015-07-30  9:36               ` Alexander Monakov
2015-07-30 10:00                 ` Jens Gustedt
2015-07-30 11:37                   ` Alexander Monakov
2015-07-30 13:46                     ` Rich Felker
2015-07-30 16:07                       ` Jens Gustedt
2015-08-03 16:36                         ` Alexander Monakov
2015-08-03 19:43                           ` Jens Gustedt
2015-08-03 20:05                             ` Isaac Dunham
2015-08-04  5:49                               ` Jens Gustedt
2015-07-30 13:45             ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150730134519.GB16376@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).