Re: [9fans] 9front pegs CPU on VMware

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

From: erik quanstrom <quanstro@quanstro.net>
To: paurea@gmail.com, 9fans@9fans.net
Subject: Re: [9fans] 9front pegs CPU on VMware
Date: Thu, 19 Dec 2013 11:15:27 -0500	[thread overview]
Message-ID: <16972aa37b8d64c254e60a30e78c851a@brasstown.quanstro.net> (raw)
In-Reply-To: <CACm3i_h=eTBNsUarY3MMupKveP=goX4JKu3okXi5EVAn7CACLQ@mail.gmail.com>

> The original discussion started about the runq spin lock, but I
> think the scope of the problem is more general and the solution can be
> applied in user and kernel space both.  While in user space you would
> do sleep(0) in the kernel you would sched() or if you are in the
> scheduler you would loop doing mwait (see my last email).

sched can't be called from sched!

> "The MWAIT instruction can be executed at any privilege level.  The
> MONITOR CPUID feature flag (ECX[bit 3] when CPUID is executed with EAX
> = 1) indicates the availability of the MONITOR and MWAIT instruction
> in a processor.  When set, the unconditional execution of MWAIT is
> supported at privilege level 0 and conditional execution is supported
> at privilege levels 1 through 3 (software should test for the
> appropriate support of these instructions before unconditional use)."
>
> There are also other extensions, which I have not tried.  I think the
> ideas can be used in the kernel or in user space, though I have only
> tried it the kernel and the implementation is only in the kernel right
> now.

thanks, i didn't see that.

> > i assume you mean that there is contention on the cacheline holding
> > the runq lock?  i don't think there's classical congestion.  as i
> > believe cachelines not involved in the mwait would experience no
> > hold up.
> >
>
> I mean congestion in the classical network sense.  There are switches
> and links to exchange messages for the coherency protocol and some
> them get congested.  What I was seeing is the counter of messages
> growing very very fast and the performance degrading which I interpret
> as something getting congested.  I think when the lock possession is
> pingponged around (not necessarily contented, but many changes in who
> is holding the lock or maybe contention) many messages are generated
> and then the problem occurs.  I certainly saw the HW counters for
> messages go up orders of magnitude when I was not using mwait.

any memory access makes the MESI protocol do work.  i'm still not
convinced that pounding one cache line can create enough memory traffic
to sink uninvolved processors.  (but i'm not not convinced either.)

> I think the latency of mwait (this is what I remember for the opterons
> I was measuring, probably different in intel and in other amd models)

opterons have traditionally had terrible memory latency.  especially
when crossing packages.

> is actually worse (bigger) than with spinning, but if you have enough
> processors doing the spinning (not necessarily on the same locks, but

are you sure that you're getting fair interleaving with the spin locks?  if
in fact you're interleaving on big scales (say the same processor gets the
lock 100 times in a row), that's cheating a bit, isn't it?

also, in user space, there is a 2 order of magnitude difference between sleep(0)
and a wakeup.  and that's one of the main reasons that the semaphore
based lock measures quite poorly.  since the latency is not in sleep and wakeup, it
appears that it's in context switching.

> Let us know what your conclusions are after you implement and
> measure them :-).
>

ticket locks (and some other stuff) are the difference between 200k iops and
1m iops.

- erik

     prev parent reply	other threads:[~2013-12-19 16:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-16  1:49 Blake McBride
2013-12-16  1:52 ` erik quanstrom
2013-12-16  2:35   ` Blake McBride
2013-12-16  2:58     ` andrey mirtchovski
2013-12-16  3:05       ` Blake McBride
2013-12-16 10:17     ` cinap_lenrek
2013-12-16 14:57       ` Blake McBride
2013-12-16 15:34         ` erik quanstrom
2013-12-16 16:25           ` Matthew Veety
2013-12-16 16:59             ` erik quanstrom
2013-12-17 11:00           ` cinap_lenrek
2013-12-17 13:38             ` erik quanstrom
2013-12-17 14:14               ` erik quanstrom
2013-12-19  9:01             ` Gorka Guardiola Muzquiz
2013-12-19 14:16               ` Gorka Guardiola
2013-12-19 15:19               ` erik quanstrom
2013-12-19 15:57                 ` Gorka Guardiola
2013-12-19 16:15                   ` erik quanstrom [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16972aa37b8d64c254e60a30e78c851a@brasstown.quanstro.net \
    --to=quanstro@quanstro.net \
    --cc=9fans@9fans.net \
    --cc=paurea@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).