[9fans] 9front pegs CPU on VMware

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] 9front pegs CPU on VMware
@ 2013-12-16  1:49 Blake McBride
  2013-12-16  1:52 ` erik quanstrom
  0 siblings, 1 reply; 18+ messages in thread
From: Blake McBride @ 2013-12-16  1:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 314 bytes --]

Greetings,

I am running 9plan on VMware Fusion successfully, however, the CPU is
pegged.  I've seen this before with DOS.  Basically the OS has its own idle
loop so VMware sees it as always using CPU.  There is a patch to fix this
issue with a DOS guest.  Any ideas with 9front?

Thanks.

Blake McBride

[-- Attachment #2: Type: text/html, Size: 503 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16  1:49 [9fans] 9front pegs CPU on VMware Blake McBride
@ 2013-12-16  1:52 ` erik quanstrom
  2013-12-16  2:35   ` Blake McBride
  0 siblings, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2013-12-16  1:52 UTC (permalink / raw)
  To: 9fans

> I am running 9plan on VMware Fusion successfully, however, the CPU is
> pegged.  I've seen this before with DOS.  Basically the OS has its own idle
> loop so VMware sees it as always using CPU.  There is a patch to fix this
> issue with a DOS guest.  Any ideas with 9front?

change idlehands in /sys/src/9/pc to call halt unconditionally instead of
whatever it's doing now.

- erik



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16  1:52 ` erik quanstrom
@ 2013-12-16  2:35   ` Blake McBride
  2013-12-16  2:58     ` andrey mirtchovski
  2013-12-16 10:17     ` cinap_lenrek
  0 siblings, 2 replies; 18+ messages in thread
From: Blake McBride @ 2013-12-16  2:35 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 972 bytes --]

Thanks.  That fixed the problem.  For the faint at heart, here is what I
did:

    cd /sys/src/9/pc
    acme main.c

At the bottom of the file there is a function named idlehands(void).
 Change that function to do nothing but call halt().

Then, from that same directory, build the kernel with:

    mk 'CONF=pcf'

Then install the kernel with:

    9fs 9fat
    cp 9pcf /n/9fat

Halt the system with:

    fshalt

Reboot the machine.






On Sun, Dec 15, 2013 at 7:52 PM, erik quanstrom <quanstro@labs.coraid.com>wrote:

> > I am running 9plan on VMware Fusion successfully, however, the CPU is
> > pegged.  I've seen this before with DOS.  Basically the OS has its own
> idle
> > loop so VMware sees it as always using CPU.  There is a patch to fix this
> > issue with a DOS guest.  Any ideas with 9front?
>
> change idlehands in /sys/src/9/pc to call halt unconditionally instead of
> whatever it's doing now.
>
> - erik
>
>

[-- Attachment #2: Type: text/html, Size: 2965 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16  2:35   ` Blake McBride
@ 2013-12-16  2:58     ` andrey mirtchovski
  2013-12-16  3:05       ` Blake McBride
  2013-12-16 10:17     ` cinap_lenrek
  1 sibling, 1 reply; 18+ messages in thread
From: andrey mirtchovski @ 2013-12-16  2:58 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

welcome to the club. now do the same thing with linux. and try to
regale your experience in less than 4 blogposts :)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16  2:58     ` andrey mirtchovski
@ 2013-12-16  3:05       ` Blake McBride
  0 siblings, 0 replies; 18+ messages in thread
From: Blake McBride @ 2013-12-16  3:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 242 bytes --]

Good point.  I can't.


On Sun, Dec 15, 2013 at 8:58 PM, andrey mirtchovski
<mirtchovski@gmail.com>wrote:

> welcome to the club. now do the same thing with linux. and try to
> regale your experience in less than 4 blogposts :)
>
>

[-- Attachment #2: Type: text/html, Size: 546 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16  2:35   ` Blake McBride
  2013-12-16  2:58     ` andrey mirtchovski
@ 2013-12-16 10:17     ` cinap_lenrek
  2013-12-16 14:57       ` Blake McBride
  1 sibling, 1 reply; 18+ messages in thread
From: cinap_lenrek @ 2013-12-16 10:17 UTC (permalink / raw)
  To: 9fans

the idlehands() on 9front is as follows:

/*
 *  put the processor in the halt state if we've no processes to run.
 *  an interrupt will get us going again.
 */
void
idlehands(void)
{
	extern int nrdy;

	if(conf.nmach == 1)
		halt();
	else if(m->cpuidcx & Monitor)
		mwait(&nrdy);
}

the reason for not just unconditionally calling halt() on a *multiprocessor*
is that this would keep the processor sleeping even when processes become
ready to be executed. there is currently no way for the first woken processor
to wakup another one other than the monitor/mwait mechanism; which for some
reason seems not to be emulated in that vmware fusion setup. one can run
aux/cpuid to see what processor features are supported.

yes, theres the HZ tick that should wake up the sleeping processor eventually,
but then it might be too late.

note, this only applies to *multiprocessor* systems. so when you setup your
vm with just a single cpu, it will not waste cycles spinning.

--
cinap

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16 10:17     ` cinap_lenrek
@ 2013-12-16 14:57       ` Blake McBride
  2013-12-16 15:34         ` erik quanstrom
  0 siblings, 1 reply; 18+ messages in thread
From: Blake McBride @ 2013-12-16 14:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1864 bytes --]

I am running a dual core setup.  CPU info is:

vendor GenuineIntel
procmodel 00020655 / 00010800
features fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
features pse36 clflush dts mmx fxsr sse sse2 ss
features pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes
hypervisor
extmodel 00000000 / 00000000
extfeatures nx tscp lm
extfeatures ahf64
procname Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz
physbits 40
virtbits 48

I have the latest version of VMware Fusion.  I don't have this problem with
a Linux, OpenIndiana, NetBSD, etc.  I did have the problem with MS-DOS but
downloaded a fix.

Thanks.

Blake



On Mon, Dec 16, 2013 at 4:17 AM, <cinap_lenrek@felloff.net> wrote:

> the idlehands() on 9front is as follows:
>
> /*
>  *  put the processor in the halt state if we've no processes to run.
>  *  an interrupt will get us going again.
>  */
> void
> idlehands(void)
> {
>         extern int nrdy;
>
>         if(conf.nmach == 1)
>                 halt();
>         else if(m->cpuidcx & Monitor)
>                 mwait(&nrdy);
> }
>
> the reason for not just unconditionally calling halt() on a
> *multiprocessor*
> is that this would keep the processor sleeping even when processes become
> ready to be executed. there is currently no way for the first woken
> processor
> to wakup another one other than the monitor/mwait mechanism; which for some
> reason seems not to be emulated in that vmware fusion setup. one can run
> aux/cpuid to see what processor features are supported.
>
> yes, theres the HZ tick that should wake up the sleeping processor
> eventually,
> but then it might be too late.
>
> note, this only applies to *multiprocessor* systems. so when you setup your
> vm with just a single cpu, it will not waste cycles spinning.
>
> --
> cinap
>
>

[-- Attachment #2: Type: text/html, Size: 3300 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16 14:57       ` Blake McBride
@ 2013-12-16 15:34         ` erik quanstrom
  2013-12-16 16:25           ` Matthew Veety
  2013-12-17 11:00           ` cinap_lenrek
  0 siblings, 2 replies; 18+ messages in thread
From: erik quanstrom @ 2013-12-16 15:34 UTC (permalink / raw)
  To: blake, 9fans

> /*
>  *  put the processor in the halt state if we've no processes to run.
>  *  an interrupt will get us going again.
>  */
> void
> idlehands(void)
> {
>         extern int nrdy;
>
>         if(conf.nmach == 1)
>                 halt();
>         else if(m->cpuidcx & Monitor)
>                 mwait(&nrdy);
> }
>
> the reason for not just unconditionally calling halt() on a
> *multiprocessor*
> is that this would keep the processor sleeping even when processes become
> ready to be executed. there is currently no way for the first woken
> processor
> to wakup another one other than the monitor/mwait mechanism; which for some
> reason seems not to be emulated in that vmware fusion setup. one can run
> aux/cpuid to see what processor features are supported.
>
> yes, theres the HZ tick that should wake up the sleeping processor
> eventually,
> but then it might be too late.

it won't be "too late"—as causing failures.  i've tried testing this and
generally found that reduced contention on the dog pile lock means
unconditionally halting gives a performance boost.

- erik



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16 15:34         ` erik quanstrom
@ 2013-12-16 16:25           ` Matthew Veety
  2013-12-16 16:59             ` erik quanstrom
  2013-12-17 11:00           ` cinap_lenrek
  1 sibling, 1 reply; 18+ messages in thread
From: Matthew Veety @ 2013-12-16 16:25 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs


On Dec 16, 2013, at 10:34, erik quanstrom <quanstro@labs.coraid.com> wrote:

>> /*
>> *  put the processor in the halt state if we've no processes to run.
>> *  an interrupt will get us going again.
>> */
>> void
>> idlehands(void)
>> {
>>        extern int nrdy;
>> 
>>        if(conf.nmach == 1)
>>                halt();
>>        else if(m->cpuidcx & Monitor)
>>                mwait(&nrdy);
>> }
>> 
>> the reason for not just unconditionally calling halt() on a
>> *multiprocessor*
>> is that this would keep the processor sleeping even when processes become
>> ready to be executed. there is currently no way for the first woken
>> processor
>> to wakup another one other than the monitor/mwait mechanism; which for some
>> reason seems not to be emulated in that vmware fusion setup. one can run
>> aux/cpuid to see what processor features are supported.
>> 
>> yes, theres the HZ tick that should wake up the sleeping processor
>> eventually,
>> but then it might be too late.
> 
> it won't be "too late"—as causing failures.  i've tried testing this and
> generally found that reduced contention on the dog pile lock means
> unconditionally halting gives a performance boost.
> 
> - erik
> 

What are the changes you made to 9atom to facilitate this? Just replacing the if/else with a halt?




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16 16:25           ` Matthew Veety
@ 2013-12-16 16:59             ` erik quanstrom
  0 siblings, 0 replies; 18+ messages in thread
From: erik quanstrom @ 2013-12-16 16:59 UTC (permalink / raw)
  To: 9fans

> > it won't be "too late"—as causing failures.  i've tried testing this and
> > generally found that reduced contention on the dog pile lock means
> > unconditionally halting gives a performance boost.
> > 
> > - erik
> > 
> 
> What are the changes you made to 9atom to facilitate this? Just replacing the if/else with a halt?

no other changes needed.  if there are, then one would suspect there is
code depending on timing that is accidental, and not guarenteed.

the current 9atom version is somewhat compromised:

	if(m->machno != 0)
		halt();

i don't recall any reason for this other than some long-forgotten experiments.

- erik



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-16 15:34         ` erik quanstrom
  2013-12-16 16:25           ` Matthew Veety
@ 2013-12-17 11:00           ` cinap_lenrek
  2013-12-17 13:38             ` erik quanstrom
  2013-12-19  9:01             ` Gorka Guardiola Muzquiz
  1 sibling, 2 replies; 18+ messages in thread
From: cinap_lenrek @ 2013-12-17 11:00 UTC (permalink / raw)
  To: 9fans

thats a surprising result. by dog pile lock you mean the runq spinlock no?

--
cinap



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-17 11:00           ` cinap_lenrek
@ 2013-12-17 13:38             ` erik quanstrom
  2013-12-17 14:14               ` erik quanstrom
  2013-12-19  9:01             ` Gorka Guardiola Muzquiz
  1 sibling, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2013-12-17 13:38 UTC (permalink / raw)
  To: 9fans

> thats a surprising result. by dog pile lock you mean the runq spinlock no?

yes.

- erik



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-17 13:38             ` erik quanstrom
@ 2013-12-17 14:14               ` erik quanstrom
  0 siblings, 0 replies; 18+ messages in thread
From: erik quanstrom @ 2013-12-17 14:14 UTC (permalink / raw)
  To: quanstro, 9fans

On Tue Dec 17 08:40:23 EST 2013, quanstro@quanstro.net wrote:
> > thats a surprising result. by dog pile lock you mean the runq spinlock no?
>
> yes.

my guess is it is made worse by the probes outside the lock.

- erik



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-17 11:00           ` cinap_lenrek
  2013-12-17 13:38             ` erik quanstrom
@ 2013-12-19  9:01             ` Gorka Guardiola Muzquiz
  2013-12-19 14:16               ` Gorka Guardiola
  2013-12-19 15:19               ` erik quanstrom
  1 sibling, 2 replies; 18+ messages in thread
From: Gorka Guardiola Muzquiz @ 2013-12-19  9:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> On 17 Dec 2013, at 12:00, cinap_lenrek@felloff.net wrote:
> 
> thats a surprising result. by dog pile lock you mean the runq spinlock no?
> 

I guess it depends on the HW, but I don´t find that so surprising. You are looping
sending messages to the coherency fabric, which gets congested as a result.
I have seen that happen.

You should back off, but sleeping for a fixed time is not a good solution either.
Mwait is a perfect solution in this case, there is some latency, but you are in a bad
place anyway and with mwait, performance does not degrade too much.

Even for user space where the spinlocks backoff sleeping,
if you get to that point, your latency goes off the roof.
Latency is worse than using mwait because you are sleeping unconditionally. 
Mwait does not prevent you from getting the interrupt to schedule.
In most cases mwait is better for performance to back off in spinlocks in general.
It is also good for power which may prevent cooling slowdowns of the clock too.

G.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-19  9:01             ` Gorka Guardiola Muzquiz
@ 2013-12-19 14:16               ` Gorka Guardiola
  2013-12-19 15:19               ` erik quanstrom
  1 sibling, 0 replies; 18+ messages in thread
From: Gorka Guardiola @ 2013-12-19 14:16 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

>
>
> Latency is worse than using mwait because you are sleeping unconditionally.
> Mwait does not prevent you from getting the interrupt to schedule.
>
>
By this I mean that mwait unblocks on interrupt.  You could do something
like
(you do exponential backoff calling sleep or sleep/wakeup in the kernel)
one out of N where N
goes from big to 1 as the count increases:

while(1){
      mwait(&l->mwaitvar);
      test_the_var_and break();
      sleep(0); //one out of N iterations
}

This will make the the process consume a little part of the quantum (until
the next tick)  waiting, and most
of that time the processor is turned off or at least consuming less.

G.

[-- Attachment #2: Type: text/html, Size: 1070 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-19  9:01             ` Gorka Guardiola Muzquiz
  2013-12-19 14:16               ` Gorka Guardiola
@ 2013-12-19 15:19               ` erik quanstrom
  2013-12-19 15:57                 ` Gorka Guardiola
  1 sibling, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2013-12-19 15:19 UTC (permalink / raw)
  To: 9fans

for those without much mwait experience, mwait is a kernel-only primitive
(as per the instructions) that pauses the processor until a change has been
made in some range of memory.  the size is determined by probing the h/w,
but think cacheline.  so the discussion of locking is kernel specific as well.

> > On 17 Dec 2013, at 12:00, cinap_lenrek@felloff.net wrote:
> > 
> > thats a surprising result. by dog pile lock you mean the runq spinlock no?
> > 
> 
> I guess it depends on the HW, but I don´t find that so surprising. You are looping
> sending messages to the coherency fabric, which gets congested as a result.
> I have seen that happen.

i assume you mean that there is contention on the cacheline holding the runq lock?
i don't think there's classical congestion.  as i believe cachelines not involved in the
mwait would experience no hold up.

> You should back off, but sleeping for a fixed time is not a good solution either.
> Mwait is a perfect solution in this case, there is some latency, but you are in a bad
> place anyway and with mwait, performance does not degrade too much.

mwait() does improve things and one would expect the latency to always be better
than spining*.  but as it turns out the current scheduler is pretty hopeless in its locking
anyway.  simply grabbing the lock with lock rather than canlock makes more sense to me.

also, using ticket locks (see 9atom nix kernel) will provide automatic backoff within the lock.
ticket locks are a poor solution as they're not really scalable but they will scale to 24 cpus
much better than tas locks.

mcs locks or some other queueing-style lock is clearly the long-term solution.  but as
charles points out one would really perfer to figure out a way to fit them to the lock
api.  i have some test code, but testing queueing locks in user space is ... interesting.
i need a new approach.

- erik

* have you done tests on this?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-19 15:19               ` erik quanstrom
@ 2013-12-19 15:57                 ` Gorka Guardiola
  2013-12-19 16:15                   ` erik quanstrom
  0 siblings, 1 reply; 18+ messages in thread
From: Gorka Guardiola @ 2013-12-19 15:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 4664 bytes --]

On Thu, Dec 19, 2013 at 4:19 PM, erik quanstrom <quanstro@quanstro.net>wrote:

> for those without much mwait experience, mwait is a kernel-only primitive
> (as per the instructions) that pauses the processor until a change has been
> made in some range of memory.  the size is determined by probing the h/w,
> but think cacheline.  so the discussion of locking is kernel specific as
> well.
>
>
The original discussion started about the runq spin lock, but I think the
scope of the
problem is more general and the solution can be applied in user and kernel
space both.
While in user space you would do sleep(0) in the kernel you would sched()
or if you
are in the scheduler you would loop doing mwait (see my last email).

The manual I have available says:

"The MWAIT instruction can be executed at any privilege level. The MONITOR
CPUID feature flag (ECX[bit 3] when CPUID is executed with EAX = 1)
indicates the availability of the MONITOR and MWAIT instruction in a
processor. When set, the unconditional execution of MWAIT is supported at
privilege level 0 and conditional execution is supported at privilege
levels 1 through 3 (software should test for the appropriate support of
these instructions before unconditional use)."

There are also other extensions, which I have not tried.
I think the ideas can be used in the kernel or in user space, though I have
only tried
it the kernel and the implementation is only in the kernel right now.

> > > On 17 Dec 2013, at 12:00, cinap_lenrek@felloff.net wrote:
> > >
> > > thats a surprising result. by dog pile lock you mean the runq spinlock
> no?
> > >
> >
> > I guess it depends on the HW, but I don´t find that so surprising. You
> are looping
> > sending messages to the coherency fabric, which gets congested as a
> result.
> > I have seen that happen.
>
> i assume you mean that there is contention on the cacheline holding the
> runq lock?
> i don't think there's classical congestion.  as i believe cachelines not
> involved in the
> mwait would experience no hold up.
>

I mean congestion in the classical network sense. There are switches and
links to
exchange messages for the coherency protocol and some them get congested.
What I was seeing is the counter of messages growing very very fast and the
performance
degrading which I interpret as something getting congested.
I think when the lock possession is pingponged around (not necessarily
contented,
but many changes in who is holding the lock or maybe contention) many
messages are
generated and then the problem occurs. I certainly saw the HW counters for
messages go up
orders of magnitude when I was not using mwait.

>
> mwait() does improve things and one would expect the latency to always be
> better
> than spining*.  but as it turns out the current scheduler is pretty
> hopeless in its locking
> anyway.  simply grabbing the lock with lock rather than canlock makes more
> sense to me.
>

These kind of things are subtle, I spent a lot of time measuring and it is
difficult to know
what is happening always for sure and some of the results are
counterintuitive (at least to me)
and depend on the concrete hardware/benchmark/test. So take my conclusions
with a pinch of
salt :-).

I think the latency of mwait (this is what I remember
for the opterons I was measuring, probably different in intel and in other
amd models)
is actually worse (bigger) than with spinning,
but if you have enough processors doing the spinning (not necessarily on
the same locks, but
generating traffic) then at some point it reverses because
of the traffic in the coherency fabric (or thermal effects, I do remember
that
without the mwait all the fans would be up and with the mwait they would
turn off and the
machine would be noticeably cooler). Measure it in your hardware anyway
which will
a) probably be different with a better monitor/mway.
b) you can be sure it works for the loads you are interested in.

> also, using ticket locks (see 9atom nix kernel) will provide automatic
> backoff within the lock.
> ticket locks are a poor solution as they're not really scalable but they
> will scale to 24 cpus
> much better than tas locks.
>
> mcs locks or some other queueing-style lock is clearly the long-term
> solution.  but as
> charles points out one would really perfer to figure out a way to fit them
> to the lock
> api.  i have some test code, but testing queueing locks in user space is
> ... interesting.
> i need a new approach.
>
>
Let us know what your conclusions are after you implement and measure them
:-).

G.

[-- Attachment #2: Type: text/html, Size: 6413 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [9fans] 9front pegs CPU on VMware
  2013-12-19 15:57                 ` Gorka Guardiola
@ 2013-12-19 16:15                   ` erik quanstrom
  0 siblings, 0 replies; 18+ messages in thread
From: erik quanstrom @ 2013-12-19 16:15 UTC (permalink / raw)
  To: paurea, 9fans

> The original discussion started about the runq spin lock, but I
> think the scope of the problem is more general and the solution can be
> applied in user and kernel space both.  While in user space you would
> do sleep(0) in the kernel you would sched() or if you are in the
> scheduler you would loop doing mwait (see my last email).

sched can't be called from sched!

> "The MWAIT instruction can be executed at any privilege level.  The
> MONITOR CPUID feature flag (ECX[bit 3] when CPUID is executed with EAX
> = 1) indicates the availability of the MONITOR and MWAIT instruction
> in a processor.  When set, the unconditional execution of MWAIT is
> supported at privilege level 0 and conditional execution is supported
> at privilege levels 1 through 3 (software should test for the
> appropriate support of these instructions before unconditional use)."
>
> There are also other extensions, which I have not tried.  I think the
> ideas can be used in the kernel or in user space, though I have only
> tried it the kernel and the implementation is only in the kernel right
> now.

thanks, i didn't see that.

> > i assume you mean that there is contention on the cacheline holding
> > the runq lock?  i don't think there's classical congestion.  as i
> > believe cachelines not involved in the mwait would experience no
> > hold up.
> >
>
> I mean congestion in the classical network sense.  There are switches
> and links to exchange messages for the coherency protocol and some
> them get congested.  What I was seeing is the counter of messages
> growing very very fast and the performance degrading which I interpret
> as something getting congested.  I think when the lock possession is
> pingponged around (not necessarily contented, but many changes in who
> is holding the lock or maybe contention) many messages are generated
> and then the problem occurs.  I certainly saw the HW counters for
> messages go up orders of magnitude when I was not using mwait.

any memory access makes the MESI protocol do work.  i'm still not
convinced that pounding one cache line can create enough memory traffic
to sink uninvolved processors.  (but i'm not not convinced either.)

> I think the latency of mwait (this is what I remember for the opterons
> I was measuring, probably different in intel and in other amd models)

opterons have traditionally had terrible memory latency.  especially
when crossing packages.

> is actually worse (bigger) than with spinning, but if you have enough
> processors doing the spinning (not necessarily on the same locks, but

are you sure that you're getting fair interleaving with the spin locks?  if
in fact you're interleaving on big scales (say the same processor gets the
lock 100 times in a row), that's cheating a bit, isn't it?

also, in user space, there is a 2 order of magnitude difference between sleep(0)
and a wakeup.  and that's one of the main reasons that the semaphore
based lock measures quite poorly.  since the latency is not in sleep and wakeup, it
appears that it's in context switching.

> Let us know what your conclusions are after you implement and
> measure them :-).
>

ticket locks (and some other stuff) are the difference between 200k iops and
1m iops.

- erik



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-12-19 16:15 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-16  1:49 [9fans] 9front pegs CPU on VMware Blake McBride
2013-12-16  1:52 ` erik quanstrom
2013-12-16  2:35   ` Blake McBride
2013-12-16  2:58     ` andrey mirtchovski
2013-12-16  3:05       ` Blake McBride
2013-12-16 10:17     ` cinap_lenrek
2013-12-16 14:57       ` Blake McBride
2013-12-16 15:34         ` erik quanstrom
2013-12-16 16:25           ` Matthew Veety
2013-12-16 16:59             ` erik quanstrom
2013-12-17 11:00           ` cinap_lenrek
2013-12-17 13:38             ` erik quanstrom
2013-12-17 14:14               ` erik quanstrom
2013-12-19  9:01             ` Gorka Guardiola Muzquiz
2013-12-19 14:16               ` Gorka Guardiola
2013-12-19 15:19               ` erik quanstrom
2013-12-19 15:57                 ` Gorka Guardiola
2013-12-19 16:15                   ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).