[9fans] A little ado about taslock

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] A little ado about taslock
@ 2010-06-21  7:25 Venkatesh Srinivas
  2010-06-21 14:21 ` erik quanstrom
  0 siblings, 1 reply; 4+ messages in thread
From: Venkatesh Srinivas @ 2010-06-21  7:25 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1391 bytes --]

Hi,

Erik's thread about a 16-processor x86 machine convinced me to try something
related to spinlocks.

The current 9 spinlocks are portable code, calling an arch-provided tas() in
a loop to do their thing. On i386, Intel recommends 'PAUSE' in the core of a
spin-lock loop; I modified tas to PAUSE (0xF3 0x90 if you prefer) if the
lock-acquire attempt failed.

In a crude test on a 1.5GHz p4 willamette with a local fossil/venti and
256mb of ram, 'time mk 'CONF=pcf' > /dev/null' in /sys/src/9/pc, on a
fully-built source tree, adding the PAUSE reduced times from an average of
18.97s to 18.84s (across ten runs).

I tinkered a bit further. Removing the increments of glare, inglare and
lockstat.locks, coupled with the PAUSE addition, reduced the average real
time to 18.16s, again across 10 runs.

If taslock.c were arch-specific, we could almost certainly do better - i386
doesn't need the coherence() call in unlock, we could safely test-and-tas
rather than than raw tas().

There're also other places to look at too, wrt to application of
arch-specific bits; see:
http://code.google.com/p/inferno-npe/source/detail?r=b83540e1e77e62a19cbd21d2eb54d43d338716a5for
what XADD can do for incref/decref. Similarly, pc/l.s:_xdec could be
much shorter, again using XADD.

None of these are a huge deal; just thought they might be interesting.

Take care,
-- vs

[-- Attachment #2: Type: text/html, Size: 1739 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] A little ado about taslock
  2010-06-21  7:25 [9fans] A little ado about taslock Venkatesh Srinivas
@ 2010-06-21 14:21 ` erik quanstrom
  2010-06-21 16:28   ` Lyndon Nerenberg
  0 siblings, 1 reply; 4+ messages in thread
From: erik quanstrom @ 2010-06-21 14:21 UTC (permalink / raw)
  To: 9fans

> In a crude test on a 1.5GHz p4 willamette with a local fossil/venti and
> 256mb of ram, 'time mk 'CONF=pcf' > /dev/null' in /sys/src/9/pc, on a
> fully-built source tree, adding the PAUSE reduced times from an average of
> 18.97s to 18.84s (across ten runs).

we tried this at coraid years ago.  it's a win — but only on the p4 and
netburst-based xeons with old-and-crappy hyperthreading enabled.  it
seems to otherwise be a small loss.

i don't see an actual performance problem on the 16-cpu machine.
i see an apparent performance problem.  the 4- and 16- processor
machines have a single-threaded speed ratio of ~ 1:1.7, so since
kprof does sampling on the clock interrupt, it seems reasonable
that processors could get in a timing-predictable loop and get
sampled at different places each time.  no way rebalance is using
40% of the cpu, right?  the anomoly in time(1) is not yet explained.
but it's clearly not much of a performance problem there was only
a 10% slowdown between 1 core busy and 16 cores busy.  that's
likely due to the fact that plan 9 knows nothing of the numa nature
of that board.

richard miller does point out a real problem.  idlehands just returns
if conf.nproc>1.  this is done so we don't have to wait for the next
clock tick should work become available.  this is a power management
problem, not a performance problem.  your interesting locking solution
posted previously doesn't help with this.  it's not even a locking problem.

a potential solution to this would be to have a new bit array, e.g.
active.schedwait which is set when a proc has no work.  the mach
could then call halt.  a mach could then check for an idle mach
to wake after reading a proc.  an apic ipi would be a suitable wakeup
mechanism with r.t. latencies < 500ns. (www.barrelfish.org/barrelfish_mmcs08.pdf)
one assumes that 500ns/2 + wakeup time ≈ wakeup time.

two unfinished thoughts:

1.  it sure wouldn't surprise me if this has been done in plan 9 before.
i'd be interested to know what ken's sequent kernel did.

2.  if today 16 machs are possible (and 128 on an intel xeon mp 7500—
8 sockets * 8 core * 2t = 128), what do we expect in 5 years?  128?

- erik

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] A little ado about taslock
  2010-06-21 14:21 ` erik quanstrom
@ 2010-06-21 16:28   ` Lyndon Nerenberg
  2010-06-21 16:38     ` David Leimbach
  0 siblings, 1 reply; 4+ messages in thread
From: Lyndon Nerenberg @ 2010-06-21 16:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> 2.  if today 16 machs are possible (and 128 on an intel xeon mp 7500?
> 8 sockets * 8 core * 2t = 128), what do we expect in 5 years?  128?

www.seamicro.com



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] A little ado about taslock
  2010-06-21 16:28   ` Lyndon Nerenberg
@ 2010-06-21 16:38     ` David Leimbach
  0 siblings, 0 replies; 4+ messages in thread
From: David Leimbach @ 2010-06-21 16:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 346 bytes --]

On Mon, Jun 21, 2010 at 9:28 AM, Lyndon Nerenberg <lyndon@orthanc.ca> wrote:

> 2.  if today 16 machs are possible (and 128 on an intel xeon mp 7500?
>> 8 sockets * 8 core * 2t = 128), what do we expect in 5 years?  128?
>>
>
> www.seamicro.com
>
> There's a 100 core MIPS-like board available now too.

http://www.tilera.com/

Dave

[-- Attachment #2: Type: text/html, Size: 834 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-06-21 16:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-21  7:25 [9fans] A little ado about taslock Venkatesh Srinivas
2010-06-21 14:21 ` erik quanstrom
2010-06-21 16:28   ` Lyndon Nerenberg
2010-06-21 16:38     ` David Leimbach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).