From mboxrd@z Thu Jan  1 00:00:00 1970
Mime-Version: 1.0 (Apple Message framework v753.1)
In-Reply-To: <20080717192956.43F6C5B46@mail.bitblocks.com>
References: <20080717192956.43F6C5B46@mail.bitblocks.com>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Message-Id: <38BCA65F-2B8C-4536-99A5-227D00685A5F@telus.net>
Content-Transfer-Encoding: 7bit
From: Paul Lalonde <plalonde@telus.net>
Date: Thu, 17 Jul 2008 20:31:44 -0700
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] Plan 9 and multicores/parallelism/concurrency?
Topicbox-Message-UUID: eb2372d4-ead3-11e9-9d60-3106f5b1d025

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jul 17, 2008, at 12:29 PM, Bakul Shah wrote:
> My reasoning was that more and more
> cores can be (and will be) put on a die but a corresponding
> increase in off chip memory bandwidth will not be possible so
> at some point memory bottleneck will prevent 100% use of
> cores even if you assume ideal placement of threads and no
> thread movement to a different core.

As the number of cores increases you have to hugely increase the
amount of cache - you need cache enough for a large enough working
set to keep a core busy during the long wait for its next slice of
bandwidth (figurative slice - the multiplexing clearly should finer
grained).  Latency hiding on those fetches is critically important.

>
> I was certainly not suggesting moving threads around.  I was
> speculating that as the number of cores goes up perhaps the
> kernel is not the right place to do affinity scheduling or
> much any sophisticated scheduling.

Largely agreed.  The real tension is in virtualizing the resources,
which beats against affinity.  Affinity is clearly an early loser in
oversubscribed situations, but it would be a major win to have a
scheduler (in or out of kernel) that could degrade intelligently in
the face of oversubscription, instead of the hard wall you get when
you throw away affinity.

> Some friends of mine are able to sqeeze a lot of parallelism
> out supposedly hard to parallelize code.  But this is in a
> purely cooperative worlds where they assume threads don't
> move and where machines are dedicated to specific tasks.

Envy.

The other part not to forget about is data-parallel.  At least in
graphics we get to recast most of our heavy loads to data-parallel,
which has huge benefits.  If you can manage data-parallel with a nice
task DAG and decent load-balancing you can do wonders at keeping data
on-chip while pushing lots of flops.

Paul, patiently awaiting hardware announcements so he can talk freely.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFIgA6hpJeHo/Fbu1wRAvZUAJ0WxfsfPHZJSclLwhgLj8ibkdgDiwCgx80y
7WT72MW7TsELUwi7jSATr/8=
=5nHw
-----END PGP SIGNATURE-----