[9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
@ 2009-04-19  7:58 John Barham
  2009-04-19 12:16 ` erik quanstrom
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: John Barham @ 2009-04-19  7:58 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I certainly can't think ahead 20 years but I think it's safe to say
that the next 5 (at least doing HPC and large-scale web type stuff)
will increasingly look like this:
http://www.technologyreview.com/computing/22504/?a=f, which talks
about building a cluster from AMD Geode (!) nodes w/ compact flash
storage.  Sure it's not super-fast, but it's very efficient per watt.
If you had more cash you might substitute HE Opterons and SSD's but
the principle is the same.

The general trend is that capital expenditures for computing are going
down but operating expenditures are going up.  Indeed if you sign up
for something like Amazon's EC2 service, your initial capital outlay
is exactly $0.  (I vividly recall paying over $3000 for a low-end
server and $300/month in colo fees back in early 2003 when I had a
hosting business.)

Apparently they use the above cluster to implement some type of
distributed memcached style cache.  Here is the page listing the many
clients for memcached:
http://code.google.com/p/memcached/wiki/Clients.  However, if w/ Plan
9 you implement the interface to the cache as a 9p service, it is
automatically available to any language that can do file I/O (heck,
even Haskell, if you can slog through the advanced type theory).  So
your software development costs go down.

Another change that levels the playing field in Plan 9's favor is the
clock-speed wall and the move to multi-core chips.  Soon everyone is
going to have to re-write their software to make it concurrent if they
want to make it run faster.  And concurrency is hard, especially when
the predominant model is preemptive threads.  Here again Plan 9's
technical advantages of its lightweight kernel and CSP threading model
confers an economic advantage.

I think the key to successfully being able to use Plan 9 commercially
is to use its unique technical advantages to exploit disruptive
economic changes.  Economics beats technology every time (e.g.,
x86/amd64 vs. MIPS/Itanium, Ethernet vs. Infiniband, SATA vs. SCSI) so
don't try to fight it.

  John

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19  7:58 [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years) John Barham
@ 2009-04-19 12:16 ` erik quanstrom
  2009-04-19 15:43   ` John Barham
  2009-04-19 14:27 ` Eric Van Hensbergen
  2009-04-20 15:48 ` ron minnich
  2 siblings, 1 reply; 10+ messages in thread
From: erik quanstrom @ 2009-04-19 12:16 UTC (permalink / raw)
  To: 9fans

> I think the key to successfully being able to use Plan 9 commercially
> is to use its unique technical advantages to exploit disruptive
> economic changes.

works for coraid.

> Economics beats technology every time (e.g., x86/amd64 vs.
> MIPS/Itanium, Ethernet vs. Infiniband, SATA vs. SCSI) so
> don't try to fight it.

if those examples prove your point, i'm not sure i agree.

having just completed a combined-mode sata/sas driver,
scsi vs ata is is fresh on my mind.  i'll use it as an example.

sata and scsi can't be directly compared because sata is an
specific physical/data layer that supports the ata 7+ command set*,
while scsi is a set of command sets and an a set of physical
standards.

if you mean that the ata command set is not as good as the
scsi command set, i don't think i agree with this.  the ata
command set is simpler, and still gets the job done.  both
suffer from bad command formatting, but scsi is worse.  ata
has 28 bit and 48 bit (sic) commands.  scsi has 6, 10, 12, 16
and 32-byte commands).  one can find problems with both
command sets.

if you mean that sata is worse than sas, i think that's a hard
sell, too.  sata and sas use the same physical connections at
the same speeds.  there are some data-layer differences, but
they seem to me to be differences without distinction.  (as
evidenced by the aforementioned combined-mode hba.)

the main difference between sas and sata is that sas supports
dual-porting of drives so that if your hba fails, your drive can
keep working.  i don't see that as a real killer feature.  hard
drives fail so much more often than hbas.

- erik

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19  7:58 [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years) John Barham
  2009-04-19 12:16 ` erik quanstrom
@ 2009-04-19 14:27 ` Eric Van Hensbergen
  2009-04-19 20:11   ` tlaronde
  2009-04-20 15:48 ` ron minnich
  2 siblings, 1 reply; 10+ messages in thread
From: Eric Van Hensbergen @ 2009-04-19 14:27 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sun, Apr 19, 2009 at 2:58 AM, John Barham <jbarham@gmail.com> wrote:
> I certainly can't think ahead 20 years but I think it's safe to say
> that the next 5 (at least doing HPC and large-scale web type stuff)
> will increasingly look like this:
> http://www.technologyreview.com/computing/22504/?a=f, which talks
> about building a cluster from AMD Geode (!) nodes w/ compact flash
> storage.  Sure it's not super-fast, but it's very efficient per watt.
> If you had more cash you might substitute HE Opterons and SSD's but
> the principle is the same.
>

We thought this was the future several years ago
(http://bit.ly/16ZWjc), but couldn't convince the company that such an
approach would win out over big iron.  Of course, if you look at Blue
Gene, it's really just a massive realization of this model with
several really tightly coupled interconnects.

>
> Apparently they use the above cluster to implement some type of
> distributed memcached style cache.
>

I'm not convinced that such ad-hoc DSM models are the way to go as a
general principal.  Full blown DSM didn't fair very well in the past.
Plan 9 distributed applications take a different approach and instead
of sharing memory they share services in much more of a message
passing model.  This isn't to say that all caches are bad -- I just
don't believe in making them the foundation of your programing model
as it will surely lead to trouble.

    -eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19 12:16 ` erik quanstrom
@ 2009-04-19 15:43   ` John Barham
  2009-04-19 16:52     ` erik quanstrom
  0 siblings, 1 reply; 10+ messages in thread
From: John Barham @ 2009-04-19 15:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

>> Economics beats technology every time (e.g., x86/amd64 vs.
>> MIPS/Itanium, Ethernet vs. Infiniband, SATA vs. SCSI) so
>> don't try to fight it.
>
> if those examples prove your point, i'm not sure i agree.
>
> having just completed a combined-mode sata/sas driver,
> scsi vs ata is is fresh on my mind.  i'll use it as an example.

To clarify, I meant that given X vs. Y, the cost benefits of X
eventually overwhelm the initial technical benefits of Y.

With SATA vs. SCSI in particular, I wasn't so much thinking of command
sets or physical connections but of providing cluster scale storage
(i.e., 10's or 100's of TB) where it's fast enough and reliable enough
but much cheaper to use commodity 7200 rpm SATA drives in RAID 5 than
"server grade" 10k or 15k rpm SCSI or SAS drives.

  John

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19 15:43   ` John Barham
@ 2009-04-19 16:52     ` erik quanstrom
  2009-04-20 15:11       ` John Barham
  0 siblings, 1 reply; 10+ messages in thread
From: erik quanstrom @ 2009-04-19 16:52 UTC (permalink / raw)
  To: 9fans

> To clarify, I meant that given X vs. Y, the cost benefits of X
> eventually overwhelm the initial technical benefits of Y.
>
> With SATA vs. SCSI in particular, I wasn't so much thinking of command
> sets or physical connections but of providing cluster scale storage
> (i.e., 10's or 100's of TB) where it's fast enough and reliable enough
> but much cheaper to use commodity 7200 rpm SATA drives in RAID 5 than
> "server grade" 10k or 15k rpm SCSI or SAS drives.

this dated prejudice that scsi is for servers and ata is
for your dad's computer has just got to die.

could you explain how raid 5 relates to sata vs sas?
i can't see now it's anything but a non-sequitor.

you do realize that enterprise sata drives are available?
you do realize that many of said drives are built with the
same drive mechanism as sata hard drives?

as an example, the seagate es.2 drives are available with
a sas interface or a sata interface.

(by the way, enterprise drives are well worth it, as i discovered
on monday. :-(.)

while it's true there aren't any 15k sata drives currently
on the market, on the other hand if you want real performance,
you can beat sas by getting an intel ssd drive.  these are
not currently available in a sas.

- erik

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19 14:27 ` Eric Van Hensbergen
@ 2009-04-19 20:11   ` tlaronde
  0 siblings, 0 replies; 10+ messages in thread
From: tlaronde @ 2009-04-19 20:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sun, Apr 19, 2009 at 09:27:43AM -0500, Eric Van Hensbergen wrote:
>
> I'm not convinced that such ad-hoc DSM models are the way to go as a
> general principal.  Full blown DSM didn't fair very well in the past.
> Plan 9 distributed applications take a different approach and instead
> of sharing memory they share services in much more of a message
> passing model.  This isn't to say that all caches are bad -- I just
> don't believe in making them the foundation of your programing model
> as it will surely lead to trouble.
>

FWIW, the more satisfying definition for me of a "computing unit" (an
"atom" OS based) is memory based: all the processing unit having direct
hardware access to a memory space/sharing the same directly hardware
accessible memory space.

There seems to be 2 kinds of "NUMA" around there :

1) Cathedral model NUMA: a hierarchical association of memories, tightly
coupled but with different speeds (a lot of uniprocessor are NUMA these
days with cache1, cache2 and main memory). All directly known by the
cores.

2) Bazaar model NUMA, or software NUMA, or GPLNUMA: treating an
inorganized collection of storage as addressable memories since one can
always give a way to locate the ressource, including by URL, associating
high speed tightly connected memories with remote storage accessible via
IP packets sent by surface mail if the hardware drived whistle is heard
by the human writing the letters.

Curiously enough, I believe in 1. I don't believe in 2.
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19 16:52     ` erik quanstrom
@ 2009-04-20 15:11       ` John Barham
  2009-04-20 16:48         ` erik quanstrom
  0 siblings, 1 reply; 10+ messages in thread
From: John Barham @ 2009-04-20 15:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> could you explain how raid 5 relates to sata vs sas?
> i can't see now it's anything but a non-sequitor.

Here is the motivating real-world business case: You are in the movie
post-production business and need > 50 TB of online storage at as low
a price as possible with good performance and reliability.  7200 rpm
SATA (currently ~15¢/GB on Newegg) plus RAID narrows the performance
and reliability of benefits of 15k rpm SAS (currently ~$1/GB) at a
much lower cost.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-19  7:58 [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years) John Barham
  2009-04-19 12:16 ` erik quanstrom
  2009-04-19 14:27 ` Eric Van Hensbergen
@ 2009-04-20 15:48 ` ron minnich
  2009-04-20 17:15   ` Wes Kussmaul
  2 siblings, 1 reply; 10+ messages in thread
From: ron minnich @ 2009-04-20 15:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sun, Apr 19, 2009 at 12:58 AM, John Barham <jbarham@gmail.com> wrote:
> I certainly can't think ahead 20 years but I think it's safe to say
> that the next 5 (at least doing HPC and large-scale web type stuff)
> will increasingly look like this:
> http://www.technologyreview.com/computing/22504/?a=f, which talks
> about building a cluster from AMD Geode (!) nodes w/ compact flash
> storage.  Sure it's not super-fast, but it's very efficient per watt.
> If you had more cash you might substitute HE Opterons and SSD's but
> the principle is the same.

It's nice. We did that one a few years ago. Here is the 7 year old
version: http://eri.ca.sandia.gov/eri/howto.html

We've been doing these with the Geode stuff since about 2006. We are
certainly not the first. The RLX was doing what FAWN did about 8 years
ago; orion, about 3-4 years ago (both transmeta). RLX and Orion
multisystems showed there is not much of a market for lots of wimpy
nodes -- yet or never, is the real question. Either way, they did not
have enough buyers to stay in business. And RLX had to drop its wimpy
transmetas for P4s, and they could not keep up with the cheap
mainboards. It's a tough business.

ron

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-20 15:11       ` John Barham
@ 2009-04-20 16:48         ` erik quanstrom
  0 siblings, 0 replies; 10+ messages in thread
From: erik quanstrom @ 2009-04-20 16:48 UTC (permalink / raw)
  To: 9fans

On Mon Apr 20 11:13:01 EDT 2009, jbarham@gmail.com wrote:
> > could you explain how raid 5 relates to sata vs sas?
> > i can't see now it's anything but a non-sequitor.
>
> Here is the motivating real-world business case: You are in the movie
> post-production business and need > 50 TB of online storage at as low
> a price as possible with good performance and reliability.  7200 rpm
> SATA (currently ~15¢/GB on Newegg)

this example has nothing to do with raid.  if the object is to find the
lowest cost per gigabyte, enterprise sata drives are the cheeper option.
(it would make more sense to compare 7.2k sas and sata drives.  there
is also a premium on spindle speed.)

the original argument was that scsi is better than ata or sas is better
than sata (i'm not sure which); in my opinion, there are no facts
to justify either assertion.

> plus RAID narrows the performance
> and reliability of benefits of 15k rpm SAS (currently ~$1/GB) at a
> much lower cost.

without raid, such a configuration might be impossible to deal with.

most 15k drives are 73gb.  this means you would need 685 for 50tb.
the afr is probablly something like 0.15% - 0.25%.  this would mean you
will loose 1-2 drives/year.  (if you believe those rosy afr numbers.)

- erik

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years)
  2009-04-20 15:48 ` ron minnich
@ 2009-04-20 17:15   ` Wes Kussmaul
  0 siblings, 0 replies; 10+ messages in thread
From: Wes Kussmaul @ 2009-04-20 17:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

ron minnich wrote:
RLX and Orion
> multisystems showed there is not much of a market for lots of wimpy
> nodes -- yet or never, is the real question. Either way, they did not
> have enough buyers to stay in business. And RLX had to drop its wimpy
> transmetas for P4s, and they could not keep up with the cheap
> mainboards. It's a tough business.

All RLX showed was that they didn't know how to market benefits rather
than nifty technology. Once again, the company with beautifully
engineered products failed to understand that the decision makers who
would buy the products were not engineers who loved them but people who
needed education and handholding in order to understand them and
overcome FUD from RLX competitors. A well-worn path.

Wes

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-04-20 17:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-19  7:58 [9fans] "FAWN: Fast array of wimpy nodes" (was: Plan 9 - the next 20 years) John Barham
2009-04-19 12:16 ` erik quanstrom
2009-04-19 15:43   ` John Barham
2009-04-19 16:52     ` erik quanstrom
2009-04-20 15:11       ` John Barham
2009-04-20 16:48         ` erik quanstrom
2009-04-19 14:27 ` Eric Van Hensbergen
2009-04-19 20:11   ` tlaronde
2009-04-20 15:48 ` ron minnich
2009-04-20 17:15   ` Wes Kussmaul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).