9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: ron minnich <rminnich@lanl.gov>
To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu>
Cc: geoff@collyer.net
Subject: Re: [9fans] datakit
Date: Thu, 19 Aug 2004 21:30:49 -0600	[thread overview]
Message-ID: <Pine.LNX.4.44.0408192114450.13910-100000@maxroach.lanl.gov> (raw)
In-Reply-To: <20040820115259.12e1c14c@garlic.apnic.net>

I don't know, Geoff, having seen all the failed attempts at putting
'reliable transport' into the network itself (including ATM, HIPPI,
HIPPI-800, GSN, Quadrics, Myrinet, SCI, Infiniband, ...) I've become a big
fan of dumb networks like Ethernet. All that fancy stuff works great in
the small, fails in the large, and boy oh boy ... do you really want
someone to come to you 3 months from now and say "what's this huge block
of zeros in my data file?". I don't.

We had a network here (HIPPI-800)  that was super-reliable ... on 2
machines. With X thousand interfaces all going at once, you got a bad
packet once every 15 mins. Oops. Took three months to find out that was
happening. Software now covers for that problem.

Every new network does this:
- we're reliable! count on it! Just push the bits and we'll take care of
  it!
- what errors? We're not seeing them
  (oh, wait, we're not LOOKING for them, oops -- yes, this really
   happens!)
- well, ok, you're using the network wrong
- well, ok, it has bugs, but you're not seeing them -- it's your
  application
- oops, you're seeing bugs? we never simulated this scenario. Gosh, maybe
  there is a problem.
- there's a problem. Fixed in next hardware release
- there's a problem in the new hardware release
- (final phase) Our latest code release detects and corrects any
   errors in the network!

See: NFS, from '86 to '91 (everyone remember patching SunOS kernels to
turn on udpcksum?)
See: ATM, any time

If I assume the network is not 100% reliable, I will write software that
thinks that way, and I won't get bitten when my "reliable" network with a
1e-14 BER wrecks some data.

The number you need? The sandia guys like their ASCI Red network with its
1e-21 BER. What did datakit do? I know nothing I've ever used can do that
1e-21. The Red Storm network might, however.

ron

p.s. the Quadrics and Infiniband guys, who are all Very Smart People, will
beg to disagree with me about listing their networks above, but I will in
turn continue to disagree with them. But maybe the Infiniband guys are
right -- I'll believe it when I see it. So it goes. The Myrinet and
HIPPI-800 and SCI and ATM (and, actually, Ethernet) guys used to believe
they could solve all the problems in the network, but last time I looked
they don't believe that any more. Software continues to guarantee hardware
reliability. TCP r00lz.



  parent reply	other threads:[~2004-08-20  3:30 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-19 15:11 Steve Simon
2004-08-20  1:35 ` geoff
2004-08-20  1:52   ` George Michaelson
2004-08-20  2:43     ` geoff
2004-08-20  3:09       ` George Michaelson
2004-08-20  5:46         ` geoff
2004-08-21  0:33           ` ron minnich
2004-08-21  4:51             ` boyd, rounin
2004-08-21 14:22             ` Brantley Coile
2004-08-22  9:50               ` Tim Newsham
2004-08-23  2:50               ` ron minnich
2004-08-20 14:13         ` boyd, rounin
2004-08-20  9:45       ` C H Forsyth
2004-08-20 12:55         ` Long Political Rant. Was: [Re: [9fans] datakit] Dave Lukes
2004-08-20 16:45           ` Jack Johnson
2004-08-20 16:59             ` rog
2004-08-20 13:06         ` [9fans] datakit Wes Kussmaul
2004-08-20 16:51         ` Skip Tavakkolian
2004-08-20 17:07           ` rog
2004-08-22 19:06             ` Jack Johnson
2004-08-20 18:41           ` boyd, rounin
2004-08-21 16:37             ` Boris Maryshev
2004-08-21 17:19             ` Boris Maryshev
2004-08-20  3:30     ` ron minnich [this message]
2004-08-20 14:24       ` boyd, rounin
2004-08-23 15:04         ` andrey mirtchovski
2004-08-23 15:27           ` ron minnich
2004-08-20 13:48   ` boyd, rounin
2004-08-20  5:05 dmr
2004-08-20  5:35 ` George Michaelson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.44.0408192114450.13910-100000@maxroach.lanl.gov \
    --to=rminnich@lanl.gov \
    --cc=9fans@cse.psu.edu \
    --cc=geoff@collyer.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).