9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: erik quanstrom <quanstro@quanstro.net>
To: 9fans@9fans.net
Subject: Re: [9fans] Petabytes on a budget: JBODs + Linux + JFS
Date: Sun, 20 Sep 2009 23:37:02 -0400	[thread overview]
Message-ID: <4d29649e5c597cd8ebd627a2d65f2c9e@quanstro.net> (raw)
In-Reply-To: <20090920201310.35C2C5B37@mail.bitblocks.com>

> > drive mfgrs don't report write error rates.  i would consider any
> > drive with write errors to be dead as fried chicken.  a more
> > interesting question is what is the chance you can read the
> > written data back correctly.  in that case with desktop drives,
> > you have a
> > 	8 bits/byte * 1e12 bytes / 1e14 bits/ure = 8%
>
> Isn't that the probability of getting a bad sector when you
> read a terabyte? In other words, this is not related to the
> disk size but how much you read from the given disk. Granted
> that when you "resilver" you have no choice but to read the
> entire disk and that is why just one redundant disk is not
> good enough for TB size disks (if you lose a disk there is 8%
> chance you copied a bad block in resilvering a mirror).

see below.  i think you're confusing a single disk 8% chance
of failure with a 3 disk tb array, with a 1e-7% chance of failure.

i would think this is acceptable.  at these low levels, something
else is going to get you — like drives failing unindependently.
say because of power problems.

> > i'm a little to lazy to calcuate what the probabilty is that
> > another sector in the row is also bad.  (this depends on
> > stripe size, the number of disks in the raid, etc.)  but it's
> > safe to say that it's pretty small.  for a 3 disk raid 5 with
> > 64k stripes it would be something like
> > 	8 bites/byte * 64k *3 / 1e14 = 1e-8
>
> The read error prob. for a 64K byte stripe is 3*2^19/10^14 ~=
> 3*0.5E-8, since three 64k byte blocks have to be read.  The
> unrecoverable case is two of them being bad at the same time.
> The prob. of this is 3*0.25E-16 (not sure I did this right --

thanks for noticing that.  i think i didn't explain myself well
i was calculating the rough probability of a ure in reading the
*whole array*, not just one stripe.

to do this more methodicly using your method, we need
to count up all the possible ways of getting a double fail
with 3 disks and multiply by the probability of getting that
sort of failure and then add 'em up.  if 0 is ok and 1 is fail,
then i think there are these cases:

0 0 0
1 0 0
0 1 0
0 0 1
1 1 0
1 0 1
0 1 1
1 1 1

so there are 4 ways to fail.  3 double fail have a probability of
3*(2^9 bits * 1e-14 1/ bit)^2
and the triple fail has a probability of
(2^9 bits * 1e-14 1/ bit)^3
so we have
3*(2^9 bits * 1e-14 1/ bit)^2 + (2^9 bits * 1e-14 1/ bit)^3 ~=
	3*(2^9 bits * 1e-14 1/ bit)^2
	= 8.24633720832e-17
that's per stripe.  if we multiply by 1e12/(64*1024) stripes/array,
we have
	= 1.2582912e-09
which is remarkably close to my lousy first guess.  so we went
from 8e-2 to 1e-9 for an improvement of 7 orders of magnitude.

> we have to consider the exact same sector # going bad in two
> of the three disks and there are three such pairs).

the exact sector doesn't matter.  i don't know any
implementations that try to do partial stripe recovery.

- erik



  reply	other threads:[~2009-09-21  3:37 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-14 16:43 erik quanstrom
2009-09-20 20:13 ` Bakul Shah
2009-09-21  3:37   ` erik quanstrom [this message]
2009-09-21 17:43     ` Bakul Shah
2009-09-21 18:02       ` erik quanstrom
2009-09-21 18:49         ` Wes Kussmaul
2009-09-21 19:21           ` erik quanstrom
2009-09-21 20:57             ` Wes Kussmaul
2009-09-21 22:42               ` erik quanstrom
2009-09-22 10:59             ` matt
2009-09-21 19:10         ` Bakul Shah
2009-09-21 20:30           ` erik quanstrom
2009-09-21 20:57             ` Jack Norton
2009-09-21 23:38               ` erik quanstrom
2009-09-21 22:07             ` Bakul Shah
2009-09-21 23:35               ` Eris Discordia
2009-09-22  0:45                 ` erik quanstrom
     [not found]               ` <6DC61E4A6EC613C81AC1688E@192.168.1.2>
2009-09-21 23:50                 ` Eris Discordia
  -- strict thread matches above, loose matches on Subject: below --
2009-09-04  0:53 Roman V Shaposhnik
2009-09-04  1:20 ` erik quanstrom
2009-09-04  9:37   ` matt
2009-09-04 14:30     ` erik quanstrom
2009-09-04 16:54     ` Roman Shaposhnik
2009-09-04 12:24   ` Eris Discordia
2009-09-04 12:41     ` erik quanstrom
2009-09-04 13:56       ` Eris Discordia
2009-09-04 14:10         ` erik quanstrom
2009-09-04 18:34           ` Eris Discordia
     [not found]       ` <48F03982350BA904DFFA266E@192.168.1.2>
2009-09-07 20:02         ` Uriel
2009-09-08 13:32           ` Eris Discordia
2009-09-04 16:52   ` Roman Shaposhnik
2009-09-04 17:27     ` erik quanstrom
2009-09-04 17:37       ` Jack Norton
2009-09-04 18:33         ` erik quanstrom
2009-09-08 16:53           ` Jack Norton
2009-09-08 17:16             ` erik quanstrom
2009-09-08 18:17               ` Jack Norton
2009-09-08 18:54                 ` erik quanstrom
2009-09-14 15:50                   ` Jack Norton
2009-09-14 17:05                     ` Russ Cox
2009-09-14 17:48                       ` Jack Norton
2009-09-04 23:25   ` James Tomaschke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d29649e5c597cd8ebd627a2d65f2c9e@quanstro.net \
    --to=quanstro@quanstro.net \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).