From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: Your message of "Mon, 14 Sep 2009 12:43:42 EDT." <6bd256e5d8267deae22ba7e994440bfd@quanstro.net> References: <6bd256e5d8267deae22ba7e994440bfd@quanstro.net> From: Bakul Shah Date: Sun, 20 Sep 2009 13:13:10 -0700 Message-Id: <20090920201310.35C2C5B37@mail.bitblocks.com> Subject: Re: [9fans] Petabytes on a budget: JBODs + Linux + JFS Topicbox-Message-UUID: 728a7848-ead5-11e9-9d60-3106f5b1d025 On Mon, 14 Sep 2009 12:43:42 EDT erik quanstrom wrote: > > I am going to try my hands at beating a dead horse:) > > So when you create a Venti volume, it basically writes '0's' to all the > > blocks of the underlying device right? If I put a venti volume on a AoE > > device which is a linux raid5, using normal desktop sata drives, what > > are my chances of a successful completion of the venti formating (let's > > say 1TB raw size)? > > drive mfgrs don't report write error rates. i would consider any > drive with write errors to be dead as fried chicken. a more > interesting question is what is the chance you can read the > written data back correctly. in that case with desktop drives, > you have a > 8 bits/byte * 1e12 bytes / 1e14 bits/ure = 8% Isn't that the probability of getting a bad sector when you read a terabyte? In other words, this is not related to the disk size but how much you read from the given disk. Granted that when you "resilver" you have no choice but to read the entire disk and that is why just one redundant disk is not good enough for TB size disks (if you lose a disk there is 8% chance you copied a bad block in resilvering a mirror). > i'm a little to lazy to calcuate what the probabilty is that > another sector in the row is also bad. (this depends on > stripe size, the number of disks in the raid, etc.) but it's > safe to say that it's pretty small. for a 3 disk raid 5 with > 64k stripes it would be something like > 8 bites/byte * 64k *3 / 1e14 = 1e-8 The read error prob. for a 64K byte stripe is 3*2^19/10^14 ~= 3*0.5E-8, since three 64k byte blocks have to be read. The unrecoverable case is two of them being bad at the same time. The prob. of this is 3*0.25E-16 (not sure I did this right -- we have to consider the exact same sector # going bad in two of the three disks and there are three such pairs).