From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4d29649e5c597cd8ebd627a2d65f2c9e@quanstro.net> From: erik quanstrom Date: Sun, 20 Sep 2009 23:37:02 -0400 To: 9fans@9fans.net In-Reply-To: <20090920201310.35C2C5B37@mail.bitblocks.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Subject: Re: [9fans] Petabytes on a budget: JBODs + Linux + JFS Topicbox-Message-UUID: 72a04ba0-ead5-11e9-9d60-3106f5b1d025 > > drive mfgrs don't report write error rates. i would consider any > > drive with write errors to be dead as fried chicken. a more > > interesting question is what is the chance you can read the > > written data back correctly. in that case with desktop drives, > > you have a > > 8 bits/byte * 1e12 bytes / 1e14 bits/ure = 8% > > Isn't that the probability of getting a bad sector when you > read a terabyte? In other words, this is not related to the > disk size but how much you read from the given disk. Granted > that when you "resilver" you have no choice but to read the > entire disk and that is why just one redundant disk is not > good enough for TB size disks (if you lose a disk there is 8% > chance you copied a bad block in resilvering a mirror). see below. i think you're confusing a single disk 8% chance of failure with a 3 disk tb array, with a 1e-7% chance of failure. i would think this is acceptable. at these low levels, something else is going to get you — like drives failing unindependently. say because of power problems. > > i'm a little to lazy to calcuate what the probabilty is that > > another sector in the row is also bad. (this depends on > > stripe size, the number of disks in the raid, etc.) but it's > > safe to say that it's pretty small. for a 3 disk raid 5 with > > 64k stripes it would be something like > > 8 bites/byte * 64k *3 / 1e14 = 1e-8 > > The read error prob. for a 64K byte stripe is 3*2^19/10^14 ~= > 3*0.5E-8, since three 64k byte blocks have to be read. The > unrecoverable case is two of them being bad at the same time. > The prob. of this is 3*0.25E-16 (not sure I did this right -- thanks for noticing that. i think i didn't explain myself well i was calculating the rough probability of a ure in reading the *whole array*, not just one stripe. to do this more methodicly using your method, we need to count up all the possible ways of getting a double fail with 3 disks and multiply by the probability of getting that sort of failure and then add 'em up. if 0 is ok and 1 is fail, then i think there are these cases: 0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 1 0 1 0 1 1 1 1 1 so there are 4 ways to fail. 3 double fail have a probability of 3*(2^9 bits * 1e-14 1/ bit)^2 and the triple fail has a probability of (2^9 bits * 1e-14 1/ bit)^3 so we have 3*(2^9 bits * 1e-14 1/ bit)^2 + (2^9 bits * 1e-14 1/ bit)^3 ~= 3*(2^9 bits * 1e-14 1/ bit)^2 = 8.24633720832e-17 that's per stripe. if we multiply by 1e12/(64*1024) stripes/array, we have = 1.2582912e-09 which is remarkably close to my lousy first guess. so we went from 8e-2 to 1e-9 for an improvement of 7 orders of magnitude. > we have to consider the exact same sector # going bad in two > of the three disks and there are three such pairs). the exact sector doesn't matter. i don't know any implementations that try to do partial stripe recovery. - erik