From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4AB7E8D7.10906@0x6a.com> Date: Mon, 21 Sep 2009 15:57:59 -0500 From: Jack Norton User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> References: <75bd45f10fe4970a189c6824bbadc841@quanstro.net> In-Reply-To: <75bd45f10fe4970a189c6824bbadc841@quanstro.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [9fans] Petabytes on a budget: JBODs + Linux + JFS Topicbox-Message-UUID: 73f25c78-ead5-11e9-9d60-3106f5b1d025 erik quanstrom wrote: >>> i think the lesson here is don't by cheep drives; if you >>> have enterprise drives at 1e-15 error rate, the fail rate >>> will be 0.8%. of course if you don't have a raid, the fail >>> rate is 100%. >>> >>> if that's not acceptable, then use raid 6. >>> >> Hopefully Raid 6 or zfs's raidz2 works well enough with cheap >> drives! >> > > don't hope. do the calculations. or simulate it. > > this is a pain in the neck as it's a function of ber, > mtbf, rebuild window and number of drives. > > i found that not having a hot spare can increase > your chances of a double failure by an order of > magnitude. the birthday paradox never ceases to > amaze. > > - erik > > While we are on the topic: How many RAID cards have we failed lately? I ask because I am about to hit a fork in the road with my work-a-like of your diskless fs. I was originally going to use linux soft raid and vblade, but I am considering using some raid cards that just so happen to be included in the piece of hardware I will be getting soon... At work, we recently had a massive failure of our RAID array. After much brown noseing, I come to find that after many harddrives being shipped to our IT guy and him scratching his head, it was in fact the RAID card itself that had failed (which takes out the whole array, plus can take out any new drives you throw at it apparently). So I ask you all this (especially those in the 'biz): all this redundancy on the drive side, why no redundancy of controller cards (or should I say, the driver infrastructure needed)? It is appealing to me to try and get some plan 9 supported raid card and have plan 9 throughout (like the coraid setup as far as I can tell), but this little issue bothers me. Speaking of birthday, I mentioned to our IT dep (all two people...) that they should try and spread out the drives used among different mfg dates and batches. It shocked me to know that this was news to them... -Jack