From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4b97c7ed935984e918f3f2f8e084cae9@quanstro.net> From: erik quanstrom Date: Mon, 21 Sep 2009 19:38:57 -0400 To: 9fans@9fans.net In-Reply-To: <4AB7E8D7.10906@0x6a.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] Petabytes on a budget: JBODs + Linux + JFS Topicbox-Message-UUID: 74003e60-ead5-11e9-9d60-3106f5b1d025 > At work, we recently had a massive failure of our RAID array. After > much brown noseing, I come to find that after many harddrives being > shipped to our IT guy and him scratching his head, it was in fact the > RAID card itself that had failed (which takes out the whole array, plus > can take out any new drives you throw at it apparently). i have never seen any controller fail in such a way that drives were actually damaged. and i would suspect serious design issues if that is what happened. that's like a bad ethernet or usb controller frying your switch. controller failure is not common for the types of controllers i use. for machines that are in service, controller failure is no more common than cpu or motherboard failure. > So I ask you all this (especially those in the 'biz): all this > redundancy on the drive side, why no redundancy of controller cards (or > should I say, the driver infrastructure needed)? the high-end sas "solution" is to buy expensive dual-ported drives and cross connect controllers and drives. this is very complicated and requires twice the number of ports or sas expanders. it also requires quite a bit of driver-level code. it is possible if the failure rates are low enough (and especially if cable failure is more probable than port failure), that the extra bits and pieces in this dual-ported setup are *less* reliable than a standard setup. and it's all for naught if the cpu. mb or memory blow up. i keep a cold spare controller, just in case. (coraid sells a spares kit for the truly paranoid, like me. and a mirroring appliance for those who are even parnoider. of course the mirroring appliance can be mirrored, which is great until the switch blows up. but naturally you can use multiple switches. alas, no protection from meteors.) > It is appealing to me to try and get some plan 9 supported raid card and > have plan 9 throughout (like the coraid setup as far as I can tell), but > this little issue bothers me. plan 9 doesn't support any raid cards per se. (well, maybe the wonderful but now ancient parallel scsi drivers might.) theoretically, intel matrix raid supports raid and is drivable with the ahci driver. that would limit you to the on-board ports. i've never tried it. as far as i can tell, matrix raid uses smm mode + microcode on the southbridge to operate. (anyone know better?) and i want as little code sneaking around behind my back as possible. the annoying problem with "hardware" raid is that it takes real contortions to make an array span controllers. and you can't recruit a hot spare from another controller. - erik