From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul O'Donnell" To: Cc: <9fans@cse.psu.edu> Subject: Re: [9fans] ide file server with mirroring In-Reply-To: <20010905145434.B67E019A25@mail.cse.psu.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Date: Wed, 5 Sep 2001 15:55:41 -0400 Topicbox-Message-UUID: ea39452e-eac9-11e9-9e20-41e7f4b1d025 computers with dual power supplies are pretty common these days, so any kind of power problem should be easy to deal with for a server you care about. in general, hardware failures are a solved problem in the storage world. storage is a commodity, with a few relatively well defined standards for access. you care about three variables (price, performance and reliability) and you optimise according to your needs. note that a fast, reliable disk array can cost a couple of orders of magnitude more per byte than a bare ide drive. given a reliable array, you care about multiple paths to it (in case you lose a controller, or someone trips over the cable), you care about reliable code (array microcode, host device drivers and file systems) and you want simple tools on the host. in the commercial (unix based) world which i inhabit, we run a lot of these systems. we have remarkably few problems which result in a loss of access to data or worse loss of data. we do run into occasional microcode bugs in the arrays, but the most challenging problem is the complexity of the tools on the host. with multiple levels of indirection (multi pathing, volume managers, file systems) even detecting a simple partition overlap can be a challenging task. this is where better (read simpler) os design can help. On Wed, 5 Sep 2001 jmk@plan9.bell-labs.com wrote: > Following on from what Nemo and Geoff have said, the most common failure > now seems to be the power supply in the computer case. During the past year > we've had a couple dozen fail, a lot of them while on a UPS, so it's not > just the off/on/power-dip stress. That, coupled with the apparent fragility > of ATA drives, makes designing a reliable hardware+filesystem more of a > challenge. > > In our computer room there are 2 or 3 IDE-raid boxes (not running a Plan 9 > filesystem) and I believe in the 6 months or so they've been there at least > one drive has failed. However, the boxes have redundant power supplies and > the drives can be hot-swapped (at some performance cost during the update), > so getting the hardware reliability is possible at a reasonable monetary and > performance cost. > >