From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <3B953ECD.473D3718@princeton.edu> From: Martin Harriss MIME-Version: 1.0 To: 9fans@cse.psu.edu Subject: Re: [9fans] (no subject) References: <20010904130843.027A419A38@mail.cse.psu.edu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Tue, 4 Sep 2001 16:51:25 -0400 Topicbox-Message-UUID: e9a72ffe-eac9-11e9-9e20-41e7f4b1d025 "Fco.J.Ballesteros" wrote: > > That's cool. BTW, I'm just wondering... Indeed it is. > Would a system crash while doing a (mirrowed) dump > corrupt both disks at the same time? From reading Geoff's code it looks as if a crash during a block write could leave the secondaries in a good state, but the master would be incomplete. > While I was trying to use a couple of ide disks to > survive disk crashes, I thought it would be better > not to use a real mirror because a crash at a bad > moment could perhaps leave the cached worm unusable. > (I had a crash while doing a dump on a cached worm and > the cached worm became unusable). In the case of the worm file system, what happens if a sequence of writes is interrupted? is the file system guaranteed to be consistent after each write, or can an interrupted sequence destroy the file system? The experience above would suggest the latter. > Am I missing something? A recover procedure I don't > know of? Or perhaps some code in the mirror device tries > to deal with that? Don't see any code in there to do that. I've also been playing with disk morroring, and I added a 'resilver' command to copy a 'good' disk to a 'bad' disk. The hard part is knowing which is the good disk and which is the bad. If one of them goes bad while the system is running it's easy - when you get the error you just mark that disk as bad. But it's impossible to know in cases when the power goes off just where you were in the sequence of writes. For something like a fake worm, where speed is not a big issue, it may be worth while recording disk status in some dedicated place on the disk for each block write. One logical volume manager that I am familiar with just "guesses" which side of the mirror is good and lets fsck clean up the mess, but this is probably impractical for a "worm" file system. > Another question I have is can you rebuild your mirror > device just by raw copying of one disk into another? It looks that way from the code. In fact, I've pretty much decided that I'm going to build a file server with a cast off 300-or-so megabyte disk (in addition to the real file server disks) that would contain the bootstrap for the file server, and an 'emergency' stand-alone cpu server that could be used to repair/copy/etc the actual file server disks. I can also see writing some tools to fix corrupted fake worms. Martin > ------------------------------------------------------------------------ > > Subject: [9fans] (no subject) > Date: Tue, 4 Sep 2001 05:08:33 -0700 > From: geoff@collyer.net > Reply-To: 9fans@cse.psu.edu > To: 9fans@collyer.net > > I've fixed some bugs in the IDE file server (some latent, some new in > the IDE code) and added a mirroring device. I've tested it, it works > and later today it will be my main file server. The mirroring device > is really very little code; the file server's elegant design is > largely responsible for this. Doing a dump of 457121 4K blocks from a > cache device on h0 to a fake worm also on h0, mirrored on h1.0.0 > (a.k.a. h2) took 73 minutes, so I got 25,648,871 bytes per minute > throughput. I verified that the copy on h1.0.0 was correct. > > Here's my configuration: > > config h0 > service fs > [ uninteresting ip configuration omitted ] > filsys main cp(h0)0.25f{p(h0)25.75p(h1.0.0)25.75} > filsys dump o > filsys other p(h1.0.0)0.25 > ream other > ream main > end > > {} is the mirror device, analogous to () and []. The first device > inside {} is the master, any others are mirrors. The code can be > found at www.collyer.net/~geoff/9/. I'll add some commentary on the > changes later today. They fall into several categories: > > - fixes to latent bugs. > - addition of some missing switch cases for Devfworm, Devnone and Devide. > the file server could really use a device switch (rather than a lot of > switch statements scattered throughout the code). > in particular, device configuration strings are now printed better. > - additional paranoia in the IDE code; specifying a non-existent drive > no longer causes a kernel page fault. > - converted nemo's style back to the original style, and some tidying up. > - probably vestigial paranoia traceable to hunting down the fpinit bug. > - local configuration (e.g., timezone). you'll want to crank MAXMEG up. > - addition of the mirror device.