From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 11 Jan 2010 11:54:41 -0500 From: Venkatesh Srinivas To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Message-ID: <20100111165440.GA21617@endeavour> References: <20100111082753.BA91311803D@smtp.hushmail.com> <20100111140745.GA21294@endeavour> <126f64914bf395ab797aa58afc6344e6@coraid.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <126f64914bf395ab797aa58afc6344e6@coraid.com> User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Re: [9fans] Hardware for Plan9 Topicbox-Message-UUID: bd1d1140-ead5-11e9-9d60-3106f5b1d025 On Mon, Jan 11, 2010 at 10:22:49AM -0500, erik quanstrom wrote: >> >> An alternate configuration, which takes more memory, but might offer >> a bit more in the way of survivability, would be to not use fs for venti. Instead, >> run one venti daemon per disk, with independent arenas/indexes. Insert a little >> Venti proxy between fossil and your daemons; it should try reads on each >> venti until one returns a block; it should issue writes to all of them. > >why would that be more survivable? When an fs mirror is out of sync, which mirror holds the right data? Fs has no way of knowing. Venti at least has the block hashes. Imagine cutting power after a first disk in a mirror has data written but subsequent ones don't? With fs, disaster. (well, sorta. devfs always reads from the first device in a mirror first, and writes to the devices in order as well. you might get lucky, but you wouldn't know about errors until you have to hit the second device. at which point its too late.) Venti deals with incompletely written blocks; the arenas and index structures are still workable. The situation is even recoverable - a proxy could notice that one of the backends failed to return a read, so it rewrite the data from an other copy (which it can verify) to the failed one. Also, the isolation granted by writing data to two venti daemons is nicer than scribbling blocks to both disks; you can bring down either back-end venti while the system is running. You can even move one of the pairs to a remote system. If disks are removable in your configuration, you can even grow the available space live. -- vs