From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <1902d3f01b81b65009a4326d51d88197@quanstro.net>
To: 9fans@9fans.net
From: erik quanstrom <quanstro@quanstro.net>
Date: Sun, 25 May 2008 16:24:13 -0400
In-Reply-To: <8ccc8ba40805250848x16f054b8y71b46ff1c346eda4@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Subject: Re: [9fans] Fossil+Venti on Linux
Topicbox-Message-UUID: ac7181d4-ead3-11e9-9d60-3106f5b1d025

> You could adapt Plan B's bns to fail over between different FSs. But...
> We learned that although you can let he FS fail over nicely, many other
> things stand in the way making it unnecessary to fail over. For example,
> on Plan 9, cs and dns have problems after a fail over, your IP address
> may change, etc. All that is to say that when you experience tolerance
> to FS failures, you still face other things that do not fail over.
>
> To tolerate failures what we do is to run venti on
> a raid. If fossil gets corrupted somehow we'd just format the partition
> using the last vac. To survive crashes of the machine with the venti we
> copy its arenas to another machine, aso keeping a raid.

forgive a bit of off-topicness.  this is about ken's filesystem, not
venti or fossil.

the coraid fs maintains its cache on a local AoE-based raid10 and it
automaticlly mirrors its worm on two AoE-based raid5 targets.  the
secondary worm target is in a seperate building with a backup fs.
since reads always start with the first target, the slow offsite link
is not noticed.  (we frequently exceed the bandwidth of the
backup link -- now 100Mbps --- to the cache, so replicating the cache
would be impractical.)

we can sustain the loss of a disk drive with only a small and
temporary performance hit.  the storage targets may be rebooted with a
small pause in service.

more severe machine failues can be recovered with varing degrees of
pain.  only if both raid targets were lost simultainously would more than
24hrs of data be lost.

we don't do any failover.  we try to keep the fs up instead.  we have
had two unplanned fs outages in 2 years.  one was due to a corrupt
sector leading to a bad tag.  the other was a network problem due to
an electrical storm that could have been avoided if i'd been on the
ball.

the "diskless fileserver" paper from iwp9 has the gory details.

- erik