9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] Survey: Current Fossil+venti Filesystem
@ 2011-06-22 13:50 Jack Norton
  2011-06-22 14:02 ` erik quanstrom
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Jack Norton @ 2011-06-22 13:50 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Fellow 9fans,

I am about to build a large-ish fileserver to go along with my cpu/auth
machines (native and qemu).

I'd like to know some recent real world experiences with fossil+venti.
This stems from rumors that for some people, fossil has a history of
data loss.  I don't like rumors (or data loss), and I'm not on IRC long
enough to digest the gossip, so I want a survey:

who here has lost data with a fossil+venti setup and what were the
circumstances therein?  Also what failed?  Did fossil get hosed and you
had to recall the last snap -- was that successful?  Did your venti
index get mangled?  Did a bunch of porn suddenly show up in your usr
directory (I think we know who's fault that is)?  Your dog tripping over
the power cord with no battery backup doesn't count...

I know this probably gets discussed on IRC all the time, but 9fans
serves as my "stream of thought" manual for plan 9 with permanent
records of damn good information.  Plus I am tired of this damn mousing
debacle -- I'm about to filter out that thread.

-Jack



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-22 13:50 [9fans] Survey: Current Fossil+venti Filesystem Jack Norton
@ 2011-06-22 14:02 ` erik quanstrom
  2011-06-22 15:13 ` Charles Forsyth
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: erik quanstrom @ 2011-06-22 14:02 UTC (permalink / raw)
  To: 9fans

> directory (I think we know who's fault that is)?  Your dog tripping over
> the power cord with no battery backup doesn't count...

at least where i live, that sort of thing does count.  battery backup
is not infiinite.  but then again, i'm biased heavily against data loss.  :-)

- erik



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-22 13:50 [9fans] Survey: Current Fossil+venti Filesystem Jack Norton
  2011-06-22 14:02 ` erik quanstrom
@ 2011-06-22 15:13 ` Charles Forsyth
  2011-07-03  9:07   ` Steve Simon
  2011-06-23  8:46 ` David du Colombier
  2011-06-23 16:20 ` smiley
  3 siblings, 1 reply; 23+ messages in thread
From: Charles Forsyth @ 2011-06-22 15:13 UTC (permalink / raw)
  To: 9fans

I can't remember ever having lost data with fossil+venti,
on two complete networked systems that have been running
on fossil/venti since 2004 and 2005, and they both store
quite a bit:

[2004]
index=main
total arenas=11 active=7
total space=51,385,286,656 used=28,842,582,973
clumps=10,363,594 compressed clumps=8,225,943 data=53,987,673,480 compressed data=28,189,676,551

that's recently been copied to a new drive (soon a set of drives).
originally it was to have gone to an SSD, but that failed completely within
15 hours, and i reverted to the clone i'd made on a 320 Gb disk.

[2005]
index=main
total arenas=61 active=57
total space=32748126208 used=30256097096
clumps=8383698 compressed clumps=5978935 data=55987976415 compressed data=29727924122

i see that the 2005 system is probably due for copying to a new arena, and running
a newer instance of fossil/venti. in fact i see that kernel+fossil+venti
themselves date from 2005.

i also ran fossil+venti systems on several laptops and makes of laptops
during that time, and never had a problem.

i can't remember ever having to reinit the fossils from an archive.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-22 13:50 [9fans] Survey: Current Fossil+venti Filesystem Jack Norton
  2011-06-22 14:02 ` erik quanstrom
  2011-06-22 15:13 ` Charles Forsyth
@ 2011-06-23  8:46 ` David du Colombier
  2011-06-23 10:03   ` Richard Miller
                     ` (2 more replies)
  2011-06-23 16:20 ` smiley
  3 siblings, 3 replies; 23+ messages in thread
From: David du Colombier @ 2011-06-23  8:46 UTC (permalink / raw)
  To: 9fans

I ran dozens of file servers with only Fossil or both Fossil and Venti
since 2004, and I never lost any byte of data.

I experienced only few problems with Fossil. Most of them caused Fossil
to freeze or, more rarely, to crash.

In most cases, I simply rebooted the file server, and it worked fine
again.

What I can say is:

 - these problems tend to appear far much often when running on virtual
   machines like Qemu,
 - running Fossil without Venti prevent some problems, like freezes
   during snapshot to Venti.

I am currently running something like twenty file servers (including
standalone machines), from 8GB to 1.5TB of storage. Most file servers
are running Fossil with Venti, but some are only running Fossil.
I mirror Venti through network and regularly backup arenas to DVDs
(but I am recently moving to BD).

So far, I only succeeded to reproduce data-lost problems on Fossil when
running heavy stress testing.

I can understand Fossil have problems and they might cause data
corruption, but when data is archived to Venti, you can be sure it will
stay forever.

These problems with Fossil just have to be fixed. I am currently working
on a fossil derivative which use libventi and libthread instead of
liboventi and eventually fix these problems. I am already running it
on some test file servers, but it's too early to talk about experiences
with it.

--
David du Colombier



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23  8:46 ` David du Colombier
@ 2011-06-23 10:03   ` Richard Miller
  2011-06-23 10:35   ` Adrian Tritschler
  2011-06-23 14:51   ` Jack Norton
  2 siblings, 0 replies; 23+ messages in thread
From: Richard Miller @ 2011-06-23 10:03 UTC (permalink / raw)
  To: 9fans

David du Colombier <0intro@gmail.com>:
> I can understand Fossil have problems and they might cause data
> corruption, but when data is archived to Venti, you can be sure it will
> stay forever.

My experience is nearly the opposite of this.  I've never lost any
active fossil data.  And I agree that once data is archived to venti,
it is safe.  But the daily archiving to venti can fail - sometimes
with "archWalk:" error messages and no dump made, but more often
silently leaving a dump with corrupted metadata in one or two
directories preventing access to anything below them in the tree.
More details in http://9fans.net/archive/2009/04/135 - I still haven't
tracked down the cause but I suspect a data race somewhere.  It's an
irritation rather than a disaster because I've always been able to
patch up the bad metadata by hand.  I quite enjoy a bit of disk
surgery now and then to keep my hand in.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23  8:46 ` David du Colombier
  2011-06-23 10:03   ` Richard Miller
@ 2011-06-23 10:35   ` Adrian Tritschler
  2011-06-23 14:51   ` Jack Norton
  2 siblings, 0 replies; 23+ messages in thread
From: Adrian Tritschler @ 2011-06-23 10:35 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I'm putting it down to operator error, but I've hosed any number of
plan9 QEMU images that talk to a p9p venti running on the linux host
that hosts the QEMU images.  Never entirely sure why, but what
typically happens is that I play with one or two plan9 images for a
couple of days, then something else distracts me for a while, then
when I go to retry them a few weeks later one or more of them no
longer boots and I just get endless streams of errors from fossil
complaining that such-and-such a block cannot be found. As far as I
know I always shut the QEMU images down "nicely" and the p9p venti is
always running.

--
Adrian Tritschler
Melbourne, Australia

Screw the environment. Print this email immediately. Then burn it
without reading it.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23  8:46 ` David du Colombier
  2011-06-23 10:03   ` Richard Miller
  2011-06-23 10:35   ` Adrian Tritschler
@ 2011-06-23 14:51   ` Jack Norton
  2011-06-23 15:25     ` dexen deVries
                       ` (2 more replies)
  2 siblings, 3 replies; 23+ messages in thread
From: Jack Norton @ 2011-06-23 14:51 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

All of these are great reports and are what I expected.  Thanks.

David du Colombier wrote:
> These problems with Fossil just have to be fixed. I am currently working
> on a fossil derivative which use libventi and libthread instead of
> liboventi and eventually fix these problems. I am already running it
> on some test file servers, but it's too early to talk about experiences
> with it.
>
I had heard of someone working on a fossil derivative with libventi, I
didn't know it was you.  Is the code public yet?  I'd give it a stress
testing of my own eventually if given the chance.

I don't think I will ever stress things as much as some of you all
though -- I just use this for personal use  (I'll just say that I've got
2 soekris boards and they are plenty for what I do -- I guess I'm just
patient).

Charles, I'm glad you pointed out your woes with the SSD, I had
considered purchasing an SSD for my cpu/fs but I didn't want to gamble
with such a costly device.  Now I have every reason to avoid it (I've
seen similar testimonies).  I figured flash memory was perfect for venti
-- but not without some stability.  Plus, there are 1TB 2.5" consumer
drives out for just north of $100!  What a world we live in.  By the
time I fill that, there will be 2TB 2.5" drives...

Cheers,
Jack





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 14:51   ` Jack Norton
@ 2011-06-23 15:25     ` dexen deVries
  2011-06-23 15:27       ` erik quanstrom
  2011-06-23 15:27     ` Charles Forsyth
  2011-06-23 15:57     ` David du Colombier
  2 siblings, 1 reply; 23+ messages in thread
From: dexen deVries @ 2011-06-23 15:25 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thursday 23 June 2011 16:51:21 Jack Norton wrote:
> Charles, I'm glad you pointed out your woes with the SSD, I had
> considered purchasing an SSD for my cpu/fs but I didn't want to gamble
> with such a costly device.  Now I have every reason to avoid it (I've
> seen similar testimonies).  I figured flash memory was perfect for venti
> -- but not without some stability.  Plus, there are 1TB 2.5" consumer
> drives out for just north of $100!  What a world we live in.  By the
> time I fill that, there will be 2TB 2.5" drives...

The word out there is that it's the newfangled SSDs with lots of smarts
onboard (like built-in deduplication) that are particularly prone to sudden
failure. Unfortunately the older ones don't come in sensible capacities.

I have a very old 16GB SSD extracted from Asus EeePC; it's kind of slow, but
should be very simple and thus reliable. Serves well with NILFS2 FS (currently
Linux and *BSD specific) on it.

On the other hand, if you can burn $$$, there are enterprisey SSDs based on
SLC Flash, built in form of PCIE cards, should be quite reliable.

http://news.ycombinator.com/item?id=2667398

--
dexen deVries

``One can't proceed from the informal to the formal by formal means.''



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 14:51   ` Jack Norton
  2011-06-23 15:25     ` dexen deVries
@ 2011-06-23 15:27     ` Charles Forsyth
  2011-06-23 15:57     ` David du Colombier
  2 siblings, 0 replies; 23+ messages in thread
From: Charles Forsyth @ 2011-06-23 15:27 UTC (permalink / raw)
  To: 9fans

>Charles, I'm glad you pointed out your woes with the SSD, I had
>considered purchasing an SSD for my cpu/fs but I didn't want to gamble
>with such a costly device.

it was just one device. i've got others in laptops and netbooks that
have been just fine, and have splendid performance.
i'd certainly run it with replication onto something else (such as a hard disk).
indeed i'd been planning to do that anyway, but the ssd i had
didn't last long enough to set that up using fs(3).



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 15:25     ` dexen deVries
@ 2011-06-23 15:27       ` erik quanstrom
  2011-06-23 15:57         ` ron minnich
  2011-06-23 17:02         ` Jack Norton
  0 siblings, 2 replies; 23+ messages in thread
From: erik quanstrom @ 2011-06-23 15:27 UTC (permalink / raw)
  To: 9fans

> On the other hand, if you can burn $$$, there are enterprisey SSDs based on
> SLC Flash, built in form of PCIE cards, should be quite reliable.
>

i think the pcie form factor for a hard drive is a trap.
pcie is not easily hot swappable, and more expensive
than a number of smaller devices that can be mirrored,
thus not leading to an expensive single point of failure.

the ahci driver fully supports hot swap (or "surprise removal"
as intel calls it).  one can easily mirror two ssds, or a ssd
and hard drive that way.

- erik



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 15:27       ` erik quanstrom
@ 2011-06-23 15:57         ` ron minnich
  2011-06-23 16:19           ` erik quanstrom
                             ` (2 more replies)
  2011-06-23 17:02         ` Jack Norton
  1 sibling, 3 replies; 23+ messages in thread
From: ron minnich @ 2011-06-23 15:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

you need to be careful about mirroring SSDs.

See: Differential RAID: Rethinking RAID for SSD Reliability
Mahesh Balakrishnan (Microsoft Research Silicon Valley), Asim Kadav
(University of Wisconsin), Vijayan Prabhakaran (Microsoft Research
Silicon Valley), Dahlia Malkhi (Microsoft Research Silicon Valley)

ron



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 14:51   ` Jack Norton
  2011-06-23 15:25     ` dexen deVries
  2011-06-23 15:27     ` Charles Forsyth
@ 2011-06-23 15:57     ` David du Colombier
  2 siblings, 0 replies; 23+ messages in thread
From: David du Colombier @ 2011-06-23 15:57 UTC (permalink / raw)
  To: 9fans

> Is the code public yet?  I'd give it a stress testing of my own
> eventually if given the chance.

Not yet. I would like to fix some issues before distributing anything.

--
David du Colombier



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 15:57         ` ron minnich
@ 2011-06-23 16:19           ` erik quanstrom
  2011-06-23 16:28             ` ron minnich
  2011-06-23 16:56           ` Charles Forsyth
  2011-06-23 16:58           ` Charles Forsyth
  2 siblings, 1 reply; 23+ messages in thread
From: erik quanstrom @ 2011-06-23 16:19 UTC (permalink / raw)
  To: 9fans

On Thu Jun 23 11:59:02 EDT 2011, rminnich@gmail.com wrote:
> you need to be careful about mirroring SSDs.
>
> See: Differential RAID: Rethinking RAID for SSD Reliability
> Mahesh Balakrishnan (Microsoft Research Silicon Valley), Asim Kadav
> (University of Wisconsin), Vijayan Prabhakaran (Microsoft Research
> Silicon Valley), Dahlia Malkhi (Microsoft Research Silicon Valley)

for those who tl;dr'd ...

i don't think this paper applies to write-once systems like
venti (or ken fs for that matter).

also, good ssds are rated in terms of a minimum amount of data
that can be written to them.  an very good ssds have a minimum
write number that means they can be written to at maximum speed
for the drive's full rated lifetime.

you're not supposed to keep any drive past their rated lifetime.

- erik



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-22 13:50 [9fans] Survey: Current Fossil+venti Filesystem Jack Norton
                   ` (2 preceding siblings ...)
  2011-06-23  8:46 ` David du Colombier
@ 2011-06-23 16:20 ` smiley
  3 siblings, 0 replies; 23+ messages in thread
From: smiley @ 2011-06-23 16:20 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Jack Norton <jack@0x6a.com> writes:

> I'd like to know some recent real world experiences with
> fossil+venti. This stems from rumors that for some people, fossil has
> a history of data loss.  I don't like rumors (or data loss), and I'm

Maybe newbies tend to bang on the edges of fossil more than the old
hats, but I've encountered a couple of problems with fossil+venti.

(1) A fresh fossil+venti install on a ThinkPad T23 resulted in a file
system with read errors.  (Discussed in a previous thread on this list.)
On the other hand, a fossil-only install on the same machine worked
fine.  This was resolved by using a 9atom kernel and enabling DMA in
plan9.ini.

(2) I once experienced something really bizarre, whereby doing something
like a cp/dircp over a file resulted in corruption of the backed up
copy.  If I recall correctly, the snap/archive (don't remember which)
copy of the file thereafter contained zero bytes, even though both the
previous and new versions of the file were non-zero in length.  This is
probably a problem with how fossil+venti handle file truncation.  I
learned to tiptoe around that by using rm before cp/dircp.

Other than those, I've never lost data with fossil+venti... at least in
the whole three months or so this noob's been using them.  :) YMMV.

--
+---------------------------------------------------------------+
|E-Mail: smiley@zenzebra.mv.com             PGP key ID: BC549F8B|
|Fingerprint: 9329 DB4A 30F5 6EDA D2BA  3489 DAB7 555A BC54 9F8B|
+---------------------------------------------------------------+



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 16:19           ` erik quanstrom
@ 2011-06-23 16:28             ` ron minnich
  2011-06-23 16:37               ` erik quanstrom
  2011-06-23 18:47               ` Bakul Shah
  0 siblings, 2 replies; 23+ messages in thread
From: ron minnich @ 2011-06-23 16:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Jun 23, 2011 at 9:19 AM, erik quanstrom <quanstro@quanstro.net> wrote:

> i don't think this paper applies to write-once systems like
> venti (or ken fs for that matter).

but it might apply to fossil.

> also, good ssds are rated in terms of a minimum amount of data
> that can be written to them.  an very good ssds have a minimum
> write number that means they can be written to at maximum speed
> for the drive's full rated lifetime.


The main point I took from the talk they gave was that failure was
most strongly related to the number of writes in FLASH. If your
striping strategy is to duplicate writes to each drive, you faced the
happy prospect of doing a write and having both drives fail at the
same time. Hard drives have a different way of failing. We've seen
weirdness like this here, with drives in a bunch of nodes that all
seem to fail simultaneously, well within rated lifetime. Not cheap
drives either. Of course that was a little while ago and things seem
to have gotten better, but it's worth a warning.

Anyway, it's important to keep in mind that SSDs are a bit different.

ron



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 16:28             ` ron minnich
@ 2011-06-23 16:37               ` erik quanstrom
  2011-06-23 18:47               ` Bakul Shah
  1 sibling, 0 replies; 23+ messages in thread
From: erik quanstrom @ 2011-06-23 16:37 UTC (permalink / raw)
  To: 9fans

> The main point I took from the talk they gave was that failure was
> most strongly related to the number of writes in FLASH. If your
> striping strategy is to duplicate writes to each drive, you faced the
> happy prospect of doing a write and having both drives fail at the
> same time. Hard drives have a different way of failing. We've seen
> weirdness like this here, with drives in a bunch of nodes that all
> seem to fail simultaneously, well within rated lifetime. Not cheap
> drives either. Of course that was a little while ago and things seem
> to have gotten better, but it's worth a warning.

that's very interesting.  i haven't seen that at all.  the drives
that i've seen fail in bunches have been regular old hard drives.

- erik



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 15:57         ` ron minnich
  2011-06-23 16:19           ` erik quanstrom
@ 2011-06-23 16:56           ` Charles Forsyth
  2011-06-23 16:58           ` Charles Forsyth
  2 siblings, 0 replies; 23+ messages in thread
From: Charles Forsyth @ 2011-06-23 16:56 UTC (permalink / raw)
  To: 9fans

>you need to be careful about mirroring SSDs.

i was intending and preparing to mirror the ssd on a hard drive (not an ssd),
since i've had experience of having those work in my environment
for about 14 years between replacement. admittedly, i was a little
surprised when i read the date on the drives as i replaced them.
unfortunately, my particular ssd didn't last long enough to set up
the mirror properly. fortunately, i'd just cloned it to the hard
drive, so i simply switched to that.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 15:57         ` ron minnich
  2011-06-23 16:19           ` erik quanstrom
  2011-06-23 16:56           ` Charles Forsyth
@ 2011-06-23 16:58           ` Charles Forsyth
  2 siblings, 0 replies; 23+ messages in thread
From: Charles Forsyth @ 2011-06-23 16:58 UTC (permalink / raw)
  To: 9fans

there is an interesting article in the current ;login: about
drives and SSDs:

System Impacts of Storage Trends: Hard Errors and Testability by Steven R. Hetzler



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 15:27       ` erik quanstrom
  2011-06-23 15:57         ` ron minnich
@ 2011-06-23 17:02         ` Jack Norton
  2011-06-23 17:50           ` erik quanstrom
  1 sibling, 1 reply; 23+ messages in thread
From: Jack Norton @ 2011-06-23 17:02 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

erik quanstrom wrote:
>> On the other hand, if you can burn $$$, there are enterprisey SSDs based on
>> SLC Flash, built in form of PCIE cards, should be quite reliable.
>>
>
> i think the pcie form factor for a hard drive is a trap.
> pcie is not easily hot swappable, and more expensive
> than a number of smaller devices that can be mirrored,
> thus not leading to an expensive single point of failure.
> - erik
>

Now that you mention pcie drives, has anyone used those little mini-pcie
ssd's that fit on some atom motherboards?  Might be a convenient
location for fossil (what are they like 16GB?).  That is if they are
supported.  I've never even been near one.  Does it get attached to the
disk controller via sata (by way of magic)? Or does it do something
completely different that I cannot fathom?

Those I wouldn't consider 'a trap' as they have a bright future on
laptop motherboards and hot swap isn't even a useful feature in that case.

-jack



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 17:02         ` Jack Norton
@ 2011-06-23 17:50           ` erik quanstrom
  0 siblings, 0 replies; 23+ messages in thread
From: erik quanstrom @ 2011-06-23 17:50 UTC (permalink / raw)
  To: 9fans

> Now that you mention pcie drives, has anyone used those little mini-pcie
> ssd's that fit on some atom motherboards?  Might be a convenient
> location for fossil (what are they like 16GB?).  That is if they are
> supported.  I've never even been near one.  Does it get attached to the
> disk controller via sata (by way of magic)? Or does it do something
> completely different that I cannot fathom?

the pcie card would have to be its own controller.  how the controller is
attached to the on-board drive doesn't have to be visible to anyone using
the part.  i've never used one of these things.  one hopes that a custom
driver isn't required for each pcie-attached drive.

> Those I wouldn't consider 'a trap' as they have a bright future on
> laptop motherboards and hot swap isn't even a useful feature in that case.

i wasn't thinking of laptops, but now that you mention it,
i'd certainly want hot-swap on a laptop.  especially if i can
have more than one drive.

- erik



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 16:28             ` ron minnich
  2011-06-23 16:37               ` erik quanstrom
@ 2011-06-23 18:47               ` Bakul Shah
  2011-06-24 22:44                 ` Akshat Kumar
  1 sibling, 1 reply; 23+ messages in thread
From: Bakul Shah @ 2011-06-23 18:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, 23 Jun 2011 09:28:49 PDT ron minnich <rminnich@gmail.com>  wrote:
>
> The main point I took from the talk they gave was that failure was
> most strongly related to the number of writes in FLASH. If your
> striping strategy is to duplicate writes to each drive, you faced the
> happy prospect of doing a write and having both drives fail at the
> same time. Hard drives have a different way of failing. We've seen
> weirdness like this here, with drives in a bunch of nodes that all
> seem to fail simultaneously, well within rated lifetime. Not cheap
> drives either. Of course that was a little while ago and things seem
> to have gotten better, but it's worth a warning.

All they are saying is to age SSDs at different rate to avoid
correlated failures.

Disk drives have a similar problem in that disks from the same
batch seem to die at a similar age. One issue is that N years
later it is not cost effective to get a replacement disk of
the same size.

Now I think this (dying at the same age) is actually a good
thing! The key is to not wait to replace until they die; just
replace them all when you decide to replace *any*!

zfs helps since it will automatically grow the space (So for
instance, on my home system originally I used a mirror of 2
250GB used IDE disks and another mirror of 2 300GB sata disks,
striped together.  I first replaced both IDE disks with bigger
ATA disks.  Later I replaced the 300GB sata disks with 1TB
disks and now I have a lot more space to play with).

@work I used ZFS raidz2 on 2TBx6 drives and a 2x80GB SSD
mirror for root + the write intent log (this is a server for
backing up N machines, so write performance is more critical).
Due to a mixup we are using MLC SSDs instead of SLC SSDs (to
be replaced at some point).  Not ideal but works well enough.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-23 18:47               ` Bakul Shah
@ 2011-06-24 22:44                 ` Akshat Kumar
  0 siblings, 0 replies; 23+ messages in thread
From: Akshat Kumar @ 2011-06-24 22:44 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

My last attempt at a Fossil+Venti system
setup resulted in this:

http://9fans.net/archive/2009/01/150

which, to my knowledge, was never resolved.
The result was that I lost all venti data and
then the fossil system was unresponsive.

I've been using Ken FS systems to maintain
my data, now, as a NAS. Along with cinap's
cifsd or Newsham's filesystem for windows,
or your friendly modern unix options, this
works well in a heterogeneous network
setup as well.


Best,
Akshat



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [9fans] Survey: Current Fossil+venti Filesystem
  2011-06-22 15:13 ` Charles Forsyth
@ 2011-07-03  9:07   ` Steve Simon
  0 siblings, 0 replies; 23+ messages in thread
From: Steve Simon @ 2011-07-03  9:07 UTC (permalink / raw)
  To: 9fans

On Wed Jun 22 16:10:43 BST 2011, forsyth@terzarima.net wrote:
> I can't remember ever having lost data with fossil+venti,
> on two complete networked systems that have been running
> on fossil/venti since 2004 and 2005,

I am in a very similar situation, two servers since 2004 and have
not lost anything.

There is a bug in the ephemeral snapshot code in fossil which used to
cause my server to freeze about once a month, however I never lost
any data; I have since turned off ephemeral snapshots and have had no
problems since then - its a pity though.

I have had problems with disks dieing and always run mirrored pairs of
disks from different manufacturers (in the hope that they might
fail on different days :-).

-Steve



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-07-03  9:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-22 13:50 [9fans] Survey: Current Fossil+venti Filesystem Jack Norton
2011-06-22 14:02 ` erik quanstrom
2011-06-22 15:13 ` Charles Forsyth
2011-07-03  9:07   ` Steve Simon
2011-06-23  8:46 ` David du Colombier
2011-06-23 10:03   ` Richard Miller
2011-06-23 10:35   ` Adrian Tritschler
2011-06-23 14:51   ` Jack Norton
2011-06-23 15:25     ` dexen deVries
2011-06-23 15:27       ` erik quanstrom
2011-06-23 15:57         ` ron minnich
2011-06-23 16:19           ` erik quanstrom
2011-06-23 16:28             ` ron minnich
2011-06-23 16:37               ` erik quanstrom
2011-06-23 18:47               ` Bakul Shah
2011-06-24 22:44                 ` Akshat Kumar
2011-06-23 16:56           ` Charles Forsyth
2011-06-23 16:58           ` Charles Forsyth
2011-06-23 17:02         ` Jack Norton
2011-06-23 17:50           ` erik quanstrom
2011-06-23 15:27     ` Charles Forsyth
2011-06-23 15:57     ` David du Colombier
2011-06-23 16:20 ` smiley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).