From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Mon, 19 Jan 2009 22:20:09 -0800
From: Roman Shaposhnik <rvs@sun.com>
In-reply-to: <9384d5b4e8e8e41b43bb7a8714b83dc2@quanstro.net>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Message-id: <EB1AF722-2FB8-4B19-B2FF-D9E71BC2AC33@sun.com>
MIME-version: 1.0
Content-type: text/plain; delsp=yes; format=flowed; charset=US-ASCII
Content-transfer-encoding: 7BIT
References: <9384d5b4e8e8e41b43bb7a8714b83dc2@quanstro.net>
Subject: Re: [9fans] Changelogs & Patches?
Topicbox-Message-UUID: 82da2168-ead4-11e9-9d60-3106f5b1d025

I think I'm now ready to pick up this old thread (if anybody's still
interested...)

On Jan 7, 2009, at 5:11 PM, erik quanstrom wrote:
>> Lets see. May be its my misinterpretation of what venti does. But so
>> far I understand that it boils down to: I give venti a block of any
>> length, it gives me a score back. Now internally, venti might decide
>
> just a clarification.  this is done by the client.  from venti(6):
>       Files and Directories
>          Venti accepts blocks up to 56 kilobytes in size. By conven-
>          tion, Venti clients use hash trees of blocks to represent
>          arbitrary-size data files. [...]

Right. This, by the way, suggests that the onus is on the clients
to help venti reuse as much blocks as possible. Has there been
any established practices of finding the best "cut-here" points?

>> But even in the former case I don't see how the corruption could be
>> possible. Please elaborate.
>
> i didn't say there would be corruption.  i assumed corruption
> and outlined how one could recover the maximal set of data
> and have a consistent fs (assuming the damage doesn't cut a
> full strip across all backups) by simply picking a good
> block at each lba from the available damaged and/or incomplete
> backups, which may originate at different times.  (russ was the
> first that i know of to put this into practice.)
>
> in the case of zfs, my claim is that since zfs can reuse blocks, two
> vdev backups, each with corruption or missing data in different places
> are pretty well useless.


Got it. However, I'm still not fully convinced there's a definite edge
one way or the other. Don't get me wrong: I'm not trying to defend
ZFS (I don't think it needs defending, anyway) but rather I'm trying
to test my mental model of how both work.

We assume a damaged set of arenas for venti and a damaged set
of vdevs for ZFS. Everything is off-line at that point and we are
running
strictly in forensics mode. The show, basically, consists of three acts:
     1. salvaging as many good data blocks as possible
     2. building higher-order structures out of primary data blocks
     3. trying to rebuild as much of a consistent FS as possible
          using all the available blocks

It seems to me that #1 and #2 are 100% the same in terms of
the probability of success. In fact, one might claim that ZFS has
a slight edge because of:
      a. "volume management" being part of the FS
      b. the "ditto blocks" IOW every block pointer having up to
          3 alternative locations for the block it points to
The net result is that you might end up with more good blocks
to choose from in ZFS world, than in venti's case. Which brings
us to #3.

Once again, we might have more blocks to choose from than
we want (including "free" blocks) but the generation number
should be enough of a clue to filter unwanted things out.

Thanks,
Roman.

P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
possibly re-silvered) under normal working conditions, while
fossil might not even notice a corruption.