9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: FODEMESI Gergely <fgergo@eik.bme.hu>
To: plan9 mailing list <9fans@cse.psu.edu>
Subject: [9fans] fossil/venti/manber
Date: Mon, 28 Apr 2003 22:17:21 +0200	[thread overview]
Message-ID: <Pine.GSO.4.50L0.0304282143170.9486-100000@goliat> (raw)

Hi,

 in the venti paper there is a reference to manber's algorithm, for
possible future development to venti/fossil. (Udi Manber: Finding similar
files in a large file system)

Did anyone consider giving this possible development a second thought?

I'd like to elaborate on the possibility of using this algorithm with
venti.
Could somebody correct me if the following comments are false?

1. Anchors would be needed to synchronize to block boundries.

2. In order to somehow detect possible similar bit-streams, venti must
know more about the meta information on these similar bit-streams
(files/directories). By this I mean venti format has to be extended with
meta information on files.

3. Venti would have to implement a method of generating possible anchors
to possible similar bit-streams to "new" (i.e. freshly stored)
bit-streams. This should probably be done parallel to storing new blocks.
"Lazy anchoring?"

4. Except for databases with dynamically changing sizes (are there any?),
what kind of bit-streams could such a method be used for?

5. Depending on the comments to 4. could anybody imagine changing venti
format in order to provide such a seemingly marginally useful feature? See
Russ's comment on possibly never changing the venti format.

 thanks for listening: gergo


             reply	other threads:[~2003-04-28 20:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-28 20:17 FODEMESI Gergely [this message]
2003-04-28 21:55 ` Russ Cox
2003-04-28 22:13   ` William Josephson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.GSO.4.50L0.0304282143170.9486-100000@goliat \
    --to=fgergo@eik.bme.hu \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).