9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] thoughs about venti+fossil
@ 2015-04-21 18:30 hruodr
  2015-04-21 19:46 ` Russ Cox
  0 siblings, 1 reply; 67+ messages in thread
From: hruodr @ 2015-04-21 18:30 UTC (permalink / raw)
  To: 9fans


Dear Sirs!

I have a question about this old discussion:

http://www.mail-archive.com/9fans@cse.psu.edu/msg17960.html

And about the following answer:

> if you change lump.c to say
>
>       int verifywrites = 1;
>
> then venti will check every block as it is written to make
> sure there is no hash collision.  this is not the default (anymore).

Does this mean, that Plan9 by default supposes that A=B if hash(A)=hash(B)?

That this was not the default before, but it is now?

That I still have the possibility of a "full check" of A=B (and not supposing
it after checking hash(A)=hash(B)) by changing "int verifywrites = 1;" in
"lump.c"?

I do not want to revive the discussion, because I have the feeling
that the discussion about the thema easily becomes very ideological.
I had it just some time ago, here my last word:

http://thread.gmane.org/gmane.os.openbsd.misc/206951/focus=207340

In fact I feel bad declaring A=B when hash(A)=hash(B), but I know that
the probability of an error, at least by experience, is very small,
negligible (although I have scruples saying it). Just a bad feeling,
perhaps more against this kind of "probabilistic programming" than
against the probability of error (that I do not exactly know).

My question is hence not about the theory behind, not about a justification
of the test with hash functions. I just want to know why the
default was changed, why originaly A=B was checked even if one was sure
that the probability of error by only checking hash(A)=hash(B) is
negligible, why the possibility of changing this default exists.

Rodrigo.



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2015-04-21 18:30 [9fans] thoughs about venti+fossil hruodr
@ 2015-04-21 19:46 ` Russ Cox
  0 siblings, 0 replies; 67+ messages in thread
From: Russ Cox @ 2015-04-21 19:46 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1951 bytes --]

On Tue, Apr 21, 2015 at 2:30 PM, <hruodr@gmail.com> wrote:

> Does this mean, that Plan9 by default supposes that A=B if hash(A)=hash(B)?
>

Yes.


> That this was not the default before, but it is now?
>

Yes.


> That I still have the possibility of a "full check" of A=B (and not
> supposing
> it after checking hash(A)=hash(B)) by changing "int verifywrites = 1;" in
> "lump.c"?
>

Yes.


> I just want to know why the
> default was changed,


I believe I changed it, because I was working on performance, and reading
data from disk to run a memcmp whose answer is already known is an obvious
operation to cut to make a disk-bound server faster. Since you (quite
reasonably) don't want to reopen the debate over whether that's a
reasonable optimization, I won't justify it further.

My paper with Sean Rhea and Alex Pesterev documents the performance effect
of double-checking the equality in some detail.
https://www.usenix.org/legacy/event/usenix08/tech/full_papers/rhea/rhea.pdf.
(Caveat: in the usual academic tradition, the paper uses "Venti" to mean
the system described in the original paper, not the system in Plan 9 today.
The current Plan 9 implementation is much closer to what the paper calls
"Foundation: Compare by Hash".)


> why originaly A=B was checked even if one was sure
> that the probability of error by only checking hash(A)=hash(B) is
> negligible, why the possibility of changing this default exists.
>

I didn't write the original code, so I can't answer that definitively. That
said, it seems to me quite reasonable to check A=B in the initial version
of a server, since you might have bugs in your implementation such that
(for example) hash(x) = 0 for all x. Once the system is more stable it also
seems to me reasonable to remove those checks if they incur significant
cost, much as one turns off costly asserts or other debugging code.

Hope this helps.
Russ

[-- Attachment #2: Type: text/html, Size: 3344 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
@ 2015-04-23  7:21 hruodr
  0 siblings, 0 replies; 67+ messages in thread
From: hruodr @ 2015-04-23  7:21 UTC (permalink / raw)
  To: 9fans


On Tue, 21 Apr 2015, Russ Cox wrote:

> My paper with Sean Rhea and Alex Pesterev documents the performance 
> effect of double-checking the equality in some detail.
> https://www.usenix.org/legacy/event/usenix08/tech/full_papers/rhea/rhea.pdf

Very nice paper. Specialy from chapter 3.

By reading it, my same original question arises again:

****
  Foundation’s CAS layer is modeled on the Venti [34]
  content-addressed storage server, but we have adapted the
  Venti algorithms for use in a single-disk system and also
  optionally eliminated the assumption that SHA-1 is free
  of collisions, producing two operating modes for Foundation:
  compare-by-hash and compare-by-value. (Page 4)
****


And the question is answered in the paper:

****
While we originally investigated this mode [Compare-by-value] 
due to (in our opinion, unfounded) concerns about cryptographic
hash collisions (see [5, 16] for a lively debate), we
were surprised to find that its overall write performance
was close to that of compare-by-hash mode, despite the
added comparisons. Moreover, compare-by-value is always
faster for reads, as naming blocks by their log offsets
completely eliminates index lookups during reads. (page 7)
****


> (Caveat: in the usual academic tradition, the paper uses "Venti" to mean
> the system described in the original paper, not the system in Plan 9 today.
> The current Plan 9 implementation is much closer to what the paper calls 
> "Foundation: Compare by Hash".)

New question: and if I compile it with "int verifywrites = 1", is it
closer to "Compare-by-Value"?

I mean offset as handle.


> Hope this helps.

I wanted to know if the optional compiling with full check was in
consideration of people that have concerns about the (in)correctness of
compare-by-hash. One can disagree about the risk of using compare-by-hash,
but one cannot disagree in the fact that one disagrees. :)

I think, everyone should decide himself if he uses "compare-as-hash" 
and where he uses it. In some applications I would even take much more 
risk than compare-by-hash. And I find interesting the experiments with
hash functions, including compare-by-hash.

I appreciate that the option of "compare-by-value" is there. Documentation 
about where "compare-by-hash" is used, is important in orther that people 
may decide by themselve.

Interesting would also be the possibility of easily changing the hash
functions. As you note in the paper, this is important in "compare-by-value"
for increasing performance.

The problem I have with "compare-by-hash" is not only the probability of
hash colisions, but that it seems to rely on empirical knowledge about 
the used hash function. People used to analytical arguments may find
empirical arguments and empirical programming gruesome. If the empirical 
knowledge changes, if one discovers that the used hash function does not 
distribute homogenously enough its domain in its range, then one will 
want (specially in the case of compare by hash) to change the hash 
function with a better one. Trial and error is the empirical method 
of solving (and making) problems.

Rodrigo.




^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  5:55           ` Bruce Ellis
@ 2008-03-11 18:34             ` Uriel
  0 siblings, 0 replies; 67+ messages in thread
From: Uriel @ 2008-03-11 18:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Mar 6, 2008 at 6:55 AM, Bruce Ellis <bruce.ellis@gmail.com> wrote:
> Uriel, is this guy just a clueless dick?
Both..?

> Or am I missing a milligram of sense in a mountain of bullshit?

If there is even one atom of sense I certainly have not seen it, the
guy is a hopeless case.

Sorry for the late reply, busy doing 'holidays' on a Swedish hospital
dealing with doctors about as obtuse as Enrico.

Hope all is well in oz, nice to see DoC getting some attention!

uriel


>  brucee
>
>
>
>  On Thu, Mar 6, 2008 at 4:40 PM, Uriel <uriel99@gmail.com> wrote:
>  > >  So we've seen again: statistics are *never* reliable. It only helps
>  > >  for vague decisions on very large masses, never for a single case.
>  >
>  > There is a possibility that a meteorite will crush your head any
>  > moment, there are some statistics about how probable this, but as you
>  > say, they are not reliable, so best go live in a very deep cave, just
>  > make sure there is no Internet access, the world will be grateful for
>  > it.
>  >
>  > uriel
>  >
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-11  2:10                         ` erik quanstrom
@ 2008-03-11  6:03                           ` Bruce Ellis
  0 siblings, 0 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-11  6:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

It's good to see that nobody is passionate enough to put up $10.

brucee

On Tue, Mar 11, 2008 at 1:10 PM, erik quanstrom <quanstro@quanstro.net> wrote:
>
> > erik quanstrom wrote:
> >
> > > that's cause it's 2^80 not 2^60.  did i mistype?
> >
> > Thank goodness! Now I can sleep...
> >
> > --
> > Wes Kussmaul
>
> maybe not.  it's just 1024 times more likely that you will.
>
> - erik
>
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-11  2:04                       ` Wes Kussmaul
@ 2008-03-11  2:10                         ` erik quanstrom
  2008-03-11  6:03                           ` Bruce Ellis
  0 siblings, 1 reply; 67+ messages in thread
From: erik quanstrom @ 2008-03-11  2:10 UTC (permalink / raw)
  To: 9fans

> erik quanstrom wrote:
>
> > that's cause it's 2^80 not 2^60.  did i mistype?
>
> Thank goodness! Now I can sleep...
>
> --
> Wes Kussmaul

maybe not.  it's just 1024 times more likely that you will.

- erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 19:27                     ` erik quanstrom
  2008-03-10 20:55                       ` Bakul Shah
@ 2008-03-11  2:04                       ` Wes Kussmaul
  2008-03-11  2:10                         ` erik quanstrom
  1 sibling, 1 reply; 67+ messages in thread
From: Wes Kussmaul @ 2008-03-11  2:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

erik quanstrom wrote:

> that's cause it's 2^80 not 2^60.  did i mistype?

Thank goodness! Now I can sleep...

--
Wes Kussmaul




^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 18:06                   ` Bruce Ellis
  2008-03-10 18:31                     ` Eric Van Hensbergen
  2008-03-10 18:46                     ` Geoffrey Avila
@ 2008-03-10 21:35                     ` Charles Forsyth
  2 siblings, 0 replies; 67+ messages in thread
From: Charles Forsyth @ 2008-03-10 21:35 UTC (permalink / raw)
  To: 9fans

> And if one day you do get a collision,
> venti will tell you and then you can

tell your grandchildren


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 19:27                     ` erik quanstrom
@ 2008-03-10 20:55                       ` Bakul Shah
  2008-03-11  2:04                       ` Wes Kussmaul
  1 sibling, 0 replies; 67+ messages in thread
From: Bakul Shah @ 2008-03-10 20:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, 10 Mar 2008 15:27:47 EDT erik quanstrom <quanstro@coraid.com>  wrote:
> >
> > > brushing off the factor of 2^60 is like brushing off the difference
> > > in weight between a bunch of banannas and the moon.
> >
> > bunch of bananas = ~1kg
> > mass of moon (weight of moon @ earth's sea level) = 7.3477^22 kg
> > 2^60      =    1,152,921,504,606,846,976 = 1.15etc^18
> >
> > Oh oh, you're off by 4 orders of magnitude, I think I hear collisions...
>
> that's cause it's 2^80 not 2^60.  did i mistype?

Nonrecoverable read error rate for disks is typically 1 in
10^14.  While >0.5 probability that two arbitrary blocks hash
to the same SHA1 value is 1 in 2^80.  That is roughly a
factor of 10 Billion.  So we are talking about a bunch of
bananas and less than 1000 Chiquita cargo ships.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 18:46                     ` Geoffrey Avila
@ 2008-03-10 20:28                       ` Charles Forsyth
  0 siblings, 0 replies; 67+ messages in thread
From: Charles Forsyth @ 2008-03-10 20:28 UTC (permalink / raw)
  To: 9fans

> The statistician replies that while the odds of being on a plane with
> _one_ bomb are too high for him, the odds of being on a plane with
> _multiple_ bombs are infinitesimal.

probably that explains why Homeland Security always seemed to be on Orange Alert.
at first i thought it had something to do with trading of commodity futures in Florida.



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 19:00                   ` Wes Kussmaul
@ 2008-03-10 19:27                     ` erik quanstrom
  2008-03-10 20:55                       ` Bakul Shah
  2008-03-11  2:04                       ` Wes Kussmaul
  0 siblings, 2 replies; 67+ messages in thread
From: erik quanstrom @ 2008-03-10 19:27 UTC (permalink / raw)
  To: 9fans

>
> > brushing off the factor of 2^60 is like brushing off the difference
> > in weight between a bunch of banannas and the moon.
>
> bunch of bananas = ~1kg
> mass of moon (weight of moon @ earth's sea level) = 7.3477^22 kg
> 2^60      =    1,152,921,504,606,846,976 = 1.15etc^18
>
> Oh oh, you're off by 4 orders of magnitude, I think I hear collisions...

that's cause it's 2^80 not 2^60.  did i mistype?

☺ erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 13:20                 ` erik quanstrom
@ 2008-03-10 19:00                   ` Wes Kussmaul
  2008-03-10 19:27                     ` erik quanstrom
  0 siblings, 1 reply; 67+ messages in thread
From: Wes Kussmaul @ 2008-03-10 19:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

erik quanstrom wrote:

> brushing off the factor of 2^60 is like brushing off the difference
> in weight between a bunch of banannas and the moon.

bunch of bananas = ~1kg
mass of moon (weight of moon @ earth's sea level) = 7.3477^22 kg
2^60      =    1,152,921,504,606,846,976 = 1.15etc^18

Oh oh, you're off by 4 orders of magnitude, I think I hear collisions...

-- 
Wes Kussmaul


The information contained in this electronic message and any attachments to this message are intended for the exclusive 
use of the addressee(s) and may contain confidential or privileged information. If you are not the intended recipient, 
please notify attorney Mort Hapless at Vulner, Exposed & Wideopen LLP immediately at either (781) 647-7178, or at 
ohoh@vulex.com, and destroy all copies of this message and any attachments. No, really. Really. Listen, we mean it! Hey, 
if you don’t stop reading that confidential stuff about our client you’re in big trouble. OK, we’re the ones in trouble 
but we’ll find a way to go after you, or at least we think we may be able to. Look, we’re begging you. Just click the 
delete button and move on to a message that concerns you, OK? Please?? We'll buy you lunch...



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 18:06                   ` Bruce Ellis
  2008-03-10 18:31                     ` Eric Van Hensbergen
@ 2008-03-10 18:46                     ` Geoffrey Avila
  2008-03-10 20:28                       ` Charles Forsyth
  2008-03-10 21:35                     ` Charles Forsyth
  2 siblings, 1 reply; 67+ messages in thread
From: Geoffrey Avila @ 2008-03-10 18:46 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs


This reminds me of a joke:

A man was asked by his neighbor to watch the neighbor's house while he
went away to a conference. The conference was only for one day, but the
neighbor was gone a week. He explained to the man that the extra time was
due to driving there and back. The neighbor, a statistician, explained
that he was unwilling to fly, having calculated that the odds of a bomb
being on the plane were too high for his taste.

Not long after, the man needs to fly someplace, and is surprised to see
his neighbor the statistician in the row next to him.

"I thought you didn't fly?"
asks the man.

The statistician replies that while the odds of being on a plane with
_one_ bomb are too high for him, the odds of being on a plane with
_multiple_ bombs are infinitesimal.

"I just pack a bomb in my carryon, to be safe."
says the statistician.

-GBA


> I concur and offer the great venti challenge.
>
> I am willing to personally give $10,000 to the person who experiences
> the first venti collision. (I know it's not a million but call me
> cheap.)
>
> The rules are simple. To enroll in the contest send $10 to charity and
> send me a copy of the receipt. First in best dressed. Additional rule
> - anyone who continues to whine should feel morally obliged to send
> $10 to charity for each transgression. No funny business or you'll be
> struck by lightning.
>
> brucee
>
> On Tue, Mar 11, 2008 at 3:18 AM, Russ Cox <rsc@swtch.com> wrote:
>> This topic just isn't worth all the anguish it is causing.
>>
>> If you are worried about SHA-1 collisions,
>> turn on the flag that looks for them.
>> Then you can sleep at night knowing that
>> all your writes have been collision-free.
>>
>> And if one day you do get a collision,
>> venti will tell you and then you can evaluate
>> your choices.
>>
>> Russ
>>
>>
>>
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 18:31                     ` Eric Van Hensbergen
@ 2008-03-10 18:40                       ` Bruce Ellis
  0 siblings, 0 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-10 18:40 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Fine, as long as each node donates $10 :-)

Well actually that isn't needed, as you would need several universes
of disk space (depending on whose estimate you like). brucee writes a
cheque for $10 for polluting 9fans.

Oh the Free Beer Foundation - I drank all the donations.

brucee

On Tue, Mar 11, 2008 at 5:31 AM, Eric Van Hensbergen <ericvh@gmail.com> wrote:
> On Mon, Mar 10, 2008 at 1:06 PM, Bruce Ellis <bruce.ellis@gmail.com> wrote:
> > I concur and offer the great venti challenge.
> >
> >  I am willing to personally give $10,000 to the person who experiences
> >  the first venti collision. (I know it's not a million but call me
> >  cheap.)
> >
> >  The rules are simple. To enroll in the contest send $10 to charity and
> >  send me a copy of the receipt. First in best dressed. Additional rule
> >  - anyone who continues to whine should feel morally obliged to send
> >  $10 to charity for each transgression. No funny business or you'll be
> >  struck by lightning.
> >
>
> finally, a killer app for the blue gene 64 thousand node scalability runs.
> I went to donate my $10 to the FBF via beer pal, but the website is no
> longer there.
>
>            -eric
>
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 18:06                   ` Bruce Ellis
@ 2008-03-10 18:31                     ` Eric Van Hensbergen
  2008-03-10 18:40                       ` Bruce Ellis
  2008-03-10 18:46                     ` Geoffrey Avila
  2008-03-10 21:35                     ` Charles Forsyth
  2 siblings, 1 reply; 67+ messages in thread
From: Eric Van Hensbergen @ 2008-03-10 18:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, Mar 10, 2008 at 1:06 PM, Bruce Ellis <bruce.ellis@gmail.com> wrote:
> I concur and offer the great venti challenge.
>
>  I am willing to personally give $10,000 to the person who experiences
>  the first venti collision. (I know it's not a million but call me
>  cheap.)
>
>  The rules are simple. To enroll in the contest send $10 to charity and
>  send me a copy of the receipt. First in best dressed. Additional rule
>  - anyone who continues to whine should feel morally obliged to send
>  $10 to charity for each transgression. No funny business or you'll be
>  struck by lightning.
>

finally, a killer app for the blue gene 64 thousand node scalability runs.
I went to donate my $10 to the FBF via beer pal, but the website is no
longer there.

            -eric


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 16:18                 ` Russ Cox
@ 2008-03-10 18:06                   ` Bruce Ellis
  2008-03-10 18:31                     ` Eric Van Hensbergen
                                       ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-10 18:06 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I concur and offer the great venti challenge.

I am willing to personally give $10,000 to the person who experiences
the first venti collision. (I know it's not a million but call me
cheap.)

The rules are simple. To enroll in the contest send $10 to charity and
send me a copy of the receipt. First in best dressed. Additional rule
- anyone who continues to whine should feel morally obliged to send
$10 to charity for each transgression. No funny business or you'll be
struck by lightning.

brucee

On Tue, Mar 11, 2008 at 3:18 AM, Russ Cox <rsc@swtch.com> wrote:
> This topic just isn't worth all the anguish it is causing.
>
> If you are worried about SHA-1 collisions,
> turn on the flag that looks for them.
> Then you can sleep at night knowing that
> all your writes have been collision-free.
>
> And if one day you do get a collision,
> venti will tell you and then you can evaluate
> your choices.
>
> Russ
>
>
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 10:19               ` sqweek
  2008-03-10 12:29                 ` Gorka Guardiola
  2008-03-10 13:20                 ` erik quanstrom
@ 2008-03-10 16:18                 ` Russ Cox
  2008-03-10 18:06                   ` Bruce Ellis
  2 siblings, 1 reply; 67+ messages in thread
From: Russ Cox @ 2008-03-10 16:18 UTC (permalink / raw)
  To: 9fans

This topic just isn't worth all the anguish it is causing.

If you are worried about SHA-1 collisions,
turn on the flag that looks for them.
Then you can sleep at night knowing that
all your writes have been collision-free.

And if one day you do get a collision,
venti will tell you and then you can evaluate
your choices.

Russ



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 10:19               ` sqweek
  2008-03-10 12:29                 ` Gorka Guardiola
@ 2008-03-10 13:20                 ` erik quanstrom
  2008-03-10 19:00                   ` Wes Kussmaul
  2008-03-10 16:18                 ` Russ Cox
  2 siblings, 1 reply; 67+ messages in thread
From: erik quanstrom @ 2008-03-10 13:20 UTC (permalink / raw)
  To: 9fans

[redirected to 9fans@9fans.net]

>  The difference between this and venti (aside from the factor of 2^60
> or whatever it was) is that network/memory/disk errors are either

brushing off the factor of 2^60 is like brushing off the difference
in weight between a bunch of banannas and the moon.

> transient or managable.
>  Silent network error? Going to be difficult to notice, but once you
> do a retransmit will fix it (or if things are really bad, a
> replacement network card).
>  RAM Problems? If transient, it is fixed next reboot, otherwise
> replace the module.
>  Silent disk corruption? Rewrite the data or replace the disk.
>  Venti hash collision? Um... well, it doesn't matter how many times we
> try to rewrite the block in question, it is always going to collide

what's the difference?  you are assuming that you can recreate the destroyed
data.  you're also assuming that a corrupt hash was stored with the corrupt
data.  if a proper hash is stored with corrupt data, you will never be able to
store that same block correctly without venti surgery.

> Replacing venti seems less than satisfactory - what else provides the
> same functionality? Our best option is to replace the hash and hope we
> don't get a different collision. But, this leaves us with a whole
> bunch of data addressed by the old hashing scheme which we presumably
> have to write new code to convert[1]. New code means new bugs, and I'd
> be lying if I claimed the prospect of writing such a utility to run on
> several years of a venti archive didn't scare me.

this is a common fallacy.  being scared of "what if" doesn't change probabilities.
oddly people are not wired to be afraid of the things they should.  how
many people do you know who get white-knuckled at the thought of getting
into a car?  you're >1000x more likely to die in a car than an airplane.
in fact you're 10^22 times more likely to die in a car crash than to have
a venti collision.  (at least in the us.)

by the way, i'm not sure what you mean by "replace venti".  there isn't
anything that does content-addressed storage for plan 9.  but if you
mean, is there anything that does the same job as venti+fossil, there is.
i've been using ken's file server with aoe storage.  the data is protected
by raid on the storage appliances.  this allows us to get very good fs
throughput although our working set is generally >4GB.

>  Well, I came up with one perhaps more interesting question while
> thinking about what happens with different block sizes (in particular
> blocks of one byte and blocks of the same size as the hash)... As I
> understand it, venti uses the hash of the data to determine where on
> disk to store the block. So, what happens when the hash resolves to an
> address which is off the end of the disk?

not really.  a direct addressing scheme would require 1.46e48 bytes of
storage.  venti uses a hash of the data like a normal disk would use
an lba.  this hash is called a fingerprint.  the fingerprints are indexed.
the index provides a mapping between fingerprint and arena:offset.

- erik



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-10 10:19               ` sqweek
@ 2008-03-10 12:29                 ` Gorka Guardiola
  2008-03-10 13:20                 ` erik quanstrom
  2008-03-10 16:18                 ` Russ Cox
  2 siblings, 0 replies; 67+ messages in thread
From: Gorka Guardiola @ 2008-03-10 12:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, Mar 10, 2008 at 11:19 AM, sqweek <sqweek@gmail.com> wrote:
>   Silent network error? Going to be difficult to notice, but once you
>  do a retransmit will fix it (or if things are really bad, a

You notice network errors in ways that are itself probabilistic.
Bits can change in ways the CRC doesn't notice. The same with
disks. And cosmic rays hitting your processor...
Some of this probabilities are higher (or comparable) than the
probability of the
sha-1 colliding.
--
- curiosity sKilled the cat


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 15:09             ` Charles Forsyth
  2008-03-06 17:09               ` Robert Raschke
@ 2008-03-10 10:19               ` sqweek
  2008-03-10 12:29                 ` Gorka Guardiola
                                   ` (2 more replies)
  1 sibling, 3 replies; 67+ messages in thread
From: sqweek @ 2008-03-10 10:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Mar 7, 2008 at 12:09 AM, Charles Forsyth <forsyth@terzarima.net> wrote:
> > But for HA applications, we still need some additional redundancy
>  > or at least some error diagnostics at application level. Well,
>  > we'll most likely needs this anyways, eg. to detect human fault
>  > or code bugs.
>
>  i hadn't realised the code i'd quoted only dealt with blocks in memory
>  (i didn't look hard enough once i'd found it), but russ then pointed out
>  that another option will do something like the check i'd intended.
>
>  given that, you have at least a check and a diagnostic that the
>  unlikely event ocurred.  it isn't the case i'd worry about first.  after all, the applications
>  pull the stuff into memory across interfaces that might have at most a parity
>  check, after transmission using protocols that use a fairly simple 16-bit
>  check sum, a compromise between speed of calculation and effectiveness.
>  one might sometimes add an end-to-end check, or digesting ... perhaps using SHA1!

 The difference between this and venti (aside from the factor of 2^60
or whatever it was) is that network/memory/disk errors are either
transient or managable.
 Silent network error? Going to be difficult to notice, but once you
do a retransmit will fix it (or if things are really bad, a
replacement network card).
 RAM Problems? If transient, it is fixed next reboot, otherwise
replace the module.
 Silent disk corruption? Rewrite the data or replace the disk.
 Venti hash collision? Um... well, it doesn't matter how many times we
try to rewrite the block in question, it is always going to collide.
Replacing venti seems less than satisfactory - what else provides the
same functionality? Our best option is to replace the hash and hope we
don't get a different collision. But, this leaves us with a whole
bunch of data addressed by the old hashing scheme which we presumably
have to write new code to convert[1]. New code means new bugs, and I'd
be lying if I claimed the prospect of writing such a utility to run on
several years of a venti archive didn't scare me.

[1] Unless you could do this with vac and co... my venti-fu is weak.
I'm setting my file server up soon, I promise!

 But if I normalise my worries based on the likelihood of the problem
occuring, then the real thing leaving a bad taste in my mouth is that
eventually something happens to force maintenance:
1) you get a hash collision
2) something displaces venti
3) venti changes
 OTOH, eventually you're going to run out of disk space, so venti is
unlikely to be the weak link here either.

 Well, I came up with one perhaps more interesting question while
thinking about what happens with different block sizes (in particular
blocks of one byte and blocks of the same size as the hash)... As I
understand it, venti uses the hash of the data to determine where on
disk to store the block. So, what happens when the hash resolves to an
address which is off the end of the disk?

-sqweek


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-08  9:37             ` Enrico Weigelt
  2008-03-08  9:57               ` Bruce Ellis
  2008-03-08 10:46               ` Charles Forsyth
@ 2008-03-08 15:37               ` erik quanstrom
  2 siblings, 0 replies; 67+ messages in thread
From: erik quanstrom @ 2008-03-08 15:37 UTC (permalink / raw)
  To: weigelt, 9fans

>> After this fact the colliding block is itself very interesting,
>> aand it is also very likely that theis block will be stored and 
>> archived just for this reason.
> 
> Which will increase the chance of a failure ;-O

by how much?  the fact that something *could* happen is often
meaningless.  what is
	lim{x->∞} 1+1/x?
the εδ argument made to proove the result always says that
if i control the input of a function this much i can control the
output that much.  in the real world there are limits (ha!)
to how small or large something can get before it is practically
infinite or zero.  this is because, .e.g., there is no such thing as 
1e24 bytes of storage.

theoretically, i don't think a collision by itself would be all that
interesting.  the number of possible bit patterns in, say, 8k blocks
would be
	2^(8*8192).
while the number of possible bit patterns in a sha1 hashs
	2^(8*20).
assuming an even distribution, there would be
	ceil(2^(8*8192)/2^(8*20) - 1) =
		2^(8*8172) - 1
collisions on average per hash value.  (that's ~1.37e4095, btw.)

the only way a collision would be interesting is if it exposed
a weakness in sha1.

- erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-08  9:37             ` Enrico Weigelt
  2008-03-08  9:57               ` Bruce Ellis
@ 2008-03-08 10:46               ` Charles Forsyth
  2008-03-08 15:37               ` erik quanstrom
  2 siblings, 0 replies; 67+ messages in thread
From: Charles Forsyth @ 2008-03-08 10:46 UTC (permalink / raw)
  To: 9fans

as a purely intellectual problem — how could we recover from or perhaps avoid
this remarkably unlikely event if our data stream just happened to trigger it —
this is all very well.  nevertheless, it still glosses over the hard fact that it really
is the least of your worries if you are building a real system using it:
data corruption is far more likely to occur in the networks, for instance,
so if you really are building a system, you are wasting your time on this aspect
(except to the extent you come to see the point).


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-08  9:37             ` Enrico Weigelt
@ 2008-03-08  9:57               ` Bruce Ellis
  2008-03-08 10:46               ` Charles Forsyth
  2008-03-08 15:37               ` erik quanstrom
  2 siblings, 0 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-08  9:57 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

perhaps a more sophisticated response would be to accept the fact that
you don't know what you are doing and don't really want advice. we
only mock you not because you have an idea that didn't pan out ... but
because you won't shut up about "i don't know what i'm doing but i'll
type another page of crap". we don't do that here. well the teenagers
do.

brucee

On Sat, Mar 8, 2008 at 8:37 PM, Enrico Weigelt <weigelt@metux.de> wrote:
> * Wilhelm B. Kloke <wb@arb-phys.uni-dortmund.de> wrote:
>
> > After this fact the colliding block is itself very interesting,
> > aand it is also very likely that theis block will be stored and
> > archived just for this reason.
>
> Which will increase the chance of a failure ;-O
>
> > In practice though, a filesystem relying on venti could just
> > change the block boundaries for this case or choose some other
> > escape from needing to store these special blocks.
>
> hmm, let's assume, the scientists will come up with (maybe
> artificially constructed) collissions long before they'll
> happen in practice, we could handle them specially. All we
> need to do is to leave enough room in the keyspace, so we
> can assign them to these special blocks, once they've been
> discovered.
>
> A simple way could simply be an additional bit, which tells
> us that the key isn't the data's hash, but an explicitly
> assigned dictionary entry.
>
> Maybe a more sophisticated approach: add an keytype prefix,
> which allows dropping in several types of keys. Default might
> be sha-1, but for special cases another keytype (eg. "dict")
> can be used.
>
>
>
> cu
> --
> ---------------------------------------------------------------------
>  Enrico Weigelt    ==   metux IT service - http://www.metux.de/
> ---------------------------------------------------------------------
>  Please visit the OpenSource QM Taskforce:
>        http://wiki.metux.de/public/OpenSource_QM_Taskforce
>  Patches / Fixes for a lot dozens of packages in dozens of versions:
>        http://patches.metux.de/
> ---------------------------------------------------------------------
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  9:54           ` Wilhelm B. Kloke
@ 2008-03-08  9:37             ` Enrico Weigelt
  2008-03-08  9:57               ` Bruce Ellis
                                 ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-08  9:37 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* Wilhelm B. Kloke <wb@arb-phys.uni-dortmund.de> wrote:

> After this fact the colliding block is itself very interesting,
> aand it is also very likely that theis block will be stored and
> archived just for this reason.

Which will increase the chance of a failure ;-O

> In practice though, a filesystem relying on venti could just
> change the block boundaries for this case or choose some other
> escape from needing to store these special blocks.

hmm, let's assume, the scientists will come up with (maybe
artificially constructed) collissions long before they'll
happen in practice, we could handle them specially. All we
need to do is to leave enough room in the keyspace, so we
can assign them to these special blocks, once they've been
discovered.

A simple way could simply be an additional bit, which tells
us that the key isn't the data's hash, but an explicitly
assigned dictionary entry.

Maybe a more sophisticated approach: add an keytype prefix,
which allows dropping in several types of keys. Default might
be sha-1, but for special cases another keytype (eg. "dict")
can be used.


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 21:39                   ` Paul Lalonde
@ 2008-03-08  9:06                     ` Enrico Weigelt
  0 siblings, 0 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-08  9:06 UTC (permalink / raw)
  To: 9fans

* Paul Lalonde <plalonde@telus.net> wrote:
> Bruce Ellis wrote:
> >If you stopped to pick up the penny you'd get hit by lightning and
> >fail to cash in your lottery ticket while getting bitten by a moose!
> >
>
> But thank god, there would be no collision.

God doesn't exist. He vanished in a puff of logic ;-P

> I hate getting hit by cars.

Yeah, logic proofed that too much logic can be harmful ;-O


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 20:18                 ` Bruce Ellis
  2008-03-06 21:39                   ` Paul Lalonde
@ 2008-03-06 22:10                   ` Martin Harriss
  1 sibling, 0 replies; 67+ messages in thread
From: Martin Harriss @ 2008-03-06 22:10 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Bruce Ellis wrote:
> If you stopped to pick up the penny you'd get hit by lightning and
> fail to cash in your lottery ticket while getting bitten by a moose!

Mynd you, møøse bites Kan be pretty nasti...


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 20:18                 ` Bruce Ellis
@ 2008-03-06 21:39                   ` Paul Lalonde
  2008-03-08  9:06                     ` Enrico Weigelt
  2008-03-06 22:10                   ` Martin Harriss
  1 sibling, 1 reply; 67+ messages in thread
From: Paul Lalonde @ 2008-03-06 21:39 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Bruce Ellis wrote:
> If you stopped to pick up the penny you'd get hit by lightning and
> fail to cash in your lottery ticket while getting bitten by a moose!
>

But thank god, there would be no collision.  I hate getting hit by cars.

Paul
> brucee
>
> On Fri, Mar 7, 2008 at 6:45 AM, Paul Lalonde <plalonde@telus.net> wrote:
>
>> You don't have to care about the chance of a collision.  Work out the
>> expected value of the collision by estimating the maximum that might
>> cost you.  I'll go nuts, and claim that a collision will cost *one
>> BILLION* dollars.
>> Not checking for a collision, assuming a 2**-90 collision rate (which is
>> a huge over-estimate, but that I've seen bandied about here), you wind
>> up with an expected dollar cost of 2**-60 dollars on that collision.
>>
>> I don't pick up pennies in the street.  I certainly won't dedicate any
>> more brainpower to this silliness.
>>
>> Paul
>>
>> Enrico Weigelt wrote:
>>
>>> * Bruce Ellis <bruce.ellis@gmail.com> wrote:
>>>
>>>
>>>> it's even sillier, if everyone bought 1,000,000 times as many tickets
>>>> guess how that would change the probabilities. not at all!
>>>>
>>>>
>>> The main problem is: statistics is not reliable. You just can
>>> guess how many times approx. something will happen if you take
>>> a really large number of tries. You can never be sure for a
>>> single case.
>>>
>>>
>>> cu
>>>
>>>
>>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 19:45               ` Paul Lalonde
@ 2008-03-06 20:18                 ` Bruce Ellis
  2008-03-06 21:39                   ` Paul Lalonde
  2008-03-06 22:10                   ` Martin Harriss
  0 siblings, 2 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-06 20:18 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

If you stopped to pick up the penny you'd get hit by lightning and
fail to cash in your lottery ticket while getting bitten by a moose!

brucee

On Fri, Mar 7, 2008 at 6:45 AM, Paul Lalonde <plalonde@telus.net> wrote:
> You don't have to care about the chance of a collision.  Work out the
> expected value of the collision by estimating the maximum that might
> cost you.  I'll go nuts, and claim that a collision will cost *one
> BILLION* dollars.
> Not checking for a collision, assuming a 2**-90 collision rate (which is
> a huge over-estimate, but that I've seen bandied about here), you wind
> up with an expected dollar cost of 2**-60 dollars on that collision.
>
> I don't pick up pennies in the street.  I certainly won't dedicate any
> more brainpower to this silliness.
>
> Paul
>
> Enrico Weigelt wrote:
> > * Bruce Ellis <bruce.ellis@gmail.com> wrote:
> >
> >> it's even sillier, if everyone bought 1,000,000 times as many tickets
> >> guess how that would change the probabilities. not at all!
> >>
> >
> > The main problem is: statistics is not reliable. You just can
> > guess how many times approx. something will happen if you take
> > a really large number of tries. You can never be sure for a
> > single case.
> >
> >
> > cu
> >
>
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 19:09 Brian L. Stuart
@ 2008-03-06 19:50 ` Charles Forsyth
  0 siblings, 0 replies; 67+ messages in thread
From: Charles Forsyth @ 2008-03-06 19:50 UTC (permalink / raw)
  To: 9fans

> is a very well-designed system, and is as reliable
> as any other form of archive.

it has been more reliable than my old QIC tapes
(i discovered recently).  fortunately, i found i'd already copied
those to something else, and now to venti.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  6:16             ` Enrico Weigelt
  2008-03-06 18:50               ` ron minnich
@ 2008-03-06 19:45               ` Paul Lalonde
  2008-03-06 20:18                 ` Bruce Ellis
  1 sibling, 1 reply; 67+ messages in thread
From: Paul Lalonde @ 2008-03-06 19:45 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

You don't have to care about the chance of a collision.  Work out the
expected value of the collision by estimating the maximum that might
cost you.  I'll go nuts, and claim that a collision will cost *one
BILLION* dollars.
Not checking for a collision, assuming a 2**-90 collision rate (which is
a huge over-estimate, but that I've seen bandied about here), you wind
up with an expected dollar cost of 2**-60 dollars on that collision.

I don't pick up pennies in the street.  I certainly won't dedicate any
more brainpower to this silliness.

Paul

Enrico Weigelt wrote:
> * Bruce Ellis <bruce.ellis@gmail.com> wrote:
>
>> it's even sillier, if everyone bought 1,000,000 times as many tickets
>> guess how that would change the probabilities. not at all!
>>
>
> The main problem is: statistics is not reliable. You just can
> guess how many times approx. something will happen if you take
> a really large number of tries. You can never be sure for a
> single case.
>
>
> cu
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 18:50               ` ron minnich
@ 2008-03-06 19:43                 ` Charles Forsyth
  0 siblings, 0 replies; 67+ messages in thread
From: Charles Forsyth @ 2008-03-06 19:43 UTC (permalink / raw)
  To: 9fans

> Think about this. You can warm up some CPUs to a point at which they will:
> - transparently corrupt floating point computations
> - not be warm enough to trigger the "I'm too hot" fault

years ago we had a computer that used a Fairchild (Intergraph) Clipper chip.
after it arrived it was worryingly unreliable: lots of transient faults and lockups.
it turned out that the clever cooling system the computer maker used was
just far too efficient, and made the box too cold.
the chip didn't work correctly when it was too cold.
(no one thought to include an ``it's an ice age'' alert.)
they restrained the cooling system and it was then ever so reliable.
probably.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
@ 2008-03-06 19:09 Brian L. Stuart
  2008-03-06 19:50 ` Charles Forsyth
  0 siblings, 1 reply; 67+ messages in thread
From: Brian L. Stuart @ 2008-03-06 19:09 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

From: "Russ Cox" <rsc@swtch.com>
> sure.  use sha-256 and your probability of collision goes
> down even further.  but *you* (probably) still won't be *sure*.

I should probably not put my 2 cents worth in here,
but my resistance is weak...

It is true that you cannot be sure that there won't be a
collision in venti, regardless of what hashing function
you use.  It is probabilistic, and doesn't prevent it
from happening tomorrow, or from not happening until
the sun burns out.  But it seems to me that there's
a bigger picture.  The reason we would not want a collision
is that it would, in effect, be a form of data corruption.
But it's only one possible source.  It's possible that
network communication could be corrupted but still
pass the CRC checks (if they're even present).  It's
possible that the disk could be corrupted in such a
way that a block is in error, but still passes the
ECC check.  It's possible that a bit in the main
memory might flip (or two bits if we have parity
memory).  In the end, we have to rely on the fact
that these are all very unlikely to happen; their
probabilities are quite low.  A higher probability
of damage comes from a potential fire in the machine
room.  We often add some form of off-site backup
to handle this.  But it can't make us sure that
an earthquake won't hit the off-site backup location
at the same time we have a fire locally.  Rather,
the probability of both is low enough we accept it.
The amount of effort we put into mitigating an error
is proportional to the probability of that error
occurring and the amount of harm the error would
cause.

What does all this mean for venti?  If we want to
reduce the overall probability of data corruption,
we want to put our efforts into addressing the one
with the highest probability.  Making the others
better won't appreciably help the overall probability.
And a venti collision is not the one with the
highest probability among those I've listed.  In
fact, I'd suspect its the one with the lowest
probability.  So putting attention on making it
less likely is really misplaced effort from a
practical standpoint.

The truth is that the first time I read the venti
papers, I was bothered the same way.  Yes, there
can be problems, but generally we design systems
where the design itself doesn't contain any known
sources of failure.  In venti, we have.  And it
bugged me for quite a while.  But when I finally
realized objectively that the probability of a
hardware failure is orders of magnitude greater
than a collision, I started to accept that venti
is a very well-designed system, and is as reliable
as any other form of archive.

BLS


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  6:16             ` Enrico Weigelt
@ 2008-03-06 18:50               ` ron minnich
  2008-03-06 19:43                 ` Charles Forsyth
  2008-03-06 19:45               ` Paul Lalonde
  1 sibling, 1 reply; 67+ messages in thread
From: ron minnich @ 2008-03-06 18:50 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 5, 2008 at 10:16 PM, Enrico Weigelt <weigelt@metux.de> wrote:
> * Bruce Ellis <bruce.ellis@gmail.com> wrote:
>  > it's even sillier, if everyone bought 1,000,000 times as many tickets
>  > guess how that would change the probabilities. not at all!
>
>  The main problem is: statistics is not reliable.

baloney. Statistics are quite reliable. It's why your computers work
at all. They are statistical beasts, with quantifiable uncertainty.
You don't get certainty, you get dialed-in uncertainty.

It's just that the engineering is so good you've fooled yourself into
thinking it's certain.

Think about this. You can warm up some CPUs to a point at which they will:
- transparently corrupt floating point computations
- not be warm enough to trigger the "I'm too hot" fault

Think about this: the probability of a perfectly bad packet getting
through a network layer with no detected error is non-zero.

Think about the bit error rate of disks -- it's non zero. But you
trust them for some reason, and you don't trust venti?

OK, why is that? If you understand that you might start to understand
why Venti is better than you know.

In any event, if you are going to try to make statistical arguments,
move away from anecdotal conjectures such as "my cousin bought a
lottery ticket on the same day I got hit by lightning" and move into
the math. That's what the tools are for, so use them :-)

ron


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 16:58     ` Russ Cox
@ 2008-03-06 18:16       ` andrey mirtchovski
  0 siblings, 0 replies; 67+ messages in thread
From: andrey mirtchovski @ 2008-03-06 18:16 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> it gets even sillier

oh, indeed it does: i've just had my coffee. here's the result:

got a collision? store all colliding blocks and return a random one
the next time someone wants to access the colliding hash. compared to
the 1/10⁹⁰ probability of collision, the 1/2 (most likely) probability
that you'll read the correct colliding block certainly looks like a 1
:)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 15:09             ` Charles Forsyth
@ 2008-03-06 17:09               ` Robert Raschke
  2008-03-10 10:19               ` sqweek
  1 sibling, 0 replies; 67+ messages in thread
From: Robert Raschke @ 2008-03-06 17:09 UTC (permalink / raw)
  To: 9fans

Hi,

as far as I understand, there was recently a finding that SHA1 (or
MD5, can't remember off the top of my head) is potentially unsafe to
be used as a SIGNATURE of a document.  This is because somebody
managed to CONSTRUCT a text that ended up getting the same hash as
another (this is apparently not the easiest thing to do either).  And
that leads to potential falsification of data while still having a
supposedly valid signature.

This is completely different to what venti uses hashes for, where the
hash is computed on REAL (not constructed) data blocks for indexing
purposes.  If you manage to go out of your way and construct a block
that ends up clashing with an existing hash index, it doesn't matter,
because you won't break the existing data with it!

I get the impression that the former clouds the understanding of the
latter.

Robby


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06 12:39   ` Enrico Weigelt
@ 2008-03-06 16:58     ` Russ Cox
  2008-03-06 18:16       ` andrey mirtchovski
  0 siblings, 1 reply; 67+ messages in thread
From: Russ Cox @ 2008-03-06 16:58 UTC (permalink / raw)
  To: weigelt, 9fans

> (we couldn't use hashing for traffic reductions, safely).

yes you can.  you can use hashes to build a hash table
with a collision policy.  there is some company
(whose name escapes me; maybe someone else will
remember) that makes exactly this product, so that
once network A has sent a particular chunk of data to
network B once, future transmissions are replaced
transparently with a shorter name.  kind of like
lempel-ziv on steroids.  apparently it makes
cross-country ms exchange servers and file servers
much more bearable.

> it would be an interesting feature. Of course the fs on top then
> MUST refresh from time to time, but this can be done while the
> system is idle (good for situations with high load peaks and enough
> idle time on the other hand).

sorry, but this is just a fantastically terrible idea.
you're taking a reliable system and making it unreliable.

if you were really concerned, it would be better
to implement a garbage collector that you could
hand a root set.  even that would worry me (a simple
bug would wipe out your entire archive), but it
wouldn't be as bad as relying on timeouts.

> For this I need to be *sure* that there will be
> *no* collissions, even if the system runs for a long time and
> grows really big (maybe several PB on thousands of nodes).
>
> Another interesting question: can the risk of colissions be
> reduced by combining several different hash functions in
> parallel ?

sure.  use sha-256 and your probability of collision goes
down even further.  but *you* (probably) still won't be *sure*.

russ


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  6:40           ` Enrico Weigelt
  2008-03-06 14:35             ` erik quanstrom
  2008-03-06 14:58             ` Tom Lieber
@ 2008-03-06 15:09             ` Charles Forsyth
  2008-03-06 17:09               ` Robert Raschke
  2008-03-10 10:19               ` sqweek
  2 siblings, 2 replies; 67+ messages in thread
From: Charles Forsyth @ 2008-03-06 15:09 UTC (permalink / raw)
  To: weigelt, 9fans

> But for HA applications, we still need some additional redundancy
> or at least some error diagnostics at application level. Well,
> we'll most likely needs this anyways, eg. to detect human fault
> or code bugs.

i hadn't realised the code i'd quoted only dealt with blocks in memory
(i didn't look hard enough once i'd found it), but russ then pointed out
that another option will do something like the check i'd intended.

given that, you have at least a check and a diagnostic that the
unlikely event ocurred.  it isn't the case i'd worry about first.  after all, the applications
pull the stuff into memory across interfaces that might have at most a parity
check, after transmission using protocols that use a fairly simple 16-bit
check sum, a compromise between speed of calculation and effectiveness.
one might sometimes add an end-to-end check, or digesting ... perhaps using SHA1!


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  6:40           ` Enrico Weigelt
  2008-03-06 14:35             ` erik quanstrom
@ 2008-03-06 14:58             ` Tom Lieber
  2008-03-06 15:09             ` Charles Forsyth
  2 siblings, 0 replies; 67+ messages in thread
From: Tom Lieber @ 2008-03-06 14:58 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Mar 6, 2008 at 1:40 AM, Enrico Weigelt <weigelt@metux.de> wrote:
>  My current idea is to use two separate hash functions in parallel
>  (as many sw distros already do). But I've got no idea if this
>  really helps or collissions in SHA-1 will often go parallel with
>  colissions in the second hash (eg. MD5).

Two lightning strikes at the same time as two lotteries being won. You
cannot eliminate the chances of collision this way. What are you
trying to achieve with this thread?

--
Tom Lieber
http://AllTom.com/


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  6:40           ` Enrico Weigelt
@ 2008-03-06 14:35             ` erik quanstrom
  2008-03-06 14:58             ` Tom Lieber
  2008-03-06 15:09             ` Charles Forsyth
  2 siblings, 0 replies; 67+ messages in thread
From: erik quanstrom @ 2008-03-06 14:35 UTC (permalink / raw)
  To: weigelt, 9fans

> But for HA applications, we still need some additional redundancy
> or at least some error diagnostics at application level. Well,
> we'll most likely needs this anyways, eg. to detect human fault
> or code bugs.
>
> My current idea is to use two separate hash functions in parallel
> (as many sw distros already do). But I've got no idea if this
> really helps or collissions in SHA-1 will often go parallel with
> colissions in the second hash (eg. MD5).

adding a second hash will likely increase your failure rate as
the failure rate of storage is >> collision rate of sha1.  and
adding a second hash will increase your storage, thus increasing
your exposure to storage failure.

- erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05 14:33 ` Russ Cox
@ 2008-03-06 12:39   ` Enrico Weigelt
  2008-03-06 16:58     ` Russ Cox
  0 siblings, 1 reply; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-06 12:39 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* Russ Cox <rsc@swtch.com> wrote:
> > 1. how stable is the keying ? sha-1 has only 160 bits, while
> >    data blocks may be up to 56k long. so, the mapping is only
> >    unique into one direction (not one-to-one). how can we be
> >    *really sure*, that - even on very large storages (TB or
> >    even PB) - data to each key is alway (one-to-one) unique ?
>
> if you change lump.c to say
>
> 	int verifywrites = 1;
>
> then venti will check every block as it is written to make
> sure there is no hash collision.  this is not the default (anymore).
> the snippet that forsyth quoted only applies if the block is
> present in the in-memory cache.

Okay, for the centralistic approach of venti, this will catch
collissions (BTW: what does it do if it detects one ?), but for
my idea of an distributed superstorage, this won't help much
(we couldn't use hashing for traffic reductions, safely).

> > 2. what approx. compression level could be assumed on large
> >    storages by putting together equal data blocks (on several
> >    kind of data, eg. typical office documents vs. media) ?
>
> i don't know of any studies that break it up by data type.
> the largest benefit is due to coalescing of exact files,
> and that simply depends on the duplicate rate.
> it's very workload dependent, and i don't know of any
> good studies to point you at.

hmm, already suspected that :(
Of course, for the fossil case, it the reuse rate will be very high,
but that's not the only use of this approach I currently think of.

Much more intersting, IMHO, is the idea of using it to compress an
large data storage and also build an easily extendible and redundant
storage system out of it (maybe even replacing RAID and reducing disk
traffic by much more effective caching).

> > 3. what happens on the space consumtion if venti is used as
> >    storage for heavily rw filesystems for a longer time
> >    (not as permanent archive) - how much space will be wasted ?
> >    should we add some method for block expiry (eg. timeouts
> >    of reference counters) ?
>
> no.  venti is for archiving.  if you don't want to use it for
> archiving, fine.  but timeouts and reference counts would
> break it for the rest of us (neither is guaranteed correct --
> what if a reference is written on a whiteboard or in a
> rarely-accessed safe-deposit box?).

Okay, let's assume some venti-2, which has timeouts, then it would
require some kind of refresh from time to time. For an eternal
archive, this of course would make more throuble than worth it.
(would also require some way for reclaiming unused space).

But if it should be used as backend for some "normal" filesystem,
it would be an interesting feature. Of course the fs on top then
MUST refresh from time to time, but this can be done while the
system is idle (good for situations with high load peaks and enough
idle time on the other hand). Instead an explicit reference counting
must act immediately (okay, removing large trees could be moved to
background somehow).

> you can do snapshots in fossil instead of archives and
> those *can* be timed out.  but they don't use venti
> and don't coalesce storage as much as venti does,
> which is why timeouts are possible.

hmm, that doesn't satisfy me, I'd really like to have the both
benefits together ;-P

As I'm already planning an distributed filesystem and going to
use an venti-like approach for it, I'll do some exoeriments with
some "venti-2", which allows reclaiming expired blocks.

> > 4. assuming #1 can be answered 100% yes - would it suit for
> >    an very large (several PB) heavily distributed storage
> >    (eg. for some kind of distributed, redundant filesystem) ?
>
> there are many systems that have been built using
> content-addressed storage, just not on top of venti.
> Here are a few.
>
> http://pdos.csail.mit.edu/pastwatch/
> http://pdos.csail.mit.edu/ivy/
> http://en.wikipedia.org/wiki/Distributed_hash_table

Thanks for the links. The text about DHT (and linked stuff) is
very interesting. My ideas were already near to the PHT and Trie :)

I'll have to do some bit more research to get redundancy/fault
tolerance and dynaic space allocation together. In an ideal
storage network, individual storage areanas (even within some
server node) can be added and removed easily on the fly.


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  5:40         ` Uriel
  2008-03-06  5:55           ` Bruce Ellis
@ 2008-03-06 12:26           ` erik quanstrom
  1 sibling, 0 replies; 67+ messages in thread
From: erik quanstrom @ 2008-03-06 12:26 UTC (permalink / raw)
  To: 9fans

> There is a possibility that a meteorite will crush your head any
> moment, there are some statistics about how probable this, but as you
> say, they are not reliable, so best go live in a very deep cave, just
> make sure there is no Internet access, the world will be grateful for
> it.

actually, the world runs on probability.  einstein was wrong.
god does play dice.  quantum mechanical equations are probability
equations.  the interactions between basic force-carrying particles are
probability equations.  and the best understanding is that
time itself runs in one direction due to probability.

pretty cool, eh?

- erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:15         ` andrey mirtchovski
  2008-03-06  4:31           ` Bruce Ellis
  2008-03-06  6:40           ` Enrico Weigelt
@ 2008-03-06  9:54           ` Wilhelm B. Kloke
  2008-03-08  9:37             ` Enrico Weigelt
  2 siblings, 1 reply; 67+ messages in thread
From: Wilhelm B. Kloke @ 2008-03-06  9:54 UTC (permalink / raw)
  To: 9fans

andrey mirtchovski <mirtchovski@gmail.com> schrieb:
>>  Well, cracking the lottery jackpot happens quite often (if people
>>  would buy as many lotter tickets as we've got disitinct data
>>  blocks as we have in larger data storages or network traffic
>>  over several years, it would happen very regularily).
>
> i think what you fail to take into consideration is the fact, that
> even if the chance of a collision may be relatively high by your
> standards, the chance that the colliding blocks have data of any
> significance is very, very low. i.e., the algorithm for figuring out
> whether a hash collision will be important to you personally belongs
> to EXPSPACE, which, we all know, is filled with pr0n anyways.

This analysis is not entirely correct. The fact that SHA1 is a very
interesting target for a thorough search for possible collisions
makes it a lot more likely that such a collision will be found.
After this fact the colliding block is itself very interesting,
aand it is also very likely that theis block will be stored and archived
just for this reason.

In practice though, a filesystem relying on venti could just
change the block boundaries for this case or choose some other
escape from needing to store these special blocks.
--
Dipl.-Math. Wilhelm Bernhard Kloke
Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257
PGP: http://vestein.arb-phys.uni-dortmund.de/~wb/mypublic.key


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  5:38   ` Enrico Weigelt
@ 2008-03-06  9:44     ` Joel C. Salomon
  0 siblings, 0 replies; 67+ messages in thread
From: Joel C. Salomon @ 2008-03-06  9:44 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

On Thu, Mar 6, 2008 at 12:38 AM, Enrico Weigelt <weigelt@metux.de> wrote:
>  Okay, venti detects collisions. But what happens then ?

Check the code: it actually computes the bitwise XOR of the two blocks
and displays them as a PNG image.  This is the industry-standard way
of generating the schematic diagrams for the Infinite Improbability
Drive.

--Joel


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:15         ` andrey mirtchovski
  2008-03-06  4:31           ` Bruce Ellis
@ 2008-03-06  6:40           ` Enrico Weigelt
  2008-03-06 14:35             ` erik quanstrom
                               ` (2 more replies)
  2008-03-06  9:54           ` Wilhelm B. Kloke
  2 siblings, 3 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-06  6:40 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* andrey mirtchovski <mirtchovski@gmail.com> wrote:

> i think what you fail to take into consideration is the fact, that
> even if the chance of a collision may be relatively high by your
> standards, the chance that the colliding blocks have data of any
> significance is very, very low.

Okay, valid point. For my personal things (eg. large media
collections) this would be perfectly fine, since the data isn't
that important and a few broken data blocks aren't that harmful
(loosing an frame in an movie is even hard to notice).

But for HA applications, we still need some additional redundancy
or at least some error diagnostics at application level. Well,
we'll most likely needs this anyways, eg. to detect human fault
or code bugs.

My current idea is to use two separate hash functions in parallel
(as many sw distros already do). But I've got no idea if this
really helps or collissions in SHA-1 will often go parallel with
colissions in the second hash (eg. MD5).


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:31           ` Bruce Ellis
@ 2008-03-06  6:16             ` Enrico Weigelt
  2008-03-06 18:50               ` ron minnich
  2008-03-06 19:45               ` Paul Lalonde
  0 siblings, 2 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-06  6:16 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* Bruce Ellis <bruce.ellis@gmail.com> wrote:
> it's even sillier, if everyone bought 1,000,000 times as many tickets
> guess how that would change the probabilities. not at all!

The main problem is: statistics is not reliable. You just can
guess how many times approx. something will happen if you take
a really large number of tries. You can never be sure for a
single case.


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  5:40         ` Uriel
@ 2008-03-06  5:55           ` Bruce Ellis
  2008-03-11 18:34             ` Uriel
  2008-03-06 12:26           ` erik quanstrom
  1 sibling, 1 reply; 67+ messages in thread
From: Bruce Ellis @ 2008-03-06  5:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Uriel, is this guy just a clueless dick?  Or am I missing a milligram
of sense in a mountain of bullshit?

brucee

On Thu, Mar 6, 2008 at 4:40 PM, Uriel <uriel99@gmail.com> wrote:
> >  So we've seen again: statistics are *never* reliable. It only helps
> >  for vague decisions on very large masses, never for a single case.
>
> There is a possibility that a meteorite will crush your head any
> moment, there are some statistics about how probable this, but as you
> say, they are not reliable, so best go live in a very deep cave, just
> make sure there is no Internet access, the world will be grateful for
> it.
>
> uriel
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:04       ` Enrico Weigelt
                           ` (2 preceding siblings ...)
  2008-03-06  4:40         ` cummij
@ 2008-03-06  5:40         ` Uriel
  2008-03-06  5:55           ` Bruce Ellis
  2008-03-06 12:26           ` erik quanstrom
  3 siblings, 2 replies; 67+ messages in thread
From: Uriel @ 2008-03-06  5:40 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

>  So we've seen again: statistics are *never* reliable. It only helps
>  for vague decisions on very large masses, never for a single case.

There is a possibility that a meteorite will crush your head any
moment, there are some statistics about how probable this, but as you
say, they are not reliable, so best go live in a very deep cave, just
make sure there is no Internet access, the world will be grateful for
it.

uriel


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
       [not found] ` <a553f487750f88281db1cce3378577c7@terzarima.net>
@ 2008-03-06  5:38   ` Enrico Weigelt
  2008-03-06  9:44     ` Joel C. Salomon
  0 siblings, 1 reply; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-06  5:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* Charles Forsyth <forsyth@terzarima.net> wrote:
> >1. how stable is the keying ? sha-1 has only 160 bits, while
> >   data blocks may be up to 56k long. so, the mapping is only
> >   unique into one direction (not one-to-one). how can we be
> >   *really sure*, that - even on very large storages (TB or
> >   even PB) - data to each key is alway (one-to-one) unique ?
>
> on a write, the computer will tell you if you ought to have bought
> that lottery ticket and stayed out of the rain:
<snip>

Okay, venti detects collisions. But what happens then ?
Does it simply refuse the write or is there a way for managing
hash-colliding data blocks ?

Of couse, this can be worked around, if we define, the key is
always server-generated and does not *always* reflect the hash
(from client-side: an arbitrary number):
Adding another bit, which tells if the key is not the hash, but
some (server-allocated) ID (eg. table entry).

For an stricly server-based model, this is perfectly fine.

BUT: I've got another idea in mind: an heavily distributed block
storage, which uses hashkeys for block identification and also
pools together equal blocks (like venti does). Ideally each
block should be transmitted only once (as long as it is in the
local cache). For this I need to be *sure* that there will be
*no* collissions, even if the system runs for a long time and
grows really big (maybe several PB on thousands of nodes).

Another interesting question: can the risk of colissions be
reduced by combining several different hash functions in
parallel ?


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:40         ` cummij
@ 2008-03-06  5:15           ` Bruce Ellis
  0 siblings, 0 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-06  5:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

my sister was once bitten by a moose.

On Thu, Mar 6, 2008 at 3:40 PM,  <cummij@rpi.edu> wrote:
> > About one or two years ago, it happened that someone in my city
> > cracked the jackpot.
>
> ahh, but was he hit by lightning *at the same time*?  and then did that
> pair of events happen to the *same person* 2^90 more times?
>
> john cummings


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:04       ` Enrico Weigelt
  2008-03-06  4:13         ` Bruce Ellis
  2008-03-06  4:15         ` andrey mirtchovski
@ 2008-03-06  4:40         ` cummij
  2008-03-06  5:15           ` Bruce Ellis
  2008-03-06  5:40         ` Uriel
  3 siblings, 1 reply; 67+ messages in thread
From: cummij @ 2008-03-06  4:40 UTC (permalink / raw)
  To: weigelt, 9fans

> About one or two years ago, it happened that someone in my city
> cracked the jackpot.

ahh, but was he hit by lightning *at the same time*?  and then did that
pair of events happen to the *same person* 2^90 more times?

john cummings


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:15         ` andrey mirtchovski
@ 2008-03-06  4:31           ` Bruce Ellis
  2008-03-06  6:16             ` Enrico Weigelt
  2008-03-06  6:40           ` Enrico Weigelt
  2008-03-06  9:54           ` Wilhelm B. Kloke
  2 siblings, 1 reply; 67+ messages in thread
From: Bruce Ellis @ 2008-03-06  4:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

it's even sillier, if everyone bought 1,000,000 times as many tickets
guess how that would change the probabilities. not at all!

brucee

On Thu, Mar 6, 2008 at 3:15 PM, andrey mirtchovski
<mirtchovski@gmail.com> wrote:
> >  Well, cracking the lottery jackpot happens quite often (if people
> >  would buy as many lotter tickets as we've got disitinct data
> >  blocks as we have in larger data storages or network traffic
> >  over several years, it would happen very regularily).
>
> i think what you fail to take into consideration is the fact, that
> even if the chance of a collision may be relatively high by your
> standards, the chance that the colliding blocks have data of any
> significance is very, very low. i.e., the algorithm for figuring out
> whether a hash collision will be important to you personally belongs
> to EXPSPACE, which, we all know, is filled with pr0n anyways.
>
> cheerio!
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:04       ` Enrico Weigelt
  2008-03-06  4:13         ` Bruce Ellis
@ 2008-03-06  4:15         ` andrey mirtchovski
  2008-03-06  4:31           ` Bruce Ellis
                             ` (2 more replies)
  2008-03-06  4:40         ` cummij
  2008-03-06  5:40         ` Uriel
  3 siblings, 3 replies; 67+ messages in thread
From: andrey mirtchovski @ 2008-03-06  4:15 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

>  Well, cracking the lottery jackpot happens quite often (if people
>  would buy as many lotter tickets as we've got disitinct data
>  blocks as we have in larger data storages or network traffic
>  over several years, it would happen very regularily).

i think what you fail to take into consideration is the fact, that
even if the chance of a collision may be relatively high by your
standards, the chance that the colliding blocks have data of any
significance is very, very low. i.e., the algorithm for figuring out
whether a hash collision will be important to you personally belongs
to EXPSPACE, which, we all know, is filled with pr0n anyways.

cheerio!


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-06  4:04       ` Enrico Weigelt
@ 2008-03-06  4:13         ` Bruce Ellis
  2008-03-06  4:15         ` andrey mirtchovski
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 67+ messages in thread
From: Bruce Ellis @ 2008-03-06  4:13 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

did you miss the 2^90?  rather a lot really compared to lottery (~2^24).

brucee

On Thu, Mar 6, 2008 at 3:04 PM, Enrico Weigelt <weigelt@metux.de> wrote:
> * geoff@plan9.bell-labs.com <geoff@plan9.bell-labs.com> wrote:
>
> Hi,
>
> > >From the fortune file:
> > You are roughly 2^90 times more likely to win a U.S. state lottery
> > *and* be struck by lightning simultaneously than you are to encounter
> > [an accidental SHA1 collision] in your file system.  - J. Black
>
> Well, cracking the lottery jackpot happens quite often (if people
> would buy as many lotter tickets as we've got disitinct data
> blocks as we have in larger data storages or network traffic
> over several years, it would happen very regularily).
> Even the amount of lottery players in a smaller city with quite
> low incomes (so people can't afford playing regularily) is quite
> small (compared to the rest of the country). The chance of being
> resident in one specific of these small cities is also quite low.
> About one or two years ago, it happened that someone in my city
> cracked the jackpot.
> Now let's imagine, how many people of those who use to play lottery
> (in my family, there's exactly 1 - people who play lottery most
> likely have to believe there's a chance to win or simply don't
> know how to spend their money, also a quite small percentage of
> the population) don't want to have the price (for themselves) ?
> Exactly this happened here.
> And now take those people (winning, but don't want to have the price)
> and let's see who many of them even don't want to donate their win
> to certain projects (neither funding, science or social projects),
> especially in an region where social projects are *very* needed but
> are dramatically underfunded (eg. very bad financial situation of
> medical or social care facilities) and many people are even too
> poor for giving their childs appropriate food and clothes.
> Exactly this happened here: the winner really *refused* the win
> and so gave it away to the lottery company.
>
> I really can't say, how low the probabily for such events is,
> but I suspect, it's *extremly* low. Although I know really a lot
> of people, I cannot imagine a single one who might probably even
> think about such an decision.
>
> IMHO, such an event (winning && in my local city && refusing the win
> && the regional public and personal povery) is nearly impossible.
> BUT: it really happened !
>
> So we've seen again: statistics are *never* reliable. It only helps
> for vague decisions on very large masses, never for a single case.
>
>
> cu
> --
> ---------------------------------------------------------------------
>  Enrico Weigelt    ==   metux IT service - http://www.metux.de/
> ---------------------------------------------------------------------
>  Please visit the OpenSource QM Taskforce:
>        http://wiki.metux.de/public/OpenSource_QM_Taskforce
>  Patches / Fixes for a lot dozens of packages in dozens of versions:
>        http://patches.metux.de/
> ---------------------------------------------------------------------
>


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
       [not found]     ` <7f575fa27b41329b9ae24f40e6e5a3cd@plan9.bell-labs.com>
@ 2008-03-06  4:04       ` Enrico Weigelt
  2008-03-06  4:13         ` Bruce Ellis
                           ` (3 more replies)
  0 siblings, 4 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-06  4:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* geoff@plan9.bell-labs.com <geoff@plan9.bell-labs.com> wrote:

Hi,

> >From the fortune file:
> You are roughly 2^90 times more likely to win a U.S. state lottery
> *and* be struck by lightning simultaneously than you are to encounter
> [an accidental SHA1 collision] in your file system.  - J. Black

Well, cracking the lottery jackpot happens quite often (if people
would buy as many lotter tickets as we've got disitinct data
blocks as we have in larger data storages or network traffic
over several years, it would happen very regularily).
Even the amount of lottery players in a smaller city with quite
low incomes (so people can't afford playing regularily) is quite
small (compared to the rest of the country). The chance of being
resident in one specific of these small cities is also quite low.
About one or two years ago, it happened that someone in my city
cracked the jackpot.
Now let's imagine, how many people of those who use to play lottery
(in my family, there's exactly 1 - people who play lottery most
likely have to believe there's a chance to win or simply don't
know how to spend their money, also a quite small percentage of
the population) don't want to have the price (for themselves) ?
Exactly this happened here.
And now take those people (winning, but don't want to have the price)
and let's see who many of them even don't want to donate their win
to certain projects (neither funding, science or social projects),
especially in an region where social projects are *very* needed but
are dramatically underfunded (eg. very bad financial situation of
medical or social care facilities) and many people are even too
poor for giving their childs appropriate food and clothes.
Exactly this happened here: the winner really *refused* the win
and so gave it away to the lottery company.

I really can't say, how low the probabily for such events is,
but I suspect, it's *extremly* low. Although I know really a lot
of people, I cannot imagine a single one who might probably even
think about such an decision.

IMHO, such an event (winning && in my local city && refusing the win
&& the regional public and personal povery) is nearly impossible.
BUT: it really happened !

So we've seen again: statistics are *never* reliable. It only helps
for vague decisions on very large masses, never for a single case.


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05 14:03 erik quanstrom
@ 2008-03-05 16:00 ` Russ Cox
  0 siblings, 0 replies; 67+ messages in thread
From: Russ Cox @ 2008-03-05 16:00 UTC (permalink / raw)
  To: 9fans

> "cryptograpically strong".  the invented function may have
> the same probability of collision as sha-1, but it is not
> cryptographically strong.

let's take this discussion elsewhere, since cryptographically
strong just means "we don't know how to attack it yet"
(witness md5).  surely there is a crypto list deserving of
this non-plan 9 traffic.

> i think that venti has a different problem.  indexing by
> sha-1 hash trades time and index lookups for space.
> but disk space is cheep relative to our needs and table
> lookup and fragmentation that venti implies results
> in a lot of random i/o.  modern disks are at least 25x
> faster doing sequential i/o.

this isn't fair to venti.  yes, there is a performance cost.
but venti is interesting because it creates new functionality.
the main one is that you can build very naive systems that
don't worry about wasting disk space, because underneath
they're not.  for example, i back up 1.4TB of FFS file systems
every night by copying them to a venti server.  i don't care
how cheap disk is: if you're creating 1.4TB of disk per night
you're going to run out of disk pretty soon.  using venti,
i have about 3 years of backups stored in 1.7TB of space.
there are other interesting properties too, like the log
structure making it easy to sync venti servers against each
other even across long-distance links.

venti is for archival storage.  if you need a super fast live copy
then you want to put something in front.  (and if you want
a super fast live copy with no archival, just don't use venti.)

russ


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:00 Enrico Weigelt
                   ` (2 preceding siblings ...)
  2008-03-05  8:43 ` Charles Forsyth
@ 2008-03-05 14:33 ` Russ Cox
  2008-03-06 12:39   ` Enrico Weigelt
       [not found] ` <a553f487750f88281db1cce3378577c7@terzarima.net>
  4 siblings, 1 reply; 67+ messages in thread
From: Russ Cox @ 2008-03-05 14:33 UTC (permalink / raw)
  To: weigelt, 9fans

> 1. how stable is the keying ? sha-1 has only 160 bits, while
>    data blocks may be up to 56k long. so, the mapping is only
>    unique into one direction (not one-to-one). how can we be
>    *really sure*, that - even on very large storages (TB or
>    even PB) - data to each key is alway (one-to-one) unique ?

if you change lump.c to say

	int verifywrites = 1;

then venti will check every block as it is written to make
sure there is no hash collision.  this is not the default (anymore).
the snippet that forsyth quoted only applies if the block is
present in the in-memory cache.

> 2. what approx. compression level could be assumed on large
>    storages by putting together equal data blocks (on several
>    kind of data, eg. typical office documents vs. media) ?

i don't know of any studies that break it up by data type.
the largest benefit is due to coalescing of exact files,
and that simply depends on the duplicate rate.
it's very workload dependent, and i don't know of any
good studies to point you at.

> 3. what happens on the space consumtion if venti is used as
>    storage for heavily rw filesystems for a longer time
>    (not as permanent archive) - how much space will be wasted ?
>    should we add some method for block expiry (eg. timeouts
>    of reference counters) ?

no.  venti is for archiving.  if you don't want to use it for
archiving, fine.  but timeouts and reference counts would
break it for the rest of us (neither is guaranteed correct --
what if a reference is written on a whiteboard or in a
rarely-accessed safe-deposit box?).

you can do snapshots in fossil instead of archives and
those *can* be timed out.  but they don't use venti
and don't coalesce storage as much as venti does,
which is why timeouts are possible.

> 4. assuming #1 can be answered 100% yes - would it suit for
>    an very large (several PB) heavily distributed storage
>    (eg. for some kind of distributed, redundant filesystem) ?

there are many systems that have been built using
content-addressed storage, just not on top of venti.
Here are a few.

http://pdos.csail.mit.edu/pastwatch/
http://pdos.csail.mit.edu/ivy/
http://en.wikipedia.org/wiki/Distributed_hash_table

russ


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [9fans] thoughs about venti+fossil
@ 2008-03-05 14:03 erik quanstrom
  2008-03-05 16:00 ` Russ Cox
  0 siblings, 1 reply; 67+ messages in thread
From: erik quanstrom @ 2008-03-05 14:03 UTC (permalink / raw)
  To: 9fans

> >> http://www.nmt.edu/~val/review/hash/index.html
> >>
> >> Not that this analysis is without flaws, though.
> >
> > have you invented the 9fans.net effect?
> 
> Meaning? I guess the reference went over my head.

the link was inaccessable when i tried to access it.
i figured the combined traffic of 9fans brought it down. ;-).

> > this link may or may not be similar.  but it is on point:
> > http://www.valhenson.org/review/hash.pdf
> 
> I believe it to be exactly the same paper.
> 
> > do you care to elaborate on the flaws of this analysis?
> 
> I tend to agree with counter arguments published here:
>     http://monotone.ca/docs/Hash-Integrity.html
> I'm not an expert in this field (although I dabbled
> in cryptograhy somewhat given my math background) and
> thus I would love if somebody can show that the
> counter arguments don't stand.
> 

the analysis in §4.1 is just wrong.  pedanticly, i can't get
past the fact that the paper talkes about "sha-1(1)" and
"sha-1(x), x>0".  i'm not sure what that means since sha-1
operates on blocks not integers.  but the real problem is
that the author doesn't appear to understand
"cryptograpically strong".  the invented function may have
the same probability of collision as sha-1, but it is not
cryptographically strong.

also, i think the author doesn't fully appreciate the
power of really big numbers.  you'd need 10^(12+3.2)/2
tb hard drives *full of data* to have a reasonable chance
of a hash collision with 8k blocks.  at $250 each, this
would cost 2.48e17 dollars.

i'm pretty sure that there are other limits in venti that
kick in before 9000 yottabytes.  that's not in standard
si form because yotta is the biggest si prefix i can find.

i think that venti has a different problem.  indexing by
sha-1 hash trades time and index lookups for space.
but disk space is cheep relative to our needs and table
lookup and fragmentation that venti implies results
in a lot of random i/o.  modern disks are at least 25x
faster doing sequential i/o.

- erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  8:43 ` Charles Forsyth
@ 2008-03-05  9:05   ` Gorka Guardiola
  0 siblings, 0 replies; 67+ messages in thread
From: Gorka Guardiola @ 2008-03-05  9:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Mar 5, 2008 at 9:43 AM, Charles Forsyth <forsyth@terzarima.net> wrote:
>
>  on a write, the computer will tell you if you ought to have bought


Has anyone ever seen a collision?. I have played a lot at this lottery
without luck
(or is it with? :-)).
--
- curiosity sKilled the cat


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:00 Enrico Weigelt
  2008-03-05  4:11 ` Roman Shaposhnik
  2008-03-05  5:04 ` geoff
@ 2008-03-05  8:43 ` Charles Forsyth
  2008-03-05  9:05   ` Gorka Guardiola
  2008-03-05 14:33 ` Russ Cox
       [not found] ` <a553f487750f88281db1cce3378577c7@terzarima.net>
  4 siblings, 1 reply; 67+ messages in thread
From: Charles Forsyth @ 2008-03-05  8:43 UTC (permalink / raw)
  To: weigelt, 9fans

>1. how stable is the keying ? sha-1 has only 160 bits, while
>   data blocks may be up to 56k long. so, the mapping is only
>   unique into one direction (not one-to-one). how can we be
>   *really sure*, that - even on very large storages (TB or
>   even PB) - data to each key is alway (one-to-one) unique ?

on a write, the computer will tell you if you ought to have bought
that lottery ticket and stayed out of the rain:

	u = lookuplump(score, type);
	if(u->data != nil){
		...
		if(packetcmp(p, u->data) != 0){
			...
			if(scorecmp(u->score, score) != 0)
				seterr(EStrange, "lookuplump returned bad score %V not %V", u->score, score);
			else if(scorecmp(u->score, nscore) != 0)
				seterr(EStrange, "lookuplump returned bad data %V not %V", nscore, u->score);
			else
				seterr(EStrange, "score collision %V", score);


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  5:52   ` Enrico Weigelt
  2008-03-05  6:24     ` geoff
@ 2008-03-05  6:35     ` Taj Khattra
       [not found]     ` <7f575fa27b41329b9ae24f40e6e5a3cd@plan9.bell-labs.com>
  2 siblings, 0 replies; 67+ messages in thread
From: Taj Khattra @ 2008-03-05  6:35 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

>  Tells me what I already suspected: compare-by-hash can be an
>  dangerous game (even if very uncertain).

does this tell you anything you already suspected?

http://www.usenix.org/event/usenix06/tech/full_papers/black/black_html/index.html


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  5:52   ` Enrico Weigelt
@ 2008-03-05  6:24     ` geoff
  2008-03-05  6:35     ` Taj Khattra
       [not found]     ` <7f575fa27b41329b9ae24f40e6e5a3cd@plan9.bell-labs.com>
  2 siblings, 0 replies; 67+ messages in thread
From: geoff @ 2008-03-05  6:24 UTC (permalink / raw)
  To: weigelt, 9fans

>From the fortune file:
You are roughly 2^90 times more likely to win a U.S. state lottery *and* be struck by lightning simultaneously than you are to encounter [an accidental SHA1 collision] in your file system.  - J. Black


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:11 ` Roman Shaposhnik
  2008-03-05  4:43   ` erik quanstrom
@ 2008-03-05  5:52   ` Enrico Weigelt
  2008-03-05  6:24     ` geoff
                       ` (2 more replies)
  1 sibling, 3 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-05  5:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* Roman Shaposhnik <rvs@sun.com> wrote:
> On Mar 4, 2008, at 8:00 PM, Enrico Weigelt wrote:
> >some thoughts about venti that go around in my mind:
> >
> >1. how stable is the keying ? sha-1 has only 160 bits, while
> >   data blocks may be up to 56k long. so, the mapping is only
> >   unique into one direction (not one-to-one). how can we be
> >   *really sure*, that - even on very large storages (TB or
> >   even PB) - data to each key is alway (one-to-one) unique ?
>
> http://www.nmt.edu/~val/review/hash/index.html

Tells me what I already suspected: compare-by-hash can be an
dangerous game (even if very uncertain). So I wouldn't count
current venti secure and stable if one storage is large and
used for a long time.

cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:43   ` erik quanstrom
@ 2008-03-05  5:09     ` Roman Shaposhnik
  0 siblings, 0 replies; 67+ messages in thread
From: Roman Shaposhnik @ 2008-03-05  5:09 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mar 4, 2008, at 8:43 PM, erik quanstrom wrote:

>> On Mar 4, 2008, at 8:00 PM, Enrico Weigelt wrote:
>>> some thoughts about venti that go around in my mind:
>>>
>>> 1. how stable is the keying ? sha-1 has only 160 bits, while
>>>    data blocks may be up to 56k long. so, the mapping is only
>>>    unique into one direction (not one-to-one). how can we be
>>>    *really sure*, that - even on very large storages (TB or
>>>    even PB) - data to each key is alway (one-to-one) unique ?
>>
>> http://www.nmt.edu/~val/review/hash/index.html
>>
>> Not that this analysis is without flaws, though.
>
> have you invented the 9fans.net effect?

Meaning? I guess the reference went over my head.

> this link may or may not be similar.  but it is on point:
> http://www.valhenson.org/review/hash.pdf

I believe it to be exactly the same paper.

> do you care to elaborate on the flaws of this analysis?

I tend to agree with counter arguments published here:
    http://monotone.ca/docs/Hash-Integrity.html
I'm not an expert in this field (although I dabbled
in cryptograhy somewhat given my math background) and
thus I would love if somebody can show that the
counter arguments don't stand.

Thanks,
Roman.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:00 Enrico Weigelt
  2008-03-05  4:11 ` Roman Shaposhnik
@ 2008-03-05  5:04 ` geoff
  2008-03-05  8:43 ` Charles Forsyth
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 67+ messages in thread
From: geoff @ 2008-03-05  5:04 UTC (permalink / raw)
  To: weigelt, 9fans

I think that /sys/doc/venti/venti.pdf answers your questions.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:11 ` Roman Shaposhnik
@ 2008-03-05  4:43   ` erik quanstrom
  2008-03-05  5:09     ` Roman Shaposhnik
  2008-03-05  5:52   ` Enrico Weigelt
  1 sibling, 1 reply; 67+ messages in thread
From: erik quanstrom @ 2008-03-05  4:43 UTC (permalink / raw)
  To: 9fans

> On Mar 4, 2008, at 8:00 PM, Enrico Weigelt wrote:
> > some thoughts about venti that go around in my mind:
> >
> > 1. how stable is the keying ? sha-1 has only 160 bits, while
> >    data blocks may be up to 56k long. so, the mapping is only
> >    unique into one direction (not one-to-one). how can we be
> >    *really sure*, that - even on very large storages (TB or
> >    even PB) - data to each key is alway (one-to-one) unique ?
>
> http://www.nmt.edu/~val/review/hash/index.html
>
> Not that this analysis is without flaws, though.

have you invented the 9fans.net effect?

this link may or may not be similar.  but it is on point:
http://www.valhenson.org/review/hash.pdf

do you care to elaborate on the flaws of this analysis?

- erik


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [9fans] thoughs about venti+fossil
  2008-03-05  4:00 Enrico Weigelt
@ 2008-03-05  4:11 ` Roman Shaposhnik
  2008-03-05  4:43   ` erik quanstrom
  2008-03-05  5:52   ` Enrico Weigelt
  2008-03-05  5:04 ` geoff
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 67+ messages in thread
From: Roman Shaposhnik @ 2008-03-05  4:11 UTC (permalink / raw)
  To: weigelt, Fans of the OS Plan 9 from Bell Labs

On Mar 4, 2008, at 8:00 PM, Enrico Weigelt wrote:
> some thoughts about venti that go around in my mind:
>
> 1. how stable is the keying ? sha-1 has only 160 bits, while
>    data blocks may be up to 56k long. so, the mapping is only
>    unique into one direction (not one-to-one). how can we be
>    *really sure*, that - even on very large storages (TB or
>    even PB) - data to each key is alway (one-to-one) unique ?

http://www.nmt.edu/~val/review/hash/index.html

Not that this analysis is without flaws, though.

Thanks,
Roman.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [9fans] thoughs about venti+fossil
@ 2008-03-05  4:00 Enrico Weigelt
  2008-03-05  4:11 ` Roman Shaposhnik
                   ` (4 more replies)
  0 siblings, 5 replies; 67+ messages in thread
From: Enrico Weigelt @ 2008-03-05  4:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs


Hi folks,


some thoughts about venti that go around in my mind:

1. how stable is the keying ? sha-1 has only 160 bits, while
   data blocks may be up to 56k long. so, the mapping is only
   unique into one direction (not one-to-one). how can we be
   *really sure*, that - even on very large storages (TB or
   even PB) - data to each key is alway (one-to-one) unique ?

2. what approx. compression level could be assumed on large
   storages by putting together equal data blocks (on several
   kind of data, eg. typical office documents vs. media) ?

3. what happens on the space consumtion if venti is used as
   storage for heavily rw filesystems for a longer time
   (not as permanent archive) - how much space will be wasted ?
   should we add some method for block expiry (eg. timeouts
   of reference counters) ?

4. assuming #1 can be answered 100% yes - would it suit for
   an very large (several PB) heavily distributed storage
   (eg. for some kind of distributed, redundant filesystem) ?


cu
--
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2015-04-23  7:21 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-21 18:30 [9fans] thoughs about venti+fossil hruodr
2015-04-21 19:46 ` Russ Cox
  -- strict thread matches above, loose matches on Subject: below --
2015-04-23  7:21 hruodr
2008-03-06 19:09 Brian L. Stuart
2008-03-06 19:50 ` Charles Forsyth
2008-03-05 14:03 erik quanstrom
2008-03-05 16:00 ` Russ Cox
2008-03-05  4:00 Enrico Weigelt
2008-03-05  4:11 ` Roman Shaposhnik
2008-03-05  4:43   ` erik quanstrom
2008-03-05  5:09     ` Roman Shaposhnik
2008-03-05  5:52   ` Enrico Weigelt
2008-03-05  6:24     ` geoff
2008-03-05  6:35     ` Taj Khattra
     [not found]     ` <7f575fa27b41329b9ae24f40e6e5a3cd@plan9.bell-labs.com>
2008-03-06  4:04       ` Enrico Weigelt
2008-03-06  4:13         ` Bruce Ellis
2008-03-06  4:15         ` andrey mirtchovski
2008-03-06  4:31           ` Bruce Ellis
2008-03-06  6:16             ` Enrico Weigelt
2008-03-06 18:50               ` ron minnich
2008-03-06 19:43                 ` Charles Forsyth
2008-03-06 19:45               ` Paul Lalonde
2008-03-06 20:18                 ` Bruce Ellis
2008-03-06 21:39                   ` Paul Lalonde
2008-03-08  9:06                     ` Enrico Weigelt
2008-03-06 22:10                   ` Martin Harriss
2008-03-06  6:40           ` Enrico Weigelt
2008-03-06 14:35             ` erik quanstrom
2008-03-06 14:58             ` Tom Lieber
2008-03-06 15:09             ` Charles Forsyth
2008-03-06 17:09               ` Robert Raschke
2008-03-10 10:19               ` sqweek
2008-03-10 12:29                 ` Gorka Guardiola
2008-03-10 13:20                 ` erik quanstrom
2008-03-10 19:00                   ` Wes Kussmaul
2008-03-10 19:27                     ` erik quanstrom
2008-03-10 20:55                       ` Bakul Shah
2008-03-11  2:04                       ` Wes Kussmaul
2008-03-11  2:10                         ` erik quanstrom
2008-03-11  6:03                           ` Bruce Ellis
2008-03-10 16:18                 ` Russ Cox
2008-03-10 18:06                   ` Bruce Ellis
2008-03-10 18:31                     ` Eric Van Hensbergen
2008-03-10 18:40                       ` Bruce Ellis
2008-03-10 18:46                     ` Geoffrey Avila
2008-03-10 20:28                       ` Charles Forsyth
2008-03-10 21:35                     ` Charles Forsyth
2008-03-06  9:54           ` Wilhelm B. Kloke
2008-03-08  9:37             ` Enrico Weigelt
2008-03-08  9:57               ` Bruce Ellis
2008-03-08 10:46               ` Charles Forsyth
2008-03-08 15:37               ` erik quanstrom
2008-03-06  4:40         ` cummij
2008-03-06  5:15           ` Bruce Ellis
2008-03-06  5:40         ` Uriel
2008-03-06  5:55           ` Bruce Ellis
2008-03-11 18:34             ` Uriel
2008-03-06 12:26           ` erik quanstrom
2008-03-05  5:04 ` geoff
2008-03-05  8:43 ` Charles Forsyth
2008-03-05  9:05   ` Gorka Guardiola
2008-03-05 14:33 ` Russ Cox
2008-03-06 12:39   ` Enrico Weigelt
2008-03-06 16:58     ` Russ Cox
2008-03-06 18:16       ` andrey mirtchovski
     [not found] ` <a553f487750f88281db1cce3378577c7@terzarima.net>
2008-03-06  5:38   ` Enrico Weigelt
2008-03-06  9:44     ` Joel C. Salomon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).