9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Charles Forsyth <charles.forsyth@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] SHA-1 collision and venti
Date: Mon, 27 Feb 2017 22:20:56 +0000	[thread overview]
Message-ID: <CAOw7k5jZ9Kvf-gu7k=sOwON66G9fqX5PJN7A7_amdK2biN-R4w@mail.gmail.com> (raw)
In-Reply-To: <CAGMcHPr-fKV+7VuVpzxFgKawdauzgezYT=P=LuiQPLZgTP2X_A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5231 bytes --]

I think venti could deal with it: Rwrite returns a score, Tread provides a
score, and the caller typically uses it as an opaque value. If not, whether
a different sha1 is returned or a new algorithm is used, the caller could
still not rely on sha1(block)=score.

In any case, fossil needs a fix to cope with venti returning "score
collision", to prevent it failing to archive once it hits a shattered file,
or rather the first venti-sized block of them.

On Mon, 27 Feb 2017, 21:37 Riddler, <riddler876@gmail.com> wrote:

> I think much in the same vein as git, venti doesn't need to worry too
> much about collisions given the behavior when collisions occur is
> well-defined and sensible in both systems.
> It's second-preimage's that are more of a concern (and still not
> possible with SHA1). The lack of preimage attacks on SHA1 prevents
> people from maliciously creating a file with the same hash as one you
> created. They can only duplicate ones they created which should limit
> the scope of any maliciousness to stuff they have control over.
> At the point preimages are practical, I'd want to be long gone from
> SHA1 but IIRC even MD5 still has no practical second-preimage attacks
> so we're probably a long way off from there.
>
> Technically, anything relying on venti should handle the collision
> detected response gracefully, as it's always a possibility no matter
> the algorithm.
> If fossil doesn't handle it very well perhaps it's not venti that
> needs changed (given it detects & reports) but fossil.
> A top-of-the-head suggestion would be for fossil to respond to the
> collision notice by doing something to the block that can be undone
> later (as others above have hinted at) such as appending something,
> XOR, etc., marking it as such in its own data structures then passing
> it back to venti. It could then reverse the operation when retrieving
> the files with the 'collision fixed' flag set.
> I don't know how feasible that idea is (been a while since I looked at
> fossil) but worth looking into maybe? It would seem, at a cursory
> glance, fix the problem for fossil+venti indefinitely at the cost of a
> minor computational overhead for retrieving collided files.
>
> As Charles pointed out, you could also just do that in venti, I guess
> it depends if the write API call contract in venti is "returns SHA1 of
> file" or "returns arbitrary file id".
> If the behavior was put into venti you couldn't assume the ID returned
> = sha1(block) anymore - but I don't know if anything relies on that
> behavior.
> As for venti, I wouldn't say 'no point' to an algorithm update, but
> I'd rather have fossil updated to manage to deal with collisions
> better first.
>
>
> On Mon, Feb 27, 2017 at 8:14 PM, Bakul Shah <bakul@bitblocks.com> wrote:
> > On Mon, 27 Feb 2017 19:02:29 GMT Charles Forsyth <
> charles.forsyth@gmail.com> wrote:
> >> On 27 February 2017 at 18:30, Charles Forsyth <
> charles.forsyth@gmail.com>
> >> wrote:
> >>
> >> > that's a separate argument that venti would never work for you,
> regardless
> >> > of the hash algorithm used.
> >
> >> since venti returns the resulting score from each write, and it knows
> >> whether there's been a collision,
> >> it appears it could return a modified score (having ensured that is now
> >> unique, "and the next judge said that's a very shaggy dog")
> >
> > Consider what can happens you want to consolidate two venti
> > archives into another one. Each source venti has a different
> > file with the same hash. When you discover in the destination
> > venti that they collide, it is too late to return a modified
> > score -- you have to find and fix all pointer blocks that
> > refer to this block as well.
> >
> > In theory the  chance of a random collion with SHA1 may be
> > 1 in 2^80 but we have existing files that collide (unlike the
> > hypothetical argument of someone wanting to store 10^21 byte
> > size files -- but if they can produce it, we can store it!).
> > Your argument is that since venti is readonly, existing data
> > in it is not vulnerable but not everyone stores their archives
> > on readonly medium.  Another argument would be that almost
> > always venti is privately used and unlikely to be accessible
> > to the badguys.  Yet another argument is that hardly anyone
> > uses venti so why even bother. These are behavior patterns
> > that are true today but why limit its usefulness?
> >
> > Just as we move archived data we care about to more modern
> > media (as we no longer have easy access to floppies, 9track
> > tapes, 1.4" streamer tape etc.), and update our crypto keys,
> > since they too have limited shelf-life, we can replace the use
> > of SHA1.  This is a fixable problem.  [It is much much worse
> > for git given the amount of s/w that relies on it. I think
> > it is a matter of time before someone comes up with a
> > collision between two different types of git objects (such as
> > a blob and a tree) but we'll let Linus worry about it :-)]
> >
> > The solution is to convert from sha1 to blake2b or something
> > strong and be prepared to move the data again in 10-20 years.
> >
>
>

[-- Attachment #2: Type: text/html, Size: 7726 bytes --]

  reply	other threads:[~2017-02-27 22:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-26 17:25 Bakul Shah
2017-02-26 17:30 ` Jules Merit
2017-02-26 18:29   ` Charles Forsyth
2017-02-26 18:16 ` Charles Forsyth
2017-02-26 18:25   ` Charles Forsyth
2017-02-26 19:46     ` Bakul Shah
2017-02-26 21:02       ` Kim Shrier
2017-02-27 15:46         ` Dave MacFarlane
2017-02-27 16:47           ` Charles Forsyth
2017-02-27 17:07             ` Charles Forsyth
2017-02-27 17:28               ` Bakul Shah
2017-02-27 18:14                 ` hiro
2017-02-27 18:20                   ` Bakul Shah
2017-02-27 18:30                 ` Charles Forsyth
2017-02-27 19:02                   ` Charles Forsyth
2017-02-27 20:05                     ` cinap_lenrek
2017-02-27 20:14                     ` Bakul Shah
2017-02-27 21:12                       ` Riddler
2017-02-27 22:20                         ` Charles Forsyth [this message]
2017-03-01 12:21                           ` erik quanstrom
2017-03-01 12:35                             ` David du Colombier
2017-02-27 19:34                 ` Skip Tavakkolian
2017-02-26 18:48   ` Bakul Shah
2017-02-26 19:57     ` Charles Forsyth
2017-02-26 20:06       ` Jadon Bennett
2017-02-26 20:16       ` Bakul Shah
2017-02-28 15:47 Darren Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOw7k5jZ9Kvf-gu7k=sOwON66G9fqX5PJN7A7_amdK2biN-R4w@mail.gmail.com' \
    --to=charles.forsyth@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).