* [9fans] SHA-1 collision and venti @ 2017-02-26 17:25 Bakul Shah 2017-02-26 17:30 ` Jules Merit 2017-02-26 18:16 ` Charles Forsyth 0 siblings, 2 replies; 27+ messages in thread From: Bakul Shah @ 2017-02-26 17:25 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 398 bytes --] https://arstechnica.com/security/2017/02/watershed-sha1-collision-just-broke-the-webkit-repository-others-may-follow/ https://shattered.io/static/shattered.pdf Venti is similarly corruptible, right? Since the checksum is over just the content. If you downloaded https://shattered.io/static/shattered-1.pdf and https://shattered.io/static/shattered-2.pdf, venti would lose the contents of one. [-- Attachment #2: Type: text/html, Size: 893 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 17:25 [9fans] SHA-1 collision and venti Bakul Shah @ 2017-02-26 17:30 ` Jules Merit 2017-02-26 18:29 ` Charles Forsyth 2017-02-26 18:16 ` Charles Forsyth 1 sibling, 1 reply; 27+ messages in thread From: Jules Merit @ 2017-02-26 17:30 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 629 bytes --] there is a backdoor when a score of 4, what data produces it i have no idea. On Sun, Feb 26, 2017 at 9:25 AM, Bakul Shah <bakul@bitblocks.com> wrote: > https://arstechnica.com/security/2017/02/watershed- > sha1-collision-just-broke-the-webkit-repository-others-may-follow/ > > https://shattered.io/static/shattered.pdf > > Venti is similarly corruptible, right? Since the checksum is over just the > content. If you downloaded https://shattered.io/static/shattered-1.pdf > <https://shattered.io/static/shattered-2.pdf> and > https://shattered.io/static/shattered-2.pdf, venti would lose the > contents of one. > [-- Attachment #2: Type: text/html, Size: 1336 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 17:30 ` Jules Merit @ 2017-02-26 18:29 ` Charles Forsyth 0 siblings, 0 replies; 27+ messages in thread From: Charles Forsyth @ 2017-02-26 18:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 235 bytes --] On 26 February 2017 at 17:30, Jules Merit <jules.merit.eurocorp.us@gmail.com > wrote: > there is a backdoor when a score of 4, what data produces it i have no > idea. > where is that? I had a quick look but couldn't find it. [-- Attachment #2: Type: text/html, Size: 705 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 17:25 [9fans] SHA-1 collision and venti Bakul Shah 2017-02-26 17:30 ` Jules Merit @ 2017-02-26 18:16 ` Charles Forsyth 2017-02-26 18:25 ` Charles Forsyth 2017-02-26 18:48 ` Bakul Shah 1 sibling, 2 replies; 27+ messages in thread From: Charles Forsyth @ 2017-02-26 18:16 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1334 bytes --] On 26 February 2017 at 17:25, Bakul Shah <bakul@bitblocks.com> wrote: > Venti is similarly corruptible, right? Since the checksum is over just the > content. If you downloaded https://shattered.io/static/shattered-1.pdf > <https://shattered.io/static/shattered-2.pdf> and > https://shattered.io/static/shattered-2.pdf, venti would lose the > contents of one. > Luckily, (a) they are both bigger than the block size usually configured, over which the hash is calculated, and (b) in case someone tries it, you've actually linked to the same file (-2.pdf) but under different names, so there won't be a collision by following your links. Hurrah! Venti detects a collision on the attempt to write the second copy if that differs from the earlier one stored (error "store collision"). The earlier copy is untouched (venti anyway is write-once per score). Fossil doesn't handle it well, because it turns up during archiving and ends up marking the archive attempt as failed, but it will try again. Meanwhile, you've got time to change fossil to check the venti error return for "score collision" and announce it, loudly, discarding the second one. Obviously if you care about something, make sure your version is in venti first! Chances are that collisions arise from naughty people tricking you later. Probably. [-- Attachment #2: Type: text/html, Size: 1912 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 18:16 ` Charles Forsyth @ 2017-02-26 18:25 ` Charles Forsyth 2017-02-26 19:46 ` Bakul Shah 2017-02-26 18:48 ` Bakul Shah 1 sibling, 1 reply; 27+ messages in thread From: Charles Forsyth @ 2017-02-26 18:25 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1659 bytes --] It's curious that svn "corrupts" the repository, if that's really what they mean, when two leaf files collide. An index or directory colliding with a file would be more understandable. On 26 February 2017 at 18:16, Charles Forsyth <charles.forsyth@gmail.com> wrote: > > On 26 February 2017 at 17:25, Bakul Shah <bakul@bitblocks.com> wrote: > >> Venti is similarly corruptible, right? Since the checksum is over just >> the content. If you downloaded https://shattered.i >> o/static/shattered-1.pdf <https://shattered.io/static/shattered-2.pdf> >> and https://shattered.io/static/shattered-2.pdf, venti would lose the >> contents of one. >> > > Luckily, (a) they are both bigger than the block size usually configured, > over which the hash is calculated, and (b) in case someone tries it, you've > actually linked to the same file (-2.pdf) but under different names, so > there won't be a collision by following your links. Hurrah! > > Venti detects a collision on the attempt to write the second copy if that > differs from the earlier one stored (error "store collision"). The earlier > copy is untouched (venti anyway is write-once per score). > Fossil doesn't handle it well, because it turns up during archiving and > ends up marking the archive attempt as failed, but it will try again. > Meanwhile, you've got time to change fossil to check the venti error > return for "score collision" and announce it, loudly, discarding the second > one. > Obviously if you care about something, make sure your version is in venti > first! Chances are that collisions arise from naughty people tricking you > later. Probably. > [-- Attachment #2: Type: text/html, Size: 2530 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 18:25 ` Charles Forsyth @ 2017-02-26 19:46 ` Bakul Shah 2017-02-26 21:02 ` Kim Shrier 0 siblings, 1 reply; 27+ messages in thread From: Bakul Shah @ 2017-02-26 19:46 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, 26 Feb 2017 18:25:34 GMT Charles Forsyth <charles.forsyth@gmail.com> wrote: > > It's curious that svn "corrupts" the repository, if that's really what they > mean, when two leaf files collide. > An index or directory colliding with a file would be more understandable. The only known collision is for files. I suspect this was seen as a "can't happen" event so may be dealing with the error was not done right. You can read the report: https://bugs.webkit.org/show_bug.cgi?id=168774 > > Venti detects a collision on the attempt to write the second copy if that > > differs from the earlier one stored (error "store collision"). The earlier > > copy is untouched (venti anyway is write-once per score). Good to know at least it /detects/ score collisions. The concern would be that one of two colliding files *can't* be archived and it will be lost. We only have one example so it is not a big deal right now. > > Fossil doesn't handle it well, because it turns up during archiving and > > ends up marking the archive attempt as failed, but it will try again. > > Meanwhile, you've got time to change fossil to check the venti error > > return for "score collision" and announce it, loudly, discarding the second > > one. Hopefully the two versions can co-exist on fossil? > > Obviously if you care about something, make sure your version is in venti > > first! Chances are that collisions arise from naughty people tricking you > > later. Probably. Or may be you are doing research on collsions?! ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 19:46 ` Bakul Shah @ 2017-02-26 21:02 ` Kim Shrier 2017-02-27 15:46 ` Dave MacFarlane 0 siblings, 1 reply; 27+ messages in thread From: Kim Shrier @ 2017-02-26 21:02 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I have had a personal project on my list of "things to do when I have time", is to redo venti using sha256. Does any body see any problems with doing that? Kim ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 21:02 ` Kim Shrier @ 2017-02-27 15:46 ` Dave MacFarlane 2017-02-27 16:47 ` Charles Forsyth 0 siblings, 1 reply; 27+ messages in thread From: Dave MacFarlane @ 2017-02-27 15:46 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Why not skip sha-256 and go directly to Sha3? On Sun, Feb 26, 2017 at 4:02 PM, Kim Shrier <kim@westryn.net> wrote: > I have had a personal project on my list of "things to do > when I have time", is to redo venti using sha256. Does > any body see any problems with doing that? > > Kim > > -- - Dave ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 15:46 ` Dave MacFarlane @ 2017-02-27 16:47 ` Charles Forsyth 2017-02-27 17:07 ` Charles Forsyth 0 siblings, 1 reply; 27+ messages in thread From: Charles Forsyth @ 2017-02-27 16:47 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 161 bytes --] On 27 February 2017 at 15:46, Dave MacFarlane <driusan@gmail.com> wrote: > Why not skip sha-256 and go directly to Sha3? blake2 has also been suggested [-- Attachment #2: Type: text/html, Size: 443 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 16:47 ` Charles Forsyth @ 2017-02-27 17:07 ` Charles Forsyth 2017-02-27 17:28 ` Bakul Shah 0 siblings, 1 reply; 27+ messages in thread From: Charles Forsyth @ 2017-02-27 17:07 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1476 bytes --] On 27 February 2017 at 16:47, Charles Forsyth <charles.forsyth@gmail.com> wrote: > On 27 February 2017 at 15:46, Dave MacFarlane <driusan@gmail.com> wrote: > >> Why not skip sha-256 and go directly to Sha3? > > > blake2 has also been suggested also, it's not clear it's urgent for venti. the scam is to make a new value that produces the same hash as an earlier important value where the hash plays a part in certifying the value, or where software uses the shorthand of comparing hashes to compare values and acts on that without comparing the values. with venti, the hash is produced as a side-effect of storing a value, and it also records the value itself. when the hash is presented, the stored block is returned. the hash itself is a compact address and doesn't certify the value (ie, nothing that uses venti assumes that it also certifies the value). any attempt to store a different value with the same hash will be detected. using any hash function has a chance of collision (newer, longer hashes reduce that, but it's rare as it is). because venti is write-once, no-one can change your venti contents subtly without access to the storage device, but if they've got access to the storage they don't need to be subtle. with the collision-maker and access to the storage device, they can make a previously certain vac: mean something different, but it still needs raw access to the device, it can't be done through the venti protocol. [-- Attachment #2: Type: text/html, Size: 2226 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 17:07 ` Charles Forsyth @ 2017-02-27 17:28 ` Bakul Shah 2017-02-27 18:14 ` hiro ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Bakul Shah @ 2017-02-27 17:28 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1776 bytes --] My argument is that an archival system that can't store some files, no matter how they were generated, is not good enough. A hash collision researcher may have a legitimate reason to store such files. > On Feb 27, 2017, at 9:07 AM, Charles Forsyth <charles.forsyth@gmail.com> wrote: > > >> On 27 February 2017 at 16:47, Charles Forsyth <charles.forsyth@gmail.com> wrote: >>> On 27 February 2017 at 15:46, Dave MacFarlane <driusan@gmail.com> wrote: >>> Why not skip sha-256 and go directly to Sha3? >> >> blake2 has also been suggested > > also, it's not clear it's urgent for venti. the scam is to make a new value that produces the same hash as an earlier important value where the hash plays a part in certifying the value, > or where software uses the shorthand of comparing hashes to compare values and acts on that without comparing the values. > with venti, the hash is produced as a side-effect of storing a value, and it also records the value itself. > when the hash is presented, the stored block is returned. the hash itself is a compact address and doesn't certify the value (ie, nothing that uses venti assumes that it also certifies the value). > any attempt to store a different value with the same hash will be detected. using any hash function has a chance of collision (newer, longer hashes reduce that, but it's rare as it is). > because venti is write-once, no-one can change your venti contents subtly without access to the storage device, but if they've got access to the storage they don't need to be subtle. > with the collision-maker and access to the storage device, they can make a previously certain vac: mean something different, but it still needs raw access to the device, it can't be done through > the venti protocol. [-- Attachment #2: Type: text/html, Size: 2746 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 17:28 ` Bakul Shah @ 2017-02-27 18:14 ` hiro 2017-02-27 18:20 ` Bakul Shah 2017-02-27 18:30 ` Charles Forsyth 2017-02-27 19:34 ` Skip Tavakkolian 2 siblings, 1 reply; 27+ messages in thread From: hiro @ 2017-02-27 18:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Bakul: I want to store a 1000000000 Petabyte file, can your archival system support that? I want to research big files. There's always a limit, but when does it matter? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 18:14 ` hiro @ 2017-02-27 18:20 ` Bakul Shah 0 siblings, 0 replies; 27+ messages in thread From: Bakul Shah @ 2017-02-27 18:20 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs The two are not comparable. > On Feb 27, 2017, at 10:14 AM, hiro <23hiro@gmail.com> wrote: > > Bakul: I want to store a 1000000000 Petabyte file, can your archival > system support that? I want to research big files. > > There's always a limit, but when does it matter? > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 17:28 ` Bakul Shah 2017-02-27 18:14 ` hiro @ 2017-02-27 18:30 ` Charles Forsyth 2017-02-27 19:02 ` Charles Forsyth 2017-02-27 19:34 ` Skip Tavakkolian 2 siblings, 1 reply; 27+ messages in thread From: Charles Forsyth @ 2017-02-27 18:30 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 394 bytes --] On 27 February 2017 at 17:28, Bakul Shah <bakul@bitblocks.com> wrote: > My argument is that an archival system that can't store some files, no > matter how they were generated, is not good enough. A hash collision > researcher may have a legitimate reason to store such files. > that's a separate argument that venti would never work for you, regardless of the hash algorithm used. [-- Attachment #2: Type: text/html, Size: 765 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 18:30 ` Charles Forsyth @ 2017-02-27 19:02 ` Charles Forsyth 2017-02-27 20:05 ` cinap_lenrek 2017-02-27 20:14 ` Bakul Shah 0 siblings, 2 replies; 27+ messages in thread From: Charles Forsyth @ 2017-02-27 19:02 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 438 bytes --] On 27 February 2017 at 18:30, Charles Forsyth <charles.forsyth@gmail.com> wrote: > that's a separate argument that venti would never work for you, regardless > of the hash algorithm used. since venti returns the resulting score from each write, and it knows whether there's been a collision, it appears it could return a modified score (having ensured that is now unique, "and the next judge said that's a very shaggy dog") [-- Attachment #2: Type: text/html, Size: 777 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 19:02 ` Charles Forsyth @ 2017-02-27 20:05 ` cinap_lenrek 2017-02-27 20:14 ` Bakul Shah 1 sibling, 0 replies; 27+ messages in thread From: cinap_lenrek @ 2017-02-27 20:05 UTC (permalink / raw) To: 9fans couldnt you apply encryption before hashing? so to mount a collision attack you'd also need to know the encryption key used by the underlying storatge system (fossil, vac). so you dont just keep the the network address of your venti server but also the encryption key. just make it part of the dial string or something... -- cinap ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 19:02 ` Charles Forsyth 2017-02-27 20:05 ` cinap_lenrek @ 2017-02-27 20:14 ` Bakul Shah 2017-02-27 21:12 ` Riddler 1 sibling, 1 reply; 27+ messages in thread From: Bakul Shah @ 2017-02-27 20:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, 27 Feb 2017 19:02:29 GMT Charles Forsyth <charles.forsyth@gmail.com> wrote: > On 27 February 2017 at 18:30, Charles Forsyth <charles.forsyth@gmail.com> > wrote: > > > that's a separate argument that venti would never work for you, regardless > > of the hash algorithm used. > since venti returns the resulting score from each write, and it knows > whether there's been a collision, > it appears it could return a modified score (having ensured that is now > unique, "and the next judge said that's a very shaggy dog") Consider what can happens you want to consolidate two venti archives into another one. Each source venti has a different file with the same hash. When you discover in the destination venti that they collide, it is too late to return a modified score -- you have to find and fix all pointer blocks that refer to this block as well. In theory the chance of a random collion with SHA1 may be 1 in 2^80 but we have existing files that collide (unlike the hypothetical argument of someone wanting to store 10^21 byte size files -- but if they can produce it, we can store it!). Your argument is that since venti is readonly, existing data in it is not vulnerable but not everyone stores their archives on readonly medium. Another argument would be that almost always venti is privately used and unlikely to be accessible to the badguys. Yet another argument is that hardly anyone uses venti so why even bother. These are behavior patterns that are true today but why limit its usefulness? Just as we move archived data we care about to more modern media (as we no longer have easy access to floppies, 9track tapes, 1.4" streamer tape etc.), and update our crypto keys, since they too have limited shelf-life, we can replace the use of SHA1. This is a fixable problem. [It is much much worse for git given the amount of s/w that relies on it. I think it is a matter of time before someone comes up with a collision between two different types of git objects (such as a blob and a tree) but we'll let Linus worry about it :-)] The solution is to convert from sha1 to blake2b or something strong and be prepared to move the data again in 10-20 years. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 20:14 ` Bakul Shah @ 2017-02-27 21:12 ` Riddler 2017-02-27 22:20 ` Charles Forsyth 0 siblings, 1 reply; 27+ messages in thread From: Riddler @ 2017-02-27 21:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I think much in the same vein as git, venti doesn't need to worry too much about collisions given the behavior when collisions occur is well-defined and sensible in both systems. It's second-preimage's that are more of a concern (and still not possible with SHA1). The lack of preimage attacks on SHA1 prevents people from maliciously creating a file with the same hash as one you created. They can only duplicate ones they created which should limit the scope of any maliciousness to stuff they have control over. At the point preimages are practical, I'd want to be long gone from SHA1 but IIRC even MD5 still has no practical second-preimage attacks so we're probably a long way off from there. Technically, anything relying on venti should handle the collision detected response gracefully, as it's always a possibility no matter the algorithm. If fossil doesn't handle it very well perhaps it's not venti that needs changed (given it detects & reports) but fossil. A top-of-the-head suggestion would be for fossil to respond to the collision notice by doing something to the block that can be undone later (as others above have hinted at) such as appending something, XOR, etc., marking it as such in its own data structures then passing it back to venti. It could then reverse the operation when retrieving the files with the 'collision fixed' flag set. I don't know how feasible that idea is (been a while since I looked at fossil) but worth looking into maybe? It would seem, at a cursory glance, fix the problem for fossil+venti indefinitely at the cost of a minor computational overhead for retrieving collided files. As Charles pointed out, you could also just do that in venti, I guess it depends if the write API call contract in venti is "returns SHA1 of file" or "returns arbitrary file id". If the behavior was put into venti you couldn't assume the ID returned = sha1(block) anymore - but I don't know if anything relies on that behavior. As for venti, I wouldn't say 'no point' to an algorithm update, but I'd rather have fossil updated to manage to deal with collisions better first. On Mon, Feb 27, 2017 at 8:14 PM, Bakul Shah <bakul@bitblocks.com> wrote: > On Mon, 27 Feb 2017 19:02:29 GMT Charles Forsyth <charles.forsyth@gmail.com> wrote: >> On 27 February 2017 at 18:30, Charles Forsyth <charles.forsyth@gmail.com> >> wrote: >> >> > that's a separate argument that venti would never work for you, regardless >> > of the hash algorithm used. > >> since venti returns the resulting score from each write, and it knows >> whether there's been a collision, >> it appears it could return a modified score (having ensured that is now >> unique, "and the next judge said that's a very shaggy dog") > > Consider what can happens you want to consolidate two venti > archives into another one. Each source venti has a different > file with the same hash. When you discover in the destination > venti that they collide, it is too late to return a modified > score -- you have to find and fix all pointer blocks that > refer to this block as well. > > In theory the chance of a random collion with SHA1 may be > 1 in 2^80 but we have existing files that collide (unlike the > hypothetical argument of someone wanting to store 10^21 byte > size files -- but if they can produce it, we can store it!). > Your argument is that since venti is readonly, existing data > in it is not vulnerable but not everyone stores their archives > on readonly medium. Another argument would be that almost > always venti is privately used and unlikely to be accessible > to the badguys. Yet another argument is that hardly anyone > uses venti so why even bother. These are behavior patterns > that are true today but why limit its usefulness? > > Just as we move archived data we care about to more modern > media (as we no longer have easy access to floppies, 9track > tapes, 1.4" streamer tape etc.), and update our crypto keys, > since they too have limited shelf-life, we can replace the use > of SHA1. This is a fixable problem. [It is much much worse > for git given the amount of s/w that relies on it. I think > it is a matter of time before someone comes up with a > collision between two different types of git objects (such as > a blob and a tree) but we'll let Linus worry about it :-)] > > The solution is to convert from sha1 to blake2b or something > strong and be prepared to move the data again in 10-20 years. > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 21:12 ` Riddler @ 2017-02-27 22:20 ` Charles Forsyth 2017-03-01 12:21 ` erik quanstrom 0 siblings, 1 reply; 27+ messages in thread From: Charles Forsyth @ 2017-02-27 22:20 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 5231 bytes --] I think venti could deal with it: Rwrite returns a score, Tread provides a score, and the caller typically uses it as an opaque value. If not, whether a different sha1 is returned or a new algorithm is used, the caller could still not rely on sha1(block)=score. In any case, fossil needs a fix to cope with venti returning "score collision", to prevent it failing to archive once it hits a shattered file, or rather the first venti-sized block of them. On Mon, 27 Feb 2017, 21:37 Riddler, <riddler876@gmail.com> wrote: > I think much in the same vein as git, venti doesn't need to worry too > much about collisions given the behavior when collisions occur is > well-defined and sensible in both systems. > It's second-preimage's that are more of a concern (and still not > possible with SHA1). The lack of preimage attacks on SHA1 prevents > people from maliciously creating a file with the same hash as one you > created. They can only duplicate ones they created which should limit > the scope of any maliciousness to stuff they have control over. > At the point preimages are practical, I'd want to be long gone from > SHA1 but IIRC even MD5 still has no practical second-preimage attacks > so we're probably a long way off from there. > > Technically, anything relying on venti should handle the collision > detected response gracefully, as it's always a possibility no matter > the algorithm. > If fossil doesn't handle it very well perhaps it's not venti that > needs changed (given it detects & reports) but fossil. > A top-of-the-head suggestion would be for fossil to respond to the > collision notice by doing something to the block that can be undone > later (as others above have hinted at) such as appending something, > XOR, etc., marking it as such in its own data structures then passing > it back to venti. It could then reverse the operation when retrieving > the files with the 'collision fixed' flag set. > I don't know how feasible that idea is (been a while since I looked at > fossil) but worth looking into maybe? It would seem, at a cursory > glance, fix the problem for fossil+venti indefinitely at the cost of a > minor computational overhead for retrieving collided files. > > As Charles pointed out, you could also just do that in venti, I guess > it depends if the write API call contract in venti is "returns SHA1 of > file" or "returns arbitrary file id". > If the behavior was put into venti you couldn't assume the ID returned > = sha1(block) anymore - but I don't know if anything relies on that > behavior. > As for venti, I wouldn't say 'no point' to an algorithm update, but > I'd rather have fossil updated to manage to deal with collisions > better first. > > > On Mon, Feb 27, 2017 at 8:14 PM, Bakul Shah <bakul@bitblocks.com> wrote: > > On Mon, 27 Feb 2017 19:02:29 GMT Charles Forsyth < > charles.forsyth@gmail.com> wrote: > >> On 27 February 2017 at 18:30, Charles Forsyth < > charles.forsyth@gmail.com> > >> wrote: > >> > >> > that's a separate argument that venti would never work for you, > regardless > >> > of the hash algorithm used. > > > >> since venti returns the resulting score from each write, and it knows > >> whether there's been a collision, > >> it appears it could return a modified score (having ensured that is now > >> unique, "and the next judge said that's a very shaggy dog") > > > > Consider what can happens you want to consolidate two venti > > archives into another one. Each source venti has a different > > file with the same hash. When you discover in the destination > > venti that they collide, it is too late to return a modified > > score -- you have to find and fix all pointer blocks that > > refer to this block as well. > > > > In theory the chance of a random collion with SHA1 may be > > 1 in 2^80 but we have existing files that collide (unlike the > > hypothetical argument of someone wanting to store 10^21 byte > > size files -- but if they can produce it, we can store it!). > > Your argument is that since venti is readonly, existing data > > in it is not vulnerable but not everyone stores their archives > > on readonly medium. Another argument would be that almost > > always venti is privately used and unlikely to be accessible > > to the badguys. Yet another argument is that hardly anyone > > uses venti so why even bother. These are behavior patterns > > that are true today but why limit its usefulness? > > > > Just as we move archived data we care about to more modern > > media (as we no longer have easy access to floppies, 9track > > tapes, 1.4" streamer tape etc.), and update our crypto keys, > > since they too have limited shelf-life, we can replace the use > > of SHA1. This is a fixable problem. [It is much much worse > > for git given the amount of s/w that relies on it. I think > > it is a matter of time before someone comes up with a > > collision between two different types of git objects (such as > > a blob and a tree) but we'll let Linus worry about it :-)] > > > > The solution is to convert from sha1 to blake2b or something > > strong and be prepared to move the data again in 10-20 years. > > > > [-- Attachment #2: Type: text/html, Size: 7726 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 22:20 ` Charles Forsyth @ 2017-03-01 12:21 ` erik quanstrom 2017-03-01 12:35 ` David du Colombier 0 siblings, 1 reply; 27+ messages in thread From: erik quanstrom @ 2017-03-01 12:21 UTC (permalink / raw) To: 9fans On Mon Feb 27 14:17:49 PST 2017, charles.forsyth@gmail.com wrote: > I think venti could deal with it: Rwrite returns a score, Tread provides a > score, and the caller typically uses it as an opaque value. If not, whether > a different sha1 is returned or a new algorithm is used, the caller could > still not rely on sha1(block)=score. > > In any case, fossil needs a fix to cope with venti returning "score > collision", to prevent it failing to archive once it hits a shattered file, > or rather the first venti-sized block of them. i believe that rsc worked this out in some work he did based on venti. sadly i don't remember the name of the project. a modest proposal. since p(collison) is calculated for a random collison, why not just store encrypted blocks. - erik ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-03-01 12:21 ` erik quanstrom @ 2017-03-01 12:35 ` David du Colombier 0 siblings, 0 replies; 27+ messages in thread From: David du Colombier @ 2017-03-01 12:35 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > i believe that rsc worked this out in some work he did based on venti. > sadly i don't remember the name of the project. I believe you're referring to Foundation. https://swtch.com/~rsc/papers/fndn-usenix2008.pdf -- David du Colombier ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-27 17:28 ` Bakul Shah 2017-02-27 18:14 ` hiro 2017-02-27 18:30 ` Charles Forsyth @ 2017-02-27 19:34 ` Skip Tavakkolian 2 siblings, 0 replies; 27+ messages in thread From: Skip Tavakkolian @ 2017-02-27 19:34 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 2513 bytes --] I wondered if one could make a logical argument that says, one could use a combination of hashes that have different collision resistances (e.g. SHA1⊕MD5) for each file, extending to any number of hashes to satisfy that the combination is unique for all files... So I did a little research, and the short answer is NO! It turns out that the combined hash would be no better than the best of the hash functions in the combo: http://crypto.stackexchange.com/questions/270/guarding-against-cryptanalytic-breakthroughs-combining-multiple-hash-functions The Internet is a wonderful thing. On Mon, Feb 27, 2017 at 9:29 AM Bakul Shah <bakul@bitblocks.com> wrote: > My argument is that an archival system that can't store some files, no > matter how they were generated, is not good enough. A hash collision > researcher may have a legitimate reason to store such files. > > On Feb 27, 2017, at 9:07 AM, Charles Forsyth <charles.forsyth@gmail.com> > wrote: > > > On 27 February 2017 at 16:47, Charles Forsyth <charles.forsyth@gmail.com> > wrote: > > On 27 February 2017 at 15:46, Dave MacFarlane <driusan@gmail.com> wrote: > > Why not skip sha-256 and go directly to Sha3? > > > blake2 has also been suggested > > > also, it's not clear it's urgent for venti. the scam is to make a new > value that produces the same hash as an earlier important value where the > hash plays a part in certifying the value, > or where software uses the shorthand of comparing hashes to compare values > and acts on that without comparing the values. > with venti, the hash is produced as a side-effect of storing a value, and > it also records the value itself. > when the hash is presented, the stored block is returned. the hash itself > is a compact address and doesn't certify the value (ie, nothing that uses > venti assumes that it also certifies the value). > any attempt to store a different value with the same hash will be > detected. using any hash function has a chance of collision (newer, longer > hashes reduce that, but it's rare as it is). > because venti is write-once, no-one can change your venti contents subtly > without access to the storage device, but if they've got access to the > storage they don't need to be subtle. > with the collision-maker and access to the storage device, they can make a > previously certain vac: mean something different, but it still needs raw > access to the device, it can't be done through > the venti protocol. > > [-- Attachment #2: Type: text/html, Size: 5078 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 18:16 ` Charles Forsyth 2017-02-26 18:25 ` Charles Forsyth @ 2017-02-26 18:48 ` Bakul Shah 2017-02-26 19:57 ` Charles Forsyth 1 sibling, 1 reply; 27+ messages in thread From: Bakul Shah @ 2017-02-26 18:48 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1563 bytes --] The links are to different files. The pdfs look identical except for color background. The diff bytes are 193..320. The rest is the same so your first 8k byte checksum would be the same. > On Feb 26, 2017, at 10:16 AM, Charles Forsyth <charles.forsyth@gmail.com> wrote: > > >> On 26 February 2017 at 17:25, Bakul Shah <bakul@bitblocks.com> wrote: >> Venti is similarly corruptible, right? Since the checksum is over just the content. If you downloaded https://shattered.io/static/shattered-1.pdf and https://shattered.io/static/shattered-2.pdf, venti would lose the contents of one. > > Luckily, (a) they are both bigger than the block size usually configured, over which the hash is calculated, and (b) in case someone tries it, you've actually linked to the same file (-2.pdf) but under different names, so there won't be a collision by following your links. Hurrah! > > Venti detects a collision on the attempt to write the second copy if that differs from the earlier one stored (error "store collision"). The earlier copy is untouched (venti anyway is write-once per score). > Fossil doesn't handle it well, because it turns up during archiving and ends up marking the archive attempt as failed, but it will try again. > Meanwhile, you've got time to change fossil to check the venti error return for "score collision" and announce it, loudly, discarding the second one. > Obviously if you care about something, make sure your version is in venti first! Chances are that collisions arise from naughty people tricking you later. Probably. [-- Attachment #2: Type: text/html, Size: 2423 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 18:48 ` Bakul Shah @ 2017-02-26 19:57 ` Charles Forsyth 2017-02-26 20:06 ` Jadon Bennett 2017-02-26 20:16 ` Bakul Shah 0 siblings, 2 replies; 27+ messages in thread From: Charles Forsyth @ 2017-02-26 19:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 274 bytes --] On Sun, 26 Feb 2017, 18:49 Bakul Shah, <bakul@bitblocks.com> wrote: > The links are to different files. > Not on Gmail at least look to see where each link points. Both are to -2 in the message I see on Gmail. Unless it cleverly optimised the"identical" content! [-- Attachment #2: Type: text/html, Size: 653 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 19:57 ` Charles Forsyth @ 2017-02-26 20:06 ` Jadon Bennett 2017-02-26 20:16 ` Bakul Shah 1 sibling, 0 replies; 27+ messages in thread From: Jadon Bennett @ 2017-02-26 20:06 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, Feb 26, 2017 at 07:57:53PM +0000, Charles Forsyth wrote: > On Sun, 26 Feb 2017, 18:49 Bakul Shah, <bakul@bitblocks.com> wrote: > > > The links are to different files. > > > > Not on Gmail at least look to see where each link points. Both are to -2 > in the message I see on Gmail. Unless it cleverly optimised the"identical" > content! It's a multipart email; the HTML version had two of the same link, but readers viewing the plaintext version saw the different addresses. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti 2017-02-26 19:57 ` Charles Forsyth 2017-02-26 20:06 ` Jadon Bennett @ 2017-02-26 20:16 ` Bakul Shah 1 sibling, 0 replies; 27+ messages in thread From: Bakul Shah @ 2017-02-26 20:16 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Sun, 26 Feb 2017 19:57:53 GMT Charles Forsyth <charles.forsyth@gmail.com> wrote: > > > The links are to different files. > > > > Not on Gmail at least look to see where each link points. Both are to -2 > in the message I see on Gmail. Unless it cleverly optimised the"identical" > content! I took at a look at the raw email file and you are right. Thanks iPad + apple Mail. I can paste a URL and it gets turned into a real link in the HTML part. But if I then edit the URL, the link doesn't change. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [9fans] SHA-1 collision and venti
@ 2017-02-28 15:47 Darren Wise
0 siblings, 0 replies; 27+ messages in thread
From: Darren Wise @ 2017-02-28 15:47 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
[-- Attachment #1: Type: text/plain, Size: 912 bytes --]
Regardless of the file size the main limit to anything would be for me something material..
Like actual drive space to put that lone file in to or spread across a set if drives, NAS, RAID an all that..
Software itself is always a little way ahead of hardware..
> Kind regards,
> Darren Wise Esq,
> B.Sc, HND, GNVQ, City & Guilds.
>
> Managing Director (MD)
> Art Director (AD)
> Chief Architect/Analyst (CA/A)
> Chief Technical Officer (CTO)
>
> www.wisecorp.co.uk> www.wisecorp.co.uk/babywise
> www.darrenwise.co.uk
-------- Original message --------
From: hiro <23hiro@gmail.com>
Date: 27/02/2017 18:14 (GMT+00:00)
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] SHA-1 collision and venti
Bakul: I want to store a 1000000000 Petabyte file, can your archival
system support that? I want to research big files.
There's always a limit, but when does it matter?
[-- Attachment #2: Type: text/html, Size: 1510 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2017-03-01 12:35 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-26 17:25 [9fans] SHA-1 collision and venti Bakul Shah 2017-02-26 17:30 ` Jules Merit 2017-02-26 18:29 ` Charles Forsyth 2017-02-26 18:16 ` Charles Forsyth 2017-02-26 18:25 ` Charles Forsyth 2017-02-26 19:46 ` Bakul Shah 2017-02-26 21:02 ` Kim Shrier 2017-02-27 15:46 ` Dave MacFarlane 2017-02-27 16:47 ` Charles Forsyth 2017-02-27 17:07 ` Charles Forsyth 2017-02-27 17:28 ` Bakul Shah 2017-02-27 18:14 ` hiro 2017-02-27 18:20 ` Bakul Shah 2017-02-27 18:30 ` Charles Forsyth 2017-02-27 19:02 ` Charles Forsyth 2017-02-27 20:05 ` cinap_lenrek 2017-02-27 20:14 ` Bakul Shah 2017-02-27 21:12 ` Riddler 2017-02-27 22:20 ` Charles Forsyth 2017-03-01 12:21 ` erik quanstrom 2017-03-01 12:35 ` David du Colombier 2017-02-27 19:34 ` Skip Tavakkolian 2017-02-26 18:48 ` Bakul Shah 2017-02-26 19:57 ` Charles Forsyth 2017-02-26 20:06 ` Jadon Bennett 2017-02-26 20:16 ` Bakul Shah 2017-02-28 15:47 Darren Wise
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).