From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Brian L. Stuart" To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] thoughs about venti+fossil Date: Thu, 6 Mar 2008 19:09:55 +0000 Message-Id: <030620081909.4185.47D041830009D5D50000105922228869349B0A02D2089B9A019C04040A0DBF9B9D0E9A9B9C040D@att.net> Topicbox-Message-UUID: 716ff3fe-ead3-11e9-9d60-3106f5b1d025 From: "Russ Cox" > sure. use sha-256 and your probability of collision goes > down even further. but *you* (probably) still won't be *sure*. I should probably not put my 2 cents worth in here, but my resistance is weak... It is true that you cannot be sure that there won't be a collision in venti, regardless of what hashing function you use. It is probabilistic, and doesn't prevent it from happening tomorrow, or from not happening until the sun burns out. But it seems to me that there's a bigger picture. The reason we would not want a collision is that it would, in effect, be a form of data corruption. But it's only one possible source. It's possible that network communication could be corrupted but still pass the CRC checks (if they're even present). It's possible that the disk could be corrupted in such a way that a block is in error, but still passes the ECC check. It's possible that a bit in the main memory might flip (or two bits if we have parity memory). In the end, we have to rely on the fact that these are all very unlikely to happen; their probabilities are quite low. A higher probability of damage comes from a potential fire in the machine room. We often add some form of off-site backup to handle this. But it can't make us sure that an earthquake won't hit the off-site backup location at the same time we have a fire locally. Rather, the probability of both is low enough we accept it. The amount of effort we put into mitigating an error is proportional to the probability of that error occurring and the amount of harm the error would cause. What does all this mean for venti? If we want to reduce the overall probability of data corruption, we want to put our efforts into addressing the one with the highest probability. Making the others better won't appreciably help the overall probability. And a venti collision is not the one with the highest probability among those I've listed. In fact, I'd suspect its the one with the lowest probability. So putting attention on making it less likely is really misplaced effort from a practical standpoint. The truth is that the first time I read the venti papers, I was bothered the same way. Yes, there can be problems, but generally we design systems where the design itself doesn't contain any known sources of failure. In venti, we have. And it bugged me for quite a while. But when I finally realized objectively that the probability of a hardware failure is orders of magnitude greater than a collision, I started to accept that venti is a very well-designed system, and is as reliable as any other form of archive. BLS