From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 6 Mar 2008 06:38:06 +0100 From: Enrico Weigelt To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] thoughs about venti+fossil Message-ID: <20080306053806.GB18329@nibiru.local> References: <20080305040019.GA13663@nibiru.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i Topicbox-Message-UUID: 701ff4d6-ead3-11e9-9d60-3106f5b1d025 * Charles Forsyth wrote: > >1. how stable is the keying ? sha-1 has only 160 bits, while > > data blocks may be up to 56k long. so, the mapping is only > > unique into one direction (not one-to-one). how can we be > > *really sure*, that - even on very large storages (TB or > > even PB) - data to each key is alway (one-to-one) unique ? > > on a write, the computer will tell you if you ought to have bought > that lottery ticket and stayed out of the rain: Okay, venti detects collisions. But what happens then ? Does it simply refuse the write or is there a way for managing hash-colliding data blocks ? Of couse, this can be worked around, if we define, the key is always server-generated and does not *always* reflect the hash (from client-side: an arbitrary number): Adding another bit, which tells if the key is not the hash, but some (server-allocated) ID (eg. table entry). For an stricly server-based model, this is perfectly fine. BUT: I've got another idea in mind: an heavily distributed block storage, which uses hashkeys for block identification and also pools together equal blocks (like venti does). Ideally each block should be transmitted only once (as long as it is in the local cache). For this I need to be *sure* that there will be *no* collissions, even if the system runs for a long time and grows really big (maybe several PB on thousands of nodes). Another interesting question: can the risk of colissions be reduced by combining several different hash functions in parallel ? cu -- --------------------------------------------------------------------- Enrico Weigelt == metux IT service - http://www.metux.de/ --------------------------------------------------------------------- Please visit the OpenSource QM Taskforce: http://wiki.metux.de/public/OpenSource_QM_Taskforce Patches / Fixes for a lot dozens of packages in dozens of versions: http://patches.metux.de/ ---------------------------------------------------------------------