From mboxrd@z Thu Jan 1 00:00:00 1970 To: weigelt@metux.de, 9fans@9fans.net Subject: Re: [9fans] Streaming on venti From: "Russ Cox" Date: Thu, 5 Jun 2008 10:01:41 -0400 In-Reply-To: <20080605112423.GC24609@nibiru.local> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20080605140141.A17931E8C1F@holo.morphisms.net> Topicbox-Message-UUID: b3b81bb0-ead3-11e9-9d60-3106f5b1d025 > VAC eg. is good for archiving, but it's tree-based structure > is probably not optimal for streaming (on large files, a lot > of blocks IMHO have to be loaded before getting the first > payload block can be reached). A typical venti tree has a branching factor of 409 (8192/20). For a 1GB file, that means you have to load two extra blocks to find the first one, and 322 interior blocks to find all 131,072 data blocks. Is improving that 0.2% really your justification for a less capable data structure? > Some kind of linked list, eg. like in CMD-1541 filesystem > (each block as an pointer to the next one) or an linked list > of index blocks would fit better. For a 1GB file, you'd need 20*131,072 = 2,621,440 bytes of storage to hold the pointers. No matter where you put them, your entire file is now (1GB+2,621,440)/8192 = 131,392 blocks. The tree was using 131,394 blocks. So at best, the linked list has reduced the number of block loads by 0.002%, and you've given up random access, including streaming starting halfway through a file. Doesn't sound better to me. Venti's performance is dominated much more by fragmentation in where the blocks are laid out in the arena logs (that causes seeks) than anything in higher level data structures. There is a paper about this in the upcoming Usenix. See http://swtch.com/~rsc/papers/ for a link to PDF and HTML. (Because the paper is targeted at a non-Plan 9 audience, "Venti" in that paper refers to venti as described in the original paper. The current venti sources implement all the improvements described as "Foundation" in the paper.) Russ