9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: ori@eigenstate.org
To: 9fans@9fans.net
Subject: Re: [9fans] yet another try to fixup venti
Date: Thu, 13 Jun 2024 00:08:03 -0400	[thread overview]
Message-ID: <9C832F275135C5830776F4656A698D43@eigenstate.org> (raw)
In-Reply-To: <17181391500.35F5.93227@composer.9fans.topicbox.com>

Sounds fairly interesting, though I'm curious how it compares;
my guess was that because of the lack of locality due to using
hashes for the score, a trie wouldn't be that different from a
hash table.

Quoth wb.kloke@gmail.com:
> After studying Steve Stallion's  SSD venti disaster, I decided to do my own try to fix the issues of venti.
> 
> Despite my reservations on the lasting wisdom of some of the design choices, I try to use the traditional  arena disk layout.
>  Only the on-disk index is replaced with a trie-based in-memory structure.
> 
> The trienodes represent either the score and IAddr data as leaves or 16 indices for the next nibble of the score to search further. There is no need for a Bloom filter, as the trie search is not less performant for negative results. The actual trienode size is 64 bytes now, but can probably shorted to 48 bytes.
> 
> So far, I have managed to convert buildindex into buildtrie.  If -v option is used, the contents of the trie are printed in lexical order of the score.
> 
> The data from my experiments are:
> 
> I used my 4 arena files, each 20GB, containing about 10 million clumps in standard 500MB arenas. Data from the arena directories are read in in about  one and a half minute. (There is one error in one of the arenas.) IMHO this is acceptable as startup time for a venti server.
> 
> The trie has about 14m nodes, which are stored in a contiguous array. The trie, which is now 32 bit indexed, thus may be reduced to 24 bit index for the current data amount.
> 
> For larger storage, there is a design choice, either use 24 bit indices and 48 byte trie nodes, and 256 trie arrays, or use 32bit indices and 64 byte trienodes in a single array.
> 
> After I  manage to  push my data to a planport fork on github, you will hear more.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T21878aa53884911b-M7cf5960cda854dba36823793
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

  parent reply	other threads:[~2024-06-13  4:08 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-11 20:52 wb.kloke
2024-06-12 20:12 ` [9fans] " wb.kloke
2024-06-13  4:08 ` ori [this message]
2024-06-13 15:52   ` [9fans] " wb.kloke
2024-06-13 19:41     ` wb.kloke
2024-06-16  9:19       ` wb.kloke
2024-06-20 15:32         ` wb.kloke
2024-08-16 17:27           ` wb.kloke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9C832F275135C5830776F4656A698D43@eigenstate.org \
    --to=ori@eigenstate.org \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).