From: ori@eigenstate.org
To: 9fans@9fans.net
Subject: Re: [9fans] yet another try to fixup venti
Date: Thu, 13 Jun 2024 00:08:03 -0400 [thread overview]
Message-ID: <9C832F275135C5830776F4656A698D43@eigenstate.org> (raw)
In-Reply-To: <17181391500.35F5.93227@composer.9fans.topicbox.com>
Sounds fairly interesting, though I'm curious how it compares;
my guess was that because of the lack of locality due to using
hashes for the score, a trie wouldn't be that different from a
hash table.
Quoth wb.kloke@gmail.com:
> After studying Steve Stallion's SSD venti disaster, I decided to do my own try to fix the issues of venti.
>
> Despite my reservations on the lasting wisdom of some of the design choices, I try to use the traditional arena disk layout.
> Only the on-disk index is replaced with a trie-based in-memory structure.
>
> The trienodes represent either the score and IAddr data as leaves or 16 indices for the next nibble of the score to search further. There is no need for a Bloom filter, as the trie search is not less performant for negative results. The actual trienode size is 64 bytes now, but can probably shorted to 48 bytes.
>
> So far, I have managed to convert buildindex into buildtrie. If -v option is used, the contents of the trie are printed in lexical order of the score.
>
> The data from my experiments are:
>
> I used my 4 arena files, each 20GB, containing about 10 million clumps in standard 500MB arenas. Data from the arena directories are read in in about one and a half minute. (There is one error in one of the arenas.) IMHO this is acceptable as startup time for a venti server.
>
> The trie has about 14m nodes, which are stored in a contiguous array. The trie, which is now 32 bit indexed, thus may be reduced to 24 bit index for the current data amount.
>
> For larger storage, there is a design choice, either use 24 bit indices and 48 byte trie nodes, and 256 trie arrays, or use 32bit indices and 64 byte trienodes in a single array.
>
> After I manage to push my data to a planport fork on github, you will hear more.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T21878aa53884911b-M7cf5960cda854dba36823793
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
next prev parent reply other threads:[~2024-06-13 4:08 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-11 20:52 wb.kloke
2024-06-12 20:12 ` [9fans] " wb.kloke
2024-06-13 4:08 ` ori [this message]
2024-06-13 15:52 ` [9fans] " wb.kloke
2024-06-13 19:41 ` wb.kloke
2024-06-16 9:19 ` wb.kloke
2024-06-20 15:32 ` wb.kloke
2024-08-16 17:27 ` wb.kloke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9C832F275135C5830776F4656A698D43@eigenstate.org \
--to=ori@eigenstate.org \
--cc=9fans@9fans.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).