From: wb.kloke@gmail.com
To: 9fans <9fans@9fans.net>
Subject: Re: [9fans] yet another try to fixup venti
Date: Thu, 13 Jun 2024 11:52:35 -0400 [thread overview]
Message-ID: <17182939550.C8ecf.211654@composer.9fans.topicbox.com> (raw)
In-Reply-To: <9C832F275135C5830776F4656A698D43@eigenstate.org>
[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]
On Thursday, 13 June 2024, at 6:08 AM, ori wrote:
> Sounds fairly interesting, though I'm curious how it compares;
my guess was that because of the lack of locality due to using
hashes for the score, a trie wouldn't be that different from a
hash table.
You are right. Lack of locality is a main issue.
But, as Noam Preil has shown in his workshop presentation, it is nearly impossible to work on the current implementation. Example: index.c has 19202 bytes, whereas trie.c has 3232, doing roughly the same work.
The main drawback of the current venti is the disk based index. A fair comparison will have to put this on a ramdisk. The index section of my data set now is 5GB, but 2GB would be surely sufficient. When I have my implemention ready to work (this is a Herculian task, but I intend just to disable index disk writing without changing the rest), I shall publish the performance data.
One idea to improve locality is avoiding the storage of the full score in the trie nodes. Only the nodes referring to the first 6 nibbles tend to be densely populated. The leaves need not really to contain the score, as the index addr stored there also points to a place where the score is found in the the arena partition. This should not really be detrimental for performance as the block is needed anyway.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T21878aa53884911b-Mf21f7a13249d0b37cde75c53
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 2269 bytes --]
next prev parent reply other threads:[~2024-06-13 15:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-11 20:52 wb.kloke
2024-06-12 20:12 ` [9fans] " wb.kloke
2024-06-13 4:08 ` [9fans] " ori
2024-06-13 15:52 ` wb.kloke [this message]
2024-06-13 19:41 ` wb.kloke
2024-06-16 9:19 ` wb.kloke
2024-06-20 15:32 ` wb.kloke
2024-08-16 17:27 ` wb.kloke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17182939550.C8ecf.211654@composer.9fans.topicbox.com \
--to=wb.kloke@gmail.com \
--cc=9fans@9fans.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).