On Thursday, 13 June 2024, at 6:08 AM, ori wrote: > Sounds fairly interesting, though I'm curious how it compares; my guess was that because of the lack of locality due to using hashes for the score, a trie wouldn't be that different from a hash table. You are right. Lack of locality is a main issue.  But, as Noam Preil has shown in his workshop presentation, it is nearly impossible to work on the current implementation. Example: index.c has 19202 bytes, whereas trie.c has 3232, doing roughly the same work. The main drawback of the current venti is the disk based index. A fair comparison will have to put this on a ramdisk. The index section of my data set now is 5GB, but 2GB would be surely sufficient. When I have my implemention ready to work (this is a Herculian task, but I intend just to disable index disk writing without changing the rest), I shall publish the performance data. One idea to improve locality is avoiding  the storage of the full score in the trie nodes. Only the nodes referring to the first 6 nibbles tend to be densely populated. The leaves  need not really to contain the score, as the index addr stored there also points to a place where the score is found in the the arena partition. This should not really be detrimental for performance as the block is needed anyway. ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/T21878aa53884911b-Mf21f7a13249d0b37cde75c53 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription