From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,DKIM_VALID,FORGED_GMAIL_RCVD,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 Received: from tb-ob0.topicbox.com (tb-ob0.topicbox.com [64.147.108.117]) by inbox.vuxu.org (Postfix) with ESMTP id 5F4F42B346 for ; Tue, 11 Jun 2024 22:52:37 +0200 (CEST) Received: from tb-mx1.topicbox.com (tb-mx1.nyi.icgroup.com [10.90.30.61]) by tb-ob0.topicbox.com (Postfix) with ESMTP id EBEEB287A1 for ; Tue, 11 Jun 2024 16:52:36 -0400 (EDT) (envelope-from bounce.mMb074534433ed9a094542eef4.r522be890-2105-11eb-b15e-8d699134e1fa@9fans.bounce.topicbox.com) Received: by tb-mx1.topicbox.com (Postfix, from userid 1132) id E8D991B22B78; Tue, 11 Jun 2024 16:52:36 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=9fans.net; h=from:to :message-id:date:mime-version:content-type :content-transfer-encoding:list-help:list-id:list-post :list-subscribe:reply-to:subject:list-unsubscribe; s=dkim-1; t= 1718139156; x=1718225556; bh=jAa+0VIhjEUOWSGUgvP2T7BiNXNLBYaI35a vuCkNX/c=; b=iNNRSRPBeCqTi7xkAycwtjayU3aSryHkfL31jMfcsDcZ/N2hBzo t36M2L1oCAYWghqxbO7DiW1mJxw37EmcZu2WRWvt6jtBk4qNB+U3ateZB/lmybrR skt2XOHRZ6iamRakdfeLnZ7B56s74wO/Y/FbTWVOOLuEQMX++PXVpTAo= From: wb.kloke@gmail.com To: 9fans <9fans@9fans.net> Message-Id: <17181391500.35F5.93227@composer.9fans.topicbox.com> Date: Tue, 11 Jun 2024 16:52:30 -0400 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=17181391501.2ffB2d5.93227 Content-Transfer-Encoding: 7bit Topicbox-Policy-Reasoning: allow: sender is a member Topicbox-Message-UUID: 8523f41a-2834-11ef-be7e-c256242d11b0 Archived-At: =?UTF-8?B?PGh0dHBzOi8vOWZhbnMudG9waWNib3guY29tL2dyb3Vwcy85?= =?UTF-8?B?ZmFucy9UMjE4NzhhYTUzODg0OTExYi1NYjA3NDUzNDQzM2VkOWEwOTQ1NDJl?= =?UTF-8?B?ZWY0Pg==?= List-Help: List-Id: "9fans" <9fans.9fans.net> List-Post: List-Software: Topicbox v0 List-Subscribe: Precedence: list Reply-To: 9fans <9fans@9fans.net> Subject: [9fans] yet another try to fixup venti List-Unsubscribe: , Topicbox-Delivery-ID: 2:9fans:437d30aa-c441-11e9-8a57-d036212d11b0:522be890-2105-11eb-b15e-8d699134e1fa:Mb074534433ed9a094542eef4:1:K0IRpQP0NnDj81jYUdTwJTsPAEHUP84HMlEZxf6KNFE --17181391501.2ffB2d5.93227 Date: Tue, 11 Jun 2024 16:52:30 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable After studying Steve Stallion's=C2=A0 SSD venti disaster, I decided to do m= y own try to fix the issues of venti. Despite my reservations on the lasting wisdom of some of the design choices= , I try to use the traditional=C2=A0 arena disk layout. Only the on-disk index is replaced with a trie-based in-memory structure.= =C2=A0 The trienodes represent either the score and IAddr data as leaves or 16 ind= ices for the next nibble of the score to search further. There is no need f= or a Bloom filter, as the trie search is not less performant for negative r= esults. The actual trienode size is 64 bytes now, but can probably shorted = to 48 bytes. So far, I have managed to convert buildindex into buildtrie.=C2=A0 If -v op= tion is used, the contents of the trie are printed in lexical order of the = score. The data from my experiments are: I used my 4 arena files, each 20GB, containing about 10 million clumps in s= tandard 500MB arenas. Data from the arena directories are read in in about= =C2=A0 one and a half minute. (There is one error in one of the arenas.) IM= HO this is acceptable as startup time for a venti server. The trie has about 14m nodes, which are stored in a contiguous array. The t= rie, which is now 32 bit indexed, thus may be reduced to 24 bit index for t= he current data amount. For larger storage, there is a design choice, either use 24 bit indices and= 48 byte trie nodes, and 256 trie arrays, or use 32bit indices and 64 byte = trienodes in a single array. After I=C2=A0 manage to=C2=A0 push my data to a planport fork on github, yo= u will hear more. ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/T21878aa53884911b-Mb0745= 34433ed9a094542eef4 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription --17181391501.2ffB2d5.93227 Date: Tue, 11 Jun 2024 16:52:30 -0400 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
After studying Steve Stallion's  SSD = venti disaster, I decided to do my own try to fix the issues of venti.

Despite my reservations on the lasting wisdom = of some of the design choices, I try to use the traditional  arena dis= k layout.
Only the on-disk index is replaced with a trie-ba= sed in-memory structure. 

The trienod= es represent either the score and IAddr data as leaves or 16 indices for th= e next nibble of the score to search further. There is no need for a Bloom = filter, as the trie search is not less performant for negative results. The= actual trienode size is 64 bytes now, but can probably shorted to 48 bytes= .

So far, I have managed to convert buildi= ndex into buildtrie.  If -v option is used, the contents of the trie a= re printed in lexical order of the score.

= The data from my experiments are:

I used m= y 4 arena files, each 20GB, containing about 10 million clumps in standard = 500MB arenas. Data from the arena directories are read in in about  on= e and a half minute. (There is one error in one of the arenas.) IMHO this i= s acceptable as startup time for a venti server.

The trie has about 14m nodes, which are stored in a contiguous array= . The trie, which is now 32 bit indexed, thus may be reduced to 24 bit inde= x for the current data amount.

For larger = storage, there is a design choice, either use 24 bit indices and 48 byte tr= ie nodes, and 256 trie arrays, or use 32bit indices and 64 byte trienodes i= n a single array.

After I  manage to&= nbsp; push my data to a planport fork on github, you will hear more.
<= div id=3D"topicbox-footer" style=3D"margin:10px 0 0;border-top:1px solid #d= dd;border-color:rgba(0,0,0,.15);padding:7px 0;"> 9fans / 9fans / see discussions + participants + delivery&n= bsp;options Permalink
= --17181391501.2ffB2d5.93227--