From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8974 Path: news.gmane.org!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: AVL tree: storing balances instead of heights Date: Thu, 10 Dec 2015 13:14:50 +0100 Message-ID: <20151210121449.GC23362@port70.net> References: <20151207130344.GZ23362@port70.net> <20151207144621.GA23362@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1449749718 23034 80.91.229.3 (10 Dec 2015 12:15:18 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 10 Dec 2015 12:15:18 +0000 (UTC) Cc: Ed Schouten To: musl@lists.openwall.com Original-X-From: musl-return-8987-gllmg-musl=m.gmane.org@lists.openwall.com Thu Dec 10 13:15:10 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1a707p-0002F8-8h for gllmg-musl@m.gmane.org; Thu, 10 Dec 2015 13:15:09 +0100 Original-Received: (qmail 1492 invoked by uid 550); 10 Dec 2015 12:15:02 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 1466 invoked from network); 10 Dec 2015 12:15:01 -0000 Mail-Followup-To: musl@lists.openwall.com, Ed Schouten Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Xref: news.gmane.org gmane.linux.lib.musl.general:8974 Archived-At: * Ed Schouten [2015-12-10 10:21:45 +0100]: >=20 > I did some more research and it turns out that AVL trees also allow > you to apply top-down rebalancing instead of bottom-up. The reason for > this is that at each level rebalancing never affects the child along > the path, only its sibling. This approach isn't covered in literature > extensively, but this blog post describes the process quite well: >=20 > http://neil.brown.name/blog/20041124101820 > http://neil.brown.name/blog/20041124141849 >=20 > The downside of such an approach is that the tree needs to be > traversed forward twice, meaning that a na=EFve implementation would > need to perform a larger number of comparisons. This can be mitigated > by storing the path during the first traversal. As the number of nodes > of the tree can never exceed the number of bytes in the address space, > we know that two uintptr_t's provide enough bits to store any path. >=20 > I've done some measurements and such an implementation only seems to > be 7% slower for insertions and 4% slower for deletions when compared > to my previous recursive implementation. The advantage is that > recursion is eliminated. Memory use is constant, regardless of the > depth of the tree. >=20 eliminating recursion is not that important. you can turn depth*stackframe into depth bits + 1 stackframe stack usage, but avl tree is very balanced so depth is small (e.g. < 60 with 47bit address space) there are plenty libc apis that use more stack than that, so i think the optimizations only make sense if the code size remains small (or if you provide a non-standard api that's reasonable and use it somewhere). > I also benchmarked an iterative bottom-up implementation. This version > was about 25% faster for deletions, but had the downside of requiring > an array of parent pointers on the stack (~750 bytes). >=20 > As usual, here are links to my source files: >=20 > https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tsearch.c > https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tdelete.c > https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/search_im= pl.h >=20 > And a link to a spreadsheet with some benchmark results, where I > measure the amount of time and number of comparisons of 10 million > insertions and deletions of keys in ascending order: >=20 > https://docs.google.com/spreadsheets/d/1Uy1Kgz9SGf_szH3K4FkyxxioHmBlRu9Ls= QjDqkBw8v8/edit#gid=3D0 >=20 can you add code size column as well? (sum of size of x86_64 pic .text+.data+.bss) i assume freebsd does not balance the tree so it timeouts (which also shows why this api is not useful in practice) glibc uses red-black trees which do less balancing operations so insert/delete may be faster, but lookup is slightly slower. (but it seems it does more comparisions and that can be slow) i assume the musl code there already has the tdelete balancing patch applied and it is compiled with -Os. performance also depends on the allocator since insert/delete has to malloc/free (yet another issue with the api) musl's allocator is i think still better for realtime systems than the jemalloc used in cloudlibc, but it will have worse average performance in benchmarks like this. thanks.