From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8974
Path: news.gmane.org!not-for-mail
From: Szabolcs Nagy <nsz@port70.net>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: Re: AVL tree: storing balances instead of heights
Date: Thu, 10 Dec 2015 13:14:50 +0100
Message-ID: <20151210121449.GC23362@port70.net>
References: <CABh_MKm1Ow5_dc4ENJM-fAqWc+uPPZgftRdSDxPJiPXZAfZ+FQ@mail.gmail.com>
 <20151207130344.GZ23362@port70.net>
 <CABh_MKnkw_Yp9ifwybXBOMb0FuPvP05T183utZ1kW6Ohqq3gvQ@mail.gmail.com>
 <20151207144621.GA23362@port70.net>
 <CABh_MKmvtxKsMjdzNrqrXQ2VNZ8PMsN=iNd1NRhhrMC6PrYsug@mail.gmail.com>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: ger.gmane.org 1449749718 23034 80.91.229.3 (10 Dec 2015 12:15:18 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Thu, 10 Dec 2015 12:15:18 +0000 (UTC)
Cc: Ed Schouten <ed@nuxi.nl>
To: musl@lists.openwall.com
Original-X-From: musl-return-8987-gllmg-musl=m.gmane.org@lists.openwall.com Thu Dec 10 13:15:10 2015
Return-path: <musl-return-8987-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-8987-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1a707p-0002F8-8h
	for gllmg-musl@m.gmane.org; Thu, 10 Dec 2015 13:15:09 +0100
Original-Received: (qmail 1492 invoked by uid 550); 10 Dec 2015 12:15:02 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 1466 invoked from network); 10 Dec 2015 12:15:01 -0000
Mail-Followup-To: musl@lists.openwall.com, Ed Schouten <ed@nuxi.nl>
Content-Disposition: inline
In-Reply-To: <CABh_MKmvtxKsMjdzNrqrXQ2VNZ8PMsN=iNd1NRhhrMC6PrYsug@mail.gmail.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Xref: news.gmane.org gmane.linux.lib.musl.general:8974
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/8974>

* Ed Schouten <ed@nuxi.nl> [2015-12-10 10:21:45 +0100]:
>=20
> I did some more research and it turns out that AVL trees also allow
> you to apply top-down rebalancing instead of bottom-up. The reason for
> this is that at each level rebalancing never affects the child along
> the path, only its sibling. This approach isn't covered in literature
> extensively, but this blog post describes the process quite well:
>=20
> http://neil.brown.name/blog/20041124101820
> http://neil.brown.name/blog/20041124141849
>=20
> The downside of such an approach is that the tree needs to be
> traversed forward twice, meaning that a na=EFve implementation would
> need to perform a larger number of comparisons. This can be mitigated
> by storing the path during the first traversal. As the number of nodes
> of the tree can never exceed the number of bytes in the address space,
> we know that two uintptr_t's provide enough bits to store any path.
>=20
> I've done some measurements and such an implementation only seems to
> be 7% slower for insertions and 4% slower for deletions when compared
> to my previous recursive implementation. The advantage is that
> recursion is eliminated. Memory use is constant, regardless of the
> depth of the tree.
>=20

eliminating recursion is not that important.

you can turn depth*stackframe into depth bits + 1 stackframe
stack usage, but avl tree is very balanced so depth is small
(e.g. < 60 with 47bit address space)

there are plenty libc apis that use more stack than that, so
i think the optimizations only make sense if the code size
remains small (or if you provide a non-standard api that's
reasonable and use it somewhere).

> I also benchmarked an iterative bottom-up implementation. This version
> was about 25% faster for deletions, but had the downside of requiring
> an array of parent pointers on the stack (~750 bytes).
>=20
> As usual, here are links to my source files:
>=20
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tsearch.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tdelete.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/search_im=
pl.h
>=20
> And a link to a spreadsheet with some benchmark results, where I
> measure the amount of time and number of comparisons of 10 million
> insertions and deletions of keys in ascending order:
>=20
> https://docs.google.com/spreadsheets/d/1Uy1Kgz9SGf_szH3K4FkyxxioHmBlRu9Ls=
QjDqkBw8v8/edit#gid=3D0
>=20

can you add code size column as well?
(sum of size of x86_64 pic .text+.data+.bss)

i assume freebsd does not balance the tree so it timeouts
(which also shows why this api is not useful in practice)

glibc uses red-black trees which do less balancing operations
so insert/delete may be faster, but lookup is slightly slower.
(but it seems it does more comparisions and that can be slow)

i assume the musl code there already has the tdelete balancing
patch applied and it is compiled with -Os.

performance also depends on the allocator since insert/delete
has to malloc/free (yet another issue with the api) musl's
allocator is i think still better for realtime systems than
the jemalloc used in cloudlibc, but it will have worse average
performance in benchmarks like this.

thanks.