mailing list of musl libc
 help / color / mirror / code / Atom feed
* Re: AVL tree: storing balances instead of heights
       [not found] <CABh_MKm1Ow5_dc4ENJM-fAqWc+uPPZgftRdSDxPJiPXZAfZ+FQ@mail.gmail.com>
@ 2015-12-07 13:03 ` Szabolcs Nagy
  2015-12-07 13:19   ` Ed Schouten
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2015-12-07 13:03 UTC (permalink / raw)
  To: Ed Schouten; +Cc: musl

* Ed Schouten <ed@nuxi.nl> [2015-12-07 09:46:54 +0100]:
> Hi Szabolcs,
> 
> Thanks again for your quick response to the bug in tdelete() that I
> reported! I ran into this issue because I was comparing
> implementations of tsearch() and tdelete() across different operating
> systems. I was doing this in preparation for adding these functions to
> CloudABI's C library.
> 
> I noticed that musl's implementation explicitly stores the height of
> the elements to compute the balance factor. What I read the other day
> is that this is not strictly necessary. It turns out that just storing
> the balances is sufficient. Though this required me to spend some time
> making notes to distinguish all of the states individually, it looks
> like the implementation itself becomes a lot simpler. This approach
> also makes it possible to more easily stop rebalancing as soon as a
> subtree has become balanced again.
> 
> Below are links to my implementation of these functions:
> 
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tsearch.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tdelete.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/search_impl.h
> 
> If you like them, feel free to include them in musl as well. They are
> currently 2-clause BSD licensed, but I don't mind making them
> available under the MIT license as well if this is more practical for
> you folks.
> 

adding musl@ as it should be discussed there.

yes it is enough to only store the height difference.

note that this code was supposed to be size optimized in musl:
i think the tsearch api is not well designed and thus rarely
useful.

your balancing approach is better (i wouldn't call it a lot
simpler, but it is more efficient).

musl should probably move tfind and twalk into separate tu
(but tsearch and tdelete need the same balancing logic so
i'd keep those together).

i think the macro definitions for inlining twalk and tfind
are not justified.

you got the rootp==0 case wrong too (posix requires that to
return 0).

and the (T**) cast is invalid in

	void *tdelete(const void *restrict key, void **restrict rootp,
	              int (*compar)(const void *, const void *)) {
	  void *result = (void *)1;
	  tdelete_recurse(key, (struct __tnode **)rootp, compar, &result);
	  return result;
	}

posix specifies to return a pointer to a node, not to an element
pointer, i think that's a bug in posix (otherwise the api would
be useless).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AVL tree: storing balances instead of heights
  2015-12-07 13:03 ` AVL tree: storing balances instead of heights Szabolcs Nagy
@ 2015-12-07 13:19   ` Ed Schouten
  2015-12-07 13:22     ` Ed Schouten
  2015-12-07 14:46     ` Szabolcs Nagy
  0 siblings, 2 replies; 10+ messages in thread
From: Ed Schouten @ 2015-12-07 13:19 UTC (permalink / raw)
  To: Ed Schouten, musl

Hi Szabolcs,

2015-12-07 14:03 GMT+01:00 Szabolcs Nagy <nsz@port70.net>:
> i think the macro definitions for inlining twalk and tfind
> are not justified.

I personally think it is justified in this case. In trivial cases it
may even mean that the compare function is inlined into the body of
twalk()/tfind().

I haven't measured this for twalk() and tfind() explicitly, but my
observation for bsearch() was that a separate compare function and an
invocation of bsearch() is about the same size as inlining the
bsearch() routine.

> you got the rootp==0 case wrong too (posix requires that to
> return 0).

That already happens, right? tdelete() sets it to (void *)1, but
tdelete_recurse() then sets it to NULL immediately.

> and the (T**) cast is invalid in
>
>         void *tdelete(const void *restrict key, void **restrict rootp,
>                       int (*compar)(const void *, const void *)) {
>           void *result = (void *)1;
>           tdelete_recurse(key, (struct __tnode **)rootp, compar, &result);
>           return result;
>         }

Could you please elaborate this a bit further?

> posix specifies to return a pointer to a node, not to an element
> pointer, i think that's a bug in posix (otherwise the api would
> be useless).

Yes. On lines 40 and 44 the result pointer is set to the parent
object. As __key is the first member of the structure, I could have
written *n instead of &(*n)->__key, but this implementation does not
strongly enforce it. I could put __key anywhere in the node structure
and it would always return a pointer to a pointer to a key.

-- 
Ed Schouten <ed@nuxi.nl>
Nuxi, 's-Hertogenbosch, the Netherlands
KvK-nr.: 62051717


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AVL tree: storing balances instead of heights
  2015-12-07 13:19   ` Ed Schouten
@ 2015-12-07 13:22     ` Ed Schouten
  2015-12-07 14:46     ` Szabolcs Nagy
  1 sibling, 0 replies; 10+ messages in thread
From: Ed Schouten @ 2015-12-07 13:22 UTC (permalink / raw)
  To: Ed Schouten, musl

2015-12-07 14:19 GMT+01:00 Ed Schouten <ed@nuxi.nl>:
>> you got the rootp==0 case wrong too (posix requires that to
>> return 0).
>
> That already happens, right? tdelete() sets it to (void *)1, but
> tdelete_recurse() then sets it to NULL immediately.

Ah, sorry. If rootp itself is NULL, it also needs to return NULL.
Thanks for spotting this.

-- 
Ed Schouten <ed@nuxi.nl>
Nuxi, 's-Hertogenbosch, the Netherlands
KvK-nr.: 62051717


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-07 13:19   ` Ed Schouten
  2015-12-07 13:22     ` Ed Schouten
@ 2015-12-07 14:46     ` Szabolcs Nagy
  2015-12-10  9:21       ` Ed Schouten
  1 sibling, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2015-12-07 14:46 UTC (permalink / raw)
  To: musl; +Cc: Ed Schouten

* Ed Schouten <ed@nuxi.nl> [2015-12-07 14:19:58 +0100]:
> 2015-12-07 14:03 GMT+01:00 Szabolcs Nagy <nsz@port70.net>:
> > and the (T**) cast is invalid in
> >
> >         void *tdelete(const void *restrict key, void **restrict rootp,
> >                       int (*compar)(const void *, const void *)) {
> >           void *result = (void *)1;
> >           tdelete_recurse(key, (struct __tnode **)rootp, compar, &result);
> >           return result;
> >         }
> 
> Could you please elaborate this a bit further?
> 

	void *a = 0;
	void **p = &a;
	T **q = (T**)p;

may be undefined behaviour (if T* and void* have
different alignment requirement) otherwise q has
some arbitrary value that is only meaningful if
it's converted back to (void**).

	T *b = *(T**)p;

is undefined behaviour.

in general it is not guaranteed that void* or void**
is represented the same way as other pointers, the
guarantee is that all object pointers can be converted
to/from void* and may be possible to convert to/from
other object pointers, but in either case you have to
convert the pointer back to the original type to get
a meaningful value that can be dereferenced.
http://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7

the usage of void** is one of the problems with this
interface.

> > posix specifies to return a pointer to a node, not to an element
> > pointer, i think that's a bug in posix (otherwise the api would
> > be useless).
> 
> Yes. On lines 40 and 44 the result pointer is set to the parent
> object. As __key is the first member of the structure, I could have
> written *n instead of &(*n)->__key, but this implementation does not
> strongly enforce it. I could put __key anywhere in the node structure
> and it would always return a pointer to a pointer to a key.

returning *n instead of &(*n)->key would be wrong, so the
musl implementation should be fixed, and the spec should
be fixed too because with the current wording the caller
cannot do anything with the return value.

same problem as above:

	struct node { void *key; } *n = ...;
	void **pkey = (void*)n;
	void *key = *pkey;

is ub, pkey may not be meaningful.

(i'd audit cloud libc for any use of pointer casts to
make sure there is no such bugs, this is one of the
reasons why musl very rarely uses casts.. in this case
the type-unsafe void* return value bit us though.)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-07 14:46     ` Szabolcs Nagy
@ 2015-12-10  9:21       ` Ed Schouten
  2015-12-10 12:14         ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Ed Schouten @ 2015-12-10  9:21 UTC (permalink / raw)
  To: musl, Ed Schouten

Hi there,

2015-12-07 15:46 GMT+01:00 Szabolcs Nagy <nsz@port70.net>:
> in general it is not guaranteed that void* or void**
> is represented the same way as other pointers, the
> guarantee is that all object pointers can be converted
> to/from void* and may be possible to convert to/from
> other object pointers, but in either case you have to
> convert the pointer back to the original type to get
> a meaningful value that can be dereferenced.
> http://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7

Ah, thanks for clarifying. I didn't know that.

I did some more research and it turns out that AVL trees also allow
you to apply top-down rebalancing instead of bottom-up. The reason for
this is that at each level rebalancing never affects the child along
the path, only its sibling. This approach isn't covered in literature
extensively, but this blog post describes the process quite well:

http://neil.brown.name/blog/20041124101820
http://neil.brown.name/blog/20041124141849

The downside of such an approach is that the tree needs to be
traversed forward twice, meaning that a naïve implementation would
need to perform a larger number of comparisons. This can be mitigated
by storing the path during the first traversal. As the number of nodes
of the tree can never exceed the number of bytes in the address space,
we know that two uintptr_t's provide enough bits to store any path.

I've done some measurements and such an implementation only seems to
be 7% slower for insertions and 4% slower for deletions when compared
to my previous recursive implementation. The advantage is that
recursion is eliminated. Memory use is constant, regardless of the
depth of the tree.

I also benchmarked an iterative bottom-up implementation. This version
was about 25% faster for deletions, but had the downside of requiring
an array of parent pointers on the stack (~750 bytes).

As usual, here are links to my source files:

https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tsearch.c
https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tdelete.c
https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/search_impl.h

And a link to a spreadsheet with some benchmark results, where I
measure the amount of time and number of comparisons of 10 million
insertions and deletions of keys in ascending order:

https://docs.google.com/spreadsheets/d/1Uy1Kgz9SGf_szH3K4FkyxxioHmBlRu9LsQjDqkBw8v8/edit#gid=0

Best regards,
-- 
Ed Schouten <ed@nuxi.nl>
Nuxi, 's-Hertogenbosch, the Netherlands
KvK-nr.: 62051717


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-10  9:21       ` Ed Schouten
@ 2015-12-10 12:14         ` Szabolcs Nagy
  2015-12-10 13:12           ` Ed Schouten
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2015-12-10 12:14 UTC (permalink / raw)
  To: musl; +Cc: Ed Schouten

* Ed Schouten <ed@nuxi.nl> [2015-12-10 10:21:45 +0100]:
> 
> I did some more research and it turns out that AVL trees also allow
> you to apply top-down rebalancing instead of bottom-up. The reason for
> this is that at each level rebalancing never affects the child along
> the path, only its sibling. This approach isn't covered in literature
> extensively, but this blog post describes the process quite well:
> 
> http://neil.brown.name/blog/20041124101820
> http://neil.brown.name/blog/20041124141849
> 
> The downside of such an approach is that the tree needs to be
> traversed forward twice, meaning that a naïve implementation would
> need to perform a larger number of comparisons. This can be mitigated
> by storing the path during the first traversal. As the number of nodes
> of the tree can never exceed the number of bytes in the address space,
> we know that two uintptr_t's provide enough bits to store any path.
> 
> I've done some measurements and such an implementation only seems to
> be 7% slower for insertions and 4% slower for deletions when compared
> to my previous recursive implementation. The advantage is that
> recursion is eliminated. Memory use is constant, regardless of the
> depth of the tree.
> 

eliminating recursion is not that important.

you can turn depth*stackframe into depth bits + 1 stackframe
stack usage, but avl tree is very balanced so depth is small
(e.g. < 60 with 47bit address space)

there are plenty libc apis that use more stack than that, so
i think the optimizations only make sense if the code size
remains small (or if you provide a non-standard api that's
reasonable and use it somewhere).

> I also benchmarked an iterative bottom-up implementation. This version
> was about 25% faster for deletions, but had the downside of requiring
> an array of parent pointers on the stack (~750 bytes).
> 
> As usual, here are links to my source files:
> 
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tsearch.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tdelete.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/search_impl.h
> 
> And a link to a spreadsheet with some benchmark results, where I
> measure the amount of time and number of comparisons of 10 million
> insertions and deletions of keys in ascending order:
> 
> https://docs.google.com/spreadsheets/d/1Uy1Kgz9SGf_szH3K4FkyxxioHmBlRu9LsQjDqkBw8v8/edit#gid=0
> 

can you add code size column as well?
(sum of size of x86_64 pic .text+.data+.bss)

i assume freebsd does not balance the tree so it timeouts
(which also shows why this api is not useful in practice)

glibc uses red-black trees which do less balancing operations
so insert/delete may be faster, but lookup is slightly slower.
(but it seems it does more comparisions and that can be slow)

i assume the musl code there already has the tdelete balancing
patch applied and it is compiled with -Os.

performance also depends on the allocator since insert/delete
has to malloc/free (yet another issue with the api) musl's
allocator is i think still better for realtime systems than
the jemalloc used in cloudlibc, but it will have worse average
performance in benchmarks like this.

thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-10 12:14         ` Szabolcs Nagy
@ 2015-12-10 13:12           ` Ed Schouten
  2015-12-10 13:43             ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Ed Schouten @ 2015-12-10 13:12 UTC (permalink / raw)
  To: musl, Ed Schouten

Hi Szabolcs,

2015-12-10 13:14 GMT+01:00 Szabolcs Nagy <nsz@port70.net>:
> can you add code size column as well?
> (sum of size of x86_64 pic .text+.data+.bss)

I've added text sizes for all of the functions, building them using
these compilers:

GCC: (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
Clang: 3.7.0-2ubuntu1 (tags/RELEASE_370/final) (based on LLVM 3.7.0)

None of the functions use any data/bss, so I've only added the text
sizes. As some implementations have some code reuse between tsearch()
and tdelete(), I've done both separate and combined builds of these
two functions.

The sizes of the glibc implementation is an estimate, as I didn't
build that one manually.

> i assume freebsd does not balance the tree so it timeouts
> (which also shows why this api is not useful in practice)

Exactly.

> glibc uses red-black trees which do less balancing operations
> so insert/delete may be faster, but lookup is slightly slower.
> (but it seems it does more comparisions and that can be slow)

Yes. Comparisons are rather expensive (one callback per comparison),
so we'd better optimize for a tree that is as flat as possible.

> i assume the musl code there already has the tdelete balancing
> patch applied and it is compiled with -Os.

It was the latest version in Git, using -O2.

> performance also depends on the allocator since insert/delete
> has to malloc/free (yet another issue with the api) musl's
> allocator is i think still better for realtime systems than
> the jemalloc used in cloudlibc, but it will have worse average
> performance in benchmarks like this.

Yes. All of the tests were run on Linux, using glibc. Only the
tsearch()/tdelete() implementations are different between tests. They
all use glibc's standard malloc().

-- 
Ed Schouten <ed@nuxi.nl>
Nuxi, 's-Hertogenbosch, the Netherlands
KvK-nr.: 62051717


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-10 13:12           ` Ed Schouten
@ 2015-12-10 13:43             ` Szabolcs Nagy
  2015-12-20 21:43               ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2015-12-10 13:43 UTC (permalink / raw)
  To: musl; +Cc: Ed Schouten

* Ed Schouten <ed@nuxi.nl> [2015-12-10 14:12:08 +0100]:
> 2015-12-10 13:14 GMT+01:00 Szabolcs Nagy <nsz@port70.net>:
> > performance also depends on the allocator since insert/delete
> > has to malloc/free (yet another issue with the api) musl's
> > allocator is i think still better for realtime systems than
> > the jemalloc used in cloudlibc, but it will have worse average
> > performance in benchmarks like this.
> 
> Yes. All of the tests were run on Linux, using glibc. Only the
> tsearch()/tdelete() implementations are different between tests. They
> all use glibc's standard malloc().
> 

ah ok.

based on the updated stats the iterative bottom up
approach seems to be a good tradeoff, i wouldn't
try to reduce stack usage further by increasing the
code size.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-10 13:43             ` Szabolcs Nagy
@ 2015-12-20 21:43               ` Szabolcs Nagy
  2015-12-21  1:28                 ` Szabolcs Nagy
  0 siblings, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2015-12-20 21:43 UTC (permalink / raw)
  To: musl, Ed Schouten

[-- Attachment #1: Type: text/plain, Size: 1457 bytes --]

* Szabolcs Nagy <nsz@port70.net> [2015-12-10 14:43:50 +0100]:
> 
> based on the updated stats the iterative bottom up
> approach seems to be a good tradeoff, i wouldn't
> try to reduce stack usage further by increasing the
> code size.

i tried the 'more clever' balancing approach but
i could not make it small compared to the basic
approach and i could not win by using height diff
instead of height in the nodes.

complete tsearch api implementation code size on
x86_64 (pic and non-pic as well):

tsearch_avl.o   958
tsearch_fast.o  934
tsearch_small.o 804

_avl is the current code, _small uses various tricks
to make the code smaller (it is also a bit faster
because of simpler balancing code) and _fast avoids
the unnecessary balancing steps like cloud libc does.
(_fast is non-recursive, bottom-up approach with fixed
stack size and optimized for code size, the stack array
could be a vla because the tree height is known, but
the upper bound is not huge either).

code size tricks:
- left/right children are accessed with array indexing
so symmetry can be exploited (one rotation function)
- balancing/rotation use less unnecessary operations.
- void* child pointers so the void** arguments can be
used directly without making a copy on the stack. (this
means a bit less type checking)
- recursive functions use argument order such that minimal
register shuffling is needed on most archs.

i think for musl the small one is reasonable improvement.

[-- Attachment #2: tsearch_small.c --]
[-- Type: text/x-csrc, Size: 3414 bytes --]

#include <stdlib.h>
#include <search.h>

struct node {
	const void *key;
	void *a[2];
	int h;
};

static int height(struct node *n) { return n ? n->h : 0; }

static struct node *rot(struct node *x, int dir)
{
	struct node *y = x->a[!dir];
	struct node *z = y->a[dir];
	int hz = height(z);

	if (hz > height(y->a[!dir])) {
		//      x
		// dir / \ !dir         z
		//    A   y            / \
		//       / \   -->    x   y
		//      z   D        /|   |\
		//     / \          A B   C D
		//    B   C
		x->a[!dir] = z->a[dir];
		y->a[dir] = z->a[!dir];
		z->a[dir] = x;
		z->a[!dir] = y;
		x->h = hz;
		y->h = hz;
		z->h = hz+1;
	} else {
		//      x               y
		//     / \             / \
		//    A   y    -->    x   D
		//       / \         / \
		//      z   D       A   z
		x->a[!dir] = z;
		y->a[dir] = x;
		x->h = hz+1;
		y->h = hz+2;
		z = y;
	}
	return z;
}

static struct node *balance(struct node *n)
{
	int h0 = height(n->a[0]);
	int h1 = height(n->a[1]);
	if (h0-h1+1u < 3) {
		n->h = h0>h1 ? h0+1 : h1+1;
		return n;
	}
	return rot(n, h0>h1);
}

static struct node *remove_rightmost(struct node *n, const void **pkey)
{
	if (!n->a[1]) {
		struct node *left = n->a[0];
		*pkey = n->key;
		free(n);
		return left;
	}
	n->a[1] = remove_rightmost(n->a[1], pkey);
	return balance(n);
}

static struct node *remove(const void *key, void **p,
	int (*cmp)(const void *, const void *), struct node *parent)
{
	struct node *n = *p;
	if (!n)
		return 0;
	int c = cmp(key, n->key);
	if (!c) {
		if (n->a[0]) {
			n->a[0] = remove_rightmost(n->a[0], &n->key);
			*p = balance(n);
		} else {
			*p = n->a[1];
			free(n);
		}
		return parent;
	}
	parent = remove(key, &n->a[c>0], cmp, n);
	if (parent)
		*p = balance(n);
	return parent;
}

void *tdelete(const void *restrict key, void **restrict rootp,
	int(*cmp)(const void *, const void *))
{
	if (!rootp)
		return 0;
	/* last argument is arbitrary non-null pointer
	   which is returned when the root node is deleted */
	return remove(key, rootp, cmp, *rootp);
}

static struct node *insert(const void *key, void **p,
	int (*cmp)(const void *, const void *), struct node **found)
{
	struct node *n = *p;
	struct node *r;
	if (!n) {
		n = malloc(sizeof *n);
		if (n) {
			n->key = key;
			n->a[0] = n->a[1] = 0;
			n->h = 1;
		}
		*p = n;
		*found = n;
		return n;
	}
	int c = cmp(key, n->key);
	if (!c) {
		*found = n;
		return 0;
	}
	r = insert(key, &n->a[c>0], cmp, found);
	if (r)
		*p = balance(n);
	return r;
}

void *tsearch(const void *key, void **rootp,
	int (*cmp)(const void *, const void *))
{
	if (!rootp)
		return 0;
	struct node *found;
	insert(key, rootp, cmp, &found);
	return found;
}

static struct node *find(const void *key, struct node *n,
	int (*cmp)(const void *, const void *))
{
	if (!n)
		return 0;
	int c = cmp(key, n->key);
	if (!c)
		return n;
	return find(key, n->a[c>0], cmp);
}

void *tfind(const void *key, void *const *rootp,
	int(*cmp)(const void *, const void *))
{
	if (!rootp)
		return 0;
	return find(key, *rootp, cmp);
}

static void walk(const struct node *r, void (*action)(const void *, VISIT, int), int d)
{
	if (!r)
		return;
	if (r->h == 1)
		action(r, leaf, d);
	else {
		action(r, preorder, d);
		walk(r->a[0], action, d+1);
		action(r, postorder, d);
		walk(r->a[1], action, d+1);
		action(r, endorder, d);
	}
}

void twalk(const void *root, void (*action)(const void *, VISIT, int))
{
	walk(root, action, 0);
}

[-- Attachment #3: tsearch_fast.c --]
[-- Type: text/x-csrc, Size: 3166 bytes --]

#include <search.h>
#include <stdlib.h>

#define MAXH (sizeof(void*)*8*3/2)

struct node {
	const void *key;
	void *a[2];
	int h;
};

static int height(struct node *n) { return n ? n->h : 0; }

static struct node *rot(struct node *x, int dir)
{
	struct node *y = x->a[!dir];
	struct node *z = y->a[dir];
	int hz = height(z);

	if (hz > height(y->a[!dir])) {
		//      x
		// dir / \ !dir         z
		//    A   y            / \
		//       / \   -->    x   y
		//      z   D        /|   |\
		//     / \          A B   C D
		//    B   C
		x->a[!dir] = z->a[dir];
		y->a[dir] = z->a[!dir];
		z->a[dir] = x;
		z->a[!dir] = y;
		x->h = hz;
		y->h = hz;
		z->h = hz+1;
	} else {
		//      x               y
		//     / \             / \
		//    A   y    -->    x   D
		//       / \         / \
		//      z   D       A   z
		x->a[!dir] = z;
		y->a[dir] = x;
		x->h = hz+1;
		y->h = hz+2;
		z = y;
	}
	return z;
}

void *tsearch(const void *key, void **rootp,
	int (*compar)(const void *, const void *))
{
	if (!rootp)
		return 0;

	void **a[MAXH];
	struct node *n = *rootp;
	struct node *r;
	int h, i=0;

	a[i++] = rootp;
	for (;;) {
		if (!n) break;
		int c = compar(key, n->key);
		if (!c) return n;
		a[i++] = &n->a[c>0];
		n = n->a[c>0];
	}
	r = malloc(sizeof *r);
	if (!r)
		return 0;
	r->key = key;
	r->a[0] = r->a[1] = 0;
	r->h = h = 1;
	*a[--i] = r;
	while (i) {
		n = *a[--i];
		if (n->h > h)
			break;
		n->h = ++h;
		for (int d=0; d<2; d++)
			if (h-2 > height(n->a[d])) {
				*a[i] = rot(n,d);
				return r;
			}
	}
	return r;
}

void *tdelete(const void *restrict key, void **restrict rootp,
	int(*compar)(const void *, const void *))
{
	if (!rootp)
		return 0;

	void **a[MAXH];
	struct node *n = *rootp;
	struct node *parent;
	struct node *child;
	int h, i=0;
	a[i++] = rootp;
	a[i++] = rootp;
	for (;;) {
		if (!n) return 0;
		int c = compar(key, n->key);
		if (!c) break;
		a[i++] = &n->a[c>0];
		n = n->a[c>0];
	}
	parent = *a[i-2];
	if (n->a[0]) {
		struct node *found = n;
		for (a[i++]=&n->a[0], n=n->a[0]; n->a[1]; a[i++]=&n->a[1], n=n->a[1]);
		found->key = n->key;
		child = n->a[0];
	} else {
		child = n->a[1];
	}
	free(n);
	*a[--i] = child;
	h = height(child);
	while (--i) {
		int d, hd;

		h++;
		n = *a[i];
		hd = height(n->a[d=0]);
		if (h > hd)
			hd = height(n->a[d=1]);
		if (h < hd) {
			*a[i] = n = rot(n,!d);
			h = hd;
			if (h < n->h)
				break;
		} else if (h == hd) {
			break;
		} else {
			n->h = h;
		}
	}
	return parent;
}

void *tfind(const void *key, void *const *rootp,
	int(*compar)(const void *, const void *))
{
	if (!rootp)
		return 0;

	struct node *n = *rootp;
	for (;;) {
		if (!n) break;
		int c = compar(key, n->key);
		if (c < 0)
			n = n->a[0];
		else if (c > 0)
			n = n->a[1];
		else
			break;
	}
	return n;
}

static void walk(const struct node *r, void (*action)(const void *, VISIT, int), int d)
{
	if (r == 0)
		return;
	if (r->h == 1)
		action(r, leaf, d);
	else {
		action(r, preorder, d);
		walk(r->a[0], action, d+1);
		action(r, postorder, d);
		walk(r->a[1], action, d+1);
		action(r, endorder, d);
	}
}

void twalk(const void *root, void (*action)(const void *, VISIT, int))
{
	walk(root, action, 0);
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: AVL tree: storing balances instead of heights
  2015-12-20 21:43               ` Szabolcs Nagy
@ 2015-12-21  1:28                 ` Szabolcs Nagy
  0 siblings, 0 replies; 10+ messages in thread
From: Szabolcs Nagy @ 2015-12-21  1:28 UTC (permalink / raw)
  To: musl, Ed Schouten

* Szabolcs Nagy <nsz@port70.net> [2015-12-20 22:43:19 +0100]:
> complete tsearch api implementation code size on
> x86_64 (pic and non-pic as well):
> 
> tsearch_avl.o   958
> tsearch_fast.o  934
> tsearch_small.o 804
> 

more .text size data on various arches (different versions of
gcc were used i had around, -Os -fomit-frame-pointer -std=c99):

       x86_64  i386   arm  mips powerpc aarch64  sh
_avl.o    958   879  1080  1632   1352    1144  840
_fast.o   934   876  1072  1520   1248    1224  800
_small.o  804   815   896  1312   1124     968  728


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-12-21  1:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CABh_MKm1Ow5_dc4ENJM-fAqWc+uPPZgftRdSDxPJiPXZAfZ+FQ@mail.gmail.com>
2015-12-07 13:03 ` AVL tree: storing balances instead of heights Szabolcs Nagy
2015-12-07 13:19   ` Ed Schouten
2015-12-07 13:22     ` Ed Schouten
2015-12-07 14:46     ` Szabolcs Nagy
2015-12-10  9:21       ` Ed Schouten
2015-12-10 12:14         ` Szabolcs Nagy
2015-12-10 13:12           ` Ed Schouten
2015-12-10 13:43             ` Szabolcs Nagy
2015-12-20 21:43               ` Szabolcs Nagy
2015-12-21  1:28                 ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).