From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <David.Monniaux@ens.fr>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id 39DB1BC69
	for <caml-list@yquem.inria.fr>; Sun, 22 Oct 2006 02:03:27 +0200 (CEST)
Received: from smtp3-g19.free.fr (smtp3-g19.free.fr [212.27.42.29])
	by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k9M03QLR021987
	for <caml-list@yquem.inria.fr>; Sun, 22 Oct 2006 02:03:27 +0200
Received: from picsou.chatons.dyndns.org (mna75-4-82-225-77-14.fbx.proxad.net [82.225.77.14])
	by smtp3-g19.free.fr (Postfix) with ESMTP id B84275908
	for <caml-list@yquem.inria.fr>; Sun, 22 Oct 2006 02:03:26 +0200 (CEST)
Received: from ens.fr (localhost.localdomain [127.0.0.1])
	by picsou.chatons.dyndns.org (Postfix) with ESMTP id C66CD17983
	for <caml-list@yquem.inria.fr>; Sun, 22 Oct 2006 02:03:24 +0200 (CEST)
Message-ID: <453AB54C.50201@ens.fr>
Date: Sun, 22 Oct 2006 02:03:24 +0200
From: David Monniaux <David.Monniaux@ens.fr>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114
X-Accept-Language: fr-fr, fr, en-us, en-gb, en, es-es, es
MIME-Version: 1.0
To: caml-list <caml-list@yquem.inria.fr>
Subject: tree balancing in Set and Map
X-Enigmail-Version: 0.76.8.0
X-Enigmail-Supports: pgp-inline, pgp-mime
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Miltered: at concorde with ID 453AB54E.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; monniaux:01 monniaux:01 subtrees:01 caml's:01 nodes:01 height:98 height:98 functions:01 data:02 tree:02 tree:02 ens:02 maximal:03 maximal:03 generally:03 

I wonder about something: the first balanced tree data structure 
described in algorithmics courses is generally the AVL tree, where the 
heights of the subtrees differ by at most 1. In Caml's implementation, 
they are allowed to differ by at most 2.

A quick analysis of the functions h1(n) and h2(n) governing the maximal 
height of a tree with n nodes with respectively maximal height 
differences of 1 and 2 shows that asymptotically h2(n) is about 1.25 
times h1(n).

Why this choice then? Was it easier to implement? Is there some hidden 
effect like rotations not being needed as often? How about some 
bibliographic references? :-)

(Yes, you may have guessed, I'm not an algorithmician.)