From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 39DB1BC69 for ; Sun, 22 Oct 2006 02:03:27 +0200 (CEST) Received: from smtp3-g19.free.fr (smtp3-g19.free.fr [212.27.42.29]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k9M03QLR021987 for ; Sun, 22 Oct 2006 02:03:27 +0200 Received: from picsou.chatons.dyndns.org (mna75-4-82-225-77-14.fbx.proxad.net [82.225.77.14]) by smtp3-g19.free.fr (Postfix) with ESMTP id B84275908 for ; Sun, 22 Oct 2006 02:03:26 +0200 (CEST) Received: from ens.fr (localhost.localdomain [127.0.0.1]) by picsou.chatons.dyndns.org (Postfix) with ESMTP id C66CD17983 for ; Sun, 22 Oct 2006 02:03:24 +0200 (CEST) Message-ID: <453AB54C.50201@ens.fr> Date: Sun, 22 Oct 2006 02:03:24 +0200 From: David Monniaux User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114 X-Accept-Language: fr-fr, fr, en-us, en-gb, en, es-es, es MIME-Version: 1.0 To: caml-list Subject: tree balancing in Set and Map X-Enigmail-Version: 0.76.8.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 453AB54E.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; monniaux:01 monniaux:01 subtrees:01 caml's:01 nodes:01 height:98 height:98 functions:01 data:02 tree:02 tree:02 ens:02 maximal:03 maximal:03 generally:03 I wonder about something: the first balanced tree data structure described in algorithmics courses is generally the AVL tree, where the heights of the subtrees differ by at most 1. In Caml's implementation, they are allowed to differ by at most 2. A quick analysis of the functions h1(n) and h2(n) governing the maximal height of a tree with n nodes with respectively maximal height differences of 1 and 2 shows that asymptotically h2(n) is about 1.25 times h1(n). Why this choice then? Was it easier to implement? Is there some hidden effect like rotations not being needed as often? How about some bibliographic references? :-) (Yes, you may have guessed, I'm not an algorithmician.)