From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 8D489BC57 for ; Fri, 14 May 2010 17:13:37 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao0EAFsF7UvZSMDjdWdsb2JhbACdfRUBFyAFH70dhRAE X-IronPort-AV: E=Sophos;i="4.53,229,1272837600"; d="scan'208";a="59361829" Received: from fmmailgate02.web.de ([217.72.192.227]) by mail1-smtp-roc.national.inria.fr with ESMTP; 14 May 2010 17:13:37 +0200 Received: from smtp05.web.de ( [172.20.4.166]) by fmmailgate02.web.de (Postfix) with ESMTP id C3C5A160ABE34; Fri, 14 May 2010 17:13:36 +0200 (CEST) Received: from [78.43.204.177] (helo=frosties.localdomain) by smtp05.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #4) id 1OCwa0-00037m-00; Fri, 14 May 2010 17:13:36 +0200 Received: from mrvn by frosties.localdomain with local (Exim 4.71) (envelope-from ) id 1OCwa5-0003C0-11; Fri, 14 May 2010 17:13:41 +0200 From: Goswin von Brederlow To: Yoriyuki Yamagata Cc: Caml List Subject: Re: [Caml-list] Balancing algorithm of Set/Map implementation References: Date: Fri, 14 May 2010 17:13:40 +0200 In-Reply-To: (Yoriyuki Yamagata's message of "Fri, 14 May 2010 15:17:23 +0900") Message-ID: <87k4r6h5y3.fsf@frosties.localdomain> User-Agent: Gnus/5.110009 (No Gnus v0.9) XEmacs/21.4.22 (linux, no MULE) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: goswin-v-b@web.de X-Sender: goswin-v-b@web.de X-Provags-ID: V01U2FsdGVkX1+ZgrJb7+pR+tZfZMS0T/DRhiluqEy8uZrBM13J UVS7TmYROgOCAP5SNjPZK5VZnc2sPfCa9IEcBNxMBCPUmQCwPF 0ShTSQKng= X-Spam: no; 0.00; yoriyuki:01 yamagata:01 yoriyuki:01 node:01 node:01 unbalanced:01 ocaml:01 unbalanced:01 height:98 height:98 mfg:98 avoids:01 caml-list:01 writes:01 algorithm:01 Yoriyuki Yamagata writes: > Hi, list. > > When I read the balancing function of stdlib's Set/Map several years ago, I > thought I have understand how it works. But now, I read it again and I'm less > confident now. Could someone answer my questions? Here is the snippet of the > code. > > let bal l v r = > let hl = match l with Empty -> 0 | Node(_,_,_,h) -> h in > let hr = match r with Empty -> 0 | Node(_,_,_,h) -> h in > if hl > hr + 2 then begin > match l with > > Empty -> invalid_arg "Set.bal" > | Node(ll, lv, lr, _) -> > if height ll >= height lr then > create ll lv (create lr v r) > else begin > match lr with > > Empty -> invalid_arg "Set.bal" > | Node(lrl, lrv, lrr, _)-> > create (create ll lv lrl) lrv (create lrr v r) > end > end else if hr > hl + 2 then begin > > match r with > Empty -> invalid_arg "Set.bal" > | Node(rl, rv, rr, _) -> > if height rr >= height rl then > create (create l v rl) rv rr > else begin > > match rl with > Empty -> invalid_arg "Set.bal" > | Node(rll, rlv, rlr, _) -> > create (create l v rll) rlv (create rlr rv rr) > end > > end else > Node(l, v, r, (if hl >= hr then hl + 1 else hr + 1)) > > I have two question. > > | Node(ll, lv, lr, _) -> > if height ll >= height lr then > create ll lv (create lr v r) > > else begin > > Is this code right? If r is Empty and lr and ll are huge trees, > doesn't it create a massively unbalanced tree? If r is empty then lr and ll can not be huge. Otherwise the tree was massively unbalance beforehand. The balancing prevents this from hapening. > Another question is that why OCaml implementation allows > a balancing factor up to *2*, which is usually allowed only up to 1? Probably avoids having to do 2 balancings in a single operation. Or weighs the number of balancing done on average against a slightly unbalanced tree, i.e. turns out to be faster to be more unbalanced in practice. > Maybe my question is naive one, but I would appreciate if your could comment it. > > Regards, MfG Goswin