caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Balancing algorithm of Set/Map implementation
@ 2010-05-14  6:17 Yoriyuki Yamagata
       [not found] ` <AANLkTikNr91FmeqOAXT-MOS0yDgH652WMvsg0a-WopKN@mail.gmail.com>
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Yoriyuki Yamagata @ 2010-05-14  6:17 UTC (permalink / raw)
  To: Caml List

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]

Hi, list.

When I read the balancing function of stdlib's Set/Map several years ago, I
thought I have understand how it works.  But now, I read it again and I'm
less confident now.  Could someone answer my questions?  Here is the snippet
of the code.

 let bal l v r =
      let hl = match l with Empty -> 0 | Node(_,_,_,h) -> h in
      let hr = match r with Empty -> 0 | Node(_,_,_,h) -> h in
      if hl > hr + 2 then begin
        match l with
          Empty -> invalid_arg "Set.bal"
        | Node(ll, lv, lr, _) ->
            if height ll >= height lr then
              create ll lv (create lr v r)
            else begin
              match lr with
                Empty -> invalid_arg "Set.bal"
              | Node(lrl, lrv, lrr, _)->
                  create (create ll lv lrl) lrv (create lrr v r)
            end
      end else if hr > hl + 2 then begin
        match r with
          Empty -> invalid_arg "Set.bal"
        | Node(rl, rv, rr, _) ->
            if height rr >= height rl then
              create (create l v rl) rv rr
            else begin
              match rl with
                Empty -> invalid_arg "Set.bal"
              | Node(rll, rlv, rlr, _) ->
                  create (create l v rll) rlv (create rlr rv rr)
            end
      end else
        Node(l, v, r, (if hl >= hr then hl + 1 else hr + 1))

I have two question.

        | Node(ll, lv, lr, _) ->
            if height ll >= height lr then
              create ll lv (create lr v r)
            else begin

Is this code right?  If r is Empty and lr and ll are huge trees,
doesn't it create a massively unbalanced tree?

Another question is that why OCaml implementation allows
a balancing factor up to *2*, which is usually allowed only up to 1?

Maybe my question is naive one, but I would appreciate if your could comment it.

Regards,

-- 
Yoriyuki Yamagata
yoriyuki.y@gmail.com

[-- Attachment #2: Type: text/html, Size: 2246 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Fwd: [Caml-list] Balancing algorithm of Set/Map implementation
       [not found] ` <AANLkTikNr91FmeqOAXT-MOS0yDgH652WMvsg0a-WopKN@mail.gmail.com>
@ 2010-05-14  8:09   ` Julien Signoles
  0 siblings, 0 replies; 5+ messages in thread
From: Julien Signoles @ 2010-05-14  8:09 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 1432 bytes --]

Sorry miss the Cc-ing to the caml list.

---------- Forwarded message ----------
From: Julien Signoles <julien.signoles@gmail.com>
Date: 2010/5/14
Subject: Re: [Caml-list] Balancing algorithm of Set/Map implementation
To: Yoriyuki Yamagata <yoriyuki.y@gmail.com>


Hello,

2010/5/14 Yoriyuki Yamagata <yoriyuki.y@gmail.com>

> When I read the balancing function of stdlib's Set/Map several years ago, I
> thought I have understand how it works.  But now, I read it again and I'm
> less confident now.  Could someone answer my questions?
>
> Is this code right?  If r is Empty and lr and ll are huge trees,
>
> doesn't it create a massively unbalanced tree?
>
> Another question is that why OCaml implementation allows
> a balancing factor up to *2*, which is usually allowed only up to 1?
>
>
> Maybe my question is naive one, but I would appreciate if your could comment it.
>
>
Some years ago, Jean-Christophe Filliâtre and Pierre Letouzey formally
proved within the Coq proof assistant (http://coq.inria.fr) that this
algorithm is correct. Explanations provided by their paper [1] should answer
your 2 questions.

[1] Jean-Christophe Filliâtre and Pierre Letouzey. Functors for Proofs and
Programs. In *Proceedings of The European Symposium on Programming*, volume
2986 of *Lecture Notes in Computer Science*, pages 370-384, Barcelona,
Spain, April 2004.

Hope this helps,
Julien Signoles

[-- Attachment #2: Type: text/html, Size: 2079 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Balancing algorithm of Set/Map implementation
  2010-05-14  6:17 Balancing algorithm of Set/Map implementation Yoriyuki Yamagata
       [not found] ` <AANLkTikNr91FmeqOAXT-MOS0yDgH652WMvsg0a-WopKN@mail.gmail.com>
@ 2010-05-14 13:02 ` Daniel Bünzli
  2010-05-14 15:13 ` Goswin von Brederlow
  2010-05-14 18:48 ` "Stanisław T. Findeisen"
  3 siblings, 0 replies; 5+ messages in thread
From: Daniel Bünzli @ 2010-05-14 13:02 UTC (permalink / raw)
  To: Yoriyuki Yamagata; +Cc: Caml List

You may also be interested in this paper :

http://groups.csail.mit.edu/mac/users/adams/BB/

Best,

Daniel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Balancing algorithm of Set/Map implementation
  2010-05-14  6:17 Balancing algorithm of Set/Map implementation Yoriyuki Yamagata
       [not found] ` <AANLkTikNr91FmeqOAXT-MOS0yDgH652WMvsg0a-WopKN@mail.gmail.com>
  2010-05-14 13:02 ` Daniel Bünzli
@ 2010-05-14 15:13 ` Goswin von Brederlow
  2010-05-14 18:48 ` "Stanisław T. Findeisen"
  3 siblings, 0 replies; 5+ messages in thread
From: Goswin von Brederlow @ 2010-05-14 15:13 UTC (permalink / raw)
  To: Yoriyuki Yamagata; +Cc: Caml List

Yoriyuki Yamagata <yoriyuki.y@gmail.com> writes:

> Hi, list.
>
> When I read the balancing function of stdlib's Set/Map several years ago, I
> thought I have understand how it works.  But now, I read it again and I'm less
> confident now.  Could someone answer my questions?  Here is the snippet of the
> code.
>
>  let bal l v r =
>       let hl = match l with Empty -> 0 | Node(_,_,_,h) -> h in
>       let hr = match r with Empty -> 0 | Node(_,_,_,h) -> h in
>       if hl > hr + 2 then begin
>         match l with
>
>           Empty -> invalid_arg "Set.bal"
>         | Node(ll, lv, lr, _) ->
>             if height ll >= height lr then
>               create ll lv (create lr v r)
>             else begin
>               match lr with
>
>                 Empty -> invalid_arg "Set.bal"
>               | Node(lrl, lrv, lrr, _)->
>                   create (create ll lv lrl) lrv (create lrr v r)
>             end
>       end else if hr > hl + 2 then begin
>
>         match r with
>           Empty -> invalid_arg "Set.bal"
>         | Node(rl, rv, rr, _) ->
>             if height rr >= height rl then
>               create (create l v rl) rv rr
>             else begin
>
>               match rl with
>                 Empty -> invalid_arg "Set.bal"
>               | Node(rll, rlv, rlr, _) ->
>                   create (create l v rll) rlv (create rlr rv rr)
>             end
>
>       end else
>         Node(l, v, r, (if hl >= hr then hl + 1 else hr + 1))
>
> I have two question.
>
>         | Node(ll, lv, lr, _) ->
>             if height ll >= height lr then
>               create ll lv (create lr v r)
>
>             else begin
>
> Is this code right?  If r is Empty and lr and ll are huge trees,
> doesn't it create a massively unbalanced tree?

If r is empty then lr and ll can not be huge. Otherwise the tree was
massively unbalance beforehand. The balancing prevents this from
hapening.

> Another question is that why OCaml implementation allows
> a balancing factor up to *2*, which is usually allowed only up to 1?

Probably avoids having to do 2 balancings in a single operation. Or
weighs the number of balancing done on average against a slightly
unbalanced tree, i.e. turns out to be faster to be more unbalanced in
practice.

> Maybe my question is naive one, but I would appreciate if your could comment it.
>
> Regards,

MfG
        Goswin


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Balancing algorithm of Set/Map implementation
  2010-05-14  6:17 Balancing algorithm of Set/Map implementation Yoriyuki Yamagata
                   ` (2 preceding siblings ...)
  2010-05-14 15:13 ` Goswin von Brederlow
@ 2010-05-14 18:48 ` "Stanisław T. Findeisen"
  3 siblings, 0 replies; 5+ messages in thread
From: "Stanisław T. Findeisen" @ 2010-05-14 18:48 UTC (permalink / raw)
  To: Yoriyuki Yamagata; +Cc: Caml List

On 2010-05-14 08:17, Yoriyuki Yamagata wrote:
> When I read the balancing function of stdlib's Set/Map several years
> ago, I thought I have understand how it works.  But now, I read it again
> and I'm less confident now.  Could someone answer my questions?  Here is
> the snippet of the code.
> 
>  let bal l v r =
>       let hl = match l with Empty -> 0 | Node(_,_,_,h) -> h in
>       let hr = match r with Empty -> 0 | Node(_,_,_,h) -> h in
>       if hl > hr + 2 then begin
>         match l with
> 
>           Empty -> invalid_arg "Set.bal"
>         | Node(ll, lv, lr, _) ->
>             if height ll >= height lr then
>               create ll lv (create lr v r)
>             else begin
>               match lr with
> 
>                 Empty -> invalid_arg "Set.bal"
>               | Node(lrl, lrv, lrr, _)->
>                   create (create ll lv lrl) lrv (create lrr v r)
>             end
>       end else if hr > hl + 2 then begin
> 
>         match r with
>           Empty -> invalid_arg "Set.bal"
>         | Node(rl, rv, rr, _) ->
>             if height rr >= height rl then
>               create (create l v rl) rv rr
>             else begin
> 
>               match rl with
>                 Empty -> invalid_arg "Set.bal"
>               | Node(rll, rlv, rlr, _) ->
>                   create (create l v rll) rlv (create rlr rv rr)
>             end
> 
>       end else
>         Node(l, v, r, (if hl >= hr then hl + 1 else hr + 1))
[...]
> Another question is that why OCaml implementation allows 
> a balancing factor up to *2*, which is usually allowed only up to 1?

I guess the balancing factor of -2/+2 can only occur temporarily
during insert/delete operations (and such). See for instance (set.ml):

let rec add x = function
    Empty -> Node(Empty, x, Empty, 1)
  | Node(l, v, r, _) as t ->
      let c = Ord.compare x v in
      if c = 0 then t else
      if c < 0 then bal (add x l) v r else bal l v (add x r)

let rec remove x = function
    Empty -> Empty
  | Node(l, v, r, _) ->
      let c = Ord.compare x v in
      if c = 0 then merge l r else
      if c < 0 then bal (remove x l) v r else bal l v (remove x r)

This is what bal is for: to fix the balance in the tree root.

I guess this: http://en.wikipedia.org/wiki/AVL_tree#Operations is
more or less correct. :-)

STF

http://eisenbits.homelinux.net/~stf/
OpenPGP: DFD9 0146 3794 9CF6 17EA  D63F DBF5 8AA8 3B31 FE8A


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-05-14 18:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-14  6:17 Balancing algorithm of Set/Map implementation Yoriyuki Yamagata
     [not found] ` <AANLkTikNr91FmeqOAXT-MOS0yDgH652WMvsg0a-WopKN@mail.gmail.com>
2010-05-14  8:09   ` Fwd: [Caml-list] " Julien Signoles
2010-05-14 13:02 ` Daniel Bünzli
2010-05-14 15:13 ` Goswin von Brederlow
2010-05-14 18:48 ` "Stanisław T. Findeisen"

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).