From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id OAA31532; Fri, 30 May 2003 14:00:57 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA31565 for ; Fri, 30 May 2003 14:00:56 +0200 (MET DST) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h4UC0tH00548 for ; Fri, 30 May 2003 14:00:55 +0200 (MET DST) Received: from ibm3.cicrp.jussieu.fr (ibm3.cicrp.jussieu.fr [134.157.15.3]) by shiva.jussieu.fr (8.12.9/jtpda-5.4) with ESMTP id h4UC0su2087914 ; Fri, 30 May 2003 14:00:54 +0200 (CEST) Received: from ibm1.cicrp.jussieu.fr (ibm1.cicrp.jussieu.fr [134.157.15.1]) by ibm3.cicrp.jussieu.fr (8.8.8/jtpda/mob-V8) with ESMTP id OAA27420 ; Fri, 30 May 2003 14:00:04 +0200 Received: from localhost (fernande@localhost) by ibm1.cicrp.jussieu.fr (8.8.8/jtpda/mob-v8) with ESMTP id OAA1343638 ; Fri, 30 May 2003 14:00:02 +0200 Date: Fri, 30 May 2003 14:00:00 +0200 (DST) From: Diego Olivier Fernandez Pons To: Yamagata Yoriyuki cc: caml-list@inria.fr Subject: Re : [Caml-list] Set/Map with intervals and/or order-related operations In-Reply-To: <20030529.224025.59459731.yoriyuki@mbg.sphere.ne.jp> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Antivirus: scanned by sophie at shiva.jussieu.fr X-Spam: no; 0.00; pons:01 etu:99 caml-list:01 schema:01 baire:01 encapsulate:01 failwith:01 fernandez:01 bindings:01 ocaml:01 int:01 rec:01 diet:98 constructors:01 trivial:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Bonjour, > Perhaps I will develop my own since I need balancing schema, and a > Map-like structure (not only a Set-like structure.) based on Diet. > I only need queries for individual elements, not intervals. So I > don't think a k-d tree is appropriate. Also I need the data > structure purely functional, because the cost of copying is not > negligible. So, even though several people points out the existence > of C implementations, I will opt an ocaml one. The dom type of > FaCiLe is essentially a list of intervals. Maybe I can use Baire > implementations. I am afraid I haven't yet understood your problem and I will need more explanations to be able to help you. There are several points in your message I would like to comment : - Tree-like data structure balancing - Diet (discrete interval encoding scheme) 1. Balancing tree-like data structure You should separate the design of the tree-like data structure and its balancing. First write down your unbalanced data structure and get your algorithms right, only then choose one and implement one of the multiple balancing schemes available The trick is very simple (Stephen Adams) : encapsulate the balancing information in << smart constructors >>. At the beginning this constructors will be trivial let makeTree = fun l v r -> N (l, v, r) and your functions will look like this let rec insert x = function | E -> single x | N (l, v, r) -> match compare x v with | n when n < 0 -> makeTree (insert x l) v r | n when n > 0 -> makeTree l v (insert x r) | _ -> failwith "already in the data structure" Once everything goes right, you will add a balancing constructor and replace your [makeTree] by the balancing constructor [makeBTree]. Sometimes you will find usefull not to recompute all the balancing information at every node, then you just need to add a cache. This is of course optional. ... | E -> injection x | N (l, v, r, _) -> (* the last int is a cache for the constructor *) ... Your balanced function will look like this let rec insert x = function | E -> single x | N (l, v, r, _) -> match compare x v with | n when n < 0 -> makeBTree (insert x l) v r | n when n > 0 -> makeBTree l v (insert x r) | _ -> failwith "already in the data structure" Conclusion : You do not need to bother with balancing schemes from the beginning. Once your unbalanced data structure works fine, you can add them very easely. Baire contains a few examples : - unbalancedSet.ml contains unbalanced trees and AVL-trees without cache (does not change the data type) (0) - avlSet.ml contains avl sets with cache (1) - weighBalancedSet.ml contains WB sets with cache (2) ... (1) and (2) are just a copy-paste of (0) with the described modifications and (1) and (2) share all but 40 lines of code (namely the 4 balancing constructors). 2. Discrete interval encoding tree The idea of diet is to replace successive integers by an interval {3} {5} {6} {7} {9} -> {3} [5-7] {9} Then, a Diet is just a tree of intervals and singletons with some join/separate instructions. Since I do not understand what you mean by 'map-like operations' I don't know if this is appropriate. {3, 4} {5, 2} {6, 2} {7, 3} {9, 4} -> {3, 4} [5-6, 2] {7, 3} {9, 4} This will work (I mean with a significant compression rate) only if you can ensure that consecutive bindings will have very often the same image. Or are you binding sigletons and intervals to values ? {3, 4} {[5-7], 2} {[6-10], 5} {9, 4} find 5 myDataStructure -> [(5, 2); (5, 5)] Are the intervals supposed to be disjoint ? {3, 4} {[5-7], 2} {[8-10], 3} find 5 myDataStructure -> (5, 2) Or are you binding keys to multiple values {3, {4}} {5, {2, 5}} {6, {2, 5}} {7, {2, 5}} {8, 5} {9, {4, 5}} {10, {5}} find 5 myDataStructure -> [(5, 2); (5, 5)] Multidimensional data structure (k-d trees, range trees, cartesian trees) may be what you are looking for. It depends highly on what you mean by "set/map with intervals" and "order related operations". Could you give a few examples of what you expect of such a data structure ? Diego Olivier ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners