From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA20423; Fri, 11 Apr 2003 13:18:01 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA20526 for ; Fri, 11 Apr 2003 13:17:59 +0200 (MET DST) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h3BBHxX05393 for ; Fri, 11 Apr 2003 13:17:59 +0200 (MET DST) Received: from ibm3.cicrp.jussieu.fr (ibm3.cicrp.jussieu.fr [134.157.15.3]) by shiva.jussieu.fr (8.12.9/jtpda-5.4) with ESMTP id h3BBHwAv087036 for ; Fri, 11 Apr 2003 13:17:58 +0200 (CEST) Received: from ibm1.cicrp.jussieu.fr (ibm1.cicrp.jussieu.fr [134.157.15.1]) by ibm3.cicrp.jussieu.fr (8.8.8/jtpda/mob-V8) with ESMTP id NAA31104 for ; Fri, 11 Apr 2003 13:17:03 +0200 Received: from localhost (fernande@localhost) by ibm1.cicrp.jussieu.fr (8.8.8/jtpda/mob-v8) with ESMTP id NAA1630374 for ; Fri, 11 Apr 2003 13:17:02 +0200 Date: Fri, 11 Apr 2003 13:17:02 +0200 (DST) From: Diego Olivier Fernandez Pons To: caml-list@inria.fr Subject: [Caml-list] Using zippers to handle huge trees Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Antivirus: scanned by sophie at shiva.jussieu.fr X-Spam: no; 0.00; pons:01 cicrp:01 shouxun:01 dataset:01 passing:01 allocates:01 deconstruct:01 bounded:01 developped:01 huet:01 sept:99 n-ary:01 pointers:01 baire:01 ralf:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Bonjour, On Wednesday, 09 Apr 2003, Yang Shouxun wrote: > I don't know how to write a tail recursive version to build trees. > If there are not that many continuous attributes and the dataset is > not so large, the tree stops growing before stack overflow. On Wednesday 09 April 2003, Markus Mottl wrote: > The trick is to use continuation passing style (CPS): you pass a > function closure (continuation) containing everything that's needed > in subsequent computations. Instead of returning a result, the > sub-function calls the continuation with the result, which makes the > functions tail-recursive. Zippers are a simple way to handle huge (in fact infinite) trees. 1. Explanation of zippers 2. Related work 3. Examples of code 1. Explanation of zippers Zippers may be seen as 'functional pointers' since they offer : - purely functional and typed operations - O(1) acces to the pointed element - O(1) pointer movements The restrictions are that only one pointer is allowed by data structure and every pointer movement allocates memory. Take a classical type declaration for binary search trees type 'a tree =3D E | N of 'a tree * 'a * 'a tree * int Consider a binary search tree and an inner node to which you want to point. To have a 0(1) acces to the pointed subtree, it has to be directly available from the base of the data structure. Then, the rest of the tree must be kept in a separate place. We will deconstruct it along the path from the root to the pointed node type 'a path =3D | Root | Left of 'a * 'a tree * 'a path | Right of 'a tree * 'a * 'a path type 'a zipper =3D 'a tree * 'a path The zipper contrains as annouced : - the pointed subtree - the rest of the tree breaked along the path to the root Then we define the pointer movements (one for each pointer in the data structure) : exception Bottom (* To be replaced by a balancing constructor *) let makeDTree =3D fun l v r -> N (l, v, r, 0) let move_left =3D fun (tree, path) -> match tree with | E -> raise Bottom | N (l, v, r, _) -> (l, Left (v, r, path)) let move_right =3D fun (tree, path) -> match tree with | E -> raise Bottom | N (l, v, r, _) -> (r, Right (l, v, path)) let move_up =3D fun (tree, path) -> match path with | Root -> raise Top | Left (v, r, tail) -> (makeDTree tree v r, tail) | Right (l, v, tail) -> (makeDTree l v tree, tail) Now we can build an arbitrary large tree by the following procedure : - build a tree of bounded depth - choose the node which will be developped next - move the current pointer to that node - continue building the tree This procedure uses a bounded stack space 2. Related work Zippers were invented by G=E9rard Huet. There is a paper on the JFP G. Huet. The Zipper. J. Functional Programming, 7 (5), Sept 1997, pp. 549--= 554. He uses n-ary trees and binary trees in his examples. The main difference is that in binary trees the pointers are not organized in the same way (his 'left' operation is what in Baire is (left o up)) Ralf Hinze has tried to give a general framework for functional pointers named a web (you give your data structure and the basic transformations and the data structure does the rest) Ralf Hinze and Johan Jeuring. Functional Pearl: Weaving a Web. Journal of Functional Programming, 11(6):681-689, November 2001. Available on the net via Hinze's home page. In my opinion his article is not really convincing and not very clear. Several libraries already use zippers - Zen (G=E9rard Huet, Caml) uses zippers to handle acyclic automata minimization - SML/NJ Standard library (John Reppy) uses zippers to handle deletion in red-black trees - MetaPRL (Caml) uses zippers to handle insertion and deletion in splay and red-black trees - Grammatical Framework (Aarne Ranta, Haskell) uses zippers to navigate through n-ary trees. All this code is available on the web. 3. Examples of code Here are some examples of usual binary search trees operations made whith zippers (insert, delete, move_pointer_to, ...) extracted from Baire (current version 11 avril 2003, plenty of bugs and breaked code, you will find it in Baire's download pages) : let rec move_to_top =3D function ((tree, path) as pointer) -> match path with | Root -> pointer | Left (v, r, tail) -> move_to_top (makeDTree tree v r, tail) | Right (l, v, tail) -> move_to_top (makeDTree l v tree, tail) let rec move_to x =3D function ((tree, path) as pointer) -> match tree with | E -> =09(match path with =09 | Right (_, rv, _) when x <=3D rv -> move_to x (move_up pointer) =09 | Left (lv, _, _) when x >=3D lv -> move_to x (move_up pointer) =09 | _ -> pointer =09) | N (_, v, _, _) -> =09match compare x v with =09 | n when n < 0 -> =09 (match path with =09=09 | Right (_, rv, _) when x < rv -> move_to x (move_up pointer) =09=09 | Right _ | Root | Left _ -> move_to x (move_left pointer) =09 ) =09 | n when n > 0 -> =09 (match path with =09=09 | Left (lv, _, _) when x > lv -> move_to x (move_up pointer) =09=09 | Left _ | Root | Right _ -> move_to x (move_right pointer) =09 ) =09 | _ -> pointer let rec member_path x =3D function | Right (l, v, tail) -> (match compare x v with =09 | n when n < 0 -> member x l =09 | 0 -> true =09 | _ -> member_path x tail ) | Left (v, r, tail) -> (match compare x v with =09 | n when n > 0 -> member x r =09 | 0 -> true =09 | _ -> member_path x tail ) | Root -> false let rec zipper_member x =3D function (tree, path) -> match tree with | E -> member_path x path | N (l, v, r, _) -> =09match compare x v with =09 | n when n < 0 -> =09 (match path with =09=09 | Right (_, rv, _) when x <=3D rv -> member_path x path =09=09 | Right _ | Root | Left _ -> member x l =09 ) =09 | n when n > 0 -> =09 (match path with =09=09 | Left (lv, _, _) when x >=3D lv -> member_path x path =09=09 | Left _ | Root | Right _ -> member x r =09 ) =09 | _ -> true let current_tree =3D function (tree, _) -> tree let current_value =3D function (tree, _) -> match tree with | E -> None | N (_, v, _, _) -> Some v let current_value' =3D function (tree, _) -> match tree with | E -> raise Empty | N (_, v, _, _) -> v let rec zipper_insert x =3D function ((tree, path) as pointer) -> match tree with | E -> =09(match path with =09 | Right (_, rv, _) when x <=3D rv -> zipper_insert x (move_up pointer) =09 | Left (lv, _, _) when x >=3D lv -> zipper_insert x (move_up pointer) =09 | _ -> (makeTree E x E, path) =09) | N (_, v, _, _) -> =09match compare x v with =09 | n when n < 0 -> =09 (match path with =09=09 | Right (_, rv, _) when x < rv -> =09=09 zipper_insert x (move_up pointer) =09=09 | Right _ | Root | Left _ -> =09=09 zipper_insert x (move_left pointer) =09 ) =09 | n when n > 0 -> =09 (match path with =09=09 | Left (lv, _, _) when x > lv -> =09=09 zipper_insert x (move_up pointer) =09=09 | Left _ | Root | Right _ -> =09=09 zipper_insert x (move_right pointer) =09 ) =09 | _ -> pointer let rec zipper_delete x =3D function ((tree, path) as pointer) -> match tree with | E -> =09(match path with =09 | Right (_, rv, _) when x <=3D rv -> zipper_delete x (move_up pointer) =09 | Left (lv, _, _) when x >=3D lv -> zipper_delete x (move_up pointer) =09 | _ -> pointer =09) | N (l, v, r, _) -> =09match compare x v with =09 | n when n < 0 -> =09 (match path with =09=09 | Right (_, rv, _) when x < rv -> =09=09 zipper_delete x (move_up pointer) =09=09 | Right _ | Root | Left _ -> =09=09 zipper_delete x (move_left pointer) =09 ) =09 | n when n > 0 -> =09 (match path with =09=09 | Left (lv, _, _) when x > lv -> =09=09 zipper_delete x (move_up pointer) =09=09 | Left _ | Root | Right _ -> =09=09 zipper_delete x (move_right pointer) =09 ) =09 | _ -> move_to x (appendB l r, path) let rec path_to_list result =3D function | Root -> result | Left (v, r, path) -> path_to_list (result @ v :: to_ordered_list r) path | Right (l, v, path) -> path_to_list (to_ordered_list_rec (v :: result) l) path let zipper_to_list =3D function (tree, path) -> path_to_list (to_list tree) path Diego Olivier ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners