From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id BBFE6BC69 for ; Mon, 30 Jul 2007 15:56:26 +0200 (CEST) Received: from pih-relay04.plus.net (pih-relay04.plus.net [212.159.14.131]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id l6UDuQ5A014978 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Mon, 30 Jul 2007 15:56:26 +0200 Received: from [80.229.56.224] (helo=beast.local) by pih-relay04.plus.net with esmtp (Exim) id 1IFVjV-0005TT-Ck for caml-list@inria.fr; Mon, 30 Jul 2007 14:56:25 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@inria.fr Subject: Re: [Caml-list] Ropes and rope-like functional extensible vectors with O(1) prepend/append. Date: Mon, 30 Jul 2007 14:47:23 +0100 User-Agent: KMail/1.9.7 References: <20070728233305.GB28879@tux-chan> <200707300146.20940.jon@ffconsultancy.com> <20070730115129.GD28879@tux-chan> In-Reply-To: <20070730115129.GD28879@tux-chan> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200707301447.24010.jon@ffconsultancy.com> X-Miltered: at concorde with ID 46ADEE0A.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; vectors:01 prepend:01 mutation:01 node:01 nodes:01 bounded:01 functor:01 functor:01 reimplement:01 inserting:01 hofs:01 recursive:01 iter:01 iteri:01 allocations:01 On Monday 30 July 2007 12:51:29 Mauricio Fernandez wrote: > The structure is persistent and all operations are non-destructive. > Mutation is used only in the rebalancing operation and in set, but they > affect an ephemeral forest of ropes/vects and a new string/array > respectively, so the original structure is never modified and in principle > all operations should be thread-safe. > > Ropes/vects being functional doesn't mean that they will perform well in a > persistent setting however, see the clarification at > > http://eigenclass.org/repos/oropes/head/doc/Vect.html Ok. > > I have some suggestions: > > > > I'd like metadata in every node, so I can provide a constructor that > > combines the metadata of child nodes and a cull function to accelerate > > searches. > > If I understand it correctly, that scheme could in the limit turn some O(n) > searches into O(log n)?), right? Something like that, yes. > Unlike Vec, Vect uses "compact" leaves (Leaf of 'a array) of bounded size > (leaf_size, typically 16-64), which might not fit very well. I can think of two solutions: 1. Linear search. 2. Store an array of metadata corresponding to a binary search of the array of elements in a leaf. The latter sounds sexy but the former would probably suffice. :-) > Vect would need to know how to combine the metadata, wouldn't it? Users would have to provide a function to do that, yes. > I was > thinking about something like > Leaf of ('meta -> 'meta -> 'meta) * 'meta * 'a array > but I've realized that this wouldn't suffice. So, given that Vect.t is > currently > > type 'a t = > Empty > > | Concat of 'a t * int * 'a t * int * int > | Leaf of 'a array > > something like this maybe? > > type ('a, 'meta) t = > Empty of ('meta -> 'meta -> 'meta) > > | Concat of ('meta -> 'meta -> 'meta) * 'meta * > > ('a, 'meta) t * int * > ('a, 'meta) t * int * > int > > | Leaf of ('meta -> 'meta -> meta) * 'meta array * 'a array > > (* maybe also * 'meta to cache the last computation? *) > > or even without the ('meta -> 'meta -> 'meta) part, forcing the user to > pass the function on each modification? Just thinking out loud. I would use a functor to pass it the "combine" and "cull" functions, rather than putting them in the tree itself. This is analogous to the Set.Make functor accepting a comparison function. > At any rate, it'd be better to provide it as a separate structure, any > suggestions for the name?. You could just call it Tree and try to make it as generic as possible. You could then reimplement the Set module on top of your data structure by searching for the index of the given element and inserting it if it is new. Hmm, maybe a function that combines a search with an update (to avoid two traversals) would be a good idea... > > The usual HOFs, like map. > > I just pushed a patch with filter and map. The former is trivially > implemented with fold + append (thanks to the O(1) append). I was going to > code map the same way but I ended up making one that returns an isomorphic > vect and is faster (since there's no need to rebalance). Yes. You could also use recursive subdivision to create a perfectly balanced result. > So Vect currently has iter, iteri, rangeiter, fold, map and filter. I'm > considering renaming fold to fold_left and providing fold_right too. I'd definitely provide both folds, yes. I'd also like specialized rewrite functions that avoid allocations when the outputs are the same as the inputs. So I'd make the set function bail via an exception when the element is left unchanged, returning the original Vect and doing no allocation in this case. I'd also like an id_map function that did a map but reused old branches whenever they were left unchanged. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e