From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id DACB5BE3C for ; Fri, 20 Jul 2007 10:23:37 +0200 (CEST) Received: from ptb-relay01.plus.net (ptb-relay01.plus.net [212.159.14.212]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l6K8Nbja002355 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2007 10:23:37 +0200 Received: from [80.229.56.224] (helo=beast.local) by ptb-relay01.plus.net with esmtp (Exim) id 1IBnlw-00043l-EL for caml-list@yquem.inria.fr; Fri, 20 Jul 2007 09:23:36 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Vec: a functional implementation of extensible arrays Date: Fri, 20 Jul 2007 09:14:18 +0100 User-Agent: KMail/1.9.7 References: <28fa90930707181032q7681340pc30fb47434aff5fc@mail.gmail.com> <28fa90930707190959g66cb2e6wc7a446af2a6dfb72@mail.gmail.com> <6f9f8f4a0707200035l4351c6b6g585b4edd2c51faa6@mail.gmail.com> In-Reply-To: <6f9f8f4a0707200035l4351c6b6g585b4edd2c51faa6@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200707200914.18557.jon@ffconsultancy.com> X-Miltered: at discorde with ID 46A07109.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; arrays:01 camlp:01 notation:01 node:01 node:01 unbalanced:01 unbalanced:01 subnodes:01 subnode:01 ocaml:01 ocaml:01 frog:98 wrote:01 syntactic:01 extensible:01 On Friday 20 July 2007 08:35:39 Loup Vaillant wrote: > > ;-). I am open to new ideas. In part, I wanted a simple data structure > > (easier to extend, among other things). Also, I use Set, Map, etc, quite > > often, and those are also balanced trees: I thought that if I can live > > with those, I can probably live with Vec as well. This is the beginnings of an awesome data structure! > So can I. Your current implementation is already very attractive, and > looks very usable. For the new idea, have you thought of making (or > specifying) syntactic sugar to use your array? Should be very easy using the new camlp4. You might like to add a slicing notation as well. :-) > About improving performance... I have two suggestions: 1. Add an extra node representing single elements that replaces Node(Empty, _, Empty). The reduces GC stress enormously and makes the whole thing ~30% faster. 2. Allow unbalanced sub trees. Balancing is slow and folds and maps don't need to rebalance, but "get" should force rebalancing. Extracting subarrays should return an unbalanced result. > Is there a way to obtain a efficient filter? Yes. I discovered a most-excellent way to do this. It requires arbitrary metadata in every node, a constructor that composes subnodes to create the metadata for the parent and a filter function that can cull branches from the search tree. I used this in my Mathematica implementation to provide asymptotically fast filtering based upon lazily evaluated sets of symbols in each subnode. This gave huge performance improvements with no significant performance overhead. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e