From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 68626BC69 for ; Sun, 10 Jun 2007 14:17:09 +0200 (CEST) Received: from ptb-relay02.plus.net (ptb-relay02.plus.net [212.159.14.213]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l5ACH8RB030296 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Sun, 10 Jun 2007 14:17:09 +0200 Received: from [80.229.56.224] (helo=beast.local) by ptb-relay02.plus.net with esmtp (Exim) id 1HxMLy-00028o-7L for caml-list@yquem.inria.fr; Sun, 10 Jun 2007 13:17:06 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics Date: Sun, 10 Jun 2007 13:10:51 +0100 User-Agent: KMail/1.9.7 References: <5195a210705302250u6a9e5adey4ed857480f9e5cd8@mail.gmail.com> <200706011813.48515.jon@ffconsultancy.com> <46641BC1.9090305@cs.umd.edu> In-Reply-To: <46641BC1.9090305@cs.umd.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200706101310.52165.jon@ffconsultancy.com> X-Miltered: at discorde with ID 466BEBC4.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocaml:01 speedup:01 ocaml's:01 inserting:01 node:01 higher-order:01 val:01 rewriting:01 ocaml:01 frog:98 equality:01 equality:01 wrote:01 rec:01 rec:01 On Monday 04 June 2007 15:03:45 Mike Furr wrote: > I have already started using your suggestion of > including a 1-element constructor for my trees as it does indeed seem to > give a noticeable speedup. I have another suggestion for you: OCaml's exceptions are very fast and can be used to accelerate many no-op paths. For example, when inserting an element into a set that already contains that element, the current Set implementation will reallocate every node on the path to the element even though the resulting set is identical. You can improve performance enormously for the no-op case, avoiding this allocation, by raising an exception as soon as you realise the element is already present and catching it on the outside of the "insert" function to return the input set as the output. A related optimization is to use physical equality to avoid allocation (copying) when the output will be equivalent to the input. Insertion sort is a good example of this. A naive insertion sort may be written: let rec insertion = function | [] -> [] | h1::t -> match insertion t with | h2::t when h1>h2 -> h2::insertion(h1::t) | t -> h1::t but this copies already-sorted tail lists unnecessarily. It can be optimized by returning the original "list" when possible: let rec insertion = function | [] -> [] | h1::t as list -> match insertion t with | h2::t when h1>h2 -> h2::insertion(h1::t) | t' -> if t==t' then list else h1::t' More generally, applications such as term rewriters can benefit from specialized higher-order functions that incorporate this physical equality based optimization. My term rewriter used an "id_map" function: val id_map : ('a -> 'a) -> 'a array -> 'a array which returns the input when possible. This can avoid a lot of unnecessary allocation during rewriting and doubled the performance of the whole program! My implementation of id_map was as follows: let id_map f a = if a = [||] then a else let b = ref a in try for i = 0 to length a - 1 do let e = f a.(i) in if e != a.(i) then begin b := Array.copy a; (!b).(i) <- e; raise (Start (i+1)) end done; a with Start start -> let b = !b in for i = start to length a - 1 do let e = f a.(i) in if e != a.(i) then b.(i) <- e; done; b Another useful function along similar lines combines a map and a fold_left into one operation: let id_map_fold_left f x a = if a = [||] then x, a else let r = ref x in let b = ref a in try for i = 0 to length a - 1 do let r', e = f !r a.(i) in r := r'; if e != a.(i) then begin b := Array.copy a; (!b).(i) <- e; raise (Start (i+1)) end done; !r, a with Start start -> let b = !b in for i = start to length a - 1 do let r', e = f !r a.(i) in r := r'; if e != a.(i) then b.(i) <- e; done; !r, b I used the "map" to rewrite one expression into another and the "fold_left" to accumulate the state of the interpreter. I'm sure you can think of many combinations along similar lines. I think a new standard library would do very well to work such optimizations into the existing framework (e.g. the Set module) and providing some useful additional functions would be nice too. I'll write articles about these sorts of things for the OCaml Journal. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e