From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 675FABC69 for ; Fri, 1 Jun 2007 19:19:15 +0200 (CEST) Received: from pih-relay06.plus.net (pih-relay06.plus.net [212.159.14.133]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l51HJEnS011291 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Fri, 1 Jun 2007 19:19:15 +0200 Received: from [80.229.56.224] (helo=beast.local) by pih-relay06.plus.net with esmtp (Exim) id 1HuAmQ-0000ir-AN for caml-list@yquem.inria.fr; Fri, 01 Jun 2007 18:19:14 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics Date: Fri, 1 Jun 2007 18:13:48 +0100 User-Agent: KMail/1.9.7 References: <5195a210705302250u6a9e5adey4ed857480f9e5cd8@mail.gmail.com> <3d13dcfc0706010449k53f1c364gfd4db47c7c258725@mail.gmail.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200706011813.48515.jon@ffconsultancy.com> X-Miltered: at discorde with ID 46605512.002 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocaml:01 markus:01 mottl:01 hash:01 trivial:01 functors:01 higher-level:01 ocaml:01 inlining:01 compiler:01 stdlib:01 nodes:01 stdlib:01 hash:01 functorize:01 On Friday 01 June 2007 17:14:36 Markus Mottl wrote: > Absolutely! E.g. we had to specialize hash tables for integer and > string keys, because the generic implementation calls a function for > each key comparison rather than generating specialized code for e.g. > integer comparisons. This has a noticable impact in production > systems. Sets are another example. In that case, function call in trivial comparison is a side effect of using functors. > ... > I'd surely be happy to see the addition of some (optional) > higher-level code transformations to OCaml. Not just inlining, maybe > some partial evaluation of the resulting code, which could also reduce > code size if the compiler can prove that certain branches will not be > taken. The stdlib has a lot of scope for optimization. The Set implementation currently does not specialize 1-element nodes. Doing this improves performance by ~30%, partly by relieving the GC and partly by avoiding branches in common cases. I believe some compilers automate this optimization (trimming the bottom layer from trees). Non tail-recursive functions in the stdlib is another example. You can easily get quadratic instead of linear behaviour without realising. Hash tables can have big hidden performance costs, especially for soft real-time work. Actually, while we're here. I've long thought that the stdlib should provide abstract implementations of concrete data structures like RB trees and AVL trees, and functorize the Set and Map modules over the tree type they use. This would let people add new abstract data structures (I like purely functional sequences based on AVL trees) built upon solid concrete data structures from the stdlib rather than cutting and pasting code (one of the more embarassing OCaml FAQs). Making this feasible by optimizing away the abstractions requires more than just defunctorizing though. You need to partially specialize by type, as Markus says. You also need to do whole-program transformations to flatten data structures. For example, a set would require: Node of 'a t * 'a * 'a t * height and a sequence would require: Node of 'a t * 'a * 'a t * height * size So a generic OCaml solution would add an indirection to the metadata that could be flattened out. Not being hardcore enough to tinker with the OCaml compiler itself, I'd write an OCaml program that generated OCaml implementations of data structures with the necessary specialization. Indeed, I've already done this for low-dimensional vectors and matrices, but doing it for trees would be more interesting. Just out of curiosity, how many of the optimizations being discussed can be done with camlp4? Anyway, we should try to build a coherent list of optimizations we'd like and then try to prioritize them. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e