From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 9B9E2BC69 for ; Thu, 31 May 2007 08:22:31 +0200 (CEST) Received: from pih-relay06.plus.net (pih-relay06.plus.net [212.159.14.133]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l4V6MUjP017791 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Thu, 31 May 2007 08:22:31 +0200 Received: from [80.229.56.224] (helo=beast.local) by pih-relay06.plus.net with esmtp (Exim) id 1Hte3K-0003C9-Fk for caml-list@yquem.inria.fr; Thu, 31 May 2007 07:22:30 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics Date: Thu, 31 May 2007 07:17:01 +0100 User-Agent: KMail/1.9.7 References: <5195a210705302250u6a9e5adey4ed857480f9e5cd8@mail.gmail.com> In-Reply-To: <5195a210705302250u6a9e5adey4ed857480f9e5cd8@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200705310717.01553.jon@ffconsultancy.com> X-Miltered: at discorde with ID 465E69A6.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.01; ocaml:01 ocaml:01 sml:01 numerically:01 ocamlopt:01 unrolling:01 60%:98 img:98 img':98 img:98 height:98 1.0:98 frog:98 compilers:01 wrote:01 On Thursday 31 May 2007 06:50:05 Yuanchen Zhu wrote: > The performance numbers were as following: > > Ocaml (unsafe) : user: 39.674s, real: 41.356s > MLton (safe): user: 17.981s, real: 21.968s You may be interested to know that there are no optimizing SML compilers for AMD64, which is a much better platform for numerical work: http://www.ffconsultancy.com/languages/ray_tracer/results.html OCaml is over 60% faster on this benchmark. Having said that, I notice that twice as many people are downloading the x86 demos on my site compared to the x64. > let hconvolve kern (img:t) r = > let sup = Array.length kern - 1 in > let img' = create (size img) in > for y = 0 to height img - 1 do > for x = 0 to width img - 1 do > let s = ref 0.0 in > for i = 0 to sup do > let (kx, ky) = kern.(i) in > s := !s +. ky *. getReflected img y (x + kx) 1.0 r I can think of various ways to rearrange this that might help performance. > The new running time is: > > Ocaml (unsafe) : user: 21.477s, real: 23.366s What is the running time for safe OCaml? > which is much in line with MLton: > > MLton (safe): user: 17.981s, real: 21.968s What platforms and architectures did you benchmark on? May we have the code to benchmark it ourselves? > Although note that the MLton version has array-bound check enabled and > used the two-line high order function version of hconvolve. You might also try an FFT-based convolution if your filter is dense. > So the moral of the story: To use Ocaml for numerically intensive > work, code in C style in the inner loops. Absolutely. > This brings me to the next question: is there any plan to implement > type specialization optimization for ocamlopt? For numerics, this is > really crucial if you want write both in an elegant functional style > and get good performance. Also, I remember reading somewhere that the > current code base of Ocaml is ill-suited for implementing this kind of > optimization. May I ask what exactly needs to be done to the current > code base in order to support that? I have some compiler-writing > background and this sounds like an interesting project to work in my > past time. Writing OCaml programs that generate OCaml programs is by far your best bet here. We use a replacement standard library that uses autogenerated code to eliminate boxing and perform unrolling and type specialization where possible. As I can autogenerate my code, I would much rather the OCaml developers concentrated on things that I cannot get around, like the lack of a 32-bit float storage type and a more efficient internal representation of complex numbers. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e