From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 684BFBC8E for ; Wed, 4 May 2005 09:27:13 +0200 (CEST) Received: from ptb-relay01.plus.net (ptb-relay01.plus.net [212.159.14.212]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j447RCxf015174 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Wed, 4 May 2005 09:27:13 +0200 Received: from [80.229.56.224] (helo=chetara) by ptb-relay01.plus.net with esmtp (Exim) id 1DTEHo-0001wp-5a for caml-list@yquem.inria.fr; Wed, 04 May 2005 08:27:12 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Re: Mini ray tracer Date: Wed, 4 May 2005 08:27:11 +0100 User-Agent: KMail/1.7.1 References: <200504281037.46186.jon@ffconsultancy.com> <42786A96.3030204@bik-gmbh.de> In-Reply-To: <42786A96.3030204@bik-gmbh.de> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200505040827.11366.jon@ffconsultancy.com> X-Miltered: at nez-perce with ID 42787950.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 tracer:01 florian:01 hars:01 trivial:01 gcc:01 timings:01 ocaml:01 ocaml:01 inlining:01 structs:01 afaik:01 suboptimal:01 optimise:01 intersection:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: On Wednesday 04 May 2005 07:24, Florian Hars wrote: > I did some comparisions (the ml code on that page doesn't compile, but the > fix is trivial) and told gcc 3.3.5 to actually optimize the c++ (-O2 > -march=x86-64 -msse2 -ffast-math) As I said last night, my timings were for optimised C++ and optimised OCaml on x86. > and the c++ was consistently faster than > the ocaml code for detail levels of 4 and greater, marginally at 4, about > 20% at a level of 10, twice as fast at 12, and infinitely faster at a level > of 14 (the c++ program finished in less than three minutes, the ocaml > program started to trigger the oom killer after about an hour, and I > finally had to push the friendly red button labeled "reset" to get my > computer back). Yes, the OCaml program seems to use ~50% more memory. I assume C++ is inlining those structs. It would be ugly to work around this in the OCaml, AFAIK. If you want to work at the extremes of memory usage then you'll probably want to ditch that data structure and use a single-precision float big array. > This looks like a suboptimal example for the speed of ocaml. These results appear to be specific to 64-bit. As far as the shootout is concerned, we seem to have agreed that you can optimise your code up to 100 LOC. So the OCaml can have some algorithmic optimisations which increase its performance and verbostity. With these algorithmic optimisations, I get 35.6s (82 LOC OCaml) vs 36.8s (92 LOC C++) on AMD64 for n=11. The optimisation is to trace shadow rays using an intersection routine which returns as soon as any intersection is found, rather than returning the parameter of the first intersection. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. Objective CAML for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists