From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jon@ffconsultancy.com>
Delivered-To: caml-list@yquem.inria.fr
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78])
	by yquem.inria.fr (Postfix) with ESMTP id 684BFBC8E
	for <caml-list@yquem.inria.fr>; Wed,  4 May 2005 09:27:13 +0200 (CEST)
Received: from ptb-relay01.plus.net (ptb-relay01.plus.net [212.159.14.212])
	by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j447RCxf015174
	(version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO)
	for <caml-list@yquem.inria.fr>; Wed, 4 May 2005 09:27:13 +0200
Received: from [80.229.56.224] (helo=chetara)
	 by ptb-relay01.plus.net with esmtp (Exim) id 1DTEHo-0001wp-5a
	for caml-list@yquem.inria.fr; Wed, 04 May 2005 08:27:12 +0100
From: Jon Harrop <jon@ffconsultancy.com>
Organization: Flying Frog Consultancy Ltd.
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Re: Mini ray tracer
Date: Wed, 4 May 2005 08:27:11 +0100
User-Agent: KMail/1.7.1
References: <200504281037.46186.jon@ffconsultancy.com> <slrnd7d1d1.dg5.brown@panic.cs.bris.ac.uk> <42786A96.3030204@bik-gmbh.de>
In-Reply-To: <42786A96.3030204@bik-gmbh.de>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200505040827.11366.jon@ffconsultancy.com>
X-Miltered: at nez-perce with ID 42787950.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 tracer:01 florian:01 hars:01 trivial:01 gcc:01 timings:01 ocaml:01 ocaml:01 inlining:01 structs:01 afaik:01 suboptimal:01 optimise:01 intersection:01 
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.0.2
X-Spam-Level: 

On Wednesday 04 May 2005 07:24, Florian Hars wrote:
> I did some comparisions (the ml code on that page doesn't compile, but the
> fix is trivial) and told gcc 3.3.5 to actually optimize the c++ (-O2
> -march=x86-64 -msse2 -ffast-math)

As I said last night, my timings were for optimised C++ and optimised OCaml on 
x86.

> and the c++ was consistently faster than 
> the ocaml code for detail levels of 4 and greater, marginally at 4, about
> 20% at a level of 10, twice as fast at 12, and infinitely faster at a level
> of 14 (the c++ program finished in less than three minutes, the ocaml
> program started to trigger the oom killer after about an hour, and I
> finally had to push the friendly red button labeled "reset" to get my
> computer back).

Yes, the OCaml program seems to use ~50% more memory. I assume C++ is inlining 
those structs. It would be ugly to work around this in the OCaml, AFAIK. If 
you want to work at the extremes of memory usage then you'll probably want to 
ditch that data structure and use a single-precision float big array.

> This looks like a suboptimal example for the speed of ocaml.

These results appear to be specific to 64-bit.

As far as the shootout is concerned, we seem to have agreed that you can 
optimise your code up to 100 LOC. So the OCaml can have some algorithmic 
optimisations which increase its performance and verbostity.

With these algorithmic optimisations, I get 35.6s (82 LOC OCaml) vs 36.8s (92 
LOC C++) on AMD64 for n=11.

The optimisation is to trace shadow rays using an intersection routine which 
returns as soon as any intersection is found, rather than returning the 
parameter of the first intersection.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists