From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 4871BBB94 for ; Tue, 4 Oct 2005 01:22:18 +0200 (CEST) Received: from pih-relay05.plus.net (pih-relay05.plus.net [212.159.14.132]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j93NMHTC020910 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Tue, 4 Oct 2005 01:22:17 +0200 Received: from [80.229.56.224] (helo=chetara) by pih-relay05.plus.net with esmtp (Exim) id 1EMZdM-00016D-IJ for caml-list@yquem.inria.fr; Tue, 04 Oct 2005 00:22:12 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Ray tracer language comparison Date: Tue, 4 Oct 2005 00:18:24 +0100 User-Agent: KMail/1.7.2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200510040018.24932.jon@ffconsultancy.com> X-Miltered: at nez-perce with ID 4341BD29.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; run-time:01 ocamlopt:01 ocamlopt:01 verbose:01 ocaml:01 unboxing:01 ocaml:01 unboxing:01 denotes:01 denotes:01 transformed:01 vectors:01 inlining:01 -inline:01 radius:98 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 I've updated my language comparison with four implementations in Scheme and one in Lisp: http://www.ffconsultancy.com/free/ray_tracer/languages.html In short, Stalin's run-time performance is excellent (36% faster than ocamlopt) but its compile times are poor (2,000x slower than ocamlopt!) and SBCL-compiled Lisp is 6x slower than ocamlopt. Both Scheme and Lisp are >2x as verbose as OCaml. Juho Snellman posted an interesting Lisp variant that used a macro to factor out unboxing, greatly reducing the stress on the GC and improving Lisp's performance to that of OCaml. Applying the same optimisation to the OCaml implementations makes them much faster again but I have yet to factor this out into a camlp4 macro. This raises two interesting questions: 1. Can a camlp4 macro be written to factor out this unboxing transformation: let v = center -| orig in let b = dot v dir in let disc = b *. b -. dot v v +. radius *. radius in where "+|" denotes vector addition and "*|" denotes scalar*vector multiplication. Transformed to: let vx = center.x -. orig.x in let vy = center.y -. orig.y in let vz = center.z -. orig.z in let b = vx *. dir.x +. vy *. dir.y +. vz *. dir.z in let vv = vx *. vx +. vy *. vy +. vz *. vz in let disc = b *. b -. vv +. radius *. radius in So the intermediate vectors are not allocated and collected. 2. Manually inlining the code gives a huge performance improvement but using -inline does not appear to give this improvement. I assume ocamlopt can inline this but will not remove the boxing and subsequent unboxing. Is that right? -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. Objective CAML for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists