From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: Caml-list@yquem.inria.fr Delivered-To: Caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id BB134BB81 for ; Mon, 6 Mar 2006 14:08:08 +0100 (CET) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id k26D86Nl014417 for ; Mon, 6 Mar 2006 14:08:07 +0100 Received: from [192.168.1.200] (ppp9-113.lns1.syd7.internode.on.net [59.167.9.113]) by smtp3.adl2.internode.on.net (8.13.5/8.13.5) with ESMTP id k26D80MZ070033; Mon, 6 Mar 2006 23:38:00 +1030 (CST) (envelope-from skaller@users.sourceforge.net) Subject: Re: [Caml-list] I'm sure it's been discussed a few times, but here we go.... single-precision floats From: skaller To: Asfand Yar Qazi Cc: Caml-list@yquem.inria.fr In-Reply-To: <440C29DC.4000701@asfandyar.cjb.net> References: <440C29DC.4000701@asfandyar.cjb.net> Content-Type: text/plain Organization: Async P/L Date: Tue, 07 Mar 2006 00:13:18 +1100 Message-Id: <1141650798.13144.138.camel@budgie.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 440C3436.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 ocaml:01 reloaded:01 recursive:01 gcc:01 ocaml:01 gcc:01 stack:01 doubles:01 iterated:01 haskell:01 mlton:01 async:01 competitor:98 consultants:98 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Mon, 2006-03-06 at 12:23 +0000, Asfand Yar Qazi wrote: > All the OCaml discussions about floating point precision I have seen so far > evolve around how fast operations are performed on them - but the critical > thing for things like collision detection, etc. in games is the amount of data > that can fit into the CPU cache and be operated on before the cache must be > reloaded. Obviously, twice as many single precision floats can fit into any > CPU's cache than double precision floats. No, it isn't obvious. I have some routines using the Taka algorithm which is heavily recursive and therefore depends heavily on cache use. Code for gcc and Felix uses single precision floats. It is competing with Ocaml .. which is not only using double precision .. it might be boxing as well!??? [Perhaps gcc is passing the single precision floats as double anyhow ..?] http://felix.sourceforge.net/current/speed/en_flx_perf_0012.html This is an older set of results. On my newer AMD64x2 3800, gcc opt wins, but Ocaml is second. The effect of cache usage ignoring FP calculation speed is seen here: http://felix.sourceforge.net/current/speed/en_flx_perf_0005.html where Felix trashes everything by a clear margin simply because it happens to use one less word on the stack than it's nearest competitor. Perhaps Xavier can explain how Ocaml manages to be so dang fast on the FP stuff .. if you try C or Felix with doubles they drop right off the chart. > We're talking huge dynamic data structures with millions of floating point > coordinates that all have to be iterated over many times a second - preferably > by using multithreaded algorithms, so that multiple CPUs can be used > efficiently. Ocaml doesn't currently permit multi-processing. Felix does, it might be a better alternative for games and possibly High Performance Computing apps. Other options include Haskell and MLton. -- John Skaller Async PL, Realtime software consultants Checkout Felix: http://felix.sourceforge.net