I'm sure it's been discussed a few times, but here we go.... single-precision floats

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* I'm sure it's been discussed a few times, but here we go.... single-precision floats
@ 2006-03-06 12:23 Asfand Yar Qazi
  2006-03-06 13:13 ` [Caml-list] " skaller
  0 siblings, 1 reply; 3+ messages in thread
From: Asfand Yar Qazi @ 2006-03-06 12:23 UTC (permalink / raw)
  To: Caml-list

Hi,

I recently performed some tests on GNU C++, and found that (for a small fast
fourier transform operation anyway) double precision out-performs single
precision floating point.

However, on the Ogre 3D engine forum, I had a lengthy discussion and a
conclusion was reached that the reason single-precision floats are preferred
for games (and I assume all high-performance applications that require huge
amounts of data) is that more of them can fit into the cache of the processor.

All the OCaml discussions about floating point precision I have seen so far
evolve around how fast operations are performed on them - but the critical
thing for things like collision detection, etc. in games is the amount of data
that can fit into the CPU cache and be operated on before the cache must be
reloaded.  Obviously, twice as many single precision floats can fit into any
CPU's cache than double precision floats.

We're talking huge dynamic data structures with millions of floating point
coordinates that all have to be iterated over many times a second - preferably
by using multithreaded algorithms, so that multiple CPUs can be used
efficiently.  Since doing this sort of work (i.e. parallel computing) in C++
is a pain in the **** ('scuse my French :-), I want to learn a language that
will make it easy and less error-prone - hence my study of OCaml.

So, is there any way (I'm thinking similar to 'nativeint') to use floats in
OCaml to maximize the data that can be stored and operated on in the CPUs 
cache such that system memory is accessed as little as possible?

Thanks

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] I'm sure it's been discussed a few times, but here we go.... single-precision floats
  2006-03-06 12:23 I'm sure it's been discussed a few times, but here we go.... single-precision floats Asfand Yar Qazi
@ 2006-03-06 13:13 ` skaller
  2006-03-06 17:01   ` Asfand Yar Qazi
  0 siblings, 1 reply; 3+ messages in thread
From: skaller @ 2006-03-06 13:13 UTC (permalink / raw)
  To: Asfand Yar Qazi; +Cc: Caml-list

On Mon, 2006-03-06 at 12:23 +0000, Asfand Yar Qazi wrote:

> All the OCaml discussions about floating point precision I have seen so far
> evolve around how fast operations are performed on them - but the critical
> thing for things like collision detection, etc. in games is the amount of data
> that can fit into the CPU cache and be operated on before the cache must be
> reloaded.  Obviously, twice as many single precision floats can fit into any
> CPU's cache than double precision floats.

No, it isn't obvious. I have some routines using the Taka algorithm
which is heavily recursive and therefore depends heavily on
cache use.

Code for gcc and Felix uses single precision floats.
It is competing with Ocaml .. which is not only using
double precision .. it might be boxing as well!???
[Perhaps gcc is passing the single precision floats
as double anyhow ..?]

http://felix.sourceforge.net/current/speed/en_flx_perf_0012.html

This is an older set of results. On my newer AMD64x2 3800,
gcc opt wins, but Ocaml is second.

The effect of cache usage ignoring FP calculation speed is
seen here:

http://felix.sourceforge.net/current/speed/en_flx_perf_0005.html

where Felix trashes everything by a clear margin simply because
it happens to use one less word on the stack than it's nearest
competitor.

Perhaps Xavier can explain how Ocaml manages to be so dang
fast on the FP stuff .. if you try C or Felix with
doubles they drop right off the chart.

> We're talking huge dynamic data structures with millions of floating point
> coordinates that all have to be iterated over many times a second - preferably
> by using multithreaded algorithms, so that multiple CPUs can be used
> efficiently. 

Ocaml doesn't currently permit multi-processing.
Felix does, it might be a better alternative for games
and possibly High Performance Computing apps.
Other options include Haskell and MLton.

-- 
John Skaller <skaller at users dot sourceforge dot net>
Async PL, Realtime software consultants
Checkout Felix: http://felix.sourceforge.net

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] I'm sure it's been discussed a few times, but here we go.... single-precision floats
  2006-03-06 13:13 ` [Caml-list] " skaller
@ 2006-03-06 17:01   ` Asfand Yar Qazi
  0 siblings, 0 replies; 3+ messages in thread
From: Asfand Yar Qazi @ 2006-03-06 17:01 UTC (permalink / raw)
  To: Caml-list

skaller wrote:
> On Mon, 2006-03-06 at 12:23 +0000, Asfand Yar Qazi wrote:
> 
> 
>>All the OCaml discussions about floating point precision I have seen so far
>>evolve around how fast operations are performed on them - but the critical
>>thing for things like collision detection, etc. in games is the amount of data
>>that can fit into the CPU cache and be operated on before the cache must be
>>reloaded.  Obviously, twice as many single precision floats can fit into any
>>CPU's cache than double precision floats.
> 
> 
> No, it isn't obvious. I have some routines using the Taka algorithm
> which is heavily recursive and therefore depends heavily on
> cache use.
> 
> Code for gcc and Felix uses single precision floats.
> It is competing with Ocaml .. which is not only using
> double precision .. it might be boxing as well!???
> [Perhaps gcc is passing the single precision floats
> as double anyhow ..?]
> 
> http://felix.sourceforge.net/current/speed/en_flx_perf_0012.html
> 
> This is an older set of results. On my newer AMD64x2 3800,
> gcc opt wins, but Ocaml is second.

Point taken - but I'm not experienced enough to comment - perhaps John Carmack 
would be a good person to contacta about this :-)

> 
> The effect of cache usage ignoring FP calculation speed is
> seen here:
> 
> http://felix.sourceforge.net/current/speed/en_flx_perf_0005.html
> 
> where Felix trashes everything by a clear margin simply because
> it happens to use one less word on the stack than it's nearest
> competitor.
> 
> Perhaps Xavier can explain how Ocaml manages to be so dang
> fast on the FP stuff .. if you try C or Felix with
> doubles they drop right off the chart.
>
> 
>>We're talking huge dynamic data structures with millions of floating point
>>coordinates that all have to be iterated over many times a second - preferably
>>by using multithreaded algorithms, so that multiple CPUs can be used
>>efficiently. 
> 
> 
> Ocaml doesn't currently permit multi-processing.
> Felix does, it might be a better alternative for games
> and possibly High Performance Computing apps.
> Other options include Haskell and MLton.
> 

Ah, didn't know that.  On some comments made by Tim Sweeny (Epic Games of 
Unreal Engine fame), I thought functional programming make multi-threaded 
coding easy.

I'm referring to some slides found here: 
http://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced-fp/docs/sweeny.pdf

Here's what he says about 'Performance' for example:
§ When updating 10,000 objects at 60 FPS,
everything is performance-sensitive
§ But:
– Productivity is just as important
– Will gladly sacrifice 10% of our performance
for 10% higher productivity
– We never use assembly language
§ There is not a simple set of “hotspots” to
optimize!
That’s all!

Somebody commented about it here 
(http://www.gamedev.net/community/forums/topic.asp?topic_id=373751) by saying:
The bottom line is that C/C++ is becoming way too costly to be used in the 
future. Both in terms of money (bugs cost money) but also in terms of 
performance. It's very, very hard to leverage multiple threads in C/C++, and 
it's only getting worse as we get more cores in our systems -- in contrast, 
concurrency in languages such as Haskell is almost as simple as 
single-threaded programming (due to it being purely functional, and having STM).

That's why I thought about learning OCaml - I thought it had the same 
functional programming ability as Haskell, but with more (non-functional) 
features.

But I still want to learn it, it seems a thousand light years different (in a 
good way) than C++.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-03-06 16:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-03-06 12:23 I'm sure it's been discussed a few times, but here we go.... single-precision floats Asfand Yar Qazi
2006-03-06 13:13 ` [Caml-list] " skaller
2006-03-06 17:01   ` Asfand Yar Qazi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).