caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Q: float arrays
@ 1996-10-29 11:12 Jocelyn Serot
  1996-10-30 13:40 ` Xavier Leroy
  0 siblings, 1 reply; 6+ messages in thread
From: Jocelyn Serot @ 1996-10-29 11:12 UTC (permalink / raw)
  To: caml-list


Hello,

Trying to quantify to cost of handling arrays of tuples instead of 
multiple "parallel" arrays, i recently came to some results, which at 
first sight, seems a little surprising.
Can someone please enlight me (or point me the bugs i made !..) ?

What i am comparing here is two functionnaly equivalent implementations
of the same operation, namely multiplying two arrays of floats.
The first one use a map2 higher-order-function (similar to List.map2 but
for arrays), the second works on an single array of pairs.

Here's the code:

*********************** farrays.ml ******************************************

open Array

let map f a =		(* this is simply copied from stdlib *)
  let l = length a in
  if l = 0 then [||] else begin
    let r = create l (f(unsafe_get a 0)) in
    for i = 1 to l - 1 do
      unsafe_set r i (f (unsafe_get a i))
    done;
    r
  end

let map2 f a a' =		(* a slight generalization *)
  let l = length a in
  if ( length a' != l ) then invalid_arg "map2" else
  if l = 0 then [||] else begin
    let r = create l (f(unsafe_get a 0)(unsafe_get a' 0)) in
    for i = 1 to l - 1 do
      unsafe_set r i (f (unsafe_get a i) (unsafe_get a' i))
    done;
    r
  end

let main () =
  let a1 = Array.create (1024*1024) 123.456
  and a2 = Array.create (1024*1024) 123.456
  and a3 = Array.create (1024*1024) (123.456,123.456) in
  let _ = Profile.start () in
  let f1 = Profile.f2 "map2" (map2 (fun p p' -> p*.p')) in
  let _ = for i = 0 to 10 do let r1 = f1 a1 a2 in () done in
  let f2 = Profile.f1 "map_pairs" (map (function (p,p') -> (p*.p'))) in
  let _ = for i = 0 to 10 do let r2 = f2 a3 in () done in
  let _ = Profile.stop () in
  ()

let _ = Printexc.catch main ()

*******************************************************************************

Time profiling is done using the time-profiling module posted by
Mark Hayden (hayden@cs.cornell.edu, posted 4/96) to caml mailing list.
Both the bytecode and and the native code have been profiled.

-----------------------------------------------------------------------RESULTS
BYTECODE:

jserot@tufa$ main
 Number of samples is 11049
 Each sample counts as 0.02 seconds.

 %       self     self&            self    total
 time    seconds  children  calls  ms/call ms/call name
-------------------------------------------------------
51.42   98.894   98.894       11   8.990   8.990   map2
48.58   93.446   93.446       11   8.495   8.495   map_pairs

NATIVE:

jserot@tufa$ omain
 Number of samples is 1584
 Each sample counts as 0.02 seconds.

 %       self     self&            self    total
 time    seconds  children  calls  ms/call ms/call name
-------------------------------------------------------
62.56   16.198   16.198       11   1.473   1.473   map2
37.44    9.692    9.692       11   0.881   0.881   map_pairs

nb: tufa is a sun4-51 Sparc server. Compiler is ocaml-1.02
------------------------------------------------------------------------------

What i cannot understand is that map2 seems to be slower than map_pairs (in
the range 1->2 in native compiled mode). The few i know about the Caml compiler
is that it doesnt boxe floats in arrays. But i guess that an in an array of
float*float, each pair has to be boxed. So accessing arguments from the
map_pairs function should be definitly slower than from map2, 
Does that mean that accessing two unboxed floats takes longer than
accessing a boxed pair of float (twice the time in native mode ?!)

There's also the possibility from significative innaccurcies in my measuring
protocol ..

COuld someone comment ?

	Jocelyn

-- 
E-mail: jserot@epcc.ed.ac.uk (->Nov 29 1996) ................................
S-mail: LASMEA - URA 1793 CNRS, Universite Blaise Pascal, 63177 Aubiere cedex
Tel: (33) 73.40.73.30 - Fax: (33) 73.40.72.62 ...............................
.... http://wwwlasmea.univ-bpclermont.fr/Personnel/Jocelyn.Serot/Welcome.html





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1996-11-06  7:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-10-29 11:12 Q: float arrays Jocelyn Serot
1996-10-30 13:40 ` Xavier Leroy
1996-10-30 14:08   ` Jocelyn Serot
1996-10-30 14:52   ` Thorsten Ohl
1996-10-31 15:26     ` Xavier Leroy
1996-11-05 16:56     ` Controlling inlining [was Re: Q: float arrays] rowan+

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).