caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* bigarrays much lower than normal ones
@ 2004-10-31 16:05 Hal Daume III
  2004-10-31 17:26 ` [Caml-list] " John Prevost
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Hal Daume III @ 2004-10-31 16:05 UTC (permalink / raw)
  To: caml-list

I've been hitting the limiting size of normal float arrays and was having 
a look at the Bigarray module.  Unfortunately, it seems roughly 3-4 times 
*slower* than the standard array, which is pretty much unacceptable for 
me.  Am I doing something naively wrong, or are the Bigarrays truly this 
slow?  The timing results I get (i686, redhat) are along the liens of:

stdarray, safe:

  12.000u 0.030s 0:12.18 98.7%    0+0k 0+0io 107pf+0w
  12.060u 0.030s 0:12.22 98.9%    0+0k 0+0io 107pf+0w

stdarray, unsafe:

  11.990u 0.070s 0:12.21 98.7%    0+0k 0+0io 107pf+0w
  12.130u 0.040s 0:12.31 98.8%    0+0k 0+0io 107pf+0w

bigarray, 64 bit:

  39.760u 0.040s 0:40.35 98.6%    0+0k 0+0io 110pf+0w
  39.750u 0.030s 0:40.09 99.2%    0+0k 0+0io 110pf+0w

bigarray, 32 bit:

  41.950u 0.050s 0:42.60 98.5%    0+0k 0+0io 110pf+0w
  42.070u 0.040s 0:42.53 99.0%    0+0k 0+0io 110pf+0w


(safe vs. unsafe is when compiled normally or with -unsafe; 64bit vs 32bit 
is the 'kind' used for the bigarrays.)

I'm also really shocked that the 32 bit float bigarrays are slower than 
the 64 bit ones!

Can someone explain this to me?

The code is:

<standard array>

open Array

let normalize a = 
  let s = fold_left (+.) 0. a in
    for i = 0 to length a - 1 do
      a.(i) <- a.(i) /. s;
    done;
    ()

let _ =
  let a = make 1000000 0. in
    for iter = 1 to 100 do
      for i = 0 to 999999 do
        let i' = float_of_int i in
          a.(i) <- log (0.01 *. i' *. i' +. 3. *. i' +. 4.);
      done;
      normalize a;
    done;
    ()



<big array>

open Bigarray

let normalize a =
  let _N = Array1.dim a in
  let rec sum n acc =
    if n >= _N then acc
    else sum (n+1) (acc +. Array1.get a n) in
  let s = sum 0 0. in
    for i = 0 to _N - 1 do
      Array1.set a i (Array1.get a i /. s);
    done;
    ()

let _ =
  let a = Array1.create float32 c_layout 1000000 in
    for iter = 1 to 100 do
      for i = 0 to 999999 do
        let i' = float_of_int i in
          Array1.set a i (log (0.01 *. i' *. i' +. 3. *. i' +. 4.));
      done;
      normalize a;
    done;
    ()





If you put the array allocation inside the iter loop, nothing changes 
much, relatively, on the timing results.

 - Hal


-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] bigarrays much lower than normal ones
  2004-10-31 16:05 bigarrays much lower than normal ones Hal Daume III
@ 2004-10-31 17:26 ` John Prevost
  2004-10-31 17:41 ` malc
  2004-11-01  0:05 ` skaller
  2 siblings, 0 replies; 4+ messages in thread
From: John Prevost @ 2004-10-31 17:26 UTC (permalink / raw)
  To: caml-list, Hal Daume III

On Sun, 31 Oct 2004 08:05:46 -0800 (PST), Hal Daume III <hdaume@isi.edu> wrote:
> I've been hitting the limiting size of normal float arrays and was having
> a look at the Bigarray module.  Unfortunately, it seems roughly 3-4 times
> *slower* than the standard array, which is pretty much unacceptable for
> me.  Am I doing something naively wrong, or are the Bigarrays truly this
> slow?

It is indeed possible to speed things up by quite a lot.  There are a
few factors at play that make your code slower than it has to be, but
the dominant factor is that your Bigarray version of normalize is much
more polymorphic than it needs to be:

val normalize : (float, 'a, 'b) Bigarray.Array1.t -> unit = <fun>

Compared to this for the normal array version:

val normalize : float array -> unit = <fun>

In the Array version, there's only one type parameter to worry about
at compile time--what goes into the array.  That defines everything
you need to know.  This is important because the compiler makes
optimizations for arrays of floating point numbers when it has the
ability to.  When a function is polymorphic, on the other hand, it has
to generate more generic code.

You noted that float32 was slower than float64: That's because
O'Caml's native float representation is always a 64-bit value.  In the
polymorphic version of normalize, the code has to figure out whether
it's working with a float32 or a float64 representation when it pulls
the values out.  The other type variable, which defines the array
layout (C or Fortran) also needs to be cut down to avoid over-generic
code.

I tried some other modifications, trying to remove overhead from
bounds checking--but it turns out that those modifications actually
slowed things down.  :)  In any case, the version with restricted
polymorphism on normalize sped things up a *lot*.

Unmodified Array:
real    1m30.292s
user    1m30.190s
sys     0m0.110s

Unmodified Bigarray:
real    3m31.446s
user    3m31.310s
sys     0m0.130s

Modified Bigarray (restricted polymorphism):
real    1m37.916s
user    1m37.810s
sys     0m0.120s

------------
open Bigarray

let normalize
 (a : (float, Bigarray.float64_elt, Bigarray.c_layout) Bigarray.Array1.t) =
 let _N = Array1.dim a in
 let rec sum n acc =
   if n >= _N then acc
   else sum (n+1) (acc +. Array1.get a n) in
 let s = sum 0 0. in
   for i = 0 to _N - 1 do
     Array1.set a i (Array1.get a i /. s);
   done;
   ()

let _ =
 let a = Array1.create float64 c_layout 1000000 in
   for iter = 1 to 100 do
     for i = 0 to 999999 do
       let i' = float_of_int i in
         Array1.set a i (log (0.01 *. i' *. i' +. 3. *. i' +. 4.));
     done;
     normalize a;
   done;
   ()
------------

You see that the one thing I changed here was to add the type
constraint in the definition of normalize, and it became almost as
fast as the normal array version.

The other thing I'll point out is that you can write:

Array1.set a i x; Array1.get a i

as

a.{i} <- x; a.{i}

Which can be quite a bit easier to read.  If I recall right, this
works for arrays of more than one dimension, as well.  I can't seem to
find the documentation for this feature, however.

John.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] bigarrays much lower than normal ones
  2004-10-31 16:05 bigarrays much lower than normal ones Hal Daume III
  2004-10-31 17:26 ` [Caml-list] " John Prevost
@ 2004-10-31 17:41 ` malc
  2004-11-01  0:05 ` skaller
  2 siblings, 0 replies; 4+ messages in thread
From: malc @ 2004-10-31 17:41 UTC (permalink / raw)
  To: Hal Daume III; +Cc: caml-list

On Sun, 31 Oct 2004, Hal Daume III wrote:

> I've been hitting the limiting size of normal float arrays and was having
> a look at the Bigarray module.  Unfortunately, it seems roughly 3-4 times
> *slower* than the standard array, which is pretty much unacceptable for
> me.  Am I doing something naively wrong, or are the Bigarrays truly this
> slow?  The timing results I get (i686, redhat) are along the liens of:

http://groups.google.com/groups?hl=en&lr=&c2coff=1&safe=off&threadm=fa.hj2u7jv.t1ms25%40ifi.uio.no&rnum=1&prev=/groups%3Fq%3Dbigarray%2Bmalc%26hl%3Den%26lr%3D%26c2coff%3D1%26safe%3Doff%26selm%3Dfa.hj2u7jv.t1ms25%2540ifi.uio.no%26rnum%3D1

-- 
mailto:malc@pulsesoft.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] bigarrays much lower than normal ones
  2004-10-31 16:05 bigarrays much lower than normal ones Hal Daume III
  2004-10-31 17:26 ` [Caml-list] " John Prevost
  2004-10-31 17:41 ` malc
@ 2004-11-01  0:05 ` skaller
  2 siblings, 0 replies; 4+ messages in thread
From: skaller @ 2004-11-01  0:05 UTC (permalink / raw)
  To: Hal Daume III; +Cc: caml-list

On Mon, 2004-11-01 at 03:05, Hal Daume III wrote:

> I'm also really shocked that the 32 bit float bigarrays are slower than 
> the 64 bit ones!

IEEE hardware loads/saves 64 bit and converts internally to 80 bit.
32 bit requires extra conversions. OTOH you might save on memory
and gain some performance wrt caches.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-11-01  0:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-31 16:05 bigarrays much lower than normal ones Hal Daume III
2004-10-31 17:26 ` [Caml-list] " John Prevost
2004-10-31 17:41 ` malc
2004-11-01  0:05 ` skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).