caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Bigarray vs. array - mixing?
@ 2003-04-23 14:56 Daniel Andor
  2003-04-23 18:47 ` Chris Hecker
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Daniel Andor @ 2003-04-23 14:56 UTC (permalink / raw)
  To: caml-list

Hi All,

I've looked through the archives, but could not find any talk of this 
particular problem.

Basically I have numerical code that uses Bigarrays in some parts (for example 
in interfacing with Lacaml), but other parts that use arrays.  It doesn't 
seem to be that clean to make them co-exist.  Which should I use?

Since I was forced to use Bigarrays for Lacaml (which is a wonderful interface 
to LAPACK -- but missing some drivers. :(((  ), I decided to use Bigarrays 
for much of the rest of my program.  I made judicious use of blit and splice, 
since I assume that they only do two bounds checks. But my code still spends 
a lot of time in Bigarray.  In fact approx the *same amount of time* as it 
spends calculating! (according to gprof)

Even though Bigarrays are efficient to access from C/Fortran in LAPACK, I 
still have to set the matrices up.  And to do that I have to shuffle around 
lots if itsy bitsy matrices -- and that's what seems to be killing me.

So I don't know if I should be using arrays or Bigarrays.  They both have 
their advantages and disadvantages (arrays: fast access; Bigarrays: nice 
splicing etc, but cannot turn off bounds checking, so slow access).  

At the moment I've ended up with a mess, since my algorithm code uses arrays, 
but code that interfaces Lacaml and Gnuplot use Bigarrays.

What should I do?  It's hard to be consistent.

Daniel.
PS: I read "Array Optimizations in OCaml" by Michael Clarkson & Vaibhav Vaish, 
but it's no help to me at this time.  

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Bigarray vs. array - mixing?
  2003-04-23 14:56 [Caml-list] Bigarray vs. array - mixing? Daniel Andor
@ 2003-04-23 18:47 ` Chris Hecker
  2003-04-24 13:41 ` Xavier Leroy
  2003-04-24 15:12 ` Markus Mottl
  2 siblings, 0 replies; 4+ messages in thread
From: Chris Hecker @ 2003-04-23 18:47 UTC (permalink / raw)
  To: Daniel Andor, caml-list


>I've looked through the archives, but could not find any talk of this
>particular problem.
>Basically I have numerical code that uses Bigarrays in some parts (for 
>example  in interfacing with Lacaml), but other parts that use arrays.  It 
>doesn't seem to be that clean to make them co-exist.  Which should I use?

There's been some discussion of this on the list a long time ago (by 
me:).  I decided to just use bigarrays for all arbitrary sized matrices 
used for linear algebra, but use arrays for fixed things like 3x3 matrices 
and whatnot.  I still end up doing tons of submatrix stuff and calculations 
into and out of the bigarrays, just like you do.

I think the Right Thing is to optimize the native compiler to output better 
bigarray access code, so that using bigarrays is less painful.  I've looked 
at it and done some minor prototyping, and it doesn't seem like it would be 
that hard.  It seems like the Caml team doesn't feel this is necessary (at 
least based on previous replies) because you're only supposed to use 
bigarrays to interface with external libraries, but this ignores how you 
get the data into the bigarray in the first place, as you've found.

I think we only need a few optimizations to get most of the way there (I 
would do all of this on bigarray1d, using a different implementation for it 
than the genarray structure and using a real conversion function instead of 
the current typecast, since any real numerical code in caml is going to 
want to manage strides and row/column access manually on a 1d array, in my 
opinion):

- a lighter weight "pointer"-ish slice type so caml code can reference 
individual rows of a bigarray1d in an outer loop without an allocation of a 
new reference counted bigarray C structure and data, blas and lapack use 
the fortran version of this all over the place, and C code does this using 
pointers everywhere as well...we need a way of expressing that operation in 
caml

- some simple/naive special-cased hoisting of the bigarray data pointer 
load out of loops, it's currently fetched every access...having the above 
lightweight slice be in caml code would help with this I think...these 
pointer reloads are the big culprit, I think

- allow unsafe access (both with unsafe_get and with operators...I was 
going to use foo.\{i} for unsafe, and then also extend it to arrays and 
strings with foo.\(i) and foo.\[i] while I was at it, since why not make it 
consistent and that operator combination is available and works)...not 
bounds checking will begin to make a difference if you get a lot of the 
other inefficiencies out of the system (it's common to hear on this list 
that bounds checking doesn't cost that much, which is true until you've 
optimized the rest of the access code)

- then look for low-hanging fruit in the FPU code, especially for x86 
(which has been talked about recently on the list)...in general, I think 
the entire x86 code generator could greatly benefit from a simple peephole 
optimization pass, too

I'm not a compiler person, and I just messed around with the above, so I 
have no idea how hard it would really be (especially the data 
pointer  hoisting), but it didn't seem too bad when I looked at 
it.  However, I do think it's worth doing stuff like this, because the 
often mentioned retort of "use C for numerical stuff" just doesn't scale 
well to projects that do a lot of math dispersed throughout the program.

Chris

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Bigarray vs. array - mixing?
  2003-04-23 14:56 [Caml-list] Bigarray vs. array - mixing? Daniel Andor
  2003-04-23 18:47 ` Chris Hecker
@ 2003-04-24 13:41 ` Xavier Leroy
  2003-04-24 15:12 ` Markus Mottl
  2 siblings, 0 replies; 4+ messages in thread
From: Xavier Leroy @ 2003-04-24 13:41 UTC (permalink / raw)
  To: Daniel Andor; +Cc: caml-list

> Basically I have numerical code that uses Bigarrays in some parts
> (for example in interfacing with Lacaml), but other parts that use
> arrays.  It doesn't seem to be that clean to make them co-exist.
> Which should I use?  Since I was forced to use Bigarrays for Lacaml
> (which is a wonderful interface to LAPACK -- but missing some
> drivers. :((( ), I decided to use Bigarrays for much of the rest of
> my program.  I made judicious use of blit and splice, since I assume
> that they only do two bounds checks. But my code still spends a lot
> of time in Bigarray.  In fact approx the *same amount of time* as it
> spends calculating! (according to gprof)

If the profile shows that significant time is spent in the bigarray_get_*
and igarray_set_* functions, this indicates that your Caml code is too
polymorphic.  ocamlopt can inline bigarray accesses only when the
types of the bigarrays is fully, statically known.  

It is easy to get unwanted polymorphism for Caml code that uses
bigarrays.  For instance,

        let f a = a.{0} <- 3.14

has type (float, 'a, 'b) Bigarray.Array1.t -> unit.  The assignment
determines that the Caml type of the array elements is float,
but it doesn't determine fully the underlying representation type
(could be float32 as well as float64), nor the layout of the array
(could be C or Fortran layout).  

Thus, the assignment cannot be inlined and must be performed by a C
function that discriminates at run-time on the actual representation
types and layout.  This is quite slow indeed.  To avoid this, consider
adding a type constraint:

        open Bigarray

        type floatarray = (float, float64_elt, c_layout) Array1.t

        let f (a: floatarray) = a.{0} <- 3.14

That should improve performance somewhat.  Still, the extra
flexibility of bigarrays over regular arrays causes the inlined
bigarray access code to be a bit slower than regular array accesses.

Hope this helps.

- Xavier Leroy

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Bigarray vs. array - mixing?
  2003-04-23 14:56 [Caml-list] Bigarray vs. array - mixing? Daniel Andor
  2003-04-23 18:47 ` Chris Hecker
  2003-04-24 13:41 ` Xavier Leroy
@ 2003-04-24 15:12 ` Markus Mottl
  2 siblings, 0 replies; 4+ messages in thread
From: Markus Mottl @ 2003-04-24 15:12 UTC (permalink / raw)
  To: Daniel Andor; +Cc: caml-list

On Wed, 23 Apr 2003, Daniel Andor wrote:
> Since I was forced to use Bigarrays for Lacaml (which is a wonderful interface 
> to LAPACK -- but missing some drivers. :(((  ), I decided to use Bigarrays 
> for much of the rest of my program.

In case you need specific drivers provided by LAPACK or BLAS but not
yet interfaced from LACAML, there are two ways to change this situation:

  * Contribute code that interfaces the function. Since LAPACK is a huge
    library with tons of specialized algorithms, this is the preferred
    solution :-)

  * Send your wish to either <Christophe.Troestler@umh.ac.be> or to me.

It's really not that difficult to add further drivers. Just follow
the numerous examples of already implemented functions. In the case of
difficulties, don't hesitate to ask questions!

Regards,
Markus Mottl

-- 
Markus Mottl          http://www.oefai.at/~markus          markus@oefai.at

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-04-24 15:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-23 14:56 [Caml-list] Bigarray vs. array - mixing? Daniel Andor
2003-04-23 18:47 ` Chris Hecker
2003-04-24 13:41 ` Xavier Leroy
2003-04-24 15:12 ` Markus Mottl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).