On Sat, Jan 15, 2011 at 7:23 PM, Guillaume Yziquel <guillaume.yziquel@citycable.ch> wrote:
Le Saturday 15 Jan 2011 à 16:00:21 (+0200), Eray Ozkural a écrit :
>    On Sat, Jan 15, 2011 at 2:38 PM, Guillaume Yziquel
>    <[1]guillaume.yziquel@citycable.ch> wrote:
>
>      Then, for instance, given a datatype, you may wish to construct the
>      datatype of an array of such types. Such a function needs to know
>      details about the way OCaml boxes or unboxes different kinds of
>      arrays,
>      and it can be done (though rather awkwardly in my case).


Awkwardly, but how? :)

 
>    That's a good idea.
>    Theoretically a functor transforms programs. Radical program rewriting
>    would be just the thing to do with a functor, but I'd rather have it in
>    the compiler.

I wasn't thinking of applying functors to rewrite / specialise (whatever
you call it) some code to a datatype.


Ok, you mean exactly like C++ type traits, where a static namespace provides further type information. In OCaml that'd be a module, right.
 
I was more thinking of having a first-class module as a regular value
that provides, when you unpack it, sufficient information to know how to
cross the barriers from OCaml to C or Fortran or whatever, and then send
it or receive it via an MPI implementation (since that's what I'm
looking at). Which means all your HPC primitives must know how to read
properly the datatype info enclosed in your first-class module.

I didn't really have MPI types on my mind, but it would surely be nice to be able to integrate nicely with MPI as well, though I think the Marshal module isn't costly (I made a small benchmark). 

What I had in mind was, say, I have this CA simulation or spiking neural net simulation code or a cell simulation, or a quantum chromodynamics simulation, maybe a visualization of an irregular mesh, or some other non-trivial scientific computing application where it's difficult to reduce everything to float arrays. Because usually you will have either vectors, or graphs of complex atomic structures and then this boxing is going to seriously hurt performance, as performance is hurt when you try to write any serious algorithm in Java in an OO fashion because everything is a pointer. When you have to start writing every algorithm in an awkward and bloated way to maintain some sense of performance, the benefit of the language quickly vanishes. (Main reason why Java should never be used except for toy web apps!) And then the HPC guy will have to turn to the portable assembly of C++, right?
 
I'm saying first-class module, because it can be typed as 'value datatype.
You only know what the 'value it is supposed to encode is, and have all
the typing info of how to deal with it encapsulated in the first-class
module and not leaking into the rest of your code.

Ok, care to give a minimal example? How do you pass and use the module value? This sounds interesting enough. You seem to be using the module to encapsulate encoding/decoding functions. Which is fine but how is it enough? How would that apply to changing the memory layout of a data type (or to provide an unboxed array of such values)? I thought you would be generating another module that represents the same type as an array of ints, and somehow convert the types transparently. How do you propose to do it?

>    What I would like is something like (thinking of a typical simulation
>    datatype):
>    type cvector4 = ][ (complex * complex * complex * complex)
>    where ][ would be a "type operator" enforcing a flattened
>    representation of the type expression it is applied to. It would just
>    change the layout so it would be equivalent to the same type without
>    the unboxing op.

If I'm not mistaking tuples or records of floats are already unboxed at
runtime. Not seeing the great benefit here.
 
Yes, but the above is not a tuple of floats.

Best, 

--
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct