From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82])
	by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p0INo2dm019928
	for <caml-list@sympa-roc.inria.fr>; Wed, 19 Jan 2011 00:50:02 +0100
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AugDALq1NU1V2gB4hWdsb2JhbACkVwEBAQoLCgUTJMI9DYVDBI5c
X-IronPort-AV: E=Sophos;i="4.60,341,1291590000"; 
   d="scan'208";a="95890636"
Received: from emailfrontal1.citycable.ch ([85.218.0.120])
  by mail1-smtp-roc.national.inria.fr with SMTP; 19 Jan 2011 00:49:57 +0100
X-Alinto-smtpauth-localdomain: Yes
Received: from seldon (unknown [85.218.92.99])
	(Authenticated sender: guillaume.yziquel@citycable.ch)
	by emailfrontal1.citycable.ch (Postfix) with ESMTPA id 7043612C6DB;
	Wed, 19 Jan 2011 00:49:55 +0100 (CET)
Received: from yziquel by seldon with local (Exim 4.72)
	(envelope-from <guillaume.yziquel@citycable.ch>)
	id 1PfLIw-0005oz-VV; Wed, 19 Jan 2011 00:49:39 +0100
Date: Wed, 19 Jan 2011 00:49:38 +0100
From: Guillaume Yziquel <guillaume.yziquel@citycable.ch>
To: Eray Ozkural <examachine@gmail.com>
Cc: Caml List <caml-list@inria.fr>
Message-ID: <20110118234938.GQ4195@localhost>
References: <AANLkTi=g2bny3VaSEOhYxEJRO_iGur365FgEsJe6d-_G@mail.gmail.com>
 <20110115123838.GS4195@localhost>
 <AANLkTinAptP4G6JZH+Msyf_99UDkWkSDF6AwH1hRfHHo@mail.gmail.com>
 <20110115172300.GN4195@localhost>
 <AANLkTi=_J3Y2kOGpEExv4Hzqkw_5yUOEtuiOVus1OfX_@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <AANLkTi=_J3Y2kOGpEExv4Hzqkw_5yUOEtuiOVus1OfX_@mail.gmail.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id p0INo2dm019928
Subject: Re: [Caml-list] Unboxing: how to do it best?

Le Saturday 15 Jan 2011 à 20:33:33 (+0200), Eray Ozkural a écrit :
>  On Sat, Jan 15, 2011 at 7:23 PM, Guillaume Yziquel
>  <[1]guillaume.yziquel@citycable.ch> wrote:
> 
>    Le Saturday 15 Jan 2011 à 16:00:21 (+0200), Eray Ozkural a écrit :
> 
>  >  On Sat, Jan 15, 2011 at 2:38 PM, Guillaume Yziquel
> 
>  >  <[1][2]guillaume.yziquel@citycable.ch> wrote:
>  >
>  >  Then, for instance, given a datatype, you may wish to construct
>  >  the datatype of an array of such types. Such a function needs to
>  >  know details about the way OCaml boxes or unboxes different kinds of
>  >  arrays, and it can be done (though rather awkwardly in my case).
> 
>  Awkwardly, but how? :)

See the function 'array' at the end of the file:

https://github.com/yziquel/OCaml-MPI/blob/master/ocamltype.p4.ml

(This code is far from clean, quite compact and ugly, but it shows it can be
done).

You here have a function that converts the a given datatype for type t
to a datatype for type t array.

>  Ok, you mean exactly like C++ type traits, where a static namespace
>  provides further type information. In OCaml that'd be a module, right.

Less static than type traits, but yes, similar in a sense.

>  I was more thinking of having a first-class module as a regular
>  value that provides, when you unpack it, sufficient information to know
>  how to cross the barriers from OCaml to C or Fortran or whatever, and then
>  send it or receive it via an MPI implementation (since that's what I'm
>  looking at). Which means all your HPC primitives must know how to
>  read properly the datatype info enclosed in your first-class module.

The contents of this first-class module is available here:

https://github.com/yziquel/OCaml-MPI/blob/master/ocamltype.p4.sig.mli

The real C hack is the 'how_to_fill' value in this module, which maps to
the switch(how) block in the following X-macro:

https://github.com/yziquel/OCaml-MPI/blob/master/messages.macro.c

This macro is used by the stub functions in

https://github.com/yziquel/OCaml-MPI/blob/master/messages.c

and these stub functions receive the 'how' value from the first-class
module datatype. It then gets arguably easier to write SPMD code which
transfers values that type-check properly with their datatype's type.

>    What I had in mind was, say, I have this CA simulation or spiking
>    neural net simulation code or a cell simulation, or a quantum
>    chromodynamics simulation, maybe a visualization of an irregular mesh,
>    or some other non-trivial scientific computing application where it's
>    difficult to reduce everything to float arrays. Because usually you
>    will have either vectors, or graphs of complex atomic structures and
>    then this boxing is going to seriously hurt performance, as performance
>    is hurt when you try to write any serious algorithm in Java in an OO
>    fashion because everything is a pointer. When you have to start writing
>    every algorithm in an awkward and bloated way to maintain some sense of
>    performance, the benefit of the language quickly vanishes. (Main reason
>    why Java should never be used except for toy web apps!) And then the
>    HPC guy will have to turn to the portable assembly of C++, right?

Possible.
 
>      I'm saying first-class module, because it can be typed as 'value
>      datatype.
>      You only know what the 'value it is supposed to encode is, and have
>      all
>      the typing info of how to deal with it encapsulated in the
>      first-class
>      module and not leaking into the rest of your code.
> 
>    Ok, care to give a minimal example? How do you pass and use the module
>    value? This sounds interesting enough. You seem to be using the module
>    to encapsulate encoding/decoding functions.

Yes.

>    Which is fine but how is it enough? How would that apply to changing
>    the memory layout of a data type (or to provide an unboxed array of such
>    values)? I thought you would be generating another module that represents
>    the same type as an array of ints, and somehow convert the types
>    transparently. How do you propose to do it?

No, it doesn't suit the specific unboxing needs you care about. What you
want really is to change the way the values are represented in memory.
You really want somehow another type of block in which you can quickly
read an OCaml value. Possibly a specific block tag for your purposes
that would advise the GC how your block packs values and how to update
pointers inside, or some workaround. You could possibly use
first-class modules to know how to read and write such a packed block
in a type-safe way, but I'd guess you'd hurt performance too much for
your purposes doing that depending on what you do.

>    >    What I would like is something like (thinking of a typical
>    simulation
>    >    datatype):
>    >    type cvector4 = ][ (complex * complex * complex * complex)
>    >    where ][ would be a "type operator" enforcing a flattened
>    >    representation of the type expression it is applied to. It would
>    just
>    >    change the layout so it would be equivalent to the same type
>    without
>    >    the unboxing op.
> 
>      If I'm not mistaking tuples or records of floats are already unboxed
>      at
>      runtime. Not seeing the great benefit here.
> 
> 
>    Yes, but the above is not a tuple of floats.
>    Best,

Well then, you could probably do something with Camlp4, but you're in
for quite some pain as far as I can see.

I imagine you could generate the first-class module for packed values
(with readers/decoders/encoders/writers in them) depending on the
']['-like type declarations. But it hardly seems fun to do.

-- 
     Guillaume Yziquel