caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] String Problem
@ 2004-06-09  9:58 Thomas Fischbacher
  2004-06-09 10:35 ` Olivier Andrieu
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Fischbacher @ 2004-06-09  9:58 UTC (permalink / raw)
  To: caml-list


Dear Caml hackers,

I am just doing some quite large (string theory) calculation which 
basically runs through a huge tree and does some computation at every 
node in ocaml which I have to parallelize in an effective way. My present 
approach is to set an alarm for the process doing the calculation, then 
splitting into chunks and serializing all the work that corresponds to 
nodes that have been touched but for which the calculation has not yet 
been finished. The serialized strings are then compressed and sent out via 
the net to other machines to help with the calculation.

I'd love to avoid temporary files, as these are not necessary, and my 
design is simpler and cleaner without having to worry about filesystem 
issues.

Now I encounter the problem that ocaml can only serialize to 
strings, but these are limited to 16 MB in size. If my data set (which is 
structured in a complicated way, i.e. it would be quite some effort to 
write specialized readers and printers) gets large enough, this entire 
approach therefore breaks down.

So, would it be that much of a problem to take the length information 
for strings out of the type word (I suppose that's the problem here) and 
use a proper 32-bit quantity on 32-bit machines here? I simply cannot 
believe it's not many more people experiencing similar difficulties with 
this 16 MB limitation on string lengths.


-- 
regards,               tf@cip.physik.uni-muenchen.de              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] String Problem
  2004-06-09  9:58 [Caml-list] String Problem Thomas Fischbacher
@ 2004-06-09 10:35 ` Olivier Andrieu
  2004-06-09 11:05   ` Thomas Fischbacher
  0 siblings, 1 reply; 3+ messages in thread
From: Olivier Andrieu @ 2004-06-09 10:35 UTC (permalink / raw)
  To: Thomas.Fischbacher; +Cc: caml-list

 Thomas Fischbacher [Wed, 9 Jun 2004]:
 > 
 > Dear Caml hackers,
 > 
 > I am just doing some quite large (string theory) calculation which
 > basically runs through a huge tree and does some computation at
 > every node in ocaml which I have to parallelize in an effective
 > way. My present approach is to set an alarm for the process doing
 > the calculation, then splitting into chunks and serializing all the
 > work that corresponds to nodes that have been touched but for which
 > the calculation has not yet been finished. The serialized strings
 > are then compressed and sent out via the net to other machines to
 > help with the calculation.
 > 
 > I'd love to avoid temporary files, as these are not necessary, and
 > my design is simpler and cleaner without having to worry about
 > filesystem issues.
 >
 > Now I encounter the problem that ocaml can only serialize to
 > strings, but these are limited to 16 MB in size. If my data set
 > (which is structured in a complicated way, i.e. it would be quite
 > some effort to write specialized readers and printers) gets large
 > enough, this entire approach therefore breaks down.

It's quite easy to serialize to a Bigarray with a bit of C code
(warning, not tested): 

,----
| #include "intext.h"
| #include "bigarray.h"
| 
| CAMLprim value ml_marshal_to_bigarray(value v, value flags)
| {
|   char *buf;
|   long len;
|   output_value_to_malloc(v, flags, &buf, &len);
|   return alloc_bigarray(BIGARRAY_UINT8 | BIGARRAY_C_LAYOUT | BIGARRAY_MANAGED, 
|                         1, buf, &len);
| }
| 
| CAMLprim value ml_demarshal_from_bigarray(value b)
| {
|   struct caml_bigarray *b_arr = Bigarray_val(b);
|   return input_value_from_block(b_arr->data, b_arr->dim[0]);
| }
`----

,----
| open Bigarray
| 
| external marshal_to_bigarray : 
|   'a -> Marshal.extern_flags list -> 
|   (char, int8_unsigned_elt, c_layout) Array1.t 
|      = "ml_marshal_to_bigarray"
| 
| external demarshal_from_bigarray :
|   (char, int8_unsigned_elt, c_layout) Array1.t -> 'a 
|      = "ml_demarshal_from_bigarray"
`----

Alternatively, buy a 64 bits computer :)

-- 
   Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] String Problem
  2004-06-09 10:35 ` Olivier Andrieu
@ 2004-06-09 11:05   ` Thomas Fischbacher
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Fischbacher @ 2004-06-09 11:05 UTC (permalink / raw)
  To: Olivier Andrieu; +Cc: caml-list


On Wed, 9 Jun 2004, Olivier Andrieu wrote:

> It's quite easy to serialize to a Bigarray with a bit of C code
> (warning, not tested): 

At least the language should provide out-of-the-box support for this.

> Alternatively, buy a 64 bits computer :)

How funny. The only reason why I am doing this in Ocaml is that I want to 
be able to abuse a few of the windows boxen here to help with the 
calculation.

Otherwise, I'd have done it in LISP.

-- 
regards,               tf@cip.physik.uni-muenchen.de              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-06-09 11:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-06-09  9:58 [Caml-list] String Problem Thomas Fischbacher
2004-06-09 10:35 ` Olivier Andrieu
2004-06-09 11:05   ` Thomas Fischbacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).