caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* mmap() and strings
@ 2004-12-08 20:04 Julien Cristau
  2004-12-08 20:24 ` [Caml-list] " Basile STARYNKEVITCH
  0 siblings, 1 reply; 6+ messages in thread
From: Julien Cristau @ 2004-12-08 20:04 UTC (permalink / raw)
  To: caml-list; +Cc: david.baelde

Hello list,

I'm wondering if somebody has an idea for the following problem:

I'm working on a program which manipulates a buffer. A writer process 
regularly changes this buffer, and reader processes have to work on it 
after each change. Currently, the buffer is a string and is passed to 
the readers trough pipes. However, this is costly because the buffer is 
copied many times at each iteration.
We thought we could use mmap(2), but there seems to be no easy solution 
to mmap() a memory region and treat it as a string in ocaml. Does 
anybody have a better idea how we could solve this problem without 
copying the buffer?

Julien


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] mmap() and strings
  2004-12-08 20:04 mmap() and strings Julien Cristau
@ 2004-12-08 20:24 ` Basile STARYNKEVITCH
  2004-12-08 20:50   ` Julien Cristau
  0 siblings, 1 reply; 6+ messages in thread
From: Basile STARYNKEVITCH @ 2004-12-08 20:24 UTC (permalink / raw)
  To: Julien Cristau; +Cc: caml-list

Le Wed, Dec 08, 2004 at 09:04:59PM +0100, Julien Cristau écrivait/wrote:
> Hello list,
> 
> I'm wondering if somebody has an idea for the following problem:
> 
> I'm working on a program which manipulates a buffer. A writer process 
> regularly changes this buffer, and reader processes have to work on it 
> after each change. Currently, the buffer is a string and is passed to 
> the readers trough pipes. However, this is costly because the buffer is 
> copied many times at each iteration.

Actually, pipes perform quite well on Linux....

> We thought we could use mmap(2), 
> but there seems to be no easy solution 
> to mmap() a memory region and treat it as a string in ocaml. 

Use Bigarray-s for that. They can mmap files (on Unix & Linux) and are
already in Ocaml 3.08

You might also use IPC shared memory segments, but there is no support
in Ocaml for these (shmget & related system calls), so you'll probably
need to code the C stub (for the Ocaml binding) yourself.

Maybe you'll need JoCaml... (but I don't know if it is dead or not -
it was designed for coding cooperative programs communicating thru
channels).


-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/ 
email: basile<at>starynkevitch<dot>net 
aliases: basile<at>tunes<dot>org = bstarynk<at>nerim<dot>net
8, rue de la Faïencerie, 92340 Bourg La Reine, France


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] mmap() and strings
  2004-12-08 20:24 ` [Caml-list] " Basile STARYNKEVITCH
@ 2004-12-08 20:50   ` Julien Cristau
  2004-12-09  1:09     ` Jacques Garrigue
  0 siblings, 1 reply; 6+ messages in thread
From: Julien Cristau @ 2004-12-08 20:50 UTC (permalink / raw)
  To: caml-list

I wrote:
> > We thought we could use mmap(2), 
> > but there seems to be no easy solution 
> > to mmap() a memory region and treat it as a string in ocaml. 
> 
On 08/12/2004-21:18, Basile STARYNKEVITCH wrote:
> Use Bigarray-s for that. They can mmap files (on Unix & Linux) and are
> already in Ocaml 3.08
> 
Actually, i had a look at bigarrays, and it's one of the solutions I 
considered. However, I'd like to keep strings as data structure, because 
the operations I have to perform take a string as an argument, and not a 
(char, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t, 
and it would be a pain to change all these functions (if I change them, 
I'll probably bind mmap() and munmap() directly and call them with 
MAP_ANONYMOUS, but I'd rather not do that).

Thanks,
Julien

[please don't Cc: me on replies, I'm subscribed]


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] mmap() and strings
  2004-12-08 20:50   ` Julien Cristau
@ 2004-12-09  1:09     ` Jacques Garrigue
  2004-12-09  1:42       ` Jacques Garrigue
  0 siblings, 1 reply; 6+ messages in thread
From: Jacques Garrigue @ 2004-12-09  1:09 UTC (permalink / raw)
  To: caml-list

From: Julien Cristau <julien.cristau@ens-lyon.fr>
> I wrote:
> > > We thought we could use mmap(2), 
> > > but there seems to be no easy solution 
> > > to mmap() a memory region and treat it as a string in ocaml. 
> > 
> On 08/12/2004-21:18, Basile STARYNKEVITCH wrote:
> > Use Bigarray-s for that. They can mmap files (on Unix & Linux) and are
> > already in Ocaml 3.08
> > 
> Actually, i had a look at bigarrays, and it's one of the solutions I 
> considered. However, I'd like to keep strings as data structure, because 
> the operations I have to perform take a string as an argument, and not a 
> (char, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t, 
> and it would be a pain to change all these functions (if I change them, 
> I'll probably bind mmap() and munmap() directly and call them with 
> MAP_ANONYMOUS, but I'd rather not do that).

I don't know exactly your goal, but if it is just that you don't want
to write a single line of C (and all the boilerplate), then you can
always do some magic (note that this is going to be very dark magic!)

The main problem is way string length is represented.
What you have to do is create a pseudo block header inside a bigarray.
The simplest way is to first create a string of the right size, and
then copy it byte by byte to the bigarray, starting with index (-4)
(for a 32-bit machine) and ending at ((len/4+1)*4) (the last by of the
last word of the string encodes part of the length), using
String.unsafe_get or String.unsafe_blit (more subtle).
Then you want to get a pointer at offset 4 in the string.
Not too hard either:
    (Obj.magic
       (!(snd (Obj.magic biga : Obj.t * int ref)) + 2)
       : string)

Now this has lots of dependencies on the behavior of the compiler and
how bigarrays are represented, but I believe this should work.
Not however that if you have problems with that, debugging can become
hairy, in this completely unsafe world.

Cheers,

Jacques Garrigue


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] mmap() and strings
  2004-12-09  1:09     ` Jacques Garrigue
@ 2004-12-09  1:42       ` Jacques Garrigue
  2004-12-09 10:32         ` David Baelde
  0 siblings, 1 reply; 6+ messages in thread
From: Jacques Garrigue @ 2004-12-09  1:42 UTC (permalink / raw)
  To: caml-list

From: Jacques Garrigue <garrigue@math.nagoya-u.ac.jp>

> The main problem is way string length is represented.
> What you have to do is create a pseudo block header inside a bigarray.
> The simplest way is to first create a string of the right size, and
> then copy it byte by byte to the bigarray, starting with index (-4)
> (for a 32-bit machine) and ending at ((len/4+1)*4) (the last by of the
> last word of the string encodes part of the length), using
> String.unsafe_get or String.unsafe_blit (more subtle).
> Then you want to get a pointer at offset 4 in the string.
> Not too hard either:
>     (Obj.magic
>        (!(snd (Obj.magic biga : Obj.t * int ref)) + 2)
>        : string)

Sorry, the above code is wrong.
The right one is
    let str_of_bigarray biga =
      (Obj.magic (snd (Obj.magic biga : Obj.t * int) + 2) : string)
The "+2" is intended to add 4 to the pointer stored in the second word
of the bigarray, which happens to be the pointer to the raw data.
And for initialization
    let copy_string s biga =
      String.unsafe_blit s (-4) (str_of_bigarray biga) (-4)
         ((String.length s / 4 + 2)*4)
After this, you can use it as a normal string.
(Again, this code comes with no warranty, use at your own risk)

Jacques Garrigue


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] mmap() and strings
  2004-12-09  1:42       ` Jacques Garrigue
@ 2004-12-09 10:32         ` David Baelde
  0 siblings, 0 replies; 6+ messages in thread
From: David Baelde @ 2004-12-09 10:32 UTC (permalink / raw)
  To: caml-list


Hello and thank you for this precise (and impressive) answer,
(I'm working together with Julien)

But actually, we need also to get a bigarray from string. So the writer 
could get a string as usual, copy it once in a shared bigarray, and then 
let the readers read it, without any more copies.

Anyway, I'm also quite afraid of including suck black magic in our 
project ;) We're OK for writing some C, we already did. Maybe it would 
be easier than Obj.Magic stuff.

Avoiding copies is our goal, but it seems impossible to get a string of 
which Caml doesn't own the blocks, in order to make a mmaped mem area 
corresponding to many strings in the writer/readers processes.

--
David


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-12-09 10:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-08 20:04 mmap() and strings Julien Cristau
2004-12-08 20:24 ` [Caml-list] " Basile STARYNKEVITCH
2004-12-08 20:50   ` Julien Cristau
2004-12-09  1:09     ` Jacques Garrigue
2004-12-09  1:42       ` Jacques Garrigue
2004-12-09 10:32         ` David Baelde

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).