caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Shared data space: How to manage Ancient keys?
@ 2010-03-18 18:32 Hugo Ferreira
  2010-03-19 10:12 ` [Caml-list] " Goswin von Brederlow
  0 siblings, 1 reply; 3+ messages in thread
From: Hugo Ferreira @ 2010-03-18 18:32 UTC (permalink / raw)
  To: caml-list

Hello,

I am trying to implement a parallel algorithm and have opted to use
the Ancient module to share data across processes because it dynamically
reallocates memory when mapping data (I don't know the size of the
objects before hand). However I am having trouble managing its use of
keys.

Maybe someone here has a solution to my problem.

To make things clear - a little background. I have a set of worker
processes that cooperate to analyse a set of sequences. I assume
all sequences are in a global blackboard-type /LINDA-like data space.
Each worker removes a single sequence. It then scans all other existing
sequences and selects another one. It then a) removes the selected
sequence b) merges the 2 sequences at hand and generates 0, 1 or more
new sequences. The process repeats itself until no more sequences exist
in the global data space.

In the above scenario all sequences would be assigned a unique id which
increases sequentially. If I used a common data structure such as a
binary tree I could simply increment a counter and use that as a key
for the insertion/deletion of sequences into/from the binary tree.
However I cannot do this here because I can only share linear memory
via mapped files in Ocaml.

So far I have made a small experiment that allows me to insert data
into a shared memory space. Ancient allows one to copy various objects
into the memory space via an integer index. From the code it seems that
the indexing table is an array like structure. So when I add and object
I must identify an index (slot) that is empty (unused) and reuse that
otherwise memory will grow unchecked.

My question is how can I generate and share keys for Ancient's use
without exceeding memory usage?

TIA,
Hugo F.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Shared data space: How to manage Ancient keys?
  2010-03-18 18:32 Shared data space: How to manage Ancient keys? Hugo Ferreira
@ 2010-03-19 10:12 ` Goswin von Brederlow
  2010-03-19 11:07   ` Hugo Ferreira
  0 siblings, 1 reply; 3+ messages in thread
From: Goswin von Brederlow @ 2010-03-19 10:12 UTC (permalink / raw)
  To: Hugo Ferreira; +Cc: caml-list

Hugo Ferreira <hmf@inescporto.pt> writes:

> Hello,
>
> I am trying to implement a parallel algorithm and have opted to use
> the Ancient module to share data across processes because it dynamically
> reallocates memory when mapping data (I don't know the size of the
> objects before hand). However I am having trouble managing its use of
> keys.
>
> Maybe someone here has a solution to my problem.
>
> To make things clear - a little background. I have a set of worker
> processes that cooperate to analyse a set of sequences. I assume
> all sequences are in a global blackboard-type /LINDA-like data space.
> Each worker removes a single sequence. It then scans all other existing
> sequences and selects another one. It then a) removes the selected
> sequence b) merges the 2 sequences at hand and generates 0, 1 or more
> new sequences. The process repeats itself until no more sequences exist
> in the global data space.
>
> In the above scenario all sequences would be assigned a unique id which
> increases sequentially. If I used a common data structure such as a
> binary tree I could simply increment a counter and use that as a key
> for the insertion/deletion of sequences into/from the binary tree.
> However I cannot do this here because I can only share linear memory
> via mapped files in Ocaml.
>
> So far I have made a small experiment that allows me to insert data
> into a shared memory space. Ancient allows one to copy various objects
> into the memory space via an integer index. From the code it seems that
> the indexing table is an array like structure. So when I add and object
> I must identify an index (slot) that is empty (unused) and reuse that
> otherwise memory will grow unchecked.
>
> My question is how can I generate and share keys for Ancient's use
> without exceeding memory usage?
>
> TIA,
> Hugo F.

How about a simple solution. Only one thread handles indexes.

You said you have worker processes. Add one controling process on top of
it that spawns all the worker processes and keeps a communication line
open with them, e.g. a pipe. Each worker would then tell the controller
to remove an index or request an unused index to store a result. Its
probably a good idea to request unused indexes in bunches, say 100 at a
time to cut down on latenzy.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Shared data space: How to manage Ancient keys?
  2010-03-19 10:12 ` [Caml-list] " Goswin von Brederlow
@ 2010-03-19 11:07   ` Hugo Ferreira
  0 siblings, 0 replies; 3+ messages in thread
From: Hugo Ferreira @ 2010-03-19 11:07 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

Goswin von Brederlow wrote:
> Hugo Ferreira <hmf@inescporto.pt> writes:
>
snip...
> 
> How about a simple solution. Only one thread handles indexes.
> 
> You said you have worker processes. Add one controling process on top of
> it that spawns all the worker processes and keeps a communication line
> open with them, e.g. a pipe. Each worker would then tell the controller
> to remove an index or request an unused index to store a result. Its
> probably a good idea to request unused indexes in bunches, say 100 at a
> time to cut down on latenzy.
>

To be honest I have already considered (but not discarded) this option
(which may provide other advantages). This however has several issues:

1) I still need to deal with problem of managing indexes (identifying
    and reusing slots).
2) Each worker has to scan all sequences that have been generated but
    not consumed. So I will also need to generate and send a list of
    available indexes. This may be a very long list.
3) Interaction between the controller and workers need to exchange
    several types of messages:
    a) Request for a sequence to process
    b) Request for a list of sequences to scan
    c) Request an index to use
    d) Acknowledge a used index (maybe)
    e) Request an index to free
    f) Acknowledge a freed index (maybe)
    This implies a complicated protocol with state (worker aware).

Advantages:
1) The workers scan the sequences asynchronously. Scans may have
    various processes access data in read only. Insertion/deletion
    requires locking for read/writes. Use of a controller will
    facilitate this.
2) The controller may avoid having the same sequences scanned
    by two or more workers.

Needless to say these problems are more than I bargained for. 8-(

Thanks,
Hugo.

> MfG
>         Goswin
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-03-19 11:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-18 18:32 Shared data space: How to manage Ancient keys? Hugo Ferreira
2010-03-19 10:12 ` [Caml-list] " Goswin von Brederlow
2010-03-19 11:07   ` Hugo Ferreira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).