caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Alain Frisch <alain@frisch.fr>
To: Philippe Veber <philippe.veber@gmail.com>,
	 Martin Jambon <martin.jambon@ens-lyon.org>
Cc: caml users <caml-list@inria.fr>
Subject: Re: [Caml-list] Master-slave architecture behind an ocsigen server.
Date: Thu, 28 Mar 2013 09:47:17 +0100	[thread overview]
Message-ID: <51540395.50202@frisch.fr> (raw)
In-Reply-To: <CAOOOohTMNd6pW=3Gp8wBc8nggLUCEd9OAEFFV91jz8wEUJMMXg@mail.gmail.com>

On 03/28/2013 08:37 AM, Philippe Veber wrote:
> Hi Martin,
> nproc meets exactly my needs: a simple lwt-friendly interface to
> dispatch function calls on a pool of processes that run on the same
> machine. I have only one concern, that should probably be discussed on
> the ocsigen list, that is I wonder if it is okay to fork the process
> running the ocsigen server. I think I remember warnings on having parent
> and children processes sharing connections/channels but it's really not
> clear to me.

FWIW, LexiFi uses an architecture quite close to this for our 
application.  The main process manages the GUI and dispatches 
computations tasks to external processes.  Some points to be noted:

- Since this is a Windows application, we cannot rely on fork.  Instead, 
we restart the application (Sys.argv.(0)), with specific command-line 
flag, captured by the library in charge of managing computations.  This 
is done by calling a special function in this library; the function does 
nothing in the main process and in the sub-processes, it starts the 
special mode and never returns.  This gives a chance to the main 
application to do some global initialization common to the main and sub 
processes (for instance, we dynlink external plugins in this 
initialization phase).

- Computation functions are registered as global values.  Registration 
returns an opaque handle which can be used to call such a function.  We 
don't rely on marshaling closures.

- The GUI process actually spawns a single sub-process (the Scheduler), 
which itself manages more worker sub-sub-processes (with a maximal 
number of workers).  Currently, we don't do very clever scheduling based 
on task priorities, but this could easily be added.

- An external computation can spawn sub-computations (by applying a 
parallel "map" to a list) either synchronously (direct style) or 
asynchronously (by providing a continuation function, which will be 
applied to the list of results, maybe in a different process).  In both 
cases,  this is done by sending those tasks to the Scheduler.  The 
Scheduler dispatches computation tasks to available workers.  In the 
synchronous parallel map, the caller runs an inner event loop to 
communicate with the Scheduler (and it only accepts sub-tasks created by 
itself or one of its descendants).

- Top-level external computations can be stopped by the main process 
(e.g. on user request).  Concretely, this kills all workers currently 
working on that task or one of its sub-tasks.

- In addition to sending back the final results, computations can report 
progress to their caller and more intermediate results.  This is useful 
to show a progress bar/status and partial results in the GUI before the 
end of the entire computation.

- Communication between processes is done by exchanging marshaled 
"variants" (a tagged representation of OCaml values, generated 
automatically using our runtime types).  Since we can attach special 
variantizers/devariantizers to specific types, this gives a chance to 
customize how some values have to be exchanged between processes (e.g. 
values relying on internal hash-consing are treated specially to 
recreate the maximal sharing in the sub-process).

- Concretely, the communication between processes is done through queues 
of messages implemented with shared memory.  (This component was 
developed by Fabrice Le Fessant and OCamlPro.)   Large computation 
arguments or results (above a certain size) are stored on the file 
system, to avoid having to keep them in RAM for too long (if all workers 
are busy, the computation might wait for some time being started).

- The API supports easily distributing computation tasks to several 
machines.  We have done some experiments with using our application's 
database to dispatch computations, but we don't use it in production.





Alain

  reply	other threads:[~2013-03-28  8:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-26 14:29 Philippe Veber
2013-03-26 19:02 ` Martin Jambon
2013-03-26 21:01 ` Martin Jambon
2013-03-27  1:11   ` Francois Berenger
2013-03-28  7:37   ` Philippe Veber
2013-03-28  8:47     ` Alain Frisch [this message]
2013-03-28  9:39       ` Philippe Veber
2013-03-28 10:54         ` Alain Frisch
2013-03-28 11:02       ` Anil Madhavapeddy
2013-03-28 11:23         ` Alain Frisch
2013-03-28 12:18         ` AW: " Gerd Stolpmann
2013-03-27 10:00 ` Sébastien Dailly
2013-03-28  8:34   ` Philippe Veber
2013-03-27 16:07 ` Gerd Stolpmann
2013-03-28  9:18   ` Philippe Veber
2013-03-28 12:29     ` AW: " Gerd Stolpmann
2013-03-27 22:42 ` Denis Berthod
2013-03-27 22:49   ` AW: " Gerd Stolpmann
     [not found]     ` <4A6314AA-0C59-4E35-9EA4-F465C0A5AF3A@gmail.com>
2013-03-28  9:23       ` Philippe Veber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51540395.50202@frisch.fr \
    --to=alain@frisch.fr \
    --cc=caml-list@inria.fr \
    --cc=martin.jambon@ens-lyon.org \
    --cc=philippe.veber@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).