caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Philippe Veber <philippe.veber@gmail.com>
To: Alain Frisch <alain@frisch.fr>
Cc: Martin Jambon <martin.jambon@ens-lyon.org>,
	caml users <caml-list@inria.fr>
Subject: Re: [Caml-list] Master-slave architecture behind an ocsigen server.
Date: Thu, 28 Mar 2013 10:39:30 +0100	[thread overview]
Message-ID: <CAOOOohR7UT1SjFHMS5w8vpfxg8JD__XA+K2FQC0yzBf6J3kgaQ@mail.gmail.com> (raw)
In-Reply-To: <51540395.50202@frisch.fr>

[-- Attachment #1: Type: text/plain, Size: 4561 bytes --]

Thanks Alain for this detailed description. I did not get why you do not
use marshalling for computation functions: this should be safe given the
same code is run in the GUI process and in the calculation sub-processes,
right?

Compared to your setting, I'm affraid I cannot use the same trick for
running the sub-processes, as the ocsigen server is in charge here. Maybe
there are some hooks that can help?

Also I like this ability to send partial results, this would be a nice
feature in my case. I'll have to think how to achieve this ...

Cheers
  ph.



2013/3/28 Alain Frisch <alain@frisch.fr>

> On 03/28/2013 08:37 AM, Philippe Veber wrote:
>
>> Hi Martin,
>> nproc meets exactly my needs: a simple lwt-friendly interface to
>> dispatch function calls on a pool of processes that run on the same
>> machine. I have only one concern, that should probably be discussed on
>> the ocsigen list, that is I wonder if it is okay to fork the process
>> running the ocsigen server. I think I remember warnings on having parent
>> and children processes sharing connections/channels but it's really not
>> clear to me.
>>
>
> FWIW, LexiFi uses an architecture quite close to this for our application.
>  The main process manages the GUI and dispatches computations tasks to
> external processes.  Some points to be noted:
>
> - Since this is a Windows application, we cannot rely on fork.  Instead,
> we restart the application (Sys.argv.(0)), with specific command-line flag,
> captured by the library in charge of managing computations.  This is done
> by calling a special function in this library; the function does nothing in
> the main process and in the sub-processes, it starts the special mode and
> never returns.  This gives a chance to the main application to do some
> global initialization common to the main and sub processes (for instance,
> we dynlink external plugins in this initialization phase).
>
> - Computation functions are registered as global values.  Registration
> returns an opaque handle which can be used to call such a function.  We
> don't rely on marshaling closures.
>
> - The GUI process actually spawns a single sub-process (the Scheduler),
> which itself manages more worker sub-sub-processes (with a maximal number
> of workers).  Currently, we don't do very clever scheduling based on task
> priorities, but this could easily be added.
>
> - An external computation can spawn sub-computations (by applying a
> parallel "map" to a list) either synchronously (direct style) or
> asynchronously (by providing a continuation function, which will be applied
> to the list of results, maybe in a different process).  In both cases,
>  this is done by sending those tasks to the Scheduler.  The Scheduler
> dispatches computation tasks to available workers.  In the synchronous
> parallel map, the caller runs an inner event loop to communicate with the
> Scheduler (and it only accepts sub-tasks created by itself or one of its
> descendants).
>
> - Top-level external computations can be stopped by the main process (e.g.
> on user request).  Concretely, this kills all workers currently working on
> that task or one of its sub-tasks.
>
> - In addition to sending back the final results, computations can report
> progress to their caller and more intermediate results.  This is useful to
> show a progress bar/status and partial results in the GUI before the end of
> the entire computation.
>
> - Communication between processes is done by exchanging marshaled
> "variants" (a tagged representation of OCaml values, generated
> automatically using our runtime types).  Since we can attach special
> variantizers/devariantizers to specific types, this gives a chance to
> customize how some values have to be exchanged between processes (e.g.
> values relying on internal hash-consing are treated specially to recreate
> the maximal sharing in the sub-process).
>
> - Concretely, the communication between processes is done through queues
> of messages implemented with shared memory.  (This component was developed
> by Fabrice Le Fessant and OCamlPro.)   Large computation arguments or
> results (above a certain size) are stored on the file system, to avoid
> having to keep them in RAM for too long (if all workers are busy, the
> computation might wait for some time being started).
>
> - The API supports easily distributing computation tasks to several
> machines.  We have done some experiments with using our application's
> database to dispatch computations, but we don't use it in production.
>
>
>
>
>
> Alain
>

[-- Attachment #2: Type: text/html, Size: 5193 bytes --]

  reply	other threads:[~2013-03-28  9:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-26 14:29 Philippe Veber
2013-03-26 19:02 ` Martin Jambon
2013-03-26 21:01 ` Martin Jambon
2013-03-27  1:11   ` Francois Berenger
2013-03-28  7:37   ` Philippe Veber
2013-03-28  8:47     ` Alain Frisch
2013-03-28  9:39       ` Philippe Veber [this message]
2013-03-28 10:54         ` Alain Frisch
2013-03-28 11:02       ` Anil Madhavapeddy
2013-03-28 11:23         ` Alain Frisch
2013-03-28 12:18         ` AW: " Gerd Stolpmann
2013-03-27 10:00 ` Sébastien Dailly
2013-03-28  8:34   ` Philippe Veber
2013-03-27 16:07 ` Gerd Stolpmann
2013-03-28  9:18   ` Philippe Veber
2013-03-28 12:29     ` AW: " Gerd Stolpmann
2013-03-27 22:42 ` Denis Berthod
2013-03-27 22:49   ` AW: " Gerd Stolpmann
     [not found]     ` <4A6314AA-0C59-4E35-9EA4-F465C0A5AF3A@gmail.com>
2013-03-28  9:23       ` Philippe Veber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOOOohR7UT1SjFHMS5w8vpfxg8JD__XA+K2FQC0yzBf6J3kgaQ@mail.gmail.com \
    --to=philippe.veber@gmail.com \
    --cc=alain@frisch.fr \
    --cc=caml-list@inria.fr \
    --cc=martin.jambon@ens-lyon.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).