From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <philippe.veber@gmail.com>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id 3740C7EE7A
	for <caml-list@sympa.inria.fr>; Thu, 28 Mar 2013 10:39:53 +0100 (CET)
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  philippe.veber@gmail.com) identity=pra;
  client-ip=209.85.223.174;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="philippe.veber@gmail.com";
  x-sender="philippe.veber@gmail.com";
  x-conformance=sidf_compatible
Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of
  philippe.veber@gmail.com designates 209.85.223.174 as
  permitted sender) identity=mailfrom;
  client-ip=209.85.223.174;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="philippe.veber@gmail.com";
  x-sender="philippe.veber@gmail.com";
  x-conformance=sidf_compatible; x-record-type="v=spf1"
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mail-ie0-f174.google.com) identity=helo;
  client-ip=209.85.223.174;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="philippe.veber@gmail.com";
  x-sender="postmaster@mail-ie0-f174.google.com";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AkoBAOEOVFHRVd+ulGdsb2JhbABDwh97CBYOAQEBAQcLCwkSKoIfAQEEAUABGx0BAwELBgULBwYuIgERAQUBDgENBhOIAQEDCQagY4wygnuEMQoZJw1ZiHwBBQyPDAeDQAOWZ48lFimELzs
X-IPAS-Result: AkoBAOEOVFHRVd+ulGdsb2JhbABDwh97CBYOAQEBAQcLCwkSKoIfAQEEAUABGx0BAwELBgULBwYuIgERAQUBDgENBhOIAQEDCQagY4wygnuEMQoZJw1ZiHwBBQyPDAeDQAOWZ48lFimELzs
X-IronPort-AV: E=Sophos;i="4.84,925,1355094000"; 
   d="scan'208";a="10816501"
Received: from mail-ie0-f174.google.com ([209.85.223.174])
  by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 28 Mar 2013 10:39:52 +0100
Received: by mail-ie0-f174.google.com with SMTP id aq17so8421912iec.33
        for <caml-list@inria.fr>; Thu, 28 Mar 2013 02:39:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=x-received:mime-version:in-reply-to:references:from:date:message-id
         :subject:to:cc:content-type;
        bh=YuXn1J3CQEZVXjxFNNiteYRJAr97FjV9SaRxw/+Te7Q=;
        b=Yg10NkCHvTXCOO9YeHpJrA63wULgjDS7ZCUGvJX+Jw3PtbqP55WtwN1J+xedk5zqSz
         AEIZfQC4EaxKQ+I28ToE9xAG8yy4I4dYLNu3jHWulXdE1AhezaWIUsFkugg51kY9zXEr
         cfrDQVy9pvYS7DW15BsyHOPChVPhmuWPFj43g9Pzf3xbJrJlA2XMddBCmycK1vE291Bj
         sa4YqAsyzBTixd3ZDpe8aN/Iq6YK8OHUnm1yUxeLfToNNqvDLveH5hW0wGcUe+lY28MF
         oDnEJqu/iYYEFR7CHbSSw9+DHHXhA+CFU6sEAnlqvf+cKYfdmrzt+Z6VGmOSSxadYTTv
         W56Q==
X-Received: by 10.50.216.164 with SMTP id or4mr2561934igc.38.1364463590621;
 Thu, 28 Mar 2013 02:39:50 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.64.136.8 with HTTP; Thu, 28 Mar 2013 02:39:30 -0700 (PDT)
In-Reply-To: <51540395.50202@frisch.fr>
References: <CAOOOohSzgcZxLOu9qUX1Box1eKyK-DEX7zrEM2GXzjs372jLpQ@mail.gmail.com>
 <51520CAE.6020009@ens-lyon.org> <CAOOOohTMNd6pW=3Gp8wBc8nggLUCEd9OAEFFV91jz8wEUJMMXg@mail.gmail.com>
 <51540395.50202@frisch.fr>
From: Philippe Veber <philippe.veber@gmail.com>
Date: Thu, 28 Mar 2013 10:39:30 +0100
Message-ID: <CAOOOohR7UT1SjFHMS5w8vpfxg8JD__XA+K2FQC0yzBf6J3kgaQ@mail.gmail.com>
To: Alain Frisch <alain@frisch.fr>
Cc: Martin Jambon <martin.jambon@ens-lyon.org>, caml users <caml-list@inria.fr>
Content-Type: multipart/alternative; boundary=14dae93406dfa0c79604d8f8efb3
X-Validation-by: philippe.veber@gmail.com
Subject: Re: [Caml-list] Master-slave architecture behind an ocsigen server.


--14dae93406dfa0c79604d8f8efb3
Content-Type: text/plain; charset=ISO-8859-1

Thanks Alain for this detailed description. I did not get why you do not
use marshalling for computation functions: this should be safe given the
same code is run in the GUI process and in the calculation sub-processes,
right?

Compared to your setting, I'm affraid I cannot use the same trick for
running the sub-processes, as the ocsigen server is in charge here. Maybe
there are some hooks that can help?

Also I like this ability to send partial results, this would be a nice
feature in my case. I'll have to think how to achieve this ...

Cheers
  ph.


2013/3/28 Alain Frisch <alain@frisch.fr>

> On 03/28/2013 08:37 AM, Philippe Veber wrote:
>
>> Hi Martin,
>> nproc meets exactly my needs: a simple lwt-friendly interface to
>> dispatch function calls on a pool of processes that run on the same
>> machine. I have only one concern, that should probably be discussed on
>> the ocsigen list, that is I wonder if it is okay to fork the process
>> running the ocsigen server. I think I remember warnings on having parent
>> and children processes sharing connections/channels but it's really not
>> clear to me.
>>
>
> FWIW, LexiFi uses an architecture quite close to this for our application.
>  The main process manages the GUI and dispatches computations tasks to
> external processes.  Some points to be noted:
>
> - Since this is a Windows application, we cannot rely on fork.  Instead,
> we restart the application (Sys.argv.(0)), with specific command-line flag,
> captured by the library in charge of managing computations.  This is done
> by calling a special function in this library; the function does nothing in
> the main process and in the sub-processes, it starts the special mode and
> never returns.  This gives a chance to the main application to do some
> global initialization common to the main and sub processes (for instance,
> we dynlink external plugins in this initialization phase).
>
> - Computation functions are registered as global values.  Registration
> returns an opaque handle which can be used to call such a function.  We
> don't rely on marshaling closures.
>
> - The GUI process actually spawns a single sub-process (the Scheduler),
> which itself manages more worker sub-sub-processes (with a maximal number
> of workers).  Currently, we don't do very clever scheduling based on task
> priorities, but this could easily be added.
>
> - An external computation can spawn sub-computations (by applying a
> parallel "map" to a list) either synchronously (direct style) or
> asynchronously (by providing a continuation function, which will be applied
> to the list of results, maybe in a different process).  In both cases,
>  this is done by sending those tasks to the Scheduler.  The Scheduler
> dispatches computation tasks to available workers.  In the synchronous
> parallel map, the caller runs an inner event loop to communicate with the
> Scheduler (and it only accepts sub-tasks created by itself or one of its
> descendants).
>
> - Top-level external computations can be stopped by the main process (e.g.
> on user request).  Concretely, this kills all workers currently working on
> that task or one of its sub-tasks.
>
> - In addition to sending back the final results, computations can report
> progress to their caller and more intermediate results.  This is useful to
> show a progress bar/status and partial results in the GUI before the end of
> the entire computation.
>
> - Communication between processes is done by exchanging marshaled
> "variants" (a tagged representation of OCaml values, generated
> automatically using our runtime types).  Since we can attach special
> variantizers/devariantizers to specific types, this gives a chance to
> customize how some values have to be exchanged between processes (e.g.
> values relying on internal hash-consing are treated specially to recreate
> the maximal sharing in the sub-process).
>
> - Concretely, the communication between processes is done through queues
> of messages implemented with shared memory.  (This component was developed
> by Fabrice Le Fessant and OCamlPro.)   Large computation arguments or
> results (above a certain size) are stored on the file system, to avoid
> having to keep them in RAM for too long (if all workers are busy, the
> computation might wait for some time being started).
>
> - The API supports easily distributing computation tasks to several
> machines.  We have done some experiments with using our application's
> database to dispatch computations, but we don't use it in production.
>
>
>
>
>
> Alain
>

--14dae93406dfa0c79604d8f8efb3
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks Alain for this detailed description. I did not get =
why you do not use marshalling for computation functions: this should be sa=
fe given the same code is run in the GUI process and in the calculation sub=
-processes, right?<br>

<br>Compared to your setting, I&#39;m affraid I cannot use the same trick f=
or running the sub-processes, as the ocsigen server is in charge here. Mayb=
e there are some hooks that can help?<br><br>Also I like this ability to se=
nd partial results, this would be a nice feature in my case. I&#39;ll have =
to think how to achieve this ...<br>

<br>Cheers<br>=A0 ph.<br><br></div><div class=3D"gmail_extra"><br><br><div =
class=3D"gmail_quote">2013/3/28 Alain Frisch <span dir=3D"ltr">&lt;<a href=
=3D"mailto:alain@frisch.fr" target=3D"_blank">alain@frisch.fr</a>&gt;</span=
><br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-le=
ft:1px #ccc solid;padding-left:1ex">

<div class=3D"im">On 03/28/2013 08:37 AM, Philippe Veber wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
Hi Martin,<br>
nproc meets exactly my needs: a simple lwt-friendly interface to<br>
dispatch function calls on a pool of processes that run on the same<br>
machine. I have only one concern, that should probably be discussed on<br>
the ocsigen list, that is I wonder if it is okay to fork the process<br>
running the ocsigen server. I think I remember warnings on having parent<br>
and children processes sharing connections/channels but it&#39;s really not=
<br>
clear to me.<br>
</blockquote>
<br></div>
FWIW, LexiFi uses an architecture quite close to this for our application. =
=A0The main process manages the GUI and dispatches computations tasks to ex=
ternal processes. =A0Some points to be noted:<br>
<br>
- Since this is a Windows application, we cannot rely on fork. =A0Instead, =
we restart the application (Sys.argv.(0)), with specific command-line flag,=
 captured by the library in charge of managing computations. =A0This is don=
e by calling a special function in this library; the function does nothing =
in the main process and in the sub-processes, it starts the special mode an=
d never returns. =A0This gives a chance to the main application to do some =
global initialization common to the main and sub processes (for instance, w=
e dynlink external plugins in this initialization phase).<br>


<br>
- Computation functions are registered as global values. =A0Registration re=
turns an opaque handle which can be used to call such a function. =A0We don=
&#39;t rely on marshaling closures.<br>
<br>
- The GUI process actually spawns a single sub-process (the Scheduler), whi=
ch itself manages more worker sub-sub-processes (with a maximal number of w=
orkers). =A0Currently, we don&#39;t do very clever scheduling based on task=
 priorities, but this could easily be added.<br>


<br>
- An external computation can spawn sub-computations (by applying a paralle=
l &quot;map&quot; to a list) either synchronously (direct style) or asynchr=
onously (by providing a continuation function, which will be applied to the=
 list of results, maybe in a different process). =A0In both cases, =A0this =
is done by sending those tasks to the Scheduler. =A0The Scheduler dispatche=
s computation tasks to available workers. =A0In the synchronous parallel ma=
p, the caller runs an inner event loop to communicate with the Scheduler (a=
nd it only accepts sub-tasks created by itself or one of its descendants).<=
br>


<br>
- Top-level external computations can be stopped by the main process (e.g. =
on user request). =A0Concretely, this kills all workers currently working o=
n that task or one of its sub-tasks.<br>
<br>
- In addition to sending back the final results, computations can report pr=
ogress to their caller and more intermediate results. =A0This is useful to =
show a progress bar/status and partial results in the GUI before the end of=
 the entire computation.<br>


<br>
- Communication between processes is done by exchanging marshaled &quot;var=
iants&quot; (a tagged representation of OCaml values, generated automatical=
ly using our runtime types). =A0Since we can attach special variantizers/de=
variantizers to specific types, this gives a chance to customize how some v=
alues have to be exchanged between processes (e.g. values relying on intern=
al hash-consing are treated specially to recreate the maximal sharing in th=
e sub-process).<br>


<br>
- Concretely, the communication between processes is done through queues of=
 messages implemented with shared memory. =A0(This component was developed =
by Fabrice Le Fessant and OCamlPro.) =A0 Large computation arguments or res=
ults (above a certain size) are stored on the file system, to avoid having =
to keep them in RAM for too long (if all workers are busy, the computation =
might wait for some time being started).<br>


<br>
- The API supports easily distributing computation tasks to several machine=
s. =A0We have done some experiments with using our application&#39;s databa=
se to dispatch computations, but we don&#39;t use it in production.<span cl=
ass=3D"HOEnZb"><font color=3D"#888888"><br>


<br>
<br>
<br>
<br>
<br>
Alain<br>
</font></span></blockquote></div><br></div>

--14dae93406dfa0c79604d8f8efb3--