AW: [Caml-list] Master-slave architecture behind an ocsigen server.

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Anil Madhavapeddy <anil@recoil.org>
Cc: Alain Frisch <alain@frisch.fr>,
	"cl-mirage@lists.cam.ac.uk List" <cl-mirage@lists.cam.ac.uk>,
	Philippe Veber <philippe.veber@gmail.com>,
	Martin Jambon <martin.jambon@ens-lyon.org>,
	caml users <caml-list@inria.fr>
Subject: AW: [Caml-list] Master-slave architecture behind an ocsigen server.
Date: Thu, 28 Mar 2013 13:18:59 +0100	[thread overview]
Message-ID: <1364473139.14693.2@samsung> (raw)
In-Reply-To: <A9720A40-D685-493B-99B5-2E8271A7617C@recoil.org> (from anil@recoil.org on Thu Mar 28 12:02:46 2013)

Am 28.03.2013 12:02:46 schrieb(en) Anil Madhavapeddy:
> On 28 Mar 2013, at 08:47, Alain Frisch <alain@frisch.fr> wrote:
> 
> > On 03/28/2013 08:37 AM, Philippe Veber wrote:
> >> Hi Martin,
> >> nproc meets exactly my needs: a simple lwt-friendly interface to
> >> dispatch function calls on a pool of processes that run on the same
> >> machine. I have only one concern, that should probably be  
> discussed on
> >> the ocsigen list, that is I wonder if it is okay to fork the  
> process
> >> running the ocsigen server. I think I remember warnings on having  
> parent
> >> and children processes sharing connections/channels but it's  
> really not
> >> clear to me.
> >
> > FWIW, LexiFi uses an architecture quite close to this for our  
> application.  The main process manages the GUI and dispatches  
> computations tasks to external processes.  Some points to be noted:
> >
> > - Since this is a Windows application, we cannot rely on fork.   
> Instead, we restart the application (Sys.argv.(0)), with specific  
> command-line flag, captured by the library in charge of managing  
> computations.  This is done by calling a special function in this  
> library; the function does nothing in the main process and in the  
> sub-processes, it starts the special mode and never returns.  This  
> gives a chance to the main application to do some global  
> initialization common to the main and sub processes (for instance, we  
> dynlink external plugins in this initialization phase).
> >
> > - Computation functions are registered as global values.   
> Registration returns an opaque handle which can be used to call such  
> a function.  We don't rely on marshaling closures.
> >
> > - The GUI process actually spawns a single sub-process (the  
> Scheduler), which itself manages more worker sub-sub-processes (with  
> a maximal number of workers).  Currently, we don't do very clever  
> scheduling based on task priorities, but this could easily be added.
> >
> > - An external computation can spawn sub-computations (by applying a  
> parallel "map" to a list) either synchronously (direct style) or  
> asynchronously (by providing a continuation function, which will be  
> applied to the list of results, maybe in a different process).  In  
> both cases,  this is done by sending those tasks to the Scheduler.   
> The Scheduler dispatches computation tasks to available workers.  In  
> the synchronous parallel map, the caller runs an inner event loop to  
> communicate with the Scheduler (and it only accepts sub-tasks created  
> by itself or one of its descendants).
> >
> > - Top-level external computations can be stopped by the main  
> process (e.g. on user request).  Concretely, this kills all workers  
> currently working on that task or one of its sub-tasks.
> >
> > - In addition to sending back the final results, computations can  
> report progress to their caller and more intermediate results.  This  
> is useful to show a progress bar/status and partial results in the  
> GUI before the end of the entire computation.
> >
> > - Communication between processes is done by exchanging marshaled  
> "variants" (a tagged representation of OCaml values, generated  
> automatically using our runtime types).  Since we can attach special  
> variantizers/devariantizers to specific types, this gives a chance to  
> customize how some values have to be exchanged between processes  
> (e.g. values relying on internal hash-consing are treated specially  
> to recreate the maximal sharing in the sub-process).
> >
> > - Concretely, the communication between processes is done through  
> queues of messages implemented with shared memory.  (This component  
> was developed by Fabrice Le Fessant and OCamlPro.)   Large  
> computation arguments or results (above a certain size) are stored on  
> the file system, to avoid having to keep them in RAM for too long (if  
> all workers are busy, the computation might wait for some time being  
> started).
> 
> Are all of the messages through these queues persistent, or just the  
> larger ones that are too big to fit in the shared memory segment, and  
> are they always point-to-point streams?
> 
> We've got a similar need in Xen/Mirage for shared memory  
> communication and queues, and have been breaking them out into  
> standalone libs such as:
> 
> https://github.com/djs55/shared-memory-ring
> 
> ...which is ABI-compatible with the existing Xen shared memory  
> interfaces, and also an OCaml version of the transport-agnostic API  
> sketched out in:
> http://anil.recoil.org/papers/2012-resolve-fable.pdf

Interesting that there are now other shared memory implementations for  
OCaml. Note that there are a number of them in Ocamlnet, with some  
specialities not yet mentioned. There is the Netcamlbox library  
providing message boxes of limited size for exchanging OCaml values  
directly. That means the value is copied to the shared memory block by  
the sender, and the receiver can pick it up there without copying it  
again. Sender and receiver can map the memory at different addresses  
(the copy procedure invoked by the sender takes care of possible  
offsets, so that that Netcamlbox also allows the communication between  
processes that don't have a fork relation). There is no need for  
marshalling the value.

http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Netcamlbox.html

Going even beyond that, Netmulticore implements an "ancient" heap in  
shared memory (like Richard's Ancient lib, but with more options). This  
heap is organized like OCaml's major heap, and there is even a GC  
implementation for it. There are a number of data structures (arrays,  
hash tables, queues, buffers) which are aware of residing in shared  
memory. For synchronization there are mutexes, semaphores and condition  
variables. So far the values to manipulate are already in shared  
memory, programming with Netmulticore feels a lot like programming with  
multi-threading. In practice, however, you need to frequently copy  
values in and out, so it is not exactly as convenient. For  
Netmulticore, all processes must map the shared memory to the same  
address (easy with "fork").

http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Intro.html#netmulticore
http://projects.camlcity.org/projects/dl/ocamlnet-3.6.3/doc/html-main/Netmcore_tut.html

> The missing link currently is the persistent queuing service, but  
> we're investigating the options here (ocamlmq looks rather nice).

There is also Netamqp, which can be used together with RabbitMQ.

http://projects.camlcity.org/projects/netamqp.html

Gerd


> -anil
> 
> 
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 



-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------

next prev parent reply	other threads:[~2013-03-28 12:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-26 14:29 Philippe Veber
2013-03-26 19:02 ` Martin Jambon
2013-03-26 21:01 ` Martin Jambon
2013-03-27  1:11   ` Francois Berenger
2013-03-28  7:37   ` Philippe Veber
2013-03-28  8:47     ` Alain Frisch
2013-03-28  9:39       ` Philippe Veber
2013-03-28 10:54         ` Alain Frisch
2013-03-28 11:02       ` Anil Madhavapeddy
2013-03-28 11:23         ` Alain Frisch
2013-03-28 12:18         ` Gerd Stolpmann [this message]
2013-03-27 10:00 ` Sébastien Dailly
2013-03-28  8:34   ` Philippe Veber
2013-03-27 16:07 ` Gerd Stolpmann
2013-03-28  9:18   ` Philippe Veber
2013-03-28 12:29     ` AW: " Gerd Stolpmann
2013-03-27 22:42 ` Denis Berthod
2013-03-27 22:49   ` AW: " Gerd Stolpmann
     [not found]     ` <4A6314AA-0C59-4E35-9EA4-F465C0A5AF3A@gmail.com>
2013-03-28  9:23       ` Philippe Veber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1364473139.14693.2@samsung \
    --to=info@gerd-stolpmann.de \
    --cc=alain@frisch.fr \
    --cc=anil@recoil.org \
    --cc=caml-list@inria.fr \
    --cc=cl-mirage@lists.cam.ac.uk \
    --cc=martin.jambon@ens-lyon.org \
    --cc=philippe.veber@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).