From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stephanh@planet.nl>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by yquem.inria.fr (Postfix) with ESMTP id A4E5CBC57
	for <caml-list@yquem.inria.fr>; Tue, 30 Nov 2010 15:04:34 +0100 (CET)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AkgCAC2T9EzVSycKkWdsb2JhbACDUJE0jgwVAQEBAQkLCgcRAx+ySpEbgSGDM3MEimKDDwY
X-IronPort-AV: E=Sophos;i="4.59,280,1288566000"; 
   d="scan'208";a="81991630"
Received: from cpsmtpb-ews07.kpnxchange.com ([213.75.39.10])
  by mail2-smtp-roc.national.inria.fr with ESMTP; 30 Nov 2010 15:04:34 +0100
Received: from cpbrm-ews21.kpnxchange.com ([10.94.84.152]) by cpsmtpb-ews07.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.4675);
	 Tue, 30 Nov 2010 15:04:33 +0100
Received: from CPSMTPM-CMT104.kpnxchange.com ([195.121.3.20]) by cpbrm-ews21.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.4675);
	 Tue, 30 Nov 2010 15:04:33 +0100
Received: from [192.168.2.4] ([84.80.25.83]) by CPSMTPM-CMT104.kpnxchange.com with Microsoft SMTPSVC(7.0.6001.18000);
	 Tue, 30 Nov 2010 15:04:33 +0100
Message-ID: <4CF50470.802@planet.nl>
Date: Tue, 30 Nov 2010 15:04:32 +0100
From: Stephan Houben <stephanh@planet.nl>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6
MIME-Version: 1.0
Newsgroups: fa.caml
To: oliver@first.in-berlin.de
Cc: caml-list@yquem.inria.fr
Subject: Re: Threading and SharedMem (Re: [Caml-list] Re: Is OCaml fast?)
References: <fa.B9mcuN46iEGhXlge41VUCLz69+Y@ifi.uio.no> <fa.D3cDWzaD9Uu03+KvekpwpBGCx7o@ifi.uio.no> <fa.xsCCCeDYPj8J16i9UrdqxoOIQ0Y@ifi.uio.no> <fa.SW2Swldk88Bs5ujaNHT8Yh4bXkg@ifi.uio.no> <fa.V+M6RbukE/w/Aftpwxkx2MvkxlU@ifi.uio.no> <fa.+OkqNL3AB4+5LA8wOnQD9WS59QQ@ifi.uio.no>
In-Reply-To: <fa.+OkqNL3AB4+5LA8wOnQD9WS59QQ@ifi.uio.no>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 30 Nov 2010 14:04:33.0592 (UTC) FILETIME=[83CD4380:01CB9097]
X-RcptDomain: yquem.inria.fr
X-Spam: no; 0.00; threading:01 ocaml:01 in-berlin:01 bigarray:01 inherently:01 bigarray:01 ocaml:01 corresponds:01 byte:01 corresponds:01 byte:01 waitpid:01 threading:01 ths:98 up-front:98 

On 11/30/2010 12:55 PM, oliver@first.in-berlin.de wrote:
> There is one problem with this... when you have forked, then
> you obviously have separated processes and also in each process
> your own ocaml-program with it's own GC running...

...neatly sidestepping the problem that the GC needs to lock out all threads...

> .with such a mem-mapping trick (never used Bigarray, so I'm astouned it uses
> mmap) you then have independent processes, working on shared mem without
> synchronisation.

> This is a good possibility to get corrupted data, and therefore unreliable behaviour.

Well, not more possibility than inherently in any code that updates a shared data structure
in parallel. It is certainly not the case that the independently executing GCs in both
processes are causing data corruption, since the GC only operates on unshared memory.
Note that the GC doesn't move the Bigarray data around.

(I am not sure if this was in particular your worry or that it was just the lack of
  synchronisation mechanisms which you bring up next.
  Apologies if I am addressing some non-concern.)

> So, you have somehow to create a way of communicating of these processes.
>
> This already is easily done in the Threads-module, because synchronisation
> mechanisms are bound there to the OCaml API and can be used easily.
>
> In the Unix module there is not much of ths IPC stuff...

In fact there is the Unix.pipe function which can be used for message passing
communication between processes.
A pipe can also be used as a semaphore:
operation V corresponds to writing a byte to the pipe, operation P corresponds to reading a byte.
It's a bit heavy since it always makes a kernel call even for the non-contended case, but
otherwise it works perfectly.

For many purposes (e.g. something "embarrassingly parallel" like computing the Mandelbrot set)
you can just divide the work up-front and only rely on the implicit synchronization given
by waitpid.

If you allow me a final observation/rant: I personally feel that the use of fork() and
pipes as a way to exploit multiple CPUs is underrated. When appropriate (lots of computation
and not so much synchronisation/communication) it works really great and is very robust because
all data is process-private by default, as opposed to threading, where everything is shared
and you have to stand on your head to get a thread-local variable. Performance can also be better
since you don't run into cache coherency issues.

I am not sure why it is not used more; possibly because it is not supported on Windows.

Stephan