From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p2OEvusM026430
	for <caml-list@sympa-roc.inria.fr>; Thu, 24 Mar 2011 15:58:00 +0100
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AhwCABP5ik3U4366kGdsb2JhbACERKEFFAEBAQEJCQ0HFAMiiE2pCJETAoElg0t3BIgqiBc
X-IronPort-AV: E=Sophos;i="4.63,237,1299452400"; 
   d="scan'208";a="91053121"
Received: from moutng.kundenserver.de ([212.227.126.186])
  by mail4-smtp-sop.national.inria.fr with ESMTP; 24 Mar 2011 15:57:59 +0100
Received: from office1.lan.sumadev.de (dslb-188-097-066-234.pools.arcor-ip.net [188.97.66.234])
	by mrelayeu.kundenserver.de (node=mreu4) with ESMTP (Nemesis)
	id 0LqrbD-1PXGa60fC7-00edhr; Thu, 24 Mar 2011 15:57:57 +0100
Received: from [192.168.1.111] (546BE640.cm-12-4d.dynamic.ziggo.nl [84.107.230.64])
	by office1.lan.sumadev.de (Postfix) with ESMTPA id C23A35F701;
	Thu, 24 Mar 2011 15:57:56 +0100 (CET)
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Alexy Khrabrov <deliverable@gmail.com>
Cc: caml-list <caml-list@inria.fr>
In-Reply-To: <4587C749-DEFE-47B9-8D9A-44FE5980B58F@gmail.com>
References: <4587C749-DEFE-47B9-8D9A-44FE5980B58F@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Date: Thu, 24 Mar 2011 15:57:55 +0100
Message-ID: <1300978675.8429.227.camel@thinkpad>
Mime-Version: 1.0
X-Mailer: Evolution 2.28.1 
Content-Transfer-Encoding: 7bit
X-Provags-ID: V02:K0:+PuXYQETRVNtyv/JW6MyNQuAbReU6fE0oBhc1n4RmKV
 K0Y3gSzDr/2F3Knk5WU9wM8W5FN1fxxsdBqdMeFJg5Y41uoqTS
 jjcS8cT/54MvfNG76a73AdD1QFRjWmS6cf5rrjjVedJEXCewfy
 RW8ofz/tf7GXlwsid/Y/BFKdCBM7cQ6XpUQs2jU4hnYeQIO9j8
 dpzJzasd1MWKuokeVscrg==
Subject: Re: [Caml-list] Efficient OCaml multicore -- roadmap?

Am Donnerstag, den 24.03.2011, 09:44 -0400 schrieb Alexy Khrabrov:
> Where does the OCaml team stand on the multicore issues?  A year or so ago, when there was a prototype parallel GC implementation, IIRC, Xavier said it has to be done right.  So what are the official plans and the status of integrating what volunteers had done?
> 
> WIth Scala having a robust actors model and AKKA kernel, and Clojure built around efficient shared memory concurrency with agents and references and STM, and Haskell also really parallel, OCaml is lacking behind.  Furthermore, F# builds on strongly parallel .NET, overcoming granddaddy.  With multicores common even in laptops and iPads, we need an efficient  multicore OCaml!  Due to the model different from Haskell or Scala and Clojure, now all on github, OCaml is both more stable and also is slower to advance -- what do folks think about this situation?  How do you do shared memory parallelism now?

I'm a bit reluctant to answering to this at all, because there is a big
misunderstanding in the world about what parallelism is good for. Some
people think it is a technique for doing micro optimizations in their
code (i.e. parallelize some inner loops). Questions about "shared memory
concurrency" often go into this direction because existing code can
profit from it. Although there is nothing wrong in principal about this,
one should be very aware that the usefulness is very limited - often
only code speedups of a few percent of the total runtime can be
achieved, and it is often restricted to two cores. If you look at
programs that are designed for parallelism, the whole story is a
completely different one, though.

So, here is my answer to your latter question: I've started a library
for using independent processes as workers that can communicate via
explicit shared memory (see
https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/src/netmulticore/, and https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/multicore/chameneos.ml for a coding sample). This seems at least to be a solution in the Unix world for coarse-grained parallelism that is not dominated by the communication overhead between the processes. Available features:

- manage worker processes
- the processes can use files, shared memory, and semaphores
- processes can communicate to each other by sending messages to
  "camlboxes". By tricky implementation, there is only overhead for 
  serialization but not deserialization which makes message passing
  really fast

Of course, this is all very experimental. At least it demonstrates that
exploiting shared memory between independent processes is possible. For
getting a real programming environment, one needs to put more effort
into this, though, especially we would need data structures that can
live in shared memory (e.g. hashtables).

The resulting programming model is of course not the usual
multi-threading model. In particular, values are not automatically put
into shared memory. This has to be explicitly managed. IMHO, this is not
a weakness, but rather a strong point (and not everybody will understand
this - it's a somehow comparable to banning "goto" from higher-level
languages). It also means that you cannot use parallelism for doing
micro optimizations - the whole program has to be re-designed.

Generally, I'd say there is right now no perfect solution to the
problem. As language developer, you have to foresee future hardware to
some degree. When more and more cores are added, the characteristics of
the hardware will gradually shift from SMP to NUMA, and the more it
does, message passing techniques will become more attractive.

I'd like to see Ocaml doing here well. Let the others focus on SMP
architectures (which will, at a certain point of hardware development,
only be found in laptops but not in real compute hardware).

Gerd

> 
> Cheers,
> Alexy
> 


-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------