From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jon@ffconsultancy.com>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by yquem.inria.fr (Postfix) with ESMTP id CCC34BBCA
	for <caml-list@yquem.inria.fr>; Fri,  9 May 2008 21:55:03 +0200 (CEST)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AmgDAHxIJEjUnw7Ub2dsb2JhbACCMY9WAQwFAgQHEwOZOg
X-IronPort-AV: E=Sophos;i="4.27,461,1204498800"; 
   d="scan'208";a="26011112"
Received: from ptb-relay01.plus.net ([212.159.14.212])
  by mail4-smtp-sop.national.inria.fr with ESMTP; 09 May 2008 21:55:03 +0200
Received: from [80.229.56.224] (helo=beast.local)
	 by ptb-relay01.plus.net with esmtp (Exim) id 1JuYgI-0005jM-3P; Fri, 09 May 2008 20:55:02 +0100
From: Jon Harrop <jon@ffconsultancy.com>
Organization: Flying Frog Consultancy Ltd.
To: Gerd Stolpmann <info@gerd-stolpmann.de>
Subject: Re: [Caml-list] Re: Why OCaml rocks
Date: Fri, 9 May 2008 19:10:41 +0100
User-Agent: KMail/1.9.9
References: <200805090139.54870.jon@ffconsultancy.com> <200805090609.36123.jon@ffconsultancy.com> <1210331526.17578.32.camel@flake.lan.gerd-stolpmann.de>
In-Reply-To: <1210331526.17578.32.camel@flake.lan.gerd-stolpmann.de>
Cc: caml-list@yquem.inria.fr
MIME-Version: 1.0
Content-Disposition: inline
Message-Id: <200805091910.41381.jon@ffconsultancy.com>
Content-Type: text/plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
X-Plusnet-Relay: 3a47016084c8286cd257196038df414e
X-Spam: no; 0.00; ocaml:01 gerd:01 stolpmann:01 parallelism:01 o'caml:01 parallelism:01 o'caml:01 ocaml:01 mutable:01 mutable:01 frog:98 threads:01 threads:01 wrote:01 caml-list:01 

On Friday 09 May 2008 12:12:00 Gerd Stolpmann wrote:
> I think the parallelism capabilities are already excellent. We have been
> able to implement the application backend of Wink's people search in
> O'Caml, and it is of course a highly parallel system of programs. This
> is not the same class raytracers or desktop parallelism fall into - this
> is highly professional supercomputing. I'm talking about a cluster of
> ~20 computers with something like 60 CPUs.
>
> Of course, we did not use multithreading very much. We are relying on
> multi-processing (both "fork"ed style and separately started programs),
> and multiplexing (i.e. application-driven micro-threading). I especially
> like the latter: Doing multiplexing in O'Caml is fun, and a substitute
> for most applications of multithreading. For example, you want to query
> multiple remote servers in parallel: Very easy with multiplexing,
> whereas the multithreaded counterpart would quickly run into scalability
> problems (threads are heavy-weight, and need a lot of resources).

If OCaml is good for concurrency on distributed systems that is great but it 
is completely different to CPU-bound parallelism on multicores.

> > There are two problems with that:
> >
> > . You go back to manual memory management between parallel
> > threads/processes.
>
> I guess you refer to explicit references between processes. This is a
> kind of problem, and best handled by avoiding it.

Then you take a massive performance hit.

> > . Parallelism is for performance and performance requires mutable data
> > structures.
>
> In our case, the mutable data structures that count are on disk.
> Everything else is only temporary state.

Exactly. That is a completely different kettle of fish to writing high 
performance numerical codes for scientific computing.

> I admit that it is a challenge to structure programs in a way such that
> parallel programs not sharing memory profit from mutable state. Note
> that it is also a challenge to debug locks in a multithreaded program so
> that they run 24/7. Parallelism is not easy after all.

Parallelism is easy in F#.

> > Then you almost always end up copying data unnecessarily because you
> > cannot collect it otherwise, which increases memory consumption and
> > massively degrades performance that, in turn, completely undermines the
> > original point of parallelism.
>
> Ok, I understand. We are complete fools. :-)
>
> I think that the cost of copying data is totally overrated. We are doing
> this often, and even over the network, and hey, we are breaking every
> speed limit.

You cannot afford to pay that price for parallel implementations of most 
numerical algorithms.

> > The cost of interthread communication is then so high in
> > OCaml that you will rarely be able to obtain any performance improvement
> > for the number of cores desktop machines are going to see over the next
> > ten years, by which time OCaml will be 10-100x slower than the
> > competition.
>
> This is a quite theoretical statement. We will rather see that most
> application programmers will not learn parallelism at all, and that
> consumers will start question the sense of multicores, and the chip
> industry will search for alternatives.

On the contrary, that is not a theoretical statement at all: it already 
happened. F# already makes it much easier to write high performance parallel 
algorithms and its concurrent GC is the crux of that capability.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e