From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.9 required=5.0 tests=AWL,DNS_FROM_RFC_POST, HTML_10_20,HTML_MESSAGE,SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 99DE8BB84 for ; Mon, 14 Jul 2008 19:04:05 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgwBAKkje0jRVY65fGdsb2JhbACCcY56QwEBCwUCBgcRA5MqhHw X-IronPort-AV: E=Sophos;i="4.30,360,1212357600"; d="scan'208";a="27310902" Received: from ti-out-0910.google.com ([209.85.142.185]) by mail4-smtp-sop.national.inria.fr with ESMTP; 14 Jul 2008 19:04:04 +0200 Received: by ti-out-0910.google.com with SMTP id d27so2535599tid.3 for ; Mon, 14 Jul 2008 10:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:in-reply-to:mime-version:content-type:references :x-google-sender-auth; bh=QyGtbGf2DjgIMnW+rMmWKaNkpBcOYkj4Tz7cZjUSJik=; b=enM6O5IxQs/yvyyYwLx/jBeAGycQ0tF1/eTQvtVU+iXIe/yhRg8Vsc/QDGHIxCKtIt SvLzLMAaedgqbu763gQO3i4akupHwLBpTBfJ5rLsZOMTqklEEnG/g9BPlf9Ibnw6q6FE 3CIV4vPVxFypGxFKzX8u68R6adoJDMZczh8dc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:in-reply-to:mime-version :content-type:references:x-google-sender-auth; b=B5XA/08mC9DzR9H08/GI2BF+PKrRsJyfBR9dnvwEtAUBcbMq6mwA++cswH7p+KKAiC QLsxa1NgW7lSfYByharL4VnQYmiRyI1H53d8P+2GlP1OJ2m7obb7GGxGUzcqBePsnimw V5/j45285NlHqyhIj3LDGDZUhtR9mwW5gDdUw= Received: by 10.151.14.5 with SMTP id r5mr9089232ybi.230.1216055041176; Mon, 14 Jul 2008 10:04:01 -0700 (PDT) Received: by 10.150.91.6 with HTTP; Mon, 14 Jul 2008 10:04:01 -0700 (PDT) Message-ID: <2a1a1a0c0807141004p28ca6805hdf8f442840c61e7b@mail.gmail.com> Date: Mon, 14 Jul 2008 13:04:01 -0400 From: "Mike Lin" Sender: nilekim@gmail.com To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] thousands of CPU cores In-Reply-To: <200807141308.23474.jon@ffconsultancy.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_67148_17302882.1216055041128" References: <200807101235.42602.jon@ffconsultancy.com> <200807141308.23474.jon@ffconsultancy.com> X-Google-Sender-Auth: 913ccd8d5c5be09f X-Spam: no; 0.00; mikelin:01 openmp:01 tpl:01 ocaml:01 openmp:01 ocaml:01 allocations:01 allocations:01 tpl:01 wrote:01 wrote:01 heap:01 heap:01 caml-list:01 marshal:01 ------=_Part_67148_17302882.1216055041128 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On Mon, Jul 14, 2008 at 8:08 AM, Jon Harrop wrote: > > Perhaps the parallel GC could enable support for things like OpenMP but I > personally would rather see a shift to similar functionality to that of > Microsoft's TPL because (I assume) it is better for parallel tree > operations > that are themselves more common in languages like OCaml. OpenMP is really great for parallelizing tight loops in numerical code, which is one scenario in which I'd agree shared memory is much better than message passing, at least as far as it scales. I wish I had this for my OCaml CRF and M^3 network code! But for higher level, map/reduce type of stuffs, I really think message passing tends to gets you there. In such applications I am usually interested in distributing across a compute farm anyway, for both CPU and memory requirements. I started with a lame homerolled fork+Marshal library, then moved on to Gerd's RPC stuff, now finally I'm playing with ocamlp3l... Incidentally, it occurs to me that when one is optimizing the kind of tight numerical loops that can really benefit from shared memory, the FIRST step, before parallelizing, is to do away with any heap allocations in the loop. The following is not a serious proposal, but just to kick the idea around - what is the feasibility of removing the global interpreter lock for segments of code which perform no heap allocations? i.e. what besides the GC is stopping us? Mike ------=_Part_67148_17302882.1216055041128 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline

On Mon, Jul 14, 2008 at 8:08 AM, Jon Harrop <jon@ffconsultancy.com> wrote:

Perhaps the parallel GC could enable support for things like OpenMP but I
personally would rather see a shift to similar functionality to that of
Microsoft's TPL because (I assume) it is better for parallel tree operations
that are themselves more common in languages like OCaml.

OpenMP is really great for parallelizing tight loops in numerical code, which is one scenario in which I'd agree shared memory is much better than message passing, at least as far as it scales. I wish I had this for my OCaml CRF and M^3 network code!

But for higher level, map/reduce type of stuffs, I really think message passing tends to gets you there. In such applications I am usually interested in distributing across a compute farm anyway, for both CPU and memory requirements. I started with a lame homerolled fork+Marshal library, then moved on to Gerd's RPC stuff, now finally I'm playing with ocamlp3l...

Incidentally, it occurs to me that when one is optimizing the kind of tight numerical loops that can really benefit from shared memory, the FIRST step, before parallelizing, is to do away with any heap allocations in the loop. The following is not a serious proposal, but just to kick the idea around - what is the feasibility of removing the global interpreter lock for segments of code which perform no heap allocations? i.e. what besides the GC is stopping us?

Mike

------=_Part_67148_17302882.1216055041128--