From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 223CABB84 for ; Mon, 14 Jul 2008 19:29:45 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoQKAEMqe0jUnw6FX2dsb2JhbACCN49tHAIGBhIDmFU X-IronPort-AV: E=Sophos;i="4.30,360,1212357600"; d="scan'208";a="15099877" Received: from pih-relay06.plus.net ([212.159.14.133]) by mail3-smtp-sop.national.inria.fr with ESMTP; 14 Jul 2008 19:29:44 +0200 Received: from [90.192.139.241] (helo=beast.local) by pih-relay06.plus.net with esmtpa (Exim) id 1KIRrr-0002Wz-T0 for caml-list@yquem.inria.fr; Mon, 14 Jul 2008 18:29:44 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] thousands of CPU cores Date: Mon, 14 Jul 2008 18:28:41 +0100 User-Agent: KMail/1.9.9 References: <200807141308.23474.jon@ffconsultancy.com> <2a1a1a0c0807141004p28ca6805hdf8f442840c61e7b@mail.gmail.com> In-Reply-To: <2a1a1a0c0807141004p28ca6805hdf8f442840c61e7b@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200807141828.41808.jon@ffconsultancy.com> X-Plusnet-Relay: cec80716e60caf5f6d473eb19021561c X-Spam: no; 0.00; allocations:01 allocations:01 ocaml:01 compiler:01 subset:01 ocaml:01 camlp:01 run-time:01 afresh:98 frog:98 wrote:01 heap:01 heap:01 caml-list:01 loops:02 On Monday 14 July 2008 18:04:01 Mike Lin wrote: > Incidentally, it occurs to me that when one is optimizing the kind of tight > numerical loops that can really benefit from shared memory, the FIRST step, > before parallelizing, is to do away with any heap allocations in the loop. > The following is not a serious proposal, but just to kick the idea around - > what is the feasibility of removing the global interpreter lock for > segments of code which perform no heap allocations? i.e. what besides the > GC is stopping us? I have had similar ideas but I think it would be much wiser to move away from OCaml and start afresh. You can easily write a compiler for a subset of OCaml suitable for high-performance numerics using LLVM. Users can easily develop their numerical code with OCaml and then quote it using camlp4 to have it compiled at run-time. However, this only works for a tiny DSL where no allocations are required (you can easily improve upon OCaml by not boxing floats and by using value types though). -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e