From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 1F123BBAF for ; Mon, 21 Dec 2009 02:02:37 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AigBAJtXLkvU4xEIkWdsb2JhbACBSpoIAQEBAQkLCgcTA7ZBAoQsBIFlGQ X-IronPort-AV: E=Sophos;i="4.47,429,1257116400"; d="scan'208";a="40335647" Received: from moutng.kundenserver.de ([212.227.17.8]) by mail3-smtp-sop.national.inria.fr with ESMTP; 21 Dec 2009 02:02:36 +0100 Received: from office1.lan.sumadev.de (dslb-084-059-079-108.pools.arcor-ip.net [84.59.79.108]) by mrelayeu.kundenserver.de (node=mrbap1) with ESMTP (Nemesis) id 0MQu7S-1NRRI516QI-00TjjO; Mon, 21 Dec 2009 02:02:35 +0100 Received: from [192.168.0.32] (dslb-084-058-048-057.pools.arcor-ip.net [84.58.48.57]) by office1.lan.sumadev.de (Postfix) with ESMTPSA id C1C33C0E8E; Mon, 21 Dec 2009 02:02:34 +0100 (CET) Subject: Re: [Caml-list] Re: OCaml is broken From: Gerd Stolpmann To: Jon Harrop Cc: caml-list@yquem.inria.fr In-Reply-To: <200912202114.42669.jon@ffconsultancy.com> References: <794713.82307.qm@web111510.mail.gq1.yahoo.com> <200912202114.42669.jon@ffconsultancy.com> Content-Type: text/plain Date: Mon, 21 Dec 2009 02:08:14 +0100 Message-Id: <1261357694.6545.89.camel@flake.lan.gerd-stolpmann.de> Mime-Version: 1.0 X-Mailer: Evolution 2.12.1 Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX18zWvNTVce1L0RXqW5jzTEvJ02kVYAnVKuE/II vow2OQyHIbxDYjKg2CMMw4Y7D395G/vZgoYXtKWcpSFekDV6jG //W4pfHmQIubZjzYJCjaw== X-Spam: no; 0.00; ocaml:01 gerd:01 stolpmann:01 gerd:01 ocaml:01 ocaml's:01 renders:01 hash:01 real-world:01 runtimes:01 runtimes:01 allocating:01 computations:01 ocaml's:01 guis:01 > The following web page describes a commercial machine sold by Azul Systems > that has up to 16 54-core CPUs (=864 cores) and 768 GB of memory in a flat > SMP configuration: > > http://www.azulsystems.com/products/compute_appliance.htm > > As you can see, a GC with shared memory can already scale across dozens of > cores and memory access is no more heterogeneous than it was 20 years ago. > Also, note that homogeneous memory access is a red herring in this context > because it does not undermine the utility of a shared heap on a multicore. The benchmarks they mention can all easily be parallelized - that stuff you can also do with multi-processing. The interesting thing would be an inherent parallel algorithm where the same memory region is accessed by multiple threads. Or at least a numeric program (your examples seem to be mostly from that area). > > - Have you considered that many Ocaml users prefer a GC that offers maximum > > single core performance, > > OCaml's GC is nowhere near offering maximum single core performance. Its > uniform data representation renders OCaml many times slower than its > competitors for many tasks. For example, filling a 10M float->float hash > table is over 18x slower with OCaml than with F#. FFT with a complex number > type is 5.5x slower with OCaml than F#. Fibonacci with floats is 3.3x slower > with OCaml than my own HLVM project (!). Sure, but these micro benchmarks are first seldom correct, and do not really count for real-world programs. For example, an important parameter of such benchmarks is the frequency the GC runs. Ocaml runs the GC very often - good for latencies, but bad for micro benchmarks because other runtimes simply delay the GC until some limits are exceeded, so these other runtimes often haven't run the GC even once in the short period of time the benchmark runs. It is simply a fact that the ocaml developers had some preferences. E.g. allocating and freeing short-living values is extremely fast (often <10ns). This is very good when you do symbolic computations, or have lots of small strings, but ignorable for numeric stuff, or for programs where the lifetime of allocated memory is bound to server sessions. The minor GC is very fast, but, as you observe, the uniform representation has costs elsewhere. > > because their application is parallelised via multiple processes > > communicating via message passing? > > A circular argument based upon the self-selected group of remaining OCaml > users. Today's OCaml users use OCaml despite its shortcomings. If you want to > see the impact of OCaml's multicore unfriendliness, consider why the OCaml > community has haemorrhaged 50% of its users in only 2 years. Don't see that. That's just speculation - maybe some win32 ocaml users switched to F#, but there are for sure also other reasons than multicore support, e.g. GUIs and better Windows integration. Btw, where do you get your numbers from? There are many, many users for whom multicore is just a useless hype. Either the algorithms are inherently difficult to parallelize (and this is vast majority), or are that easy (like all client/server stuff) that multi-processing is sufficient. You can consider multicore as a marketing trick of the chip industry to let the ordinary desktop user pay for a feature that is mostly interesting for datacenters. Gerd -- ------------------------------------------------------------ Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de Phone: +49-6151-153855 Fax: +49-6151-997714 ------------------------------------------------------------