From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id F20E4BBC4 for ; Wed, 4 Mar 2009 20:00:06 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjMGAIBerknUnwckZmdsb2JhbACCHZJlDQsDBQkPBsM0hAgG X-IronPort-AV: E=Sophos;i="4.38,302,1233529200"; d="scan'208";a="22054229" Received: from relay.ptn-ipout02.plus.net ([212.159.7.36]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 04 Mar 2009 20:00:06 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: An4FAENerknUnw4R/2dsb2JhbACCHdZWhAgG Received: from pih-relay04.plus.net ([212.159.14.17]) by relay.ptn-ipout02.plus.net with ESMTP; 04 Mar 2009 19:00:06 +0000 Received: from [87.112.234.120] (helo=leper.local) by pih-relay04.plus.net with esmtp (Exim) id 1LewK5-00084O-Qm for caml-list@yquem.inria.fr; Wed, 04 Mar 2009 19:00:06 +0000 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Odd performance result with HLVM Date: Wed, 4 Mar 2009 19:05:28 +0000 User-Agent: KMail/1.9.9 References: <200902280112.24115.jon@ffconsultancy.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200903041905.28823.jon@ffconsultancy.com> X-Plusnet-Relay: 779feb9b7e139f5409fb57343674c5a4 X-Spam: no; 0.00; mikkel:01 rgensen:01 haskell:01 ocaml:01 ocaml:01 haskell:01 lennart:01 augustsson:01 vaguely:01 extensively:01 parallelism:01 ocaml's:01 parallelism:01 2009:98 frog:98 On Wednesday 04 March 2009 16:17:55 Mikkel Fahn=C3=B8e J=C3=B8rgensen wrote: > When looking at the benchmark game and other benchmarks I have seen, I > noticed that Haskell is almost as fast as OCaml and sometimes faster. > Some Lisp implementations are also pretty fast. =46rom my ray tracer language comparison, my OCaml is ~50% faster than Hask= ell=20 written by Lennart Augustsson and Lisp written by Juho Snellman, both of wh= om=20 have extensive experience writing optimizing compilers for those languages= =20 whereas I did not: http://www.ffconsultancy.com/languages/ray_tracer/results.html Moreover, I received dozens of implementations in Haskell and Lisp and thes= e=20 were the only vaguely competitive ones: most programmers are not able to=20 write fast code in Haskell or Lisp primarily because their performance is s= o=20 wildly unpredictable. The Burrows-Wheeler block sorting data compression algorithm implemented in= =20 Haskell and discussed extensively for weeks in the context of performance i= s=20 a good example of this: they never got within 10,000x the performance of C.= =20 There are many other examples where nobody was able to get within orders of= =20 magnitude of the performance of common languages. GHC does have rudimentary support for parallelism and that makes it much=20 easier to leverage 2-6 cores in Haskell compared to OCaml. However, that is= =20 merely a deficiency in the current OCaml implementation and is something th= at=20 can be addressed. Moreover, the current Haskell implementation scales very= =20 poorly and is easily maxed out even on today's 8 core computers. For exampl= e,=20 on a recent Mandelbrot benchmark from comp.lang.functional the Haskell is=20 faster for 1-6 cores but stops seeing improvements beyond that whereas OCam= l=20 with process forking continues to see improvements up to all 8 cores and it= =20 faster overall on this machine as a consequence. Although efficient concurrent garbage collectors are hard to write, paralle= l=20 ones like the one in GHC are comparatively easy to write and still very=20 useful. > However, when you look at memory consumption OCaml uses considerably > less memory, except for languages in the C family. > > I suspect that many real world performance scenarios, such as heavily > loaded web servers and complex simulations, depend very much on memory > consumption. This is both because of GC overhead and because of the > slower memory pipeline the more cache levels are involved. > > So in case of a new JIT solution for OCaml, I believe it is important > to observe this aspect as well. OCaml's memory efficiency is certainly extremely good and it may be=20 theoretically possible to preserve that in a new implementation that suppor= ts=20 parallelism. That is absolutely not the goal of my work though: I only inte= nt=20 to get the simplest possible parallel GC working because I am interested=20 primarily in high-performance numerics, string processing and visualization= =20 and not web servers. However, I will endeavour to make the implementation as extensible as possi= ble=20 so that other people can create drop-in replacements that provide this kind= =20 of functionality. Improving upon my GC should be quite easy for anyone vers= ed=20 in the subject. Interestingly, my GC is written entirely in my own=20 intermediate representation. =2D-=20 Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e