From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jonathandeanharrop@googlemail.com>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by yquem.inria.fr (Postfix) with ESMTP id A74C6BBAF
	for <caml-list@yquem.inria.fr>; Thu, 20 May 2010 04:03:42 +0200 (CEST)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ArMBALs09EtKfVK0mGdsb2JhbACTB4pzCBUBAQEBAQgJDAcRIogppWEBBY4TAQSFEQ
X-IronPort-AV: E=Sophos;i="4.53,267,1272837600"; 
   d="scan'208";a="63119903"
Received: from mail-wy0-f180.google.com ([74.125.82.180])
  by mail4-smtp-sop.national.inria.fr with ESMTP; 20 May 2010 04:03:42 +0200
Received: by wyi11 with SMTP id 11so1060921wyi.39
        for <caml-list@yquem.inria.fr>; Wed, 19 May 2010 19:03:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlemail.com; s=gamma;
        h=domainkey-signature:received:received:from:to:references
         :in-reply-to:subject:date:organization:message-id:mime-version
         :content-type:content-transfer-encoding:x-mailer:thread-index
         :content-language;
        bh=JfwnmmKiVHWIJPiFmExbdNYewhiRlYUgUIPCMuSlkpo=;
        b=QI3LhE6k+acYn6jyDKkSaC+OUJym5l4+y6RkdmyWKvzRZDF57HFbjZmi8toIDWaxhs
         ZK7z5JO/pDhY6c186p1aKj6uVXisVz7gUh3ewYkT6vfSI/Ps8bVqIHGksQyNtX7JwfsT
         x6A8jBK5k2dZ83ZvExdX9+mZfn+NyTrOOOVn0=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=googlemail.com; s=gamma;
        h=from:to:references:in-reply-to:subject:date:organization:message-id
         :mime-version:content-type:content-transfer-encoding:x-mailer
         :thread-index:content-language;
        b=p277QvPSjOpQJ2App0IPLlM6iqQnS+r+P830cyaYsFgXJocMcC6x1XwP+aZNqT9Tgs
         dWJrJ0FmT+cBPrfBpzp59fbZEGbFbujs2KPMJ72t57wztjUU8pydUgTk9gzaVo8BJK7N
         zVDo4WaoSnsW3QxCKWMkLdcZjwFPAax4Hk4E8=
Received: by 10.227.137.143 with SMTP id w15mr8502815wbt.117.1274321021495;
        Wed, 19 May 2010 19:03:41 -0700 (PDT)
Received: from WinEight (87.114.183.77.plusnet.thn-ag1.dyn.plus.net [87.114.183.77])
        by mx.google.com with ESMTPS id z33sm2048211wbd.7.2010.05.19.19.03.38
        (version=SSLv3 cipher=RC4-MD5);
        Wed, 19 May 2010 19:03:39 -0700 (PDT)
From: Jon Harrop <jonathandeanharrop@googlemail.com>
To: <caml-list@yquem.inria.fr>
References: <w2t320e992a1005051158pcdb30e5w48409aa8b6e464e4@mail.gmail.com>	<951508.20587.qm@web58708.mail.re1.yahoo.com>	<g2j90823c941005060343t88c91d06k731ccd01ec225c07@mail.gmail.com>	<201005061233.07551.peng.zang@gmail.com>	<07b101caf08b$3e5022c0$baf06840$@com>	<AANLkTik7gzvUrLntCLbL7gQow4RCzkXDPwCNiCHIE6Or@mail.gmail.com>	<088201caf1ce$b5060cb0$1f122610$@com>	<20100512151137.26894ywcpv71ixvk@imp.ovh.net>	<012601caf351$e9a362e0$bcea28a0$@com>	<44A730DD-54EB-4A1C-BD1A-6E9EFB31B5A2@x9c.fr>	<01f001caf536$c923b4c0$5b6b1e40$@com>	<20100517095327.14271x0lnao43sao@imp.ovh.net>	<002001caf6e8$b408ed90$1c1ac8b0$@com> <20100519094634.63006zi1h04x95z4@imp.ovh.net>
In-Reply-To: <20100519094634.63006zi1h04x95z4@imp.ovh.net>
Subject: RE: [Caml-list] about OcamIL
Date: Thu, 20 May 2010 03:03:09 +0100
Organization: Flying Frog Consultancy
Message-ID: <009b01caf7c0$99968340$ccc389c0$@com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Acr3J302LfkIM1EyTeOlgvNC2mz2ZgAg+ISQ
Content-Language: en-gb
X-Spam: no; 0.00; ocaml:01 hashtables:01 haskell:01 hash:01 hash:01 unboxed:01 trade-offs:01 compiler:01 compiler:01 computations:01 ocaml's:01 parallelism:01 trade-offs:01 higher-level:01 trade-off:01 

Xavier Clerc wrote:
> Jon Harrop <jonathandeanharrop@googlemail.com> a =E9crit :
> > I don't think this is heated at all. We were talking about "high
> > performance" languages and you cited a bunch of languages that get
> whipped
> > by Python on this benchmark:
> >
> >
> > http://flyingfrogblog.blogspot.com/2009/04/f-vs-ocaml-vs-haskell-
> hash-table.
> > html
>=20
> Acknowledged.
> "Whipped" is here 2 times slower on that particular benchmark,
> while Python is rarely within an order of magnitude of OCaml code
> (cf. the language shootout).
> Moreover, hashtables are ubiquitous in Python (and hence probably
> particularly optimized), while they are not so common in Haskell
> or Caml.

I greatly value efficient generic collections.

> >> and references to benchmarks that back up your claims in this
> thread.
> >
> >   http://fsharpnews.blogspot.com/2010/05/java-vs-f.html
>=20
> Point taken.
> Just notice that the 17x factor is observed on the micro-benchmark,
> while on the larger one the two platforms seem on par.

Sure but those two benchmarks are testing completely different things.  =
The
shootout's knucleotide is a larger benchmark that still uses hash tables =
and
Java is still 4x slower because the JVM cannot express a generic hash =
table
efficiently. There are many such problems where the lack of value types =
on
the JVM is a serious impediment to performance.

> Here is a question about the micro-benchmark: do you know if F# do
> monomorphize the collection in this example?
> If it turns out to be done, one may probably argue that the problem
> is more related to the language than to the platform (just recycling
> an objection made on the page you pointed out).

I'm not sure what you mean here. Both of the programs are just using the
generic hash table implementation native to their platform. Neither =
language
does anything of relevance to optimize the code. .NET is a lot faster =
and
uses a lot less memory than the JVM because it stores keys and values
unboxed in the spine of the hash table because they are value types =
whereas
the JVM boxes them, and because the insertion function is monomorphized =
at
JIT time.

> > Scala on the JVM is 7x slower than C++ on this benchmark:
> >
> =
http://shootout.alioth.debian.org/u64q/benchmark.php?test=3Dall&lang=3Dsc=
al
> a&lan
> > g2=3Dgpp
>=20
> Agreed, but it seems that if you aggregate the results of the =
different
> benchmarks, Scala is on average only 1.5x from C++ (but far away in
> terms
> of memory consumption). The 7x factor is observed the worst result,
> the median being 2x.

Sure. This seems to be a difference in our interpretation of "high
performance". If a language or platform can be 17x slower than others =
then I
would not call it "high performance" even if it is competitively =
performant
elsewhere. Indeed, I would completely disregard averages on benchmark =
suites
and focus on outliers because they convey more useful information. =
Granted
that was a microbenchmark but the effect is severe enough to afflict =
many
real programs.

> >> It may just end up that we have different perceptions of "high
> >> performance", and of the trade-offs we are going to make in our
> >> language / platform choices.
> >
> > Probably. What languages do not you not consider to be high
> > performance?
>=20
> I am not sure it is that easy to compare languages, but measuring
> compiler
> performances: any compiler that produces code that runs within -let's
> say- 5x
> of the fastest one around, on a bunch of wide-spectrum benchmarks (e.
> g.
> numerical code *plus* string matching *plus* tree walking, etc).
> Maybe it should also be mentioned that I am more versed into symbolic
> computations.

Then you're probably more interested in OCaml's GC vs parallelism when
performance is important.

> Regarding trade-offs, I am also inclined to favor Open Source =
solutions
> and
> higher-level languages (the trade-off being here execution time vs
> programming/debugging time).

I agree in general, of course, but I'm not sure "higher-level languages"
means much in this context. Is C# higher level than Java? Maybe, but I'm
interested in the value types and approach to generics here.

Monomorphization and unboxed tuples get you a long way towards decent
performance without a top-notch GC tuned for functional languages. That
makes it feasible to implement with a na=EFve multicore-capable GC, =
which is
precisely the current direction of HLVM.

> PS: as an aside, I used the word "references" for academic =
publications
> that went through a reviewing process, not blog entries.

I see. I value reproducible results far higher than peer reviewed =
academic
papers, particularly in this context.

Cheers,
Jon.