From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <benedikt.meurer@googlemail.com>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by yquem.inria.fr (Postfix) with ESMTP id 8C1B8BBAF
	for <caml-list@yquem.inria.fr>; Sun,  5 Dec 2010 22:21:40 +0100 (CET)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Am0BAAqR+0zRVdY0gWdsb2JhbACjKAgVAQELCQoaBB6jX4wCAQWNAAEEghRpgkw
X-IronPort-AV: E=Sophos;i="4.59,302,1288566000"; 
   d="scan'208";a="83138914"
Received: from mail-bw0-f52.google.com ([209.85.214.52])
  by mail2-smtp-roc.national.inria.fr with ESMTP; 05 Dec 2010 22:21:40 +0100
Received: by bwz4 with SMTP id 4so9874751bwz.39
        for <caml-list@yquem.inria.fr>; Sun, 05 Dec 2010 13:21:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlemail.com; s=gamma;
        h=domainkey-signature:received:received:content-type:mime-version
         :subject:from:in-reply-to:date:content-transfer-encoding:message-id
         :references:to:x-mailer;
        bh=yu0FE4vFfJf6vwmy89KZ5k/JCHd2X2GSOgJIxca6UYs=;
        b=hZiDFt0SH9T/k4N1H7ZARZKA/RNJGmL3LRFKlmqbOomZXqUsqyAjMI9H9wvTaK6KBu
         J8l94oh5P09SwhVtjKeyoZtnd3k6miryP4uSKF0BBqzMh+5WhKoRKlx3fGrLjnBxyGKe
         4iHuWLK/dVXRUji6AZ1xcZl3pIr1R5K92V68Q=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=googlemail.com; s=gamma;
        h=content-type:mime-version:subject:from:in-reply-to:date
         :content-transfer-encoding:message-id:references:to:x-mailer;
        b=CmLeeIpNUMEXu78l4P03P0OoXAMPvk0GO/2H6fonmMlsF/mNO0Vv4xs5sb11DDOChc
         ji1bFWQcSf09su/Amxu4BmxmRrASmZeBtC4mfpWmtlDNb5sCoBSoi0sdPYRhfCZMfNFt
         c5x2Sa4T1yuDpqULgtIYDz7F/qay/wKXNF0E8=
Received: by 10.204.123.14 with SMTP id n14mr4988892bkr.33.1291584099403;
        Sun, 05 Dec 2010 13:21:39 -0800 (PST)
Received: from bespin.kosmos.all (ip-95-223-170-32.unitymediagroup.de [95.223.170.32])
        by mx.google.com with ESMTPS id q15sm1624947bkk.1.2010.12.05.13.21.37
        (version=SSLv3 cipher=RC4-MD5);
        Sun, 05 Dec 2010 13:21:38 -0800 (PST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Apple Message framework v1081)
Subject: Re: ocamlopt LLVM support (Was: [Caml-list] OCamlJIT2 vs. OCamlJIT)
From: Benedikt Meurer <benedikt.meurer@googlemail.com>
In-Reply-To: <004901cb94b8$c03abf80$40b03e80$@com>
Date: Sun, 5 Dec 2010 22:21:37 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <E4C27062-58F1-4F54-BC9A-BD2D2C02361F@googlemail.com>
References: <3DCEA910-1382-47E5-876B-059178F8F82E@googlemail.com>	<20101130124803.7952fca1@deb0>	<D6C274D9-8A00-4D6F-936A-58206CA5D358@googlemail.com>	<AANLkTi=Wu1SXVgQ+X0NECNx3oUsD=xEy-9Zq6T3ncJ+W@mail.gmail.com>	<B8C055A1-2A67-4C7E-BF67-4A639619C4EB@wanadoo.fr>	<0a8b01cb90da$da5e6240$8f1b26c0$@com>	<5E2DA3F1-7998-4F62-B617-7B6451D1001D@googlemail.com>	<0b3b01cb9161$a81c8e10$f855aa30$@com>	<B965CE04-4059-421F-A8CD-EB178BF32D40@googlemail.com>	<0b9301cb91a3$8f42fd60$adc8f820$@com>	<AB3E2F68-0E5B-4B4F-A4F8-3C67BD6F598E@googlemail.com>	<0cae01cb9325$a8ddc3d0$fa994b70$@com> <CC528935-50A4-4A94-849B-9DAD160D46A3@googlemail.com> <004901cb94b8$c03abf80$40b03e80$@com>
To: caml-list@yquem.inria.fr
X-Mailer: Apple Mail (2.1081)
X-Spam: no; 0.00; ocamlopt:01 ocaml:01 ocamlopt:01 coq:01 ocaml:01 pointer:01 lambda:01 ulambda:01 c--:01 vectors:01 matrices:01 hash:01 unboxed:01 bigarrays:01 ffi:01 


On Dec 5, 2010, at 21:12 , Jon Harrop wrote:

> Benedikt wrote:
>> On Dec 3, 2010, at 21:07 , Jon Harrop wrote:
>>> Benedikt wrote:
>>>> So, let's take that LLVM thing to a somewhat more serious level
>> (which
>>>> means no more arguing with toy projects that simply ignore the
>>>> difficult stuff).
>>>=20
>>> You can learn a lot from prototypes.
>>=20
>> Indeed, but you need stay on topic. HLVM does not prototype the use =
of
>> LLVM within OCaml, because you're using a quite different data
>> representation.
>=20
> My point was that HLVM's data representation is far more flexible than
> ocamlopt's, so you can use it to measure the effect of changing data
> representations on the efficiency of the code LLVM generates more =
easily
> than with ocamlopt.
>=20
> In particular, I am saying that (from my own measurements) LLVM does =
not
> cope with data representations like ocamlopt's at all well. =
Specifically,
> when you box and cast away type information. My guess is that =
retargeting
> ocamlopt to LLVM IR (which is a considerable undertaking due to the =
amount
> of hacking required on LLVM itself) would result in more performance
> degradation that improvement. Moreover, that degradation is most =
likely to
> occur on code like Coq because ocamlopt has been so heavily optimized =
for
> that one application domain and, consequently, I doubt your hard work =
would
> ever be accepted into mainstream OCaml.
>=20
> Ultimately, LLVM was built specifically to exploit a typed =
intermediate
> representation whereas ocamlopt removes type information very early.

=46rom what I've seen so far, LLVM makes little use of the type =
information, especially integer and pointer types are treated nearly the =
same, so LLVM won't benefit from this information. Any other data is =
still distinguished at the Cmm level (i.e. doubles, strings, ...), which =
is necessary for the register allocation and instruction selection phase =
in ocamlopt. You could easily change the intermediate representations in =
ocaml (lambda, ulambda, cmm) to include additional type information, so =
this is not the problem. But as said, it won't help a lot with LLVM.

Also looking at the GHC work you mentioned, they also start at the Cmm =
level (slightly different C--, but comparable to ocamlopt), with mostly =
the same amount of type information available. And as you said, they did =
quite well in some cases.

>> Using a different data representation within OCaml would indeed be
>> possible, but one has to ask what it's worth? You'd get better =
floating
>> point performance (most likely)
>=20
> And faster tuples, ints, chars, complex numbers, low-dimensional
> vectors/matrices, hash tables and so on. More types (e.g. int16 and
> float32). Even arbitrary-precision arithmetic can benefit by keeping =
small
> numbers unboxed when possible. Bigarrays disappear. The FFI gets =
simpler,
> easier to use and more powerful (e.g. you can generate AoS layouts for
> OpenGL). The benefits are enormous in the general case but that is =
beside
> the point here. Moreover, this would never be accepted either because =
it
> would degrade the performance of applications like Coq. If it is done =
at
> all, such work must be kept separate from the OCaml compilers.

ocamlopt already supports float32 and int16, tho they are not exploited =
to the upper layers (don't know why, but I also don't know why one would =
need them). Keeping Int32/Int64 values unboxed would be possible similar =
to what is done with doubles; maybe there is simply no need to do =
(honestly, I've never needed Int32/Int64 so far, int/Natint is usually =
what you want and these are optimized).

>> 3. The current garbage collector is mostly straight-forward because =
of
>> the data representation. No need to carry around/load type =
information,
>=20
> Note that conveying run-time type information also facilitates useful
> features like generic printing.

Generic printing is currently implemented using the run-time type =
information (the three bits mentioned below).

>> you just need the following bits of information: Is it a block in the
>> heap? Does it contain pointers? If so how many? All this information =
is
>> immediately available with the current data representation (also
>> improves cache locality of the GC loops).
>=20
> That information is also immediately available with HLVM's design. =
Each
> value contains a pointer to its run-time type. Each run-time type =
contains a
> pointer to a "visit" function for the GC that returns the references =
in any
> value of that type.

This is indeed possible, yes, and also implemented by most Java VMs; =
each objects ships a type-info pointer. OCaml avoids the type-info =
pointer using clever header field encoding. A matter of taste probably.

>> So, it is really worth to spend years on a new data representation
>> (including fixing all C bindings, etc.)? Just to get better floating
>> point performance? Integer performance will be almost the same,
>> avoiding the shift/tag bit just replaces an "addq r1, r2; subq $1, =
r2"
>> with "addq r1, r2"; even doing this thousands of times will not cause =
a
>> noticeable difference.
>=20
> On the contrary, there are many example of huge performance gains =
other than
> floating point. The int-based random number generator from the =
SciMark2
> benchmark is 6x faster with LLVM than with OCaml. Generic hash tables =
are
> 17x faster in F# than Java because of value types. And so on.

I doubt that this is really related to the use of "value types", I also =
doubt that the integer performance improvements are related to the =
missing "subq" (must have been an unusual CPU then). The Java =
performance is probably related to the GC in most JVMs, which is not as =
well optimized for high speed allocation and compaction as the GCs found =
in most runtimes of functional programming languages. But this is just =
guessing, we'd need some facts to be sure what's the cause. I suggest =
you present some proof-of-concept code in C or LLVM, simply using =
malloc() for memory allocation, comparing a straight-forward =
implementation to a "value type" implementation (with everything else =
the same). If there's still a 17x improvement, I promise to rewrite the =
whole OCaml system during the next three months to support value types.

> Cheers,
> Jon.

greets,
Benedikt