From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 0BC4BBC57 for ; Tue, 30 Nov 2010 23:48:23 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtkAAPsN9UzRVdc0k2dsb2JhbACjBQgVAQEBAQkJCgkRAx+pZYt8AQWOJAEEghODNA X-IronPort-AV: E=Sophos;i="4.59,282,1288566000"; d="scan'208";a="68884791" Received: from mail-ew0-f52.google.com ([209.85.215.52]) by mail3-smtp-sop.national.inria.fr with ESMTP; 30 Nov 2010 23:48:22 +0100 Received: by ewy23 with SMTP id 23so2962771ewy.39 for ; Tue, 30 Nov 2010 14:48:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:content-type:mime-version :subject:from:in-reply-to:date:content-transfer-encoding:message-id :references:to:x-mailer; bh=TmFDmiENXcpFUFSJikZ2ZUlE2o6xPp88OB99VuHqGK4=; b=DsUFZkW07xj9jmy0YeOVLfICxrBqd49YCbk+tu9B+q3/v3Jip9zIGrQzMYQ2MIbUh+ eRy3BveXQOBg00fyivcG18DE0tpe9ypkpvRwG/BZcJirVt+l1mL/AlYGlb2AryZeRah0 Oj5M6D4KNENU7v8Alu1bwquJ2WkEygUKEhTkg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=tvY/eFz5yivmjlnkAzE+n2uvfF6XJz9OW0rJVvB2B3Z0BKHSyC0XMoujOt77a9EDiH MsXmmxjgT7u/xyBWc/f7bRgjnkmf6BjR5unNj0rCOWE4ZeIRl/59ppwoMD9vcvfJbauw oMahRTkfAKHv3PDnwYvWTd84ha5huhO3MuVLs= Received: by 10.14.127.2 with SMTP id c2mr6615912eei.3.1291157302243; Tue, 30 Nov 2010 14:48:22 -0800 (PST) Received: from bespin.kosmos.all (ip-95-223-170-32.unitymediagroup.de [95.223.170.32]) by mx.google.com with ESMTPS id q58sm6821658eeh.21.2010.11.30.14.48.21 (version=SSLv3 cipher=RC4-MD5); Tue, 30 Nov 2010 14:48:21 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1081) Subject: Re: [Caml-list] OCamlJIT2 vs. OCamlJIT From: Benedikt Meurer In-Reply-To: <0a8b01cb90da$da5e6240$8f1b26c0$@com> Date: Tue, 30 Nov 2010 23:48:19 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <5E2DA3F1-7998-4F62-B617-7B6451D1001D@googlemail.com> References: <3DCEA910-1382-47E5-876B-059178F8F82E@googlemail.com> <20101130124803.7952fca1@deb0> <0a8b01cb90da$da5e6240$8f1b26c0$@com> To: caml-list@yquem.inria.fr X-Mailer: Apple Mail (2.1081) X-Spam: no; 0.00; ocamlopt:01 compilation:01 ocamlopt:01 compilation:01 ocaml:01 polymorphism:01 trivial:01 polymorphism:01 integers:01 recursive:01 invokes:01 doable:01 ulambda:01 low-level:01 ocaml:01 On Nov 30, 2010, at 23:06 , Jon Harrop wrote: > Because benchmarks like my HLVM ones have proven that LLVM can = generate > *much* faster code than ocamlopt does. >=20 > For example, Fibonacci function over floats in HLVM with optimization = passes > disabled and compilation time included in the measurement: >=20 > # let rec fib (x: float) : float =3D > if x < 1.5 then x else fib(x - 1.0) + fib(x - 2.0);; > # fib 40.0;; > - : `Float =3D 1.02334e+08 > Live: 0 > 2.48074s total; 0s suspend; 0s mark; 0s sweep >=20 > And ocamlopt without compilation time: >=20 > $ cat >fib.ml > let rec fib x =3D if x < 1.5 then x else fib(x -. 1.0) +. fib(x -. = 2.0);; > fib 40.0;; > $ ocamlopt fib.ml -o fib > $ time ./fib >=20 > real 0m7.811s > user 0m7.808s > sys 0m0.000s >=20 > Note that HLVM's *REPL* is over 3x faster than ocamlopt-compiled = OCaml. This has nothing to do with LLVM, but is simply due to the fact that = your code does not box the float parameters/results. The following peace = of C code is most probably even faster than your code, so what? double fib(double x) { if (x < 1.5) return x else return fib(x-1) + = fib(x-2); } So this is about data representation not code generation (using LLVM = with boxed floats would result in same/lower performance); HLVM ignores = complex stuff like polymorphism, modules, etc. (at least last time I = checked), which makes code generation almost trivial. The literature = includes various solutions to implement stuff like ML polymorphism: = tagged integers/boxed floats/objects is just one solution, not = necessarily the best; but examples that simply ignore the complex stuff, = and therefore deliver better performance don't really help to make = progress. A possible solution to improve ocamlopt's performance in this case would = be to compile the recursive fib calls in a way that the parameter/result = is passed in a floating point register (not boxed), and provide a = wrapper for calls from outside, which unboxes the parameter, invokes the = actual function code, and boxes the result. This should be doable on the = Ulambda/Cmm level, so it is actually quite portable and completely = independent of the low-level code generation (which is where LLVM comes = into play). That way ocamlopt code will be as fast as the C code for = this example. > I have microbenchmarks where LLVM generates code over 10x faster than > ocamlopt (specifically, floating point code on x86) and larger = numerical > programs that also wipe the floor with OCaml. ocamlopt's x86 floating point backend is easy to beat, as demonstrated = in the original post. Even a simple byte-code JIT engine (which still = boxes floating point results, etc.) is able to beat it. Your benchmarks do not prove that LLVM generates faster code than = ocamlopt. Instead you proved that OCaml's floating point representation = comes at a cost for number crunching applications (which is obvious). = Use the same data representation with LLVM (or C) and you'll notice that = the performance is the same (most likely worse) compared to ocamlopt. > LLVM is also much better documented than ocamlopt's internals. Definitely, but LLVM replaces just the low-level stuff of ocamlopt (most = probably starting at the Cmm level), so a lot of (undocumented) ocamlopt = internals remain. > Cheers, > Jon. regards, Benedikt=