From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.2 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 73FEABBC4 for ; Mon, 2 Mar 2009 21:09:49 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjIJAKvLq0lDWxLCaWdsb2JhbACBWZMXFAQiBLJvhRaITIQaBg X-IronPort-AV: E=Sophos;i="4.38,291,1233529200"; d="scan'208";a="21938311" Received: from ip67-91-18-194.z18-91-67.customer.algx.net (HELO server1.bertec.net) ([67.91.18.194]) by mail2-smtp-roc.national.inria.fr with ESMTP; 02 Mar 2009 21:09:48 +0100 Received: from kuba-laptop.bertec.net (kuba-laptop.bertec.net [192.168.2.48]) by server1.bertec.net (Postfix) with ESMTP id D967B10575C for ; Mon, 2 Mar 2009 15:09:46 -0500 (EST) Message-Id: From: Kuba Ober To: caml-list@yquem.inria.fr In-Reply-To: <49ABED07.30800@imag.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: [Caml-list] Odd performance result with HLVM Date: Mon, 2 Mar 2009 15:09:46 -0500 References: <200902280112.24115.jon@ffconsultancy.com> <200902282152.18430.jon@ffconsultancy.com> <49ABED07.30800@imag.fr> X-Mailer: Apple Mail (2.930.3) X-Spam: no; 0.00; ocamlopt:01 compilation:01 run-time:01 compilation:01 polymorphism:01 compiler:01 inlined:01 runtime:01 orthogonal:01 invocation:01 recursive:01 statically:01 ocaml:01 ml-like:01 syntax:01 > Jon Harrop a =E9crit : >> There are really two major advantages over the current ocamlopt =20 >> design and both stem from the use of JIT compilation: >> >> . Run-time types allow per-type functions like generic pretty =20 >> printers and comparison. >> >> . Monomorphisation during JIT compilation completely removes the =20 >> performance cost of polymorphism, e.g. floats, tuples and records =20 >> are never boxed. > > Do you mean that each polymorphic function is compiled into a =20 > different > native piece of code each time it is called with different parameter > types? How does the JIT'ed code size compare to ocamlopt'ed code size? Having done it, although not in a JIT but in your plain-old whole-=20 project compiler, for my use cases the code size actually shrinks. The functions usually =20= end up inlined and sometimes reduce to a few machine instructions. Most of the =20 runtime library is written using polymorphic functions. Case in point: all sorts of string-=20 processing functions which can take as arguments either strings stored in RAM or stored in ROM, =20 and those data types are very much orthogonal on my platform. An invocation of a tail-=20 recursive "strlen" reduces to about as many bytes of code than it'd take to push =20= the arguments on the stack and call a non-polymorphic version of itself. That's how I initially got a statically typed LISP to compile for =20 "tiny" 8 bit microcontrollers without using all of the whopping 1kb of RAM and 16kb of program flash =20= on a Z8F162x device. Right now I'm hacking away to get rid of last traces of LISPiness and =20= to get the project fully working in OCaml, using ML-like syntax for user code. I like it much =20 better than LISP's. I have also found that by doing whole-project compilation with =20 aggressive constant propagation and compile-time execution of functions that depend only on known =20 constants, I could get rid of about 85% of LISP macros in my code. The other macros ended up =20= being rewritten to just invoke ct_eval: string -> function, which is a compile-time =20 eval function. It's just like LISP macros, but since in ML family code isn't data, it =20= was easier to just generate strings and feed them into compiler, rather than expose all =20 of the AST machinery to "userland". Cheers, Kuba=