From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ober.14@osu.edu>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: *
X-Spam-Status: No, score=1.2 required=5.0 tests=AWL,SPF_NEUTRAL 
	autolearn=disabled version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by yquem.inria.fr (Postfix) with ESMTP id 73FEABBC4
	for <caml-list@yquem.inria.fr>; Mon,  2 Mar 2009 21:09:49 +0100 (CET)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AjIJAKvLq0lDWxLCaWdsb2JhbACBWZMXFAQiBLJvhRaITIQaBg
X-IronPort-AV: E=Sophos;i="4.38,291,1233529200"; 
   d="scan'208";a="21938311"
Received: from ip67-91-18-194.z18-91-67.customer.algx.net (HELO server1.bertec.net) ([67.91.18.194])
  by mail2-smtp-roc.national.inria.fr with ESMTP; 02 Mar 2009 21:09:48 +0100
Received: from kuba-laptop.bertec.net (kuba-laptop.bertec.net [192.168.2.48])
	by server1.bertec.net (Postfix) with ESMTP id D967B10575C
	for <caml-list@yquem.inria.fr>; Mon,  2 Mar 2009 15:09:46 -0500 (EST)
Message-Id: <C1456ED1-E6DD-476B-8B46-C327F7B19CC3@osu.edu>
From: Kuba Ober <ober.14@osu.edu>
To: caml-list@yquem.inria.fr
In-Reply-To: <49ABED07.30800@imag.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Apple Message framework v930.3)
Subject: Re: [Caml-list] Odd performance result with HLVM
Date: Mon, 2 Mar 2009 15:09:46 -0500
References: <200902280112.24115.jon@ffconsultancy.com> <C61E9F1C-36D0-4429-BDFB-6F36A3F67AA5@osu.edu> <200902282152.18430.jon@ffconsultancy.com> <49ABED07.30800@imag.fr>
X-Mailer: Apple Mail (2.930.3)
X-Spam: no; 0.00; ocamlopt:01 compilation:01 run-time:01 compilation:01 polymorphism:01 compiler:01 inlined:01 runtime:01 orthogonal:01 invocation:01 recursive:01 statically:01 ocaml:01 ml-like:01 syntax:01 


> Jon Harrop a =E9crit :
>> There are really two major advantages over the current ocamlopt =20
>> design and both stem from the use of JIT compilation:
>>
>> . Run-time types allow per-type functions like generic pretty =20
>> printers and comparison.
>>
>> . Monomorphisation during JIT compilation completely removes the =20
>> performance cost of polymorphism, e.g. floats, tuples and records =20
>> are never boxed.
>
> Do you mean that each polymorphic function is compiled into a =20
> different
> native piece of code each time it is called with different parameter
> types? How does the JIT'ed code size compare to ocamlopt'ed code size?

Having done it, although not in a JIT but in your plain-old whole-=20
project compiler,
for my use cases the code size actually shrinks. The functions usually =20=

end up inlined
and sometimes reduce to a few machine instructions. Most of the =20
runtime library is written
using polymorphic functions. Case in point: all sorts of string-=20
processing functions which
can take as arguments either strings stored in RAM or stored in ROM, =20
and those data types
are very much orthogonal on my platform. An invocation of a tail-=20
recursive
"strlen" reduces to about as many bytes of code than it'd take to push =20=

the arguments on
the stack and call a non-polymorphic version of itself.

That's how I initially got a statically typed LISP to compile for =20
"tiny" 8 bit microcontrollers
without using all of the whopping 1kb of RAM and 16kb of program flash =20=

on a Z8F162x
device.

Right now I'm hacking away to get rid of last traces of LISPiness and =20=

to get the project fully
working in OCaml, using ML-like syntax for user code. I like it much =20
better than LISP's.

I have also found that by doing whole-project compilation with =20
aggressive constant propagation
and compile-time execution of functions that depend only on known =20
constants, I could get
rid of about 85% of LISP macros in my code. The other macros ended up =20=

being rewritten
to just invoke ct_eval: string -> function, which is a compile-time =20
eval function.
It's just like LISP macros, but since in ML family code isn't data, it =20=

was easier to just
generate strings and feed them into compiler, rather than expose all =20
of the AST machinery
to "userland".

Cheers, Kuba=