From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id DAC34BC57 for ; Fri, 19 Nov 2010 11:05:15 +0100 (CET) X-IronPort-AV: E=Sophos;i="4.59,222,1288566000"; d="scan'208";a="88719047" Received: from peermanence.saclay.inria.fr (HELO [195.83.212.205]) ([195.83.212.205]) by mail1-relais-roc.national.inria.fr with ESMTP; 19 Nov 2010 11:05:15 +0100 Message-ID: <4CE64B2B.3050009@inria.fr> Date: Fri, 19 Nov 2010 11:02:19 +0100 From: Fabrice Le Fessant User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Thunderbird/3.0.10 MIME-Version: 1.0 To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Re: Native toplevel? References: <4CE395D4.4000105@frisch.fr> <2025993285.616104.1290144569061.JavaMail.root@zmbs4.inria.fr> In-Reply-To: <2025993285.616104.1290144569061.JavaMail.root@zmbs4.inria.fr> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; toplevel:01 ocamlopt:01 compiler:01 byte-code:01 compiler:01 ocamlopt:01 ocamlc:01 allocating:01 hotspot:98 wrote:01 graph:01 inline:01 caml-list:01 measurements:01 alain:01 Benedikt Meurer wrote, On 11/19/2010 06:29 AM: > This is indeed very interesting. I haven't thought of the native top-level. I haven't done any measurements yet, but from my experience with ocamlopt, I know that the optimizing native compiler is somewhat slower than the byte-code compiler. I doubt that this is related solely to the external as/ld invocations, so there's certainly more work to do than just replacing the calls to as/ld with an in-process assembler/linker. Indeed, there are some more passes in ocamlopt than in ocamlc, but I think that most of the time is spent in allocating registers and generating the assembler file and calling the assembler. As Alain said, we already have the inline assembler, so the next step would be to improve register allocation, either by spilling more variables to avoid the quadratic behavior of graph coloring, either by using some linear scan algorithm (for example, the one of HotSpot). --Fabrice