From mboxrd@z Thu Jan 1 00:00:00 1970 X-Sympa-To: caml-list@inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id pB96w7x9029790 for ; Fri, 9 Dec 2011 07:58:07 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoQJAOCw4U7RRAGzWmdsb2JhbABDmm8BkBkBFgcECwQXJYFyAQEEegUodwYeBCCHbLcpiCODGgSUaQGSJw X-IronPort-AV: E=Sophos;i="4.71,324,1320620400"; d="scan'208";a="134662157" Received: from eeoth.pair.com ([209.68.1.179]) by mail1-smtp-roc.national.inria.fr with SMTP; 09 Dec 2011 07:57:59 +0100 Received: (qmail 94307 invoked by uid 3366); 9 Dec 2011 06:57:58 -0000 Date: 9 Dec 2011 06:57:58 -0000 Message-ID: <20111209065758.94306.qmail@eeoth.pair.com> From: oleg@okmij.org To: caml-list@inria.fr CC: ontologiae@gmail.com X-Validation-by: oleg@okmij.org Subject: [Caml-list] Why NOT to compile OCaml via C Pierre-Alexandre Voye wrote: > Note that if Ocaml compiler would have a C backend, all these problems or > architecture port would disappear... > Ocaml would have more than 30 target[1] > In my Opinion, trying to generate assembler is a bad idea because modern CPU > require a lot of work to generate good assembler. There are many good reasons to avoid C when compiling functional languages, especially strict ones. One often hears that ``C is a portable assembler''. That has never been true. One of the reasons is that every assembler I know has the "jmp" instruction, which, without affecting SP, transfers control anywhere, out of a procedure or in the middle of a procedure, out of a module or into a module. C is built around the stack discipline -- after all, C is a descendant of Algol 60. (Although C has labels, they are limited, even in GCC). Although Algol-60 researchers quickly recognized the value of tail recursion, all that knowledge was lost in the Dark Ages. In strict functional languages, tail-recursion is a big deal. There is *huge* amount of literature on emulating tail-recursion in standard C or GCC. That alone should tell that there is no simple solution. When GHC (Glasgow Haskell Compiler) was emitting C, it was targeting specifically GCC. Furthermore, it employed a Perl program, aptly named `evil mangler', to post-process the assembly code generated by GCC. One can only shudder when thinking about Simon Marlow's maintaining it. There are at least two other reasons ocamlopt emitting assembly. One is garbage collection. OCaml GC is precise. Therefore, when sweeping through the stack, GC has to know if 0x80aa4000 is an unboxed integer or a heap pointer. So-called frame tables generated by the compiler tell GC which stack slots contain OCaml values. GC ignores other slots and hence expensive tests it had to do otherwise. One can build frame tables only when one has full control of the generated code and the frame layout. The third reason is exceptions. In OCaml, exceptions are pervasive and should be fast. They are indeed well implemented, via special exception frames and a dedicated exception frame `register' (which is a real register on x64).