From mboxrd@z Thu Jan 1 00:00:00 1970 X-Sympa-To: caml-list@inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id pBA7ZhRe008067 for ; Sat, 10 Dec 2011 08:35:43 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ah0GAAIL407RRAGzWmdsb2JhbABDqxIBHQQLBBclgXIBAQQBeQUWgQkUFCCHbLVJiFODGgSUcAGSJw X-IronPort-AV: E=Sophos;i="4.71,331,1320620400"; d="scan'208";a="134807485" Received: from eeoth.pair.com ([209.68.1.179]) by mail1-smtp-roc.national.inria.fr with SMTP; 10 Dec 2011 08:35:37 +0100 Received: (qmail 48009 invoked by uid 3366); 10 Dec 2011 07:35:36 -0000 Date: 10 Dec 2011 07:35:36 -0000 Message-ID: <20111210073536.48008.qmail@eeoth.pair.com> From: oleg@okmij.org To: goswin-v-b@web.de CC: caml-list@inria.fr In-reply-to: <87aa712v26.fsf@frosties.localnet> X-Validation-by: oleg@okmij.org Subject: Re: [Caml-list] Why NOT to compile OCaml via C > Well, write the code as ONE function and do use lables. Sure, the C > source will be huge for larger projects but then again you get the > single source optimization bonus from gcc. A realistic version of this approach indeed has been used, for example, in Gambit-C Scheme and Stalin Scheme compilers. The approach as suggested above has several drawbacks. First of all, it prevents separate compilation. If we do need separate compilation, we have to be content with several C functions (one per each separately compiled unit), and have to arrange for tail-recursive calls among them. So, we need a fall-back solution. Gambit-C uses trampolining. Second, you probably do not appreciate how huge the produced C functions might become. Past a certain threshold, gcc or other C compilers slow down, almost to a halt. Subjectively, it seems that the quality of optimizations deteriorates past a threshold (although I haven't investigated that). Past another threshold, gcc just bombs. Therefore, Gambit for example had to take measures limiting the size of the generated C functions. > Note: gcc does know something about tail recusion. So it is not > completly lost there. One must really draw the distinction between an optional optimization and a required behavior. GCC indeed _may_ optimize a tail call -- in simple circumstances and when the stars are well-aligned. BTW, implementing tail-recursion is much more complex than it may appear. Consider, for example let rec f x y = if y = 0 then x else f (x+y) (x-y) Here, we can't _just_ jump on the recursive call. We have to mutate the arguments in the current frame, possibly using stack for scratch storage. GCC won't do tail-call optimization in this (and many much simpler) cases. As I said earlier, there really is huge literature on implementing functional languages by compiling into C. Lots of approaches have been suggested and tried -- it is very hard to say anything new here. The fact that we are still having this discussion tells that no solution has been truly satisfactory. BTW, Gambit avoids other pitfalls of C compilation and so does manages precise GC and very efficient call/cc, exceptions and dynamic binding. The reason is that the generated C code is essentially a Gambit Virtual machine interpreter specialized to the source code. It does not share its stack with the C stack. Perhaps Gambit is the first example of the First Futamura Projection that I have seen. It works out quite well in the end.