From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA08959; Thu, 8 Jul 2004 23:06:11 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id XAA09089 for ; Thu, 8 Jul 2004 23:06:10 +0200 (MET DST) Received: from outbound28-2.lax.untd.com (outbound28-2.lax.untd.com [64.136.28.160]) by concorde.inria.fr (8.12.10/8.12.10) with SMTP id i68L67SH017179 for ; Thu, 8 Jul 2004 23:06:08 +0200 Received: from outbound28-2.lax.untd.com (smtp01.lax.untd.com [10.130.24.121]) by smtpout05.lax.untd.com with SMTP id AABAQ5P2KATD5EDA for (sender ); Thu, 8 Jul 2004 14:05:13 -0700 (PDT) Received: (qmail 29418 invoked from network); 8 Jul 2004 21:04:47 -0000 Received: from unknown (HELO vangogh) (66.52.239.74) by smtp01.lax.untd.com with SMTP; 8 Jul 2004 21:04:47 -0000 From: "Brandon J. Van Every" To: "caml" Subject: [Caml-list] tail recursion and register poor Intel architecture Date: Thu, 8 Jul 2004 14:14:58 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.6604 (9.0.2911.0) In-Reply-To: <20040708155148.GA21304@fichte.ai.univie.ac.at> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 Importance: Normal X-ContentStamp: 17:8:2666295873 X-UNTD-OriginStamp: CI84cOLHFqh7Zd2QWkwvEFvwyO3T/pIsPQZphDk9MRifc1rFBlcbjvzxG/7XTFLW X-Miltered: at concorde with ID 40EDB73F.002 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; brandon:99 recursion:01 baretta:01 ocamlopt:01 tail-call:01 closures:01 recursion:01 closures:01 intel's:01 vectors:01 scalars:01 intel's:01 caml-list:01 bayesian:01 crap:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Markus Mottl wrote: > On Thu, 08 Jul 2004, Alex Baretta wrote: > > Luc Maranget wrote: > > > + ocamlopt does it less often. Namely, calls in tail position > > > become real tail calls when all their arguments are > > > passed in registers. > > > (This does not apply to self-tail calls which are always > > > optimized) > > > > > > What?! Is this true? This effectively means that I cannot count on > > tail-call elimination in general? > > No, you can't. If you happen to use a register-starved > architecture like > IA-32 and call an OCaml-function with more than six > parameters, then you > are out of luck, because you'll have to throw things on the > stack. AFAIK, > the OCaml compiler generates closures on the heap to work around this, > but only in cases where the recursion is obvious (self-tail calls). > Hm, shouldn't be too difficult to extend this to mutual recursion? > Or does OCaml already do this? > > I usually avoid recursive functions with more than six parameters for > performance reasons. I wrap them in closures, which bind constant > parameters, or, if there are too many volatile parameters, I > modify the > volatile parameters in references outside of the function. > Another (less > efficient) solution would be to use tuples to pass volatile > parameters. > > The ultimate solution is, of course: buy a better architecture :-) Or implement more of what Intel's crufty architecture actually offers. For instance, supporting SSE would provide 8 additional 32-bit FPU registers. One doesn't have to use use the XMM registers for vectors, they could be used as scalars, and that's the more straightforward benefit of SSE architecture. SSE2 allows for 64-bit FPU registers and also integer registers, of size 32-bit and 64-bit IIRC (but I haven't much cared about integer code). So, if you were willing to compile for a Pentium III for SSE, or a Pentium4 for SSE2, the optimization problem would be less painful. Of course, you'd have to write the painful support for SSE/SSE2 in the first place. :-) For the record, I hate Intel's architectures. They were talking about Merced when I was writing real code on DEC Alpha. Alpha is dead for marketing reasons, not technical ones. Meanwhile, Itanium is still handwaving. AMD also offers more registers on their newest chips. I'm too fazed to remember the details right now; I do recall twice as many FPU registers. Cheers, www.indiegamedesign.com Brand*n Van Every S*attle, WA Praise Be to the caml-list Bayesian filter! It blesseth my postings, it is evil crap! evil crap! evil crap! ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners