From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA04217 for caml-red; Mon, 11 Dec 2000 16:29:28 +0100 (MET)
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id LAA22249 for <caml-list@pauillac.inria.fr>; Fri, 8 Dec 2000 11:50:08 +0100 (MET)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.11.1/8.10.0) with ESMTP id eB8Ao5b08107;
	Fri, 8 Dec 2000 11:50:05 +0100 (MET)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id LAA04651; Fri, 8 Dec 2000 11:50:05 +0100 (MET)
Date: Fri, 8 Dec 2000 11:50:05 +0100
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Joe Lisp <joelisp@yahoo.com>
Cc: caml-list@inria.fr
Subject: Re: callcc/cps-style programming
Message-ID: <20001208115005.B27811@pauillac.inria.fr>
References: <20001206201313.28498.qmail@web9203.mail.yahoo.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <20001206201313.28498.qmail@web9203.mail.yahoo.com>; from joelisp@yahoo.com on Wed, Dec 06, 2000 at 12:13:13PM -0800
Sender: weis@pauillac.inria.fr

> Is anyone working on callcc for OCaml?

No one that I know.  As Basile said, the problem is that the OCaml
system is resolutely stack-based, meaning that the only simple
implementation of call/cc is to copy the whole stack both at call/cc
time and when the continuation is invoked.  This is extremely slow and
not really usable.  I wrote a trivial implementation of call/cc along
these lines some time ago for an experiment with Olivier Danvy, but
it was really slow.

Faster implementations techniques for call/cc are known in the Scheme
world, such as split stacks with lazy copying, but this is hard to
implement and (in my opinion) not worth the effort given the limited
usefulness of call/cc.

Yet another direction is limited-extent continuations such as prompts,
for which a stack copying implementation might be fast enough (because
only part of the stack is copied).  Unfortunately, while there are a
bunch of papers on this topic, none gives any hints on how to do an
implementation in a stack-based execution model, and I haven't been
able to figure out the details myself.

> If the answer is no, and I want to use lightweight
> threads, I can program in CPS to make my own
> (non-preemptive) thread system.  Would large programs
> written in CPS incur any particular extra overhead
> from the current OCaml compiler/runtime?

You will incur the "normal" overhead for CPS, that is:
- lots of closure allocations (in the heap) to represent the continuations;
- no branch prediction when a continuation is invoked
  (current hardware predicts direct calls and direct returns very
  efficiently, but doesn't predict indirect jumps such as those you
  get out of invoking a continuation).

I don't think OCaml will add overhead in addition to this.  If you try
it, let us know how it goes.

> An unrelated question: suppose I have some combinators
> or whatnot that I use all the time.  If they are in
> their own file, then I assume I pay a
> cross-compilation module penalty every time I use
> them.  Is there a way to #include them instead?

Actually, there is no cross-module penalty.  The bytecode interpreter
uses the same generic, moderately efficient function call instructions
for inter-module calls and for cross-module calls.  The native-code
compiler optimizes calls to known functions, but is able to do this
equally well for local functions and for functions from other modules
(using cross-module optimization information stored in the .cmx files).
The only exception is functors, whereas calls to functions in the
functor argument are not optimized.

So, you shouldn't have to make your code less modular in order to make
it run fast, except perhaps manually inlining functors.

- Xavier Leroy