[Caml-list] calling native code from bytecode?

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] calling native code from bytecode?
@ 2001-08-01 23:45 Chris Hecker
  2001-08-03 12:03 ` Xavier Leroy
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hecker @ 2001-08-01 23:45 UTC (permalink / raw)
  To: caml-list

Is there any way to compile part of a project in bytecode and another
part with the native compiler and link them?  It seems odd that you
can call C from bytecode but not other caml code.  The gc and
everything is the same between the asm and bytecode runtimes, no?  Are
datastructures in memory (except code, of course) compatible?

Basically, I've got some numerical code that I'd like to compile to
native code for performance, but I'd like to keep most of the
non-performance stuff in bytecode so I can use the toplevel and
whatnot.  I suppose I could do some sort of heinous bytecode -> C ->
native code shim, but it seems like this could "just work".

Obviously, the holy grail would be complete intermingling of bytecode
and native code, and the linker just figures it out and does the right
thing.  That would rock.  But, I'd settle for bytecode -> native calls
only at this point.

Thoughts?

Chris

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] calling native code from bytecode?
  2001-08-01 23:45 [Caml-list] calling native code from bytecode? Chris Hecker
@ 2001-08-03 12:03 ` Xavier Leroy
  2001-08-03 14:31   ` Chris Hecker
  0 siblings, 1 reply; 7+ messages in thread
From: Xavier Leroy @ 2001-08-03 12:03 UTC (permalink / raw)
  To: Chris Hecker; +Cc: caml-list

> Is there any way to compile part of a project in bytecode and another
> part with the native compiler and link them?  It seems odd that you
> can call C from bytecode but not other caml code.  The gc and
> everything is the same between the asm and bytecode runtimes, no?  Are
> datastructures in memory (except code, of course) compatible?

Yes, they are compatible, except function closures, which contain
native code pointers for ocamlopt and byte-code pointers for ocamlc.
But that's where the problem is for mixed-mode execution: treating
pointers to bytecode and pointers to native-code differently.  

One solution would be to have two code pointers per closure, one
bytecode and one native-code.  For a bytecode closure, the native-code
pointer would point to the bytecode interpreter.  For a native-code
closure, the bytecode pointer would point to a special "switch mode"
instruction of the virtual machine.  But that's far from easy to
implement.

Another approach is Fabrice Le Fessant's asmdynlink library, which
basically is a bytecode interpreter written in Caml and compiled with
ocamlopt.  This gives native-code programs the ability to execute
bytecode, albeit at a fairly large cost in execution speed.

> Basically, I've got some numerical code that I'd like to compile to
> native code for performance, but I'd like to keep most of the
> non-performance stuff in bytecode so I can use the toplevel and
> whatnot.  I suppose I could do some sort of heinous bytecode -> C ->
> native code shim

Besides problems with potential callbacks from native-code to
bytecode, there are also (non-essential, but intricate) GC issues that
would come in the way.

> Obviously, the holy grail would be complete intermingling of bytecode
> and native code, and the linker just figures it out and does the right
> thing.  That would rock.  But, I'd settle for bytecode -> native calls
> only at this point.
> Thoughts?

Nothing is impossible, but I shudder at the idea of implementing all
this.

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] calling native code from bytecode?
  2001-08-03 12:03 ` Xavier Leroy
@ 2001-08-03 14:31   ` Chris Hecker
  2001-08-08  8:46     ` Xavier Leroy
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hecker @ 2001-08-03 14:31 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

Just a couple clarification questions (in case I decide to try to hack a prototype of this into the compiler):

>One solution would be to have two code pointers per closure, one
>bytecode and one native-code.

I think, if I was just trying to do leaf functions (that don't take closures as parms) in native code to start with, that I'd just make another bytecode instruction for native calls.  Having two pointers does seem like it'd be better if you were trying the complete solution, but I think you're right that it would be hard to implement.  It seems like the OC_CALL* instruction could mostly mirror the implementation of C_CALL* up to the point of call, because currying and whatnot would work the same way.

So, just to be excruciatingly clear, assuming there are no closures in any of the parameters to a function call, the "values" passed in and any native code and GC operations on those values are completely compatible?  Even for arbitrarily complex datastructures, as long as no closures are in the mix?

If I constrained the problem to just leaf functions (where leaf is defined as never calling back into bytecode) then it would work, runtime library-wise?

I wonder if there would be issues between the standard library functions that are implemented in C, although I'd assume they'd link, would there be sync issues?

I assume threads would be a mess, too, so I'd disable them as well.

Since bytecode and native code both use .cmi files, I assume module structure and signature layouts are compatible?

>> I suppose I could do some sort of heinous bytecode -> C ->
>> native code shim
>Besides problems with potential callbacks from native-code to
>bytecode, there are also (non-essential, but intricate) GC issues that
>would come in the way.

You mean for the shim solution in that sentence, not if I was just passing values straight to the native code?

Thanks,
Chris

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] calling native code from bytecode?
  2001-08-03 14:31   ` Chris Hecker
@ 2001-08-08  8:46     ` Xavier Leroy
  2001-08-08 18:34       ` Chris Hecker
                         ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Xavier Leroy @ 2001-08-08  8:46 UTC (permalink / raw)
  To: Chris Hecker; +Cc: caml-list

> So, just to be excruciatingly clear, assuming there are no closures
> in any of the parameters to a function call, the "values" passed in
> and any native code and GC operations on those values are completely
> compatible?  Even for arbitrarily complex datastructures, as long as
> no closures are in the mix?

As far as I can remember, yes.

> If I constrained the problem to just leaf functions (where leaf is
> defined as never calling back into bytecode) then it would work,
> runtime library-wise?

No :-)  From the standpoint of the runtime system, Caml-generated
native code and the bytecode interpreter differ in several points:

- the location of GC roots (e.g. native stack vs. bytecode interpreter
  stack);
- how to raise exceptions from C code;
- how to call back from C to Caml.

To handle these differences, the runtime system comes in two variants
(libcamlrun.a and libasmrun.a), with suitable #ifdefs, different
definitions of some runtime functions, etc.  Linking with only one
version of the runtime system (e.g. libcamlrun.a) will result in wrong
behavior for mixed-mode code, e.g. the GC will overlook memory roots
residing in the native code stack.  Linking with both versions is not
possible, as they define differently the same function names...

(It might be possible to play linker tricks so that there are actually
two copies of the runtime system in the executable, but then you'd
have two different Caml heaps, one for the bytecode system and one for
the native code, and you'd need to copy all data structures when
switching between bytecode and native code.  Quite messy.)

My advice is: don't do it.  If all you need is to have the
efficiency of native code and the debugging comfort of bytecode, just
compile all your sources twice, to native-code and to bytecode.  For
dynamic loading of bytecode in a native-code application, Fabrice Le
Fessant's asmdynlink library (or something similar) should suffice.
And I can't see any other reason why you'd want mixed-mode execution.

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] calling native code from bytecode?
  2001-08-08  8:46     ` Xavier Leroy
@ 2001-08-08 18:34       ` Chris Hecker
  2001-08-09  6:57       ` Florian Hars
  2001-08-14 13:34       ` Fabrice Le Fessant
  2 siblings, 0 replies; 7+ messages in thread
From: Chris Hecker @ 2001-08-08 18:34 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

>To handle these differences, the runtime system comes in two variants
>(libcamlrun.a and libasmrun.a), with suitable #ifdefs, different
>definitions of some runtime functions, etc.

Oh.  Bummer.

> My advice is: don't do it.  If all you need is to have the
>efficiency of native code and the debugging comfort of bytecode, just
>compile all your sources twice, to native-code and to bytecode.

It's not quite that simple.  I'm working on a video game, which is by nature a realtime simulation, and its behavior changes depending on how long frames take to compute.  So, if I can't optimize enough stuff in the bytecode version to keep it playable, I'm just going to have to switch to native code and give up on bytecode completely (giving up "debugging comfort" and the toplevel, which I've grown fond of).

The shame is that a few specific pieces of leaf code are taking the time, and since linking native code into bytecode is turning out not to be possible, they're going to force my entire project into native code.

If I get a complex bug that won't reproduce in the bytecode version because of the time-dependence and feedback, then I'm screwed.  Well, I'd be stuck debugging with printfs, which is actually what I'm doing now (plus #trace in the toplevel, which is very useful), but I was planning on getting the bytecode debugger to work under Windows as soon as I ran into something truly heinous.  That won't be an option if I have to switch to native code.

It seems like the right thing to do from an engineering standpoint is to rewrite these leaf functions in C because that will allow me to continue with bytecode and native code (plus I can debug the C trivially).  However, my entire "experiment" was to see if I could develop a commercial quality game in ocaml without dropping to C very often.  I find it slightly ironic that ocaml's environment is in some ways more friendly towards C than other ocaml code.

It does sound like I'm the only person who's ever requested this, so maybe others don't run into this problem.  

If I am actually insane enough to try to make this work correctly (fixing the closure problem and the runtime problem), would this be something that could make it into the distribution?

Chris

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] calling native code from bytecode?
  2001-08-08  8:46     ` Xavier Leroy
  2001-08-08 18:34       ` Chris Hecker
@ 2001-08-09  6:57       ` Florian Hars
  2001-08-14 13:34       ` Fabrice Le Fessant
  2 siblings, 0 replies; 7+ messages in thread
From: Florian Hars @ 2001-08-09  6:57 UTC (permalink / raw)
  To: caml-list

On Wed, Aug 08, 2001 at 10:46:15AM +0200, Xavier Leroy wrote:
> My advice is: don't do it.  If all you need is to have the
> efficiency of native code and the debugging comfort of bytecode, just
> compile all your sources twice, to native-code and to bytecode.

Or debug by insterting print statements in the native code. This is
what I did, once the first part of the code worked (which takes about
30 to 40 seconds if compiled natively). Waiting five minutes to debug
every new function is no fun...

Yours, Florian Hars. 
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] calling native code from bytecode?
  2001-08-08  8:46     ` Xavier Leroy
  2001-08-08 18:34       ` Chris Hecker
  2001-08-09  6:57       ` Florian Hars
@ 2001-08-14 13:34       ` Fabrice Le Fessant
  2 siblings, 0 replies; 7+ messages in thread
From: Fabrice Le Fessant @ 2001-08-14 13:34 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Chris Hecker, caml-list

A few years ago :), I implemented the mixing of bytecode and native
code in JoCaml. The bytecode interpreter included in native code
runtimes was modified so that bytecode functions were wrapped inside
native code functions, using a special tag. This tag was checked
before executing a closure, to either call a wrapper for real native
code functions, or to find the bytecode pointer in the closure for
bytecode functions. In native code, calling directly the wrapped
function started a new interpreter to execute its bytecode. 

For exceptions, a flag is used to know whether the first exception
handler is executing native code or bytecode. Depending on this flag,
the exception is raised in bytecode or native style.

I did not measure the cost of this, but I noticed a slowdown. I don't
know if there is a better solution. One of my (unfortunately long list
of) projects is to improve asmdynlink to produce JIT native code for
bytecode ... maybe next year !

- Fabrice

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-08-14  7:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-01 23:45 [Caml-list] calling native code from bytecode? Chris Hecker
2001-08-03 12:03 ` Xavier Leroy
2001-08-03 14:31   ` Chris Hecker
2001-08-08  8:46     ` Xavier Leroy
2001-08-08 18:34       ` Chris Hecker
2001-08-09  6:57       ` Florian Hars
2001-08-14 13:34       ` Fabrice Le Fessant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).