caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Using OCaml's run-time from LLVM-generated native code
@ 2008-02-01 21:24 Jon Harrop
  2008-02-03 21:24 ` [Caml-list] " Alain Frisch
  0 siblings, 1 reply; 9+ messages in thread
From: Jon Harrop @ 2008-02-01 21:24 UTC (permalink / raw)
  To: caml-list


I would like to use OCaml's run-time in code that is run-time generated by 
LLVM as simply as possible (i.e. with no regard for efficiency yet). There 
are two main motivations for this:

1. Write a native-code compiler for MiniML as an educational exercise but 
without having to implement a GC.

2. Write a tool that autogenerates bindings to native-code libraries that are 
immediately usable from OCaml without having to go via C or use external 
tools (just LLVM).

I believe the easiest solution is to autogenerate equivalents of the CAML* 
macros from OCaml's C FFI. For example, the LLVM backend can then generate 
native code equivalent to (from [1]):

CAMLprim value
create_tuple( value a, value b, value c )
{
    CAMLparam3( a, b, c );
    CAMLlocal1( abc );

    abc = caml_alloc(3, 0);

    Store_field( abc, 0, a );
    Store_field( abc, 1, b );
    Store_field( abc, 2, c );

    CAMLreturn( abc );
}

This raises several questions for me:

. Is this even possible?

. Has anyone written any self-contained toy C programs that use OCaml's 
run-time for garbage collection?

. Do I need to worry about what the C program puts on the stack or will this 
take care of itself?

. Has anyone already done this?

For anyone who hasn't already used LLVM and its excellent OCaml bindings, I 
cannot recommend it highly enough: it is a tremendous achievement.

[1] - http://www.linux-nantes.org/~fmonnier/OCaml/ocaml-wrapping-c.php

Many thanks,
-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-01 21:24 Using OCaml's run-time from LLVM-generated native code Jon Harrop
@ 2008-02-03 21:24 ` Alain Frisch
  2008-02-03 23:19   ` Jon Harrop
  0 siblings, 1 reply; 9+ messages in thread
From: Alain Frisch @ 2008-02-03 21:24 UTC (permalink / raw)
  To: Jon Harrop, caml-list

Jon Harrop wrote:
> This raises several questions for me:
> 
> . Is this even possible?

Yes. How could it not be possible? Run your example through cpp, you'll 
get a ``self-contained'' C program that uses only functions exported 
from OCaml's runtime.

-- Alain


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-03 21:24 ` [Caml-list] " Alain Frisch
@ 2008-02-03 23:19   ` Jon Harrop
  2008-02-04  7:03     ` Alain Frisch
  0 siblings, 1 reply; 9+ messages in thread
From: Jon Harrop @ 2008-02-03 23:19 UTC (permalink / raw)
  To: Alain Frisch; +Cc: caml-list

On Sunday 03 February 2008 21:24:12 Alain Frisch wrote:
> Jon Harrop wrote:
> > This raises several questions for me:
> >
> > . Is this even possible?
>
> Yes. How could it not be possible? Run your example through cpp, you'll
> get a ``self-contained'' C program that uses only functions exported
> from OCaml's runtime.

How does OCaml's stack walker work with C code, for example? In particular, 
how does it know what is a pointer into the heap from a C stack frame? Must 
it be explicitly disabled?

I assume local variables must be explicitly registered as global roots upon 
entry to each function and unregistered upon exit. If so, what are the 
performance implications of this?

I tried and failed to write such an example myself. Here's "forstr.ml":

let print_stat() =
  let stat = Gc.stat() in
  Printf.printf "%d minor collections\n%!" stat.Gc.minor_collections;
  Printf.printf "%d major collections\n%!" stat.Gc.major_collections;
  Gc.print_stat stdout
let _ = Callback.register "gc_print_stat" print_stat
let _ = Callback.register "gc_full_major" Gc.full_major

Here's "str.c":

#include <stdio.h>
#include <string.h>
#include <caml/mlvalues.h>
#include <caml/alloc.h>
#include <caml/memory.h>
#include <caml/fail.h>
#include <caml/callback.h>
#include <caml/custom.h>
#include <caml/intext.h>

value *full_major, *print_stat;

CAMLprim value fib(value nv) {
  int64 n = Int64_val(nv);
  return (n < 2 ? nv : copy_int64(Int64_val(fib(copy_int64(n-1))) +
                                  Int64_val(fib(copy_int64(n-2)))));
}

int apply(int n) {
  return Int64_val(fib(copy_int64(n)));
}

int main(int argc, char* argv[]) {
  caml_main(argv);
  print_stat = caml_named_value("gc_print_stat");
  full_major = caml_named_value("full_major");
  printf("%d\n", apply(argc == 2 ? atoi(argv[1]) : 10));
  callback(*print_stat, 0);
  callback(*full_major, 0);
  callback(*print_stat, 0);
  return 0;
}

Compile and run with:

$ ocamlopt -dtypes -output-obj forstr.ml -o forstring.o && gcc -Wall -o test 
forstring.o str.c -L/usr/lib/ocaml/3.10.0 -lasmrun -ldl -lm && time ./test 30
832040
369 minor collections
0 major collections
Segmentation fault

real    0m0.361s
user    0m0.344s
sys     0m0.004s

So this is several times slower than native ocamlopt-generated code, as you 
might expect, but it doesn't work correctly because it segfaults when 
full_major is called to invoke the GC. How can I fix this example?

If I can get simple examples like this working as C code then it should be 
trivial to generate them using LLVM at which point you've got a mediocre 
compiler than can be built upon. I think a lot of people would be interested 
in that...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-03 23:19   ` Jon Harrop
@ 2008-02-04  7:03     ` Alain Frisch
  2008-02-04 10:32       ` Jon Harrop
  0 siblings, 1 reply; 9+ messages in thread
From: Alain Frisch @ 2008-02-04  7:03 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

Jon Harrop wrote:
> How does OCaml's stack walker work with C code, for example? In particular, 
> how does it know what is a pointer into the heap from a C stack frame? Must 
> it be explicitly disabled?

The OCaml runtime does not scan the stack frames corresponding to C 
functions.

Jon, it is somewhat weird that you spend so much time writing about 
forking OCaml and do not take a few minutes to read the source code. The 
macros CAMLparam*, CAMLlocal* are not really that mysterious.


-- Alain


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-04  7:03     ` Alain Frisch
@ 2008-02-04 10:32       ` Jon Harrop
  2008-02-04 11:11         ` Alain Frisch
  0 siblings, 1 reply; 9+ messages in thread
From: Jon Harrop @ 2008-02-04 10:32 UTC (permalink / raw)
  To: Alain Frisch; +Cc: caml-list

On Monday 04 February 2008 07:03:17 Alain Frisch wrote:
> Jon Harrop wrote:
> > How does OCaml's stack walker work with C code, for example? In
> > particular, how does it know what is a pointer into the heap from a C
> > stack frame? Must it be explicitly disabled?
>
> The OCaml runtime does not scan the stack frames corresponding to C
> functions.

How does it know which stack frames correspond to C functions?

> Jon, it is somewhat weird that you spend so much time writing about
> forking OCaml and do not take a few minutes to read the source code. The
> macros CAMLparam*, CAMLlocal* are not really that mysterious.

Despite the availability of that code it seems that few people can use it 
correctly and I am one of them.

This seems to work even though it calls full_major aggressively:

#include <stdio.h>
#include <string.h>
#include <caml/mlvalues.h>
#include <caml/alloc.h>
#include <caml/memory.h>
#include <caml/fail.h>
#include <caml/callback.h>
#include <caml/custom.h>
#include <caml/intext.h>

extern value caml_gc_full_major(value v);

CAMLprim value fib(value nv) {
  CAMLparam1(nv);
  CAMLlocal5(a, b, c, d, e);
  int64 n = Int64_val(nv);
  if (n < 2) CAMLreturn(nv);
  a = copy_int64(n-1);
  b = copy_int64(n-2);
  c = fib(a);
  d = fib(b);
  e = copy_int64(Int64_val(c) + Int64_val(d));
  caml_gc_full_major(0);
  CAMLreturn(e);
}

int apply(int n) {
  CAMLlocal2(nv, fibn);
  nv = copy_int64(n);
  fibn = fib(nv);
  caml_gc_full_major(0);
  return Int64_val(fib(nv));
}

int main(int argc, char* argv[]) {
  caml_main(argv);
  printf("%d\n", apply(argc == 2 ? atoi(argv[1]) : 10));
  return 0;
}

Is that correct code?

Rather than messing around with these macros in each and every function it is 
probably easier and more efficient to register a single global root at entry, 
pointing to a shadow stack and push and pop elements to and from that 
directly.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-04 10:32       ` Jon Harrop
@ 2008-02-04 11:11         ` Alain Frisch
  2008-02-04 13:36           ` Jon Harrop
  0 siblings, 1 reply; 9+ messages in thread
From: Alain Frisch @ 2008-02-04 11:11 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

Jon Harrop wrote:
> Despite the availability of that code it seems that few people can use it 
> correctly and I am one of them.

What part of memory.h do you fail to understand?

> int apply(int n) {
>   CAMLlocal2(nv, fibn);
>   nv = copy_int64(n);
>   fibn = fib(nv);
>   caml_gc_full_major(0);
>   return Int64_val(fib(nv));
> }
> 
> Is that correct code?

No, this function does not follow the rules of Section 18.5.1.


-- Alain


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-04 11:11         ` Alain Frisch
@ 2008-02-04 13:36           ` Jon Harrop
  2008-02-04 15:20             ` Alain Frisch
  2008-02-05 23:08             ` Florent Monnier
  0 siblings, 2 replies; 9+ messages in thread
From: Jon Harrop @ 2008-02-04 13:36 UTC (permalink / raw)
  To: Alain Frisch; +Cc: caml-list

On Monday 04 February 2008 11:11:33 Alain Frisch wrote:
> Jon Harrop wrote:
> > Despite the availability of that code it seems that few people can use it
> > correctly and I am one of them.
>
> What part of memory.h do you fail to understand?

That file doesn't even mention the stack walker AFAICT.

> > int apply(int n) {
> >   CAMLlocal2(nv, fibn);
> >   nv = copy_int64(n);
> >   fibn = fib(nv);
> >   caml_gc_full_major(0);
> >   return Int64_val(fib(nv));
> > }
> >
> > Is that correct code?
>
> No, this function does not follow the rules of Section 18.5.1.

Perhaps this does:

int apply(int n) {
  CAMLparam0();
  CAMLlocal2(nv, fibn);
  nv = copy_int64(n);
  fibn = fib(nv);
  caml_gc_full_major(0);
  CAMLreturn(Int64_val(fib(nv)));
}

Is that correct?

Next, this C code is 4x slower than the ocamlopt-generated equivalent. What 
can be done to improve its performance without leaving C?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-04 13:36           ` Jon Harrop
@ 2008-02-04 15:20             ` Alain Frisch
  2008-02-05 23:08             ` Florent Monnier
  1 sibling, 0 replies; 9+ messages in thread
From: Alain Frisch @ 2008-02-04 15:20 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

Jon Harrop wrote:
> On Monday 04 February 2008 11:11:33 Alain Frisch wrote:
>> Jon Harrop wrote:
>>> Despite the availability of that code it seems that few people can use it
>>> correctly and I am one of them.
>> What part of memory.h do you fail to understand?
> 
> That file doesn't even mention the stack walker AFAICT.

Nevertheless, the file contains all the info you need to produce GC-safe 
C code. If you want efficient GC-safe code, you need to understand how 
ocamlopt records information about stack frames. It is probably 
impossible to do it with pure portable C code. For LLVM-generated code, 
you should be able to use the GC infrastructure improved by Gordon.

> Perhaps this does:
> 
> int apply(int n) {
>   CAMLparam0();
>   CAMLlocal2(nv, fibn);
>   nv = copy_int64(n);
>   fibn = fib(nv);
>   caml_gc_full_major(0);
>   CAMLreturn(Int64_val(fib(nv)));
> }

CAMLreturnT would be better.


-- Alain


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] Using OCaml's run-time from LLVM-generated native code
  2008-02-04 13:36           ` Jon Harrop
  2008-02-04 15:20             ` Alain Frisch
@ 2008-02-05 23:08             ` Florent Monnier
  1 sibling, 0 replies; 9+ messages in thread
From: Florent Monnier @ 2008-02-05 23:08 UTC (permalink / raw)
  To: caml-list; +Cc: Jon Harrop

Hi,

Jon Harrop wrote:
> int apply(int n) {
>   CAMLparam0();
>   CAMLlocal2(nv, fibn);
>   nv = copy_int64(n);
>   fibn = fib(nv);
>   caml_gc_full_major(0);
>   CAMLreturn(Int64_val(fib(nv)));
> }
>
> Is that correct?
>
> Next, this C code is 4x slower than the ocamlopt-generated equivalent. What 
> can be done to improve its performance without leaving C?

I don't know if it is a typo from you, or if it is me that don't understand 
the code, but it seems that fib(nv) is computed twice.
Shouldn't the second one be replaced by fibn, the result of the first one?


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-02-05 23:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-01 21:24 Using OCaml's run-time from LLVM-generated native code Jon Harrop
2008-02-03 21:24 ` [Caml-list] " Alain Frisch
2008-02-03 23:19   ` Jon Harrop
2008-02-04  7:03     ` Alain Frisch
2008-02-04 10:32       ` Jon Harrop
2008-02-04 11:11         ` Alain Frisch
2008-02-04 13:36           ` Jon Harrop
2008-02-04 15:20             ` Alain Frisch
2008-02-05 23:08             ` Florent Monnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).