Hello caml-list,
I have a GC-related question. To give you some context, I'm writing a
tool to parse .cmi files and generate .h and .c files, to facilitate
constructing OCaml variants from C bindings.
For instance, given the following source:
type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]
the tool produces C functions:
CAMLprim value Leaf(value arg1)
{
CAMLparam1(arg1);
CAMLlocal1(obj);
obj = caml_alloc_small(1, 0);
Field(obj, 0) = arg1;
CAMLreturn(obj);
}
CAMLprim value Tree(value arg1, value arg2)
{
// similar code here
}
From there, it's tempting to nest calls to variant constructors from C
and write code such as:
CAMLprim value left_comb(value a, value b, value c)
{
CAMLparam3(a, b, c);
CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
}
The problem with the above is the GC root loss due to the nesting of
calls to allocating functions.
Say Leaf(c) is constructed first, and the resulting value cached in a
register, then Leaf(b) triggers a collection, thus invalidating the
register contents, and leaving a dangling pointer in the top Tree.
Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:
0x000000000040dbf4 <+149>: callq 0x40d8fd <Leaf>
0x000000000040dbf9 <+154>: mov %rax,%rbx
0x000000000040dbfc <+157>: mov -0x90(%rbp),%rax
0x000000000040dc03 <+164>: mov %rax,%rdi
0x000000000040dc06 <+167>: callq 0x40d8fd <Leaf>
0x000000000040dc0b <+172>: mov %rax,%r12
0x000000000040dc0e <+175>: mov -0x88(%rbp),%rax
0x000000000040dc15 <+182>: mov %rax,%rdi
0x000000000040dc18 <+185>: callq 0x40d8fd <Leaf>
0x000000000040dc1d <+190>: mov %r12,%rsi
0x000000000040dc20 <+193>: mov %rax,%rdi
0x000000000040dc23 <+196>: callq 0x40da19 <Tree>
0x000000000040dc28 <+201>: mov %rbx,%rsi
0x000000000040dc2b <+204>: mov %rax,%rdi
0x000000000040dc2e <+207>: callq 0x40da19 <Tree>
While the C code clearly violates the spirit of the GC rules, I can't
help but feel this is still a pitfall.
Rule 2 of the manual states: "Local variables of type value must be
declared with one of the CAMLlocal macros. [...]"
But here, I'm not declaring local variables, unless you count compiler
temporaries as local variables?
I can see some other people making the same mistake I did. Should
there be an explicit warning in the rules? maybe underlining that
compiler temps count as variables, or discouraging the kind of nested
calls returning values displayed above?
thanks,
Frédéric Perriot
PS: this is also my first time posting to the list, so I take this
opportunity to thank you for the great Q's and A's I've read here over
the years
--
Caml-list mailing list. Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs