Frederic, It's been a while since I did this sort of thing, but I suspect if you declare CAMLlocal variables for each intermediate expression, and stick in the assignments, that should solve your problem (while not making your code too ugly). E.g. CAMLprim value left_comb(value a, value b, value c) { CAMLparam3(a, b, c); CAMLlocal5(l1, l2, l3, l4, l5); CAMLreturn(l1 = Tree(l2 = Tree((l3 = Leaf(a)), (l4 = Leaf(b)), (l5 = Leaf(c)))); } Even better, you could linearize the tree of expressions into a sequence, and that should solve your problem, also. Uh, I think. Been a while since I wrote a lotta C/C++ code to interface with Ocaml, but this oughta work. --chet-- On Wed, May 2, 2018 at 9:09 AM, Frederic Perriot wrote: > Hello caml-list, > > I have a GC-related question. To give you some context, I'm writing a > tool to parse .cmi files and generate .h and .c files, to facilitate > constructing OCaml variants from C bindings. > > For instance, given the following source: > > type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file] > > > the tool produces C functions: > > CAMLprim value Leaf(value arg1) > { > CAMLparam1(arg1); > CAMLlocal1(obj); > > obj = caml_alloc_small(1, 0); > > Field(obj, 0) = arg1; > > CAMLreturn(obj); > } > > CAMLprim value Tree(value arg1, value arg2) > { > // similar code here > } > > > From there, it's tempting to nest calls to variant constructors from C > and write code such as: > > CAMLprim value left_comb(value a, value b, value c) > { > CAMLparam3(a, b, c); > CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c))); > } > > > The problem with the above is the GC root loss due to the nesting of > calls to allocating functions. > > Say Leaf(c) is constructed first, and the resulting value cached in a > register, then Leaf(b) triggers a collection, thus invalidating the > register contents, and leaving a dangling pointer in the top Tree. > > Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx: > > 0x000000000040dbf4 <+149>: callq 0x40d8fd > 0x000000000040dbf9 <+154>: mov %rax,%rbx > 0x000000000040dbfc <+157>: mov -0x90(%rbp),%rax > 0x000000000040dc03 <+164>: mov %rax,%rdi > 0x000000000040dc06 <+167>: callq 0x40d8fd > 0x000000000040dc0b <+172>: mov %rax,%r12 > 0x000000000040dc0e <+175>: mov -0x88(%rbp),%rax > 0x000000000040dc15 <+182>: mov %rax,%rdi > 0x000000000040dc18 <+185>: callq 0x40d8fd > 0x000000000040dc1d <+190>: mov %r12,%rsi > 0x000000000040dc20 <+193>: mov %rax,%rdi > 0x000000000040dc23 <+196>: callq 0x40da19 > 0x000000000040dc28 <+201>: mov %rbx,%rsi > 0x000000000040dc2b <+204>: mov %rax,%rdi > 0x000000000040dc2e <+207>: callq 0x40da19 > > > While the C code clearly violates the spirit of the GC rules, I can't > help but feel this is still a pitfall. > > Rule 2 of the manual states: "Local variables of type value must be > declared with one of the CAMLlocal macros. [...]" > > But here, I'm not declaring local variables, unless you count compiler > temporaries as local variables? > > I can see some other people making the same mistake I did. Should > there be an explicit warning in the rules? maybe underlining that > compiler temps count as variables, or discouraging the kind of nested > calls returning values displayed above? > > thanks, > Frédéric Perriot > > PS: this is also my first time posting to the list, so I take this > opportunity to thank you for the great Q's and A's I've read here over > the years > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs -- Caml-list mailing list. Subscription management and archives: https://sympa.inria.fr/sympa/arc/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs