caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] an implicit GC rule?
@ 2018-05-02 16:09 Frederic Perriot
  2018-05-05  3:24 ` Chet Murthy
  0 siblings, 1 reply; 6+ messages in thread
From: Frederic Perriot @ 2018-05-02 16:09 UTC (permalink / raw)
  To: caml-list

Hello caml-list,

I have a GC-related question. To give you some context, I'm writing a
tool to parse .cmi files and generate .h and .c files, to facilitate
constructing OCaml variants from C bindings.

For instance, given the following source:

type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]


the tool produces C functions:

CAMLprim value Leaf(value arg1)
{
    CAMLparam1(arg1);
    CAMLlocal1(obj);

    obj = caml_alloc_small(1, 0);

    Field(obj, 0) = arg1;

    CAMLreturn(obj);
}

CAMLprim value Tree(value arg1, value arg2)
{
  // similar code here
}


From there, it's tempting to nest calls to variant constructors from C
and write code such as:

CAMLprim value left_comb(value a, value b, value c)
{
    CAMLparam3(a, b, c);
    CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
}


The problem with the above is the GC root loss due to the nesting of
calls to allocating functions.

Say Leaf(c) is constructed first, and the resulting value cached in a
register, then Leaf(b) triggers a collection, thus invalidating the
register contents, and leaving a dangling pointer in the top Tree.

Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:

   0x000000000040dbf4 <+149>:    callq  0x40d8fd <Leaf>
   0x000000000040dbf9 <+154>:    mov    %rax,%rbx
   0x000000000040dbfc <+157>:    mov    -0x90(%rbp),%rax
   0x000000000040dc03 <+164>:    mov    %rax,%rdi
   0x000000000040dc06 <+167>:    callq  0x40d8fd <Leaf>
   0x000000000040dc0b <+172>:    mov    %rax,%r12
   0x000000000040dc0e <+175>:    mov    -0x88(%rbp),%rax
   0x000000000040dc15 <+182>:    mov    %rax,%rdi
   0x000000000040dc18 <+185>:    callq  0x40d8fd <Leaf>
   0x000000000040dc1d <+190>:    mov    %r12,%rsi
   0x000000000040dc20 <+193>:    mov    %rax,%rdi
   0x000000000040dc23 <+196>:    callq  0x40da19 <Tree>
   0x000000000040dc28 <+201>:    mov    %rbx,%rsi
   0x000000000040dc2b <+204>:    mov    %rax,%rdi
   0x000000000040dc2e <+207>:    callq  0x40da19 <Tree>


While the C code clearly violates the spirit of the GC rules, I can't
help but feel this is still a pitfall.

Rule 2 of the manual states: "Local variables of type value must be
declared with one of the CAMLlocal macros. [...]"

But here, I'm not declaring local variables, unless you count compiler
temporaries as local variables?

I can see some other people making the same mistake I did. Should
there be an explicit warning in the rules? maybe underlining that
compiler temps count as variables, or discouraging the kind of nested
calls returning values displayed above?

thanks,
Frédéric Perriot

PS: this is also my first time posting to the list, so I take this
opportunity to thank you for the great Q's and A's I've read here over
the years

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-07 17:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-02 16:09 [Caml-list] an implicit GC rule? Frederic Perriot
2018-05-05  3:24 ` Chet Murthy
2018-05-05  7:42   ` Xavier Leroy
2018-05-05 14:11     ` [Caml-list] [ANN] Release 2.8.5 of Caph, a functional/dataflow language for programming FPGAs Jocelyn Sérot
2018-05-06 19:23     ` [Caml-list] an implicit GC rule? Chet Murthy
2018-05-07 17:01       ` Frederic Perriot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).