caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Frederic Perriot <fperriot@gmail.com>
To: caml-list@inria.fr
Subject: [Caml-list] an implicit GC rule?
Date: Wed, 2 May 2018 18:09:03 +0200	[thread overview]
Message-ID: <CAFY7FBM6GBZXoc6wpCkPBwLwrDWjvOkEe+pkE58EUh47WzOSfA@mail.gmail.com> (raw)

Hello caml-list,

I have a GC-related question. To give you some context, I'm writing a
tool to parse .cmi files and generate .h and .c files, to facilitate
constructing OCaml variants from C bindings.

For instance, given the following source:

type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]


the tool produces C functions:

CAMLprim value Leaf(value arg1)
{
    CAMLparam1(arg1);
    CAMLlocal1(obj);

    obj = caml_alloc_small(1, 0);

    Field(obj, 0) = arg1;

    CAMLreturn(obj);
}

CAMLprim value Tree(value arg1, value arg2)
{
  // similar code here
}


From there, it's tempting to nest calls to variant constructors from C
and write code such as:

CAMLprim value left_comb(value a, value b, value c)
{
    CAMLparam3(a, b, c);
    CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
}


The problem with the above is the GC root loss due to the nesting of
calls to allocating functions.

Say Leaf(c) is constructed first, and the resulting value cached in a
register, then Leaf(b) triggers a collection, thus invalidating the
register contents, and leaving a dangling pointer in the top Tree.

Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:

   0x000000000040dbf4 <+149>:    callq  0x40d8fd <Leaf>
   0x000000000040dbf9 <+154>:    mov    %rax,%rbx
   0x000000000040dbfc <+157>:    mov    -0x90(%rbp),%rax
   0x000000000040dc03 <+164>:    mov    %rax,%rdi
   0x000000000040dc06 <+167>:    callq  0x40d8fd <Leaf>
   0x000000000040dc0b <+172>:    mov    %rax,%r12
   0x000000000040dc0e <+175>:    mov    -0x88(%rbp),%rax
   0x000000000040dc15 <+182>:    mov    %rax,%rdi
   0x000000000040dc18 <+185>:    callq  0x40d8fd <Leaf>
   0x000000000040dc1d <+190>:    mov    %r12,%rsi
   0x000000000040dc20 <+193>:    mov    %rax,%rdi
   0x000000000040dc23 <+196>:    callq  0x40da19 <Tree>
   0x000000000040dc28 <+201>:    mov    %rbx,%rsi
   0x000000000040dc2b <+204>:    mov    %rax,%rdi
   0x000000000040dc2e <+207>:    callq  0x40da19 <Tree>


While the C code clearly violates the spirit of the GC rules, I can't
help but feel this is still a pitfall.

Rule 2 of the manual states: "Local variables of type value must be
declared with one of the CAMLlocal macros. [...]"

But here, I'm not declaring local variables, unless you count compiler
temporaries as local variables?

I can see some other people making the same mistake I did. Should
there be an explicit warning in the rules? maybe underlining that
compiler temps count as variables, or discouraging the kind of nested
calls returning values displayed above?

thanks,
Frédéric Perriot

PS: this is also my first time posting to the list, so I take this
opportunity to thank you for the great Q's and A's I've read here over
the years

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

             reply	other threads:[~2018-05-02 16:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-02 16:09 Frederic Perriot [this message]
2018-05-05  3:24 ` Chet Murthy
2018-05-05  7:42   ` Xavier Leroy
2018-05-05 14:11     ` [Caml-list] [ANN] Release 2.8.5 of Caph, a functional/dataflow language for programming FPGAs Jocelyn Sérot
2018-05-06 19:23     ` [Caml-list] an implicit GC rule? Chet Murthy
2018-05-07 17:01       ` Frederic Perriot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFY7FBM6GBZXoc6wpCkPBwLwrDWjvOkEe+pkE58EUh47WzOSfA@mail.gmail.com \
    --to=fperriot@gmail.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).