caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Chet Murthy <murthy.chet@gmail.com>
Cc: Frederic Perriot <fperriot@gmail.com>, caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] an implicit GC rule?
Date: Sat, 05 May 2018 07:42:03 +0000	[thread overview]
Message-ID: <CAH=h3gFdsCaNDOnF2oJFAYbOMEZihS-A7tMO5EiAnTaH0QwUjw@mail.gmail.com> (raw)
In-Reply-To: <CA++P_gcfkvcW33MOQtbU_yq_68F0miGhxgEdEW_ErStVSvdMvQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5433 bytes --]

On Sat, May 5, 2018 at 5:25 AM Chet Murthy <murthy.chet@gmail.com> wrote:

> It's been a while since I did this sort of thing, but I suspect if you
> declare CAMLlocal variables for each intermediate expression, and stick in
> the assignments, that should solve your problem (while not making your code
> too ugly).  E.g.
>
> CAMLprim value left_comb(value a, value b, value c)
> {
>     CAMLparam3(a, b, c);
>   CAMLlocal5(l1, l2, l3,  l4, l5);
>     CAMLreturn(l1 = Tree(l2 = Tree((l3 = Leaf(a)), (l4 = Leaf(b)), (l5 =
> Leaf(c))));
> }
>

That's bold C/C++ programming!  It might even work in C++, where assignment
expressions are l-values if I remember correctly.

However, I'm afraid it won't work in C because an assignment expression "lv
= rv" is a r-value equal to the value of rv converted to the type of lv at
the time the assignment is evaluated.  So, if lv is a local variable
registered with the GC, the GC will update lv when needed, but the "lv =
rv" expression will keep its initial value.

There's also the rules concerning sequence points.  I think the code above
respects the C99 rules but I'm less sure about the C11 rules.

>
> Even better, you could linearize the tree of expressions into a sequence,
> and that should solve your problem, also.
>

Yes, that's the robust solution.  Spelling it out:

CAMLprim value left_comb(value a, value b, value c)
{
  CAMLparam3(a, b, c);
  CAMLlocal5(la, lb, lc, tab, t);
  la = Leaf(a);
  lb = Leaf(b);
  lc = Leaf(c);
  tab = Tree(la, lb);
  t = Tree(tab, lc);
  CAMLreturn(t);
}

You can also do "CAMLreturn(Tree(tab, lc))" directly.

- Xavier Leroy


> Uh, I think.  Been a while since I wrote a lotta C/C++ code to interface
> with Ocaml, but this oughta work.
>
> --chet--
>
>
> On Wed, May 2, 2018 at 9:09 AM, Frederic Perriot <fperriot@gmail.com>
> wrote:
>
>> Hello caml-list,
>>
>> I have a GC-related question. To give you some context, I'm writing a
>> tool to parse .cmi files and generate .h and .c files, to facilitate
>> constructing OCaml variants from C bindings.
>>
>> For instance, given the following source:
>>
>> type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]
>>
>>
>> the tool produces C functions:
>>
>> CAMLprim value Leaf(value arg1)
>> {
>>     CAMLparam1(arg1);
>>     CAMLlocal1(obj);
>>
>>     obj = caml_alloc_small(1, 0);
>>
>>     Field(obj, 0) = arg1;
>>
>>     CAMLreturn(obj);
>> }
>>
>> CAMLprim value Tree(value arg1, value arg2)
>> {
>>   // similar code here
>> }
>>
>>
>> From there, it's tempting to nest calls to variant constructors from C
>> and write code such as:
>>
>> CAMLprim value left_comb(value a, value b, value c)
>> {
>>     CAMLparam3(a, b, c);
>>     CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
>> }
>>
>>
>> The problem with the above is the GC root loss due to the nesting of
>> calls to allocating functions.
>>
>> Say Leaf(c) is constructed first, and the resulting value cached in a
>> register, then Leaf(b) triggers a collection, thus invalidating the
>> register contents, and leaving a dangling pointer in the top Tree.
>>
>> Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:
>>
>>    0x000000000040dbf4 <+149>:    callq  0x40d8fd <Leaf>
>>    0x000000000040dbf9 <+154>:    mov    %rax,%rbx
>>    0x000000000040dbfc <+157>:    mov    -0x90(%rbp),%rax
>>    0x000000000040dc03 <+164>:    mov    %rax,%rdi
>>    0x000000000040dc06 <+167>:    callq  0x40d8fd <Leaf>
>>    0x000000000040dc0b <+172>:    mov    %rax,%r12
>>    0x000000000040dc0e <+175>:    mov    -0x88(%rbp),%rax
>>    0x000000000040dc15 <+182>:    mov    %rax,%rdi
>>    0x000000000040dc18 <+185>:    callq  0x40d8fd <Leaf>
>>    0x000000000040dc1d <+190>:    mov    %r12,%rsi
>>    0x000000000040dc20 <+193>:    mov    %rax,%rdi
>>    0x000000000040dc23 <+196>:    callq  0x40da19 <Tree>
>>    0x000000000040dc28 <+201>:    mov    %rbx,%rsi
>>    0x000000000040dc2b <+204>:    mov    %rax,%rdi
>>    0x000000000040dc2e <+207>:    callq  0x40da19 <Tree>
>>
>>
>> While the C code clearly violates the spirit of the GC rules, I can't
>> help but feel this is still a pitfall.
>>
>> Rule 2 of the manual states: "Local variables of type value must be
>> declared with one of the CAMLlocal macros. [...]"
>>
>> But here, I'm not declaring local variables, unless you count compiler
>> temporaries as local variables?
>>
>> I can see some other people making the same mistake I did. Should
>> there be an explicit warning in the rules? maybe underlining that
>> compiler temps count as variables, or discouraging the kind of nested
>> calls returning values displayed above?
>>
>> thanks,
>> Frédéric Perriot
>>
>> PS: this is also my first time posting to the list, so I take this
>> opportunity to thank you for the great Q's and A's I've read here over
>> the years
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[-- Attachment #2: Type: text/html, Size: 7293 bytes --]

  reply	other threads:[~2018-05-05  7:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-02 16:09 Frederic Perriot
2018-05-05  3:24 ` Chet Murthy
2018-05-05  7:42   ` Xavier Leroy [this message]
2018-05-05 14:11     ` [Caml-list] [ANN] Release 2.8.5 of Caph, a functional/dataflow language for programming FPGAs Jocelyn Sérot
2018-05-06 19:23     ` [Caml-list] an implicit GC rule? Chet Murthy
2018-05-07 17:01       ` Frederic Perriot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH=h3gFdsCaNDOnF2oJFAYbOMEZihS-A7tMO5EiAnTaH0QwUjw@mail.gmail.com' \
    --to=xavier.leroy@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=fperriot@gmail.com \
    --cc=murthy.chet@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).