caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Chet Murthy <murthy.chet@gmail.com>
To: Xavier Leroy <Xavier.Leroy@inria.fr>
Cc: Frederic Perriot <fperriot@gmail.com>, caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] an implicit GC rule?
Date: Sun, 6 May 2018 12:23:41 -0700	[thread overview]
Message-ID: <CA++P_gcLtW+CFpmxnOS79OSqen5nSNKJ-ueRrZm4oezaCEzf2Q@mail.gmail.com> (raw)
In-Reply-To: <CAH=h3gFdsCaNDOnF2oJFAYbOMEZihS-A7tMO5EiAnTaH0QwUjw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6086 bytes --]

Oh shit, and also, uh, oh right!  I forgot that (for example in Tree((v1 =
e1), (v2 = e2)) it could transpire that after evaluating e1 and assigning
to v1, the evaluation of e2 could end up moving the value pointed-to by v1,
updaiting v1, but NOT updating the result of the expression (v1 = e1) (b/c
of course, it's evaluated and on-stack waiting for the call to Tree().

Oof.

On Sat, May 5, 2018 at 12:42 AM, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:

>
>
> On Sat, May 5, 2018 at 5:25 AM Chet Murthy <murthy.chet@gmail.com> wrote:
>
>> It's been a while since I did this sort of thing, but I suspect if you
>> declare CAMLlocal variables for each intermediate expression, and stick in
>> the assignments, that should solve your problem (while not making your code
>> too ugly).  E.g.
>>
>> CAMLprim value left_comb(value a, value b, value c)
>> {
>>     CAMLparam3(a, b, c);
>>   CAMLlocal5(l1, l2, l3,  l4, l5);
>>     CAMLreturn(l1 = Tree(l2 = Tree((l3 = Leaf(a)), (l4 = Leaf(b)), (l5 =
>> Leaf(c))));
>> }
>>
>
> That's bold C/C++ programming!  It might even work in C++, where
> assignment expressions are l-values if I remember correctly.
>
> However, I'm afraid it won't work in C because an assignment expression
> "lv = rv" is a r-value equal to the value of rv converted to the type of lv
> at the time the assignment is evaluated.  So, if lv is a local variable
> registered with the GC, the GC will update lv when needed, but the "lv =
> rv" expression will keep its initial value.
>
> There's also the rules concerning sequence points.  I think the code above
> respects the C99 rules but I'm less sure about the C11 rules.
>
>>
>> Even better, you could linearize the tree of expressions into a sequence,
>> and that should solve your problem, also.
>>
>
> Yes, that's the robust solution.  Spelling it out:
>
> CAMLprim value left_comb(value a, value b, value c)
> {
>   CAMLparam3(a, b, c);
>   CAMLlocal5(la, lb, lc, tab, t);
>   la = Leaf(a);
>   lb = Leaf(b);
>   lc = Leaf(c);
>   tab = Tree(la, lb);
>   t = Tree(tab, lc);
>   CAMLreturn(t);
> }
>
> You can also do "CAMLreturn(Tree(tab, lc))" directly.
>
> - Xavier Leroy
>
>
>> Uh, I think.  Been a while since I wrote a lotta C/C++ code to interface
>> with Ocaml, but this oughta work.
>>
>> --chet--
>>
>>
>> On Wed, May 2, 2018 at 9:09 AM, Frederic Perriot <fperriot@gmail.com>
>> wrote:
>>
>>> Hello caml-list,
>>>
>>> I have a GC-related question. To give you some context, I'm writing a
>>> tool to parse .cmi files and generate .h and .c files, to facilitate
>>> constructing OCaml variants from C bindings.
>>>
>>> For instance, given the following source:
>>>
>>> type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]
>>>
>>>
>>> the tool produces C functions:
>>>
>>> CAMLprim value Leaf(value arg1)
>>> {
>>>     CAMLparam1(arg1);
>>>     CAMLlocal1(obj);
>>>
>>>     obj = caml_alloc_small(1, 0);
>>>
>>>     Field(obj, 0) = arg1;
>>>
>>>     CAMLreturn(obj);
>>> }
>>>
>>> CAMLprim value Tree(value arg1, value arg2)
>>> {
>>>   // similar code here
>>> }
>>>
>>>
>>> From there, it's tempting to nest calls to variant constructors from C
>>> and write code such as:
>>>
>>> CAMLprim value left_comb(value a, value b, value c)
>>> {
>>>     CAMLparam3(a, b, c);
>>>     CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
>>> }
>>>
>>>
>>> The problem with the above is the GC root loss due to the nesting of
>>> calls to allocating functions.
>>>
>>> Say Leaf(c) is constructed first, and the resulting value cached in a
>>> register, then Leaf(b) triggers a collection, thus invalidating the
>>> register contents, and leaving a dangling pointer in the top Tree.
>>>
>>> Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:
>>>
>>>    0x000000000040dbf4 <+149>:    callq  0x40d8fd <Leaf>
>>>    0x000000000040dbf9 <+154>:    mov    %rax,%rbx
>>>    0x000000000040dbfc <+157>:    mov    -0x90(%rbp),%rax
>>>    0x000000000040dc03 <+164>:    mov    %rax,%rdi
>>>    0x000000000040dc06 <+167>:    callq  0x40d8fd <Leaf>
>>>    0x000000000040dc0b <+172>:    mov    %rax,%r12
>>>    0x000000000040dc0e <+175>:    mov    -0x88(%rbp),%rax
>>>    0x000000000040dc15 <+182>:    mov    %rax,%rdi
>>>    0x000000000040dc18 <+185>:    callq  0x40d8fd <Leaf>
>>>    0x000000000040dc1d <+190>:    mov    %r12,%rsi
>>>    0x000000000040dc20 <+193>:    mov    %rax,%rdi
>>>    0x000000000040dc23 <+196>:    callq  0x40da19 <Tree>
>>>    0x000000000040dc28 <+201>:    mov    %rbx,%rsi
>>>    0x000000000040dc2b <+204>:    mov    %rax,%rdi
>>>    0x000000000040dc2e <+207>:    callq  0x40da19 <Tree>
>>>
>>>
>>> While the C code clearly violates the spirit of the GC rules, I can't
>>> help but feel this is still a pitfall.
>>>
>>> Rule 2 of the manual states: "Local variables of type value must be
>>> declared with one of the CAMLlocal macros. [...]"
>>>
>>> But here, I'm not declaring local variables, unless you count compiler
>>> temporaries as local variables?
>>>
>>> I can see some other people making the same mistake I did. Should
>>> there be an explicit warning in the rules? maybe underlining that
>>> compiler temps count as variables, or discouraging the kind of nested
>>> calls returning values displayed above?
>>>
>>> thanks,
>>> Frédéric Perriot
>>>
>>> PS: this is also my first time posting to the list, so I take this
>>> opportunity to thank you for the great Q's and A's I've read here over
>>> the years
>>>
>>> --
>>> Caml-list mailing list.  Subscription management and archives:
>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>>

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

[-- Attachment #2: Type: text/html, Size: 8340 bytes --]

  parent reply	other threads:[~2018-05-06 19:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-02 16:09 Frederic Perriot
2018-05-05  3:24 ` Chet Murthy
2018-05-05  7:42   ` Xavier Leroy
2018-05-05 14:11     ` [Caml-list] [ANN] Release 2.8.5 of Caph, a functional/dataflow language for programming FPGAs Jocelyn Sérot
2018-05-06 19:23     ` Chet Murthy [this message]
2018-05-07 17:01       ` [Caml-list] an implicit GC rule? Frederic Perriot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA++P_gcLtW+CFpmxnOS79OSqen5nSNKJ-ueRrZm4oezaCEzf2Q@mail.gmail.com \
    --to=murthy.chet@gmail.com \
    --cc=Xavier.Leroy@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=fperriot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).