From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by c5ff346549e7 (Postfix) with ESMTP id 002455D5 for ; Sat, 5 May 2018 07:42:30 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.49,365,1520895600"; d="scan'208,217";a="325875967" Received: from sympa.inria.fr ([193.51.193.213]) by mail2-relais-roc.national.inria.fr with ESMTP; 05 May 2018 09:42:28 +0200 Received: by sympa.inria.fr (Postfix, from userid 20132) id 84D628244D; Sat, 5 May 2018 09:42:28 +0200 (CEST) Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 4C89B8240C for ; Sat, 5 May 2018 09:42:20 +0200 (CEST) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=Xavier.Leroy@inria.fr; spf=Pass smtp.mailfrom=xavier.leroy@gmail.com; spf=None smtp.helo=postmaster@mail-ua0-f170.google.com IronPort-PHdr: =?us-ascii?q?9a23=3Af3gHOBE5M1JabVjTVEJFK51GYnF86YWxBRYc798d?= =?us-ascii?q?s5kLTJ7zocywAkXT6L1XgUPTWs2DsrQY07GQ6/iocFdDyK7JiGoFfp1IWk1Nou?= =?us-ascii?q?QttCtkPvS4D1bmJuXhdS0wEZcKflZk+3amLRodQ56mNBXdrXKo8DEdBAj0OxZr?= =?us-ascii?q?KeTpAI7SiNm82/yv95HJbAhEmDSwbaluIBmqsA7cqtQYjYx+J6gr1xDHuGFIe+?= =?us-ascii?q?NYxWNpIVKcgRPx7dqu8ZBg7ipdpesv+9ZPXqvmcas4S6dYDCk9PGAu+MLrrxjD?= =?us-ascii?q?QhCR6XYaT24bjwBHAwnB7BH9Q5fxri73vfdz1SWGIcH7S60/VC+85Kl3VhDnlC?= =?us-ascii?q?YHNyY48G7JjMxwkLlbqw+lqxBm3oLYfJ2ZOP94c6zaYN0aWHFBXt5PWCNdHoOy?= =?us-ascii?q?YYwPD+8bMuZZqYn2ul8CoBS6CAWpAu7k1z1GiWLs3aAi3OshHwPJ0gwuEdwNrX?= =?us-ascii?q?rassn6ObwIXuyp1qTF1ynPY+9U1Dr79YPGcgohofaJXb9ocMXe01cvFwLbgVWK?= =?us-ascii?q?tIfrOS2a1v4Ks2mb8uFtUuOghHQ5qwFwvDev3N0ghI/XiYIPzVDF9T50wIczJd?= =?us-ascii?q?2iSU50e8SoEJVKtyyDMYZ9X80sQ2ZtuCkgy70Gv4a2fCkMyJQ9xh7QceaLc4aS?= =?us-ascii?q?4h77VOeeOzd4hHVieL6lmxmy9k2gx+vhXce3yFZHtjRJnsXIu3wX1BHe6tKLRu?= =?us-ascii?q?Vg8kqjwzqDygLe5v1CLEspj6TUMYQhzaQ1lpcLsUTMACv2mELuga+TbEok++yo?= =?us-ascii?q?5/36Yrr8upOQLoF0hhz8P6gygMC/DuM4Mg8BX2if5+uwzqHs/Ur8QLlSj/02lL?= =?us-ascii?q?fWsIzCKMgFuqK0BxVZ34Uj5hqlETuqzdYVkWMaIF9HZB6Ll43pNEvPIPD8A/e/?= =?us-ascii?q?mVOskDJzyvHJJLLhHJTNIWbZkLv7ebZy9VRcyA0zzN1E6JJUD6sOIPP3WkPrqN?= =?us-ascii?q?PYCRo5PxSuw+n7ENV9yp8eWWWXD6CFKqzStFuI6vsrI+mNf48VpC3wK+Ml5v7r?= =?us-ascii?q?lX82g0URfaiv3ZsNaXC3BO5qI0uDYSmkvtBUOmcHokIbUfb2iEzKBTtOfWqyTu?= =?us-ascii?q?Q35jwnII2jBIbHAIuqherS8j28G8hmb35HB0rENXrycJTMD8cFdiOfOIlFnyYD?= =?us-ascii?q?RJCgTZUg3FegrlmpmPJcMuPI93hA5trY399v6riWyEhrpG5ESv+F2mTIdFla22?= =?us-ascii?q?YBRjs4xqd6+BUvxVKK0Kw+iPtdR4UKu6F5FzwiPJuZ9NRUTsjoU1uZLNaPUlev?= =?us-ascii?q?BNu8U2loE4ABhuQWakM4IO2MyxDO2y3wXe0Qnr2PQYE9qufShiepYcl6zHnC2e?= =?us-ascii?q?8qiFx0GsY=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0DuAQDKXu1ahqrZVdFcGwEBAQEDAQEBC?= =?us-ascii?q?QEBAYQXDhdjKIMvP4Edk1iBeYEPgUCFPIwoggMjhwMZBwEEMhYBAgEBAQEBAQE?= =?us-ascii?q?BARMBAQEICwsIKCMMgjUkgk8BAQICAQwSBTAbCwULCQILDQ0dAgIhARIBBQEKE?= =?us-ascii?q?gYTEgmEbgMNCA+NIJAAPIsFghyHDwMKgSuCOBKIE4FUP4MbSjWCTzeEbYJUAoY?= =?us-ascii?q?QCJFgLAhxhHSFa4J9gTU8hgOEbodBggJJhi4PIYEEDBcBggIzGidMMQaCDAmCF?= =?us-ascii?q?xeDRYE+iRQ/MI1xKoIcAQE?= X-IPAS-Result: =?us-ascii?q?A0DuAQDKXu1ahqrZVdFcGwEBAQEDAQEBCQEBAYQXDhdjKIM?= =?us-ascii?q?vP4Edk1iBeYEPgUCFPIwoggMjhwMZBwEEMhYBAgEBAQEBAQEBARMBAQEICwsIK?= =?us-ascii?q?CMMgjUkgk8BAQICAQwSBTAbCwULCQILDQ0dAgIhARIBBQEKEgYTEgmEbgMNCA+?= =?us-ascii?q?NIJAAPIsFghyHDwMKgSuCOBKIE4FUP4MbSjWCTzeEbYJUAoYQCJFgLAhxhHSFa?= =?us-ascii?q?4J9gTU8hgOEbodBggJJhi4PIYEEDBcBggIzGidMMQaCDAmCFxeDRYE+iRQ/MI1?= =?us-ascii?q?xKoIcAQE?= X-IronPort-AV: E=Sophos;i="5.49,365,1520895600"; d="scan'208,217";a="264410274" Received: from mail-ua0-f170.google.com ([209.85.217.170]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 05 May 2018 09:42:18 +0200 Received: by mail-ua0-f170.google.com with SMTP id y8so15471038ual.5 for ; Sat, 05 May 2018 00:42:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hCb7YDqS0OlndwCq0uKa7+JoRFkNYCq26qeRWykS68A=; b=iIGdjZsyyRHrSb5lQ/VFvnOR/G6WG+X1B5zbR/hnc29V+aQlUxG6IgIY0+b1EB5Uta WqB7S6EVs4Z09zFZ8QlyClF69lZjSt1NyELo/06soxE4//LJRudXB0ZOvoRdBLkCsq16 G3jhLAAmHDO0lwVIyzq9ji08AM2bqgNZCElfZc/PdMoo1UmPSIZAxw7eTyWlwv3BZMSM 7f3hxMv1mDjNqpzNz6yfjosYUJ99/a6Oem3kGEJMAFAys63BDkkC1trRdl4TpwYrn55B ET6oJJI8jo8w0g/j13KIpS7Bj9VP73UtZA9AUZcX5fmK5aQLt287Vpt+Jqq32DNQUL86 rOFg== X-Gm-Message-State: ALQs6tD97RC+svnu2Cg+xpJNDU4xlMwnSYcrp43I1hPD5Pgvt1glrTQ7 PgLMpFMhZstxReGw6/i8kkkMzkWPJQ5iwB/LsxrHeg== X-Google-Smtp-Source: AB8JxZqQuqP56OSRliUwD0NVSQM13RJo6FxPT+fff1TI7ajyFHxGsja/myUuWdYB+NVZieeDW5Al0Cb+0HKVMSynQy8= X-Received: by 10.159.36.131 with SMTP id 3mr26805668uar.7.1525506134378; Sat, 05 May 2018 00:42:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xavier Leroy Date: Sat, 05 May 2018 07:42:03 +0000 Message-ID: To: Chet Murthy Cc: Frederic Perriot , caml-list Content-Type: multipart/alternative; boundary="001a1136eaf83e0d17056b709308" Subject: Re: [Caml-list] an implicit GC rule? Reply-To: Xavier Leroy X-Loop: caml-list@inria.fr X-Sequence: 16876 Errors-to: caml-list-owner@inria.fr Precedence: list Precedence: bulk Sender: caml-list-request@inria.fr X-no-archive: yes List-Id: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --001a1136eaf83e0d17056b709308 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, May 5, 2018 at 5:25 AM Chet Murthy wrote: > It's been a while since I did this sort of thing, but I suspect if you > declare CAMLlocal variables for each intermediate expression, and stick in > the assignments, that should solve your problem (while not making your co= de > too ugly). E.g. > > CAMLprim value left_comb(value a, value b, value c) > { > CAMLparam3(a, b, c); > CAMLlocal5(l1, l2, l3, l4, l5); > CAMLreturn(l1 =3D Tree(l2 =3D Tree((l3 =3D Leaf(a)), (l4 =3D Leaf(b))= , (l5 =3D > Leaf(c)))); > } > That's bold C/C++ programming! It might even work in C++, where assignment expressions are l-values if I remember correctly. However, I'm afraid it won't work in C because an assignment expression "lv =3D rv" is a r-value equal to the value of rv converted to the type of lv at the time the assignment is evaluated. So, if lv is a local variable registered with the GC, the GC will update lv when needed, but the "lv =3D rv" expression will keep its initial value. There's also the rules concerning sequence points. I think the code above respects the C99 rules but I'm less sure about the C11 rules. > > Even better, you could linearize the tree of expressions into a sequence, > and that should solve your problem, also. > Yes, that's the robust solution. Spelling it out: CAMLprim value left_comb(value a, value b, value c) { CAMLparam3(a, b, c); CAMLlocal5(la, lb, lc, tab, t); la =3D Leaf(a); lb =3D Leaf(b); lc =3D Leaf(c); tab =3D Tree(la, lb); t =3D Tree(tab, lc); CAMLreturn(t); } You can also do "CAMLreturn(Tree(tab, lc))" directly. - Xavier Leroy > Uh, I think. Been a while since I wrote a lotta C/C++ code to interface > with Ocaml, but this oughta work. > > --chet-- > > > On Wed, May 2, 2018 at 9:09 AM, Frederic Perriot > wrote: > >> Hello caml-list, >> >> I have a GC-related question. To give you some context, I'm writing a >> tool to parse .cmi files and generate .h and .c files, to facilitate >> constructing OCaml variants from C bindings. >> >> For instance, given the following source: >> >> type 'a tree =3D Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file] >> >> >> the tool produces C functions: >> >> CAMLprim value Leaf(value arg1) >> { >> CAMLparam1(arg1); >> CAMLlocal1(obj); >> >> obj =3D caml_alloc_small(1, 0); >> >> Field(obj, 0) =3D arg1; >> >> CAMLreturn(obj); >> } >> >> CAMLprim value Tree(value arg1, value arg2) >> { >> // similar code here >> } >> >> >> From there, it's tempting to nest calls to variant constructors from C >> and write code such as: >> >> CAMLprim value left_comb(value a, value b, value c) >> { >> CAMLparam3(a, b, c); >> CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c))); >> } >> >> >> The problem with the above is the GC root loss due to the nesting of >> calls to allocating functions. >> >> Say Leaf(c) is constructed first, and the resulting value cached in a >> register, then Leaf(b) triggers a collection, thus invalidating the >> register contents, and leaving a dangling pointer in the top Tree. >> >> Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx: >> >> 0x000000000040dbf4 <+149>: callq 0x40d8fd >> 0x000000000040dbf9 <+154>: mov %rax,%rbx >> 0x000000000040dbfc <+157>: mov -0x90(%rbp),%rax >> 0x000000000040dc03 <+164>: mov %rax,%rdi >> 0x000000000040dc06 <+167>: callq 0x40d8fd >> 0x000000000040dc0b <+172>: mov %rax,%r12 >> 0x000000000040dc0e <+175>: mov -0x88(%rbp),%rax >> 0x000000000040dc15 <+182>: mov %rax,%rdi >> 0x000000000040dc18 <+185>: callq 0x40d8fd >> 0x000000000040dc1d <+190>: mov %r12,%rsi >> 0x000000000040dc20 <+193>: mov %rax,%rdi >> 0x000000000040dc23 <+196>: callq 0x40da19 >> 0x000000000040dc28 <+201>: mov %rbx,%rsi >> 0x000000000040dc2b <+204>: mov %rax,%rdi >> 0x000000000040dc2e <+207>: callq 0x40da19 >> >> >> While the C code clearly violates the spirit of the GC rules, I can't >> help but feel this is still a pitfall. >> >> Rule 2 of the manual states: "Local variables of type value must be >> declared with one of the CAMLlocal macros. [...]" >> >> But here, I'm not declaring local variables, unless you count compiler >> temporaries as local variables? >> >> I can see some other people making the same mistake I did. Should >> there be an explicit warning in the rules? maybe underlining that >> compiler temps count as variables, or discouraging the kind of nested >> calls returning values displayed above? >> >> thanks, >> Fr=C3=A9d=C3=A9ric Perriot >> >> PS: this is also my first time posting to the list, so I take this >> opportunity to thank you for the great Q's and A's I've read here over >> the years >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs > > > --=20 Caml-list mailing list. Subscription management and archives: https://sympa.inria.fr/sympa/arc/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs= --001a1136eaf83e0d17056b709308 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sat= , May 5, 2018 at 5:25 AM Chet Murthy <murthy.chet@gmail.com> wrote:
It's been a while since I did this sort = of thing, but I suspect if you declare CAMLlocal variables for each interme= diate expression, and stick in the assignments, that should solve your prob= lem (while not making your code too ugly).=C2=A0 E.g.

<= div>
CAMLprim value left_comb(value a= , value b, value c)
{
=C2=A0 =C2=A0 CAMLparam3(a, b, c)= ;
=C2=A0 CAMLlocal5(l1, l2, l3,= =C2=A0 l4, l5);
=C2=A0 =C2=A0 CAMLreturn(l1 =3D Tree(l2 =3D Tree(= (l3 =3D Leaf(a)), (l4 =3D Leaf(b)), (l5 =3D Leaf(c))));
}

That's bold C/C++ programmin= g!=C2=A0 It might even work in C++, where assignment expressions are l-valu= es if I remember correctly. =C2=A0

However, I'm afraid it won&#= 39;t work in C because an assignment expression "lv =3D rv" is a = r-value equal to the value of rv converted to the type of lv at the time th= e assignment is evaluated.=C2=A0 So, if lv is a local variable registered w= ith the GC, the GC will update lv when needed, but the "lv =3D rv"= ; expression will keep its initial value.

There's als= o the rules concerning sequence points.=C2=A0 I think the code above respec= ts the C99 rules but I'm less sure about the C11 rules.

Even better, = you could linearize the tree of expressions into a sequence, and that shoul= d solve your problem, also.

Yes= , that's the robust solution.=C2=A0 Spelling it out:

CAMLprim value left_comb(value a, value b, value c)
{
=C2=A0 CAMLparam3(a, b, c);
=
=C2=A0 CAMLlocal5(la, lb, lc, tab, t);
=C2=A0 la =3D Lea= f(a);
=C2=A0 lb =3D Leaf(b);
=C2=A0 lc =3D Leaf= (c);
=C2=A0 tab =3D Tree(la, lb);
=C2=A0 t =3D = Tree(tab, lc);
=C2=A0 CAMLreturn(t);
}
You can also do "CAMLreturn(Tree(tab, lc))" directly= .

- Xavier Leroy


Uh, I think.=C2=A0 Been a while si= nce I wrote a lotta C/C++ code to interface with Ocaml, but this oughta wor= k.

--chet--

<= /div>

On Wed= , May 2, 2018 at 9:09 AM, Frederic Perriot <fperriot@gmail.com> wrote:
Hello caml-list,

I have a GC-related question. To give you some context, I'm writing a tool to parse .cmi files and generate .h and .c files, to facilitate
constructing OCaml variants from C bindings.

For instance, given the following source:

type 'a tree =3D Leaf of 'a | Tree of 'a tree * 'a tree [@@= h_file]


the tool produces C functions:

CAMLprim value Leaf(value arg1)
{
=C2=A0 =C2=A0 CAMLparam1(arg1);
=C2=A0 =C2=A0 CAMLlocal1(obj);

=C2=A0 =C2=A0 obj =3D caml_alloc_small(1, 0);

=C2=A0 =C2=A0 Field(obj, 0) =3D arg1;

=C2=A0 =C2=A0 CAMLreturn(obj);
}

CAMLprim value Tree(value arg1, value arg2)
{
=C2=A0 // similar code here
}


>From there, it's tempting to nest calls to variant constructors from C<= br> and write code such as:

CAMLprim value left_comb(value a, value b, value c)
{
=C2=A0 =C2=A0 CAMLparam3(a, b, c);
=C2=A0 =C2=A0 CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
}


The problem with the above is the GC root loss due to the nesting of
calls to allocating functions.

Say Leaf(c) is constructed first, and the resulting value cached in a
register, then Leaf(b) triggers a collection, thus invalidating the
register contents, and leaving a dangling pointer in the top Tree.

Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:

=C2=A0 =C2=A00x000000000040dbf4 <+149>:=C2=A0 =C2=A0 callq=C2=A0 0x40= d8fd <Leaf>
=C2=A0 =C2=A00x000000000040dbf9 <+154>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rax,%rbx
=C2=A0 =C2=A00x000000000040dbfc <+157>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= -0x90(%rbp),%rax
=C2=A0 =C2=A00x000000000040dc03 <+164>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rax,%rdi
=C2=A0 =C2=A00x000000000040dc06 <+167>:=C2=A0 =C2=A0 callq=C2=A0 0x40= d8fd <Leaf>
=C2=A0 =C2=A00x000000000040dc0b <+172>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rax,%r12
=C2=A0 =C2=A00x000000000040dc0e <+175>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= -0x88(%rbp),%rax
=C2=A0 =C2=A00x000000000040dc15 <+182>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rax,%rdi
=C2=A0 =C2=A00x000000000040dc18 <+185>:=C2=A0 =C2=A0 callq=C2=A0 0x40= d8fd <Leaf>
=C2=A0 =C2=A00x000000000040dc1d <+190>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %r12,%rsi
=C2=A0 =C2=A00x000000000040dc20 <+193>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rax,%rdi
=C2=A0 =C2=A00x000000000040dc23 <+196>:=C2=A0 =C2=A0 callq=C2=A0 0x40= da19 <Tree>
=C2=A0 =C2=A00x000000000040dc28 <+201>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rbx,%rsi
=C2=A0 =C2=A00x000000000040dc2b <+204>:=C2=A0 =C2=A0 mov=C2=A0 =C2=A0= %rax,%rdi
=C2=A0 =C2=A00x000000000040dc2e <+207>:=C2=A0 =C2=A0 callq=C2=A0 0x40= da19 <Tree>


While the C code clearly violates the spirit of the GC rules, I can't help but feel this is still a pitfall.

Rule 2 of the manual states: "Local variables of type value must be
declared with one of the CAMLlocal macros. [...]"

But here, I'm not declaring local variables, unless you count compiler<= br> temporaries as local variables?

I can see some other people making the same mistake I did. Should
there be an explicit warning in the rules? maybe underlining that
compiler temps count as variables, or discouraging the kind of nested
calls returning values displayed above?

thanks,
Fr=C3=A9d=C3=A9ric Perriot

PS: this is also my first time posting to the list, so I take this
opportunity to thank you for the great Q's and A's I've read he= re over
the years

--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocam= l_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
<= /blockquote>

--001a1136eaf83e0d17056b709308--