caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jeremy Yallop <yallop@gmail.com>
To: Malcolm Matalka <mmatalka@gmail.com>
Cc: "David Sheets" <sheets@alum.mit.edu>,
	"Jeremie Dimino" <jdimino@janestreet.com>,
	"Christoph Höger" <christoph.hoeger@tu-berlin.de>,
	"caml users" <caml-list@inria.fr>
Subject: Re: [Caml-list] Save callbacks from OCaml to C
Date: Wed, 3 Feb 2016 16:14:55 -0800	[thread overview]
Message-ID: <CAAxsn=H5OSbFmp=KB9QHXEuzHfXcUiNDmkD8=p7wUBXsR0HDeQ@mail.gmail.com> (raw)
In-Reply-To: <86mvrhr1o1.fsf@gmail.com>

On 3 February 2016 at 12:15, Malcolm Matalka <mmatalka@gmail.com> wrote:
> Jeremy Yallop <yallop@gmail.com> writes:
>
>> On 3 February 2016 at 05:44, David Sheets <sheets@alum.mit.edu> wrote:
>>> On Wed, Feb 3, 2016 at 12:26 PM, Malcolm Matalka <mmatalka@gmail.com> wrote:
>>>> Jeremie Dimino <jdimino@janestreet.com> writes:
>>>>> You need to register [ml_t], [ml_x] and [ml_g
>>>>> ] as GC roots. Otherwise if the GC runs in caml_ba_alloc for instance,
>>>>> [ml_t] might ends up containing garbage even before reaching
>>>>> [caml_callback3]. You can use the normal macros for that:
>>>>>
>>>> If one is using ctypes, is all of this taken care of?  I have a library
>>>> that registers a bunch of Ocaml functions in C code, which the C code
>>>> calls.  I haven't experienced anything bad happening yet, but that
>>>> doesn't mean much...
>>>
>>> If you use ctypes and pass OCaml closures to C, you *must* retain a
>>> reference to the closure to avoid it being GCed. If you do not, you
>>> may experience the exception CallToExpiredClosure sporadically.
>>
>> Besides David's caveat, the answer is yes: ctypes will take care of
>> registering arguments as GC roots as necessary.
>
> Can you clarify this a bit?  I'm not that familiar with how the C FFI
> works.  If I pass in a closure to a C function and it is registered as a
> GC root, doesn't that mean it won't be GCd if my Ocaml program forgets
> about it or?

That's how roots behave, yes: while a value is registered as a root,
the value won't be collected.   There are (roughly speaking) two types
of root in OCaml: local roots, which persist for the duration of a
function call, and global roots, which persist until explicitly
released.  A C function binding written by hand must ensure that OCaml
values passed to it as arguments are registered as local roots, so
that if a collection occurs while the function is running the values
won't be prematurely collected.

A C binding written using ctypes can generally ignore the matter of
roots.  That's partly because ctypes takes care of root registration,
but also because most types passed between OCaml and C in a ctypes
binding are C values, not OCaml values.  For example, if you want to
pass a structure with several fields between OCaml and C there are two
approaches.  One approach is to represent the structure as an OCaml
record, which involves accessing the fields of the value in your C
binding using various macros, taking care to register values as roots
to protect them from the GC.  The other approach is to represent the
structure as a C struct, which involves accessing its fields in OCaml
using the functions ctypes provides.  (If you enjoy programming in an
untyped dialect of C with ubiquitous concurrency, you'll probably
favour the first approach.  If you prefer programming in OCaml then
the second approach might have some appeal.)

Using the C value representation for values that cross the C-OCaml
boundary generally works well, but when things become higher-order,
the situation changes a bit.  When a C library expects to be given a
first-order value such as a struct we have to give it a struct with
the appropriate layout, since C functions can directly access the
representation of values.  However, when the library expects a
function pointer we have a bit more freedom, since the representation
of functions isn't accessible -- in fact, the only thing that can be
done with a function pointer, besides passing it from place to place,
is calling it.  This freedom means that we can pass an OCaml function,
suitably packaged up, where a C function pointer is expected.

Passing OCaml functions to C as function pointers raises some
interesting issues relating to object lifetime and the garbage
collector.  The main difficulty arises from the fact that once you
pass a function pointer to a C library there's no way of knowing how
long the library holds on to it: for example, the library might
discard the function pointer when the call into the library returns,
or it might store the function pointer in a global variable to be
retrieved and called later.  In order to prevent the associated
function from being collected prematurely, some kind of action is
needed on the OCaml side, whether registering a global root, or
ensuring that the function is reachable from the OCaml program.

> Also, David and I were talking about how to solve this on IRC.  In my
> specific case, callbacks are one-shot, which means I know they need to
> be remembered until they are called then they can (possibly) be freed.
> Is there a nice solution here?  I'd prefer not to store them in some
> other data structure and remove them later just to keep a reference
> alive, if possible.

Storing some kind of references to the functions in a place that the
collector can see is essential to prevent the functions from being
collected prematurely.  The situation is the same whether you use
ctypes or write bindings by hand.

Storing the functions in a table, and removing them automatically
after they're called is one approach.  An alternative is to use the
new Ctypes.Roots module, which will be available in the next release:

   https://github.com/ocamllabs/ocaml-ctypes/blob/182a9e64src/ctypes/ctypes.mli#L419-L435

> That is overhead I'd prefer to avoid, if possible.
> I plan on having possibly hundreds of thousands of these callbacks alive
> at any point in time.

In that case it sounds like there'll be an overhead of up to a few megabytes.

  reply	other threads:[~2016-02-04  0:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-03 10:54 Christoph Höger
2016-02-03 11:48 ` Jeremie Dimino
2016-02-03 12:26   ` Malcolm Matalka
2016-02-03 13:44     ` David Sheets
2016-02-03 18:02       ` Jeremy Yallop
2016-02-03 20:15         ` Malcolm Matalka
2016-02-04  0:14           ` Jeremy Yallop [this message]
2016-02-04  7:26             ` Malcolm Matalka
2016-02-04 19:29               ` Jeremy Yallop
     [not found]   ` <56B1EC33.2090303@tu-berlin.de>
2016-02-03 13:49     ` Jeremie Dimino
2016-02-03 14:38       ` Christoph Höger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAxsn=H5OSbFmp=KB9QHXEuzHfXcUiNDmkD8=p7wUBXsR0HDeQ@mail.gmail.com' \
    --to=yallop@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=christoph.hoeger@tu-berlin.de \
    --cc=jdimino@janestreet.com \
    --cc=mmatalka@gmail.com \
    --cc=sheets@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).