caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] C bindings: memory managment
@ 2011-08-12 10:10 Thomas Braibant
  2011-08-12 10:56 ` David Allsopp
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Thomas Braibant @ 2011-08-12 10:10 UTC (permalink / raw)
  To: caml-list

Hi,

During  my summer vacations, I decided to have fun trying to make an
OCaml binding for a C library (my first time). My requirements were to
have an "OCaml feeling" (i.e. to have an OCaml interface that looks
like the library was written in OCaml), and to have good memory
management (no leaks).

Following the manual, it was easy to get a working binding for a
subset of the library (enough to follow the tutorial of the given
library). However, I ended up bitten by a nasty problem.

The OCaml interface looks like this (this is a 2D physic library) :

module Body : sig
type t (* == body* *)
val make : ... -> t
end

module Space : sig
type t (* == space* *)
val make : unit -> t
val add_body : t -> Body.t -> unit
val step : t -> unit
end

On the C side, Space.make and Body.make correspond to functions that
allocates custom blocks that hold space* and body* (the finalizers of
these custom blocks correspond to the relevant free-ing functions in
C).

However, this is wrong, since with the following piece of code, the GC
has the right to remove the bodies once in the loop (there is no more
reference to them). I end up with a segmentation fault.

let body1 = Body.make ... in
let body2 = Body.make ... in
let space = Space.make () in
let _ = Space.add_body space body1 in
let _ = Space.add_body space body2 in
for i = 0 to ... do
   Space.step space
done;;

This bodies are not global roots (as far as I understand the
terminology), so I do not see a way to tell the GC not to free the
bodies while there is still a reference to the space they have been
added to. At least, I see no such thing in the documentation.

The solutions I can imagine are:
- either to define Space.t as a record/tuple that contains a space*
and an OCaml list of the bodies that have been added. This seems a bit
of a duplication of the underlying C library.
- either to use some reference counting and memory management as an
interface between the target C library, and the OCaml library.
-  either to require the user to use a "free" OCaml function to do the
memory management (this does not meet my requirements, but this is how
my target C library is binded in other functional languages...).

Since this problem must be quite frequent (I know of one other
instance of it in a C binding), I hope that there are elegant and
general solutions. If it is not the case, I would be glad to know of
the tricks used by other bindings maintainers

With best regards,
Thomas Braibant

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [Caml-list] C bindings: memory managment
  2011-08-12 10:10 [Caml-list] C bindings: memory managment Thomas Braibant
@ 2011-08-12 10:56 ` David Allsopp
  2011-08-12 14:15 ` Romain Beauxis
  2011-08-13 14:40 ` Richard W.M. Jones
  2 siblings, 0 replies; 4+ messages in thread
From: David Allsopp @ 2011-08-12 10:56 UTC (permalink / raw)
  To: Thomas Braibant, caml-list

Thomas Braibant wrote:
> During  my summer vacations, I decided to have fun trying to make an OCaml
> binding for a C library (my first time). My requirements were to have an
> "OCaml feeling" (i.e. to have an OCaml interface that looks like the
> library was written in OCaml)

This is obviously a good idea! However, it's often worth making the C stubs the simplest possible bindings to the underlying C functions and then creating the "OCaml feeling" in the OCaml code for your library. It's much easier debugging higher level OCaml wrapping around your stubs than it is debugging a faulty C stub (obviously, performance considerations sometimes override this).

> and to have good memory management (no leaks).

That's compulsory ;o)
 
> Following the manual, it was easy to get a working binding for a subset of
> the library (enough to follow the tutorial of the given library). However,
> I ended up bitten by a nasty problem.

<snip>

> However, this is wrong, since with the following piece of code, the GC has
> the right to remove the bodies once in the loop (there is no more
> reference to them). I end up with a segmentation fault.
> 
> let body1 = Body.make ... in
> let body2 = Body.make ... in
> let space = Space.make () in
> let _ = Space.add_body space body1 in
> let _ = Space.add_body space body2 in
> for i = 0 to ... do
>    Space.step space
> done;;
> 
> This bodies are not global roots (as far as I understand the terminology),
> so I do not see a way to tell the GC not to free the bodies while there is
> still a reference to the space they have been added to. At least, I see no
> such thing in the documentation.

You need to link the values [space], [body1] and [body2] together so that the GC knows that [body1] and [body2] are still reachable. There's no way around that (if you make [body1] and [body2] part of a global root, they'll never be collected). 

> The solutions I can imagine are:
> - either to define Space.t as a record/tuple that contains a space* and an
> OCaml list of the bodies that have been added. This seems a bit of a
> duplication of the underlying C library.

Your problem, if I understand it correctly, is that there is relationship between the value [space] and values [body1] and [body2] which was set in place by (the C stub) [Space.add_body]? In which case, you have to make the GC aware of that relationship - and this is the best way of doing it. Presumably when your variable [space] is garbage collected, it would then be okay to collect [body1] and [body2] as they're not referenced elsewhere. This would then happen automatically as once [space] has been collected, there will be no more references to [body1] and [body2] and they'll be collected too.

Your C library stores references to the C body* pointers in the space object as part of its own operation - your C stubs store a list of body values with a space value as part of automatic memory management (which your C library presumably does not provide). That's not duplication: they're doing different things with different values.

> - either to use some reference counting and memory management as an
> interface between the target C library, and the OCaml library.

Yuck - definitely not. Your reference counters would be no better than the list of values. That's why OCaml has a GC - definitely use it!

> -  either to require the user to use a "free" OCaml function to do the
> memory management (this does not meet my requirements, but this is how my
> target C library is binded in other functional languages...).

This is correct if your underlying C "things" aren't just memory - usually if a resource is "precious" (e.g. file descriptor, socket, etc.) then you should provide close functions on the OCaml side (because end-users' code *should* be worrying about releasing them). Bear in mind that OCaml does not call finalizers when a program terminates (Java and .NET, I *think*, do, for example - but that's a hazy memory!) so you should never have critical release code in a finalizer.

HTH,


David


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] C bindings: memory managment
  2011-08-12 10:10 [Caml-list] C bindings: memory managment Thomas Braibant
  2011-08-12 10:56 ` David Allsopp
@ 2011-08-12 14:15 ` Romain Beauxis
  2011-08-13 14:40 ` Richard W.M. Jones
  2 siblings, 0 replies; 4+ messages in thread
From: Romain Beauxis @ 2011-08-12 14:15 UTC (permalink / raw)
  To: Thomas Braibant; +Cc: caml-list

2011/8/12 Thomas Braibant <thomas.braibant@gmail.com>:
> Hi,

Hi!

> During  my summer vacations, I decided to have fun trying to make an
> OCaml binding for a C library (my first time). My requirements were to
> have an "OCaml feeling" (i.e. to have an OCaml interface that looks
> like the library was written in OCaml), and to have good memory
> management (no leaks).
>
> Following the manual, it was easy to get a working binding for a
> subset of the library (enough to follow the tutorial of the given
> library). However, I ended up bitten by a nasty problem.
>
> The OCaml interface looks like this (this is a 2D physic library) :
>
> module Body : sig
> type t (* == body* *)
> val make : ... -> t
> end
>
> module Space : sig
> type t (* == space* *)
> val make : unit -> t
> val add_body : t -> Body.t -> unit
> val step : t -> unit
> end
>
> On the C side, Space.make and Body.make correspond to functions that
> allocates custom blocks that hold space* and body* (the finalizers of
> these custom blocks correspond to the relevant free-ing functions in
> C).
>
> However, this is wrong, since with the following piece of code, the GC
> has the right to remove the bodies once in the loop (there is no more
> reference to them). I end up with a segmentation fault.
>
> let body1 = Body.make ... in
> let body2 = Body.make ... in
> let space = Space.make () in
> let _ = Space.add_body space body1 in
> let _ = Space.add_body space body2 in
> for i = 0 to ... do
>   Space.step space
> done;;
>
> This bodies are not global roots (as far as I understand the
> terminology), so I do not see a way to tell the GC not to free the
> bodies while there is still a reference to the space they have been
> added to. At least, I see no such thing in the documentation.
>
> The solutions I can imagine are:
> - either to define Space.t as a record/tuple that contains a space*
> and an OCaml list of the bodies that have been added. This seems a bit
> of a duplication of the underlying C library.

This is the solution I go for when there is strong necessity for the
user to control close/collection by himself, i.e. when the ressource
is not a file or something similar.. Otherwise, I use your third
option below..

> - either to use some reference counting and memory management as an
> interface between the target C library, and the OCaml library.
> -  either to require the user to use a "free" OCaml function to do the
> memory management (this does not meet my requirements, but this is how
> my target C library is binded in other functional languages...).


Romain


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] C bindings: memory managment
  2011-08-12 10:10 [Caml-list] C bindings: memory managment Thomas Braibant
  2011-08-12 10:56 ` David Allsopp
  2011-08-12 14:15 ` Romain Beauxis
@ 2011-08-13 14:40 ` Richard W.M. Jones
  2 siblings, 0 replies; 4+ messages in thread
From: Richard W.M. Jones @ 2011-08-13 14:40 UTC (permalink / raw)
  To: Thomas Braibant; +Cc: caml-list

On Fri, Aug 12, 2011 at 12:10:17PM +0200, Thomas Braibant wrote:
> The solutions I can imagine are:
> - either to define Space.t as a record/tuple that contains a space*
> and an OCaml list of the bodies that have been added. This seems a bit
> of a duplication of the underlying C library.

This is the way to do it.

It's a pretty common problem (and a commonly overlooked problem IME),
but you'll find it occurs in many C bindings.  A couple of examples in
libraries that I have been involved with:

ocaml-libvirt (http://git.annexia.org/?p=ocaml-libvirt.git;a=summary)
- The relationship between connections (owner) and domains (owned).

ocamlode
- Many objects are related, similar to your example.

Rich.

-- 
Richard Jones
Red Hat

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-08-13 14:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-12 10:10 [Caml-list] C bindings: memory managment Thomas Braibant
2011-08-12 10:56 ` David Allsopp
2011-08-12 14:15 ` Romain Beauxis
2011-08-13 14:40 ` Richard W.M. Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).