caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Calling C from OCaml, GC problems
@ 2000-02-16 15:41 David Mentré
  2000-02-18  1:26 ` Markus Mottl
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: David Mentré @ 2000-02-16 15:41 UTC (permalink / raw)
  To: caml-list

Hello all Camlists,

A friend of mine and me are trying to interface CMU bdd library (in C)
with OCaml. We have a rough code but a bug is still alive. I still have
some questions relative to OCaml and C interfacing:

 1. the ocaml doc (paragraph 15.5) states that CAMLparam and CAMLreturn
    macros should be used. However, example code in 15.7 does not use
    them (even if functions have value typed parameters). The Unix
    interfacing code ocaml-2.04/otherlibs/unix/) doesn't use those macro
    either. Faulty doc?  Are those macros not mandatory? What is the
    rationale?

 2. when compiling, I've tons of warning like this:
bdd_interface.c: In function `mlbdd_alloc_manager':
bdd_interface.c:136: warning: left-hand operand of comma expression has no effect
bdd_interface.c:136: warning: unused variable `caml__dummy_result'

    the corresponding C code is:
value mlbdd_alloc_manager(MANAGER m) {
        CAMLparam0();
        CAMLlocal1 (result);  

        bdd_overflow_closure(m, mlbdd_overflow, NULL);
        result = alloc_final(Size_ml_manager, mlbdd_free_manager, 0, 1);
        Manager_store_pointer(result, (value)m);
        PRINT_DEBUG("Alloc manager");
        CAMLreturn result;
}

    Did I use wrongly those CAML* macros?

 3. The code is running on small examples but segfault on bigger
    ones. From gdb backtrace, it seems clear that my bug is releated to
    a GC problem:
(gdb) bt
#0  0x8065c45 in mark_slice ()
#1  0x806607b in major_collection_slice ()
#2  0x8066663 in minor_collection ()
#3  0x80666ac in check_urgent_gc ()
#4  0x805ba53 in alloc_final ()
#5  0x804e0fc in mlbdd_alloc_vbdd (m=0x809f020, b=0x80f55b9) at bdd_interface.c:204
#6  0x804ec85 in mlbdd_or (vb1=717431320, vb2=717431288) at bdd_interface.c:441
#7  0x80645bb in interprete ()
#8  0x80656d3 in caml_main ()
#9  0x805b1d5 in main ()
#10 0x2ab457e2 in __libc_start_main () from /lib/libc.so.6

    The bdd library is using itself a memory management library calling
    sbrk(2). Can it trigger problems with OCaml GC (like the GC going
    into bdd structures)?

Below URL's to C and OCaml interface codes if somebody need them:
  http://www.irisa.fr/paris/pages-perso/David-Mentre/bdd.ml
  http://www.irisa.fr/paris/pages-perso/David-Mentre/bdd_interface.c


Many thanks in advance for any help.

best regards,
david
-- 
 David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-16 15:41 Calling C from OCaml, GC problems David Mentré
@ 2000-02-18  1:26 ` Markus Mottl
  2000-02-18  9:45 ` Xavier Leroy
  2000-02-18 10:48 ` Juan J. Quintela
  2 siblings, 0 replies; 9+ messages in thread
From: Markus Mottl @ 2000-02-18  1:26 UTC (permalink / raw)
  To: David Mentré; +Cc: OCAML

>  1. the ocaml doc (paragraph 15.5) states that CAMLparam and CAMLreturn
>     macros should be used. However, example code in 15.7 does not use
>     them (even if functions have value typed parameters). The Unix
>     interfacing code ocaml-2.04/otherlibs/unix/) doesn't use those macro
>     either. Faulty doc?  Are those macros not mandatory? What is the
>     rationale?

"CAMLparam" + "CAMLreturn" are used in cases where the C-program has to
allocate memory on the OCaml-heap. This can, however, trigger a garbage
collection. If you are unlucky, the values you have received from OCaml are
not referenced from any other value in OCaml anymore, i.e. they are not
reachable.

The garbage collector may therefore reclaim the values in question. If you
still need them in the C-program and you try to access them later, you
might cause a segfault, because the memory of these values has been freed -
or even worse: you manipulate some new value which was allocated in this
place, which may cause problems much later.

To prevent this from happening, you have to "secure" the values by using
the macros mentioned above. "CAMLparam" ensures that the values stay
reachable for the time you still need them, "CAMLreturn" removes this
protection again and returns the value.

You need not protect the value parameters if you do not allocate anything
on the OCaml-heap or if you do not need them anymore before doing the next
allocation. This is probably the case in the Unix-library, where you mostly
just extract some system value (file descriptors, etc.) and pass it to the
appropriate system call.

>  2. when compiling, I've tons of warning like this:
[...]
>     Did I use wrongly those CAML* macros?

I know that the macros require parenthesis now (OCaml 2.99) - so you will
probably have to put them around "result" in "CAMLreturn result" to stay
compatible.

Otherwise the CAML-macros seem to be ok, but I am not quite sure about your
correct use of values like (e.g. "MANAGER m"). I'd have to see the whole
code, unfortunately, your host is unreachable at the moment...

>  3. The code is running on small examples but segfault on bigger
>     ones. From gdb backtrace, it seems clear that my bug is releated to
>     a GC problem:

This might be a problem of wrong "protection" of some values: small
examples are unlikely to cause many allocations. This means that the
probability of causing and finding the bug is lower.

Unfortunately, depending on what C does with the values, the problem can be
hidden for quite some time until the code really causes a segfault. So the
true location of the problem may be far from the point in the source where
the fault appears.

>     The bdd library is using itself a memory management library calling
>     sbrk(2). Can it trigger problems with OCaml GC (like the GC going
>     into bdd structures)?

Hm, no idea. The man page of "sbrk" says somewhere:

  It is unspecified whether the pointer returned by sbrk() is aligned
  suitably for any purpose.

And the OCaml manual says:

  Any word-aligned pointer to an address outside the heap can be safely
  cast to and from the type value. This includes pointers returned
  by malloc, and pointers to C variables (of size at least one word)
  obtained with the & operator.

It is beyond my understanding whether this might be the cause of problems,
but "sbrk" seems to be somewhat dangerous in this respect.

Good luck chasing the bug!

- Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-16 15:41 Calling C from OCaml, GC problems David Mentré
  2000-02-18  1:26 ` Markus Mottl
@ 2000-02-18  9:45 ` Xavier Leroy
  2000-02-21 16:54   ` David Mentré
  2000-02-18 10:48 ` Juan J. Quintela
  2 siblings, 1 reply; 9+ messages in thread
From: Xavier Leroy @ 2000-02-18  9:45 UTC (permalink / raw)
  To: David Mentré, caml-list

>  1. the ocaml doc (paragraph 15.5) states that CAMLparam and CAMLreturn
>     macros should be used. However, example code in 15.7 does not use
>     them (even if functions have value typed parameters). The Unix
>     interfacing code ocaml-2.04/otherlibs/unix/) doesn't use those macro
>     either. Faulty doc?  Are those macros not mandatory? What is the
>     rationale?

We've been through several designs for the "local root registration" API.
The CAMLxxx macros are the latest design, and the one that we think is
the easiest to use.  Most parts of the systems were written before
those macros were introduced, and thus still use an older API in 2.04
(but the next release of OCaml will use the new API).

>  2. when compiling, I've tons of warning like this:
> bdd_interface.c: In function `mlbdd_alloc_manager':
> bdd_interface.c:136: warning: left-hand operand of comma expression has no effect
> bdd_interface.c:136: warning: unused variable `caml__dummy_result'

I think those warnings are harmless, and are due to the way the
CAMLxxx macros are written in 2.04.  But only Damien Doligez knows for
sure...

>  3. The code is running on small examples but segfault on bigger
>     ones. From gdb backtrace, it seems clear that my bug is releated to
>     a GC problem:

That is often due to a local root not being registered, or being
incorrectly registered.

>     The bdd library is using itself a memory management library calling
>     sbrk(2). Can it trigger problems with OCaml GC (like the GC going
>     into bdd structures)?

This shouldn't be a problem.  OCaml allocates its heap using malloc(),
and scans only the portions of the memory space that it allocated itself.
There might be funky interactions between sbrk() and malloc(), but
this is unlikely, as it would cause problem with the BDD library even
in C programs.

- Xavier Leroy



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-16 15:41 Calling C from OCaml, GC problems David Mentré
  2000-02-18  1:26 ` Markus Mottl
  2000-02-18  9:45 ` Xavier Leroy
@ 2000-02-18 10:48 ` Juan J. Quintela
  2000-02-21 13:40   ` David Mentré
  2 siblings, 1 reply; 9+ messages in thread
From: Juan J. Quintela @ 2000-02-18 10:48 UTC (permalink / raw)
  To: David Mentré; +Cc: caml-list


> value mlbdd_alloc_manager(MANAGER m) {
>        CAMLparam0();
>        CAMLlocal1 (result);  

>        bdd_overflow_closure(m, mlbdd_overflow, NULL);
>        result = alloc_final(Size_ml_manager, mlbdd_free_manager, 0, 1);
>        Manager_store_pointer(result, (value)m);
>        PRINT_DEBUG("Alloc manager");
>        CAMLreturn result;
>}

I think you need to protect m with 
CAMLparam1(m);
otherwise in the allocation alloc_final, the garbage collector can be
called and the address to m change.

I hope this help.

Later, Juan.

-- 
In theory, practice and theory are the same, but in practice they 
are different -- Larry McVoy



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-18 10:48 ` Juan J. Quintela
@ 2000-02-21 13:40   ` David Mentré
  0 siblings, 0 replies; 9+ messages in thread
From: David Mentré @ 2000-02-21 13:40 UTC (permalink / raw)
  To: Juan J. Quintela; +Cc: caml-list

"Juan J. Quintela" <quintela@fi.udc.es> writes:

> I think you need to protect m with 
> CAMLparam1(m);

No. You need to protect only Caml 'values'.

(and I've tested your suggestion, segfault at the beginning of the
program).

d.
-- 
 David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-18  9:45 ` Xavier Leroy
@ 2000-02-21 16:54   ` David Mentré
  2000-02-21 23:41     ` Max Skaller
  0 siblings, 1 reply; 9+ messages in thread
From: David Mentré @ 2000-02-21 16:54 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> >  2. when compiling, I've tons of warning like this:
> > bdd_interface.c: In function `mlbdd_alloc_manager':
> > bdd_interface.c:136: warning: left-hand operand of comma expression has no effect
> > bdd_interface.c:136: warning: unused variable `caml__dummy_result'
> 
> I think those warnings are harmless, and are due to the way the
> CAMLxxx macros are written in 2.04.  But only Damien Doligez knows for
> sure...

Yes. Having a deeper look at those macros, I've found that the first
warning as been fixed in ocaml 2.99 (the macros CAMLxparam referenced a
the caml__frame variable without using it).

However, the second warning is still there (even if harmless). The
problem is that the caml__dummy_##name variables are no longer
used. Such variable is only used to make some side-effects using C
'(stmt1, stmt2, ..., stmtN)' notation. I see no way to fix this. C lack
the Caml 'let _ ='. :)

Any way, many thanks for the feedback.
Best regards,
d. 
-- 
 David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-21 16:54   ` David Mentré
@ 2000-02-21 23:41     ` Max Skaller
  0 siblings, 0 replies; 9+ messages in thread
From: Max Skaller @ 2000-02-21 23:41 UTC (permalink / raw)
  Cc: caml-list

 
> However, the second warning is still there (even if harmless). The
> problem is that the caml__dummy_##name variables are no longer
> used. Such variable is only used to make some side-effects using C
> '(stmt1, stmt2, ..., stmtN)' notation. I see no way to fix this. C lack
> the Caml 'let _ ='. :)

I'd recommend getting rid of those macros. Anyone writing
a CAML/C interface needs to understand details of how the GC
works so as to optimise code to exactly the required
functions to create temporary roots, etc .. it would be better
to provide the raw functions and a good explanation.

For example, the documentation says 'it is safe to
cast C malloced pointers to caml values' which, in English
at least, does NOT tell me what I need to know -- that the
caml GC 'knows' that such values are not pointers into it's heap.

For example, I did some work on a copy of mlgtk, and the requirements
lead to fairly idiosyncractic code -- code wrapping a single
simple object never needs to make the temporary roots; but if
there are two objects, the first created needs to be rooted.

I do wonder if it would not be useful to have a function(s)
that did allocations without collecting.

-- 
John (Max) Skaller at OTT [Open Telecommications Ltd]
mailto:maxs@in.ot.com.au      -- at work
mailto:skaller@maxtal.com.au  -- at home




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
  2000-02-23 19:39 Damien Doligez
@ 2000-02-24 23:36 ` Max Skaller
  0 siblings, 0 replies; 9+ messages in thread
From: Max Skaller @ 2000-02-24 23:36 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

Damien Doligez wrote:
> 
> >From: Max Skaller <maxs@in.ot.com.au>
> >
> >I'd recommend getting rid of those macros. Anyone writing
> >a CAML/C interface needs to understand details of how the GC
> >works so as to optimise code to exactly the required
> >functions to create temporary roots, etc .. it would be better
> >to provide the raw functions and a good explanation.
> 
> Even with the raw functions and a perfect understanding of the system,
> I found it extremely difficult to write bug-free code, and very
> time-consuming to fish out the inevitable bugs.  That's why the macros
> exist.  

	But, use of the macros _also_ leads to code with bugs,
as we are hearing, but programmers now have extra information to
cope with, and no good explanation of exactly what the
GC routines DO.

	Generally I agree with your (elided) comments about
correctness and efficiency, but for some kinds of interfacing,
the best possible performance is required. Anyone doing C interfacing
needs to have considerable expertise -- I think it would be best
to addess primary documentation to such people.


-- 
John (Max) Skaller at OTT [Open Telecommications Ltd]
mailto:maxs@in.ot.com.au      -- at work
mailto:skaller@maxtal.com.au  -- at home



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Calling C from OCaml, GC problems
@ 2000-02-23 19:39 Damien Doligez
  2000-02-24 23:36 ` Max Skaller
  0 siblings, 1 reply; 9+ messages in thread
From: Damien Doligez @ 2000-02-23 19:39 UTC (permalink / raw)
  To: caml-list

>From: Max Skaller <maxs@in.ot.com.au>
>
>I'd recommend getting rid of those macros. Anyone writing
>a CAML/C interface needs to understand details of how the GC
>works so as to optimise code to exactly the required
>functions to create temporary roots, etc .. it would be better
>to provide the raw functions and a good explanation.

Even with the raw functions and a perfect understanding of the system,
I found it extremely difficult to write bug-free code, and very
time-consuming to fish out the inevitable bugs.  That's why the macros
exist.  For most people, it is more economical to write slightly less
efficient code if that means less debugging time and fewer bugs in the
released version.


>I do wonder if it would not be useful to have a function(s)
>that did allocations without collecting.

If only we knew how to do that, we could get rid of the GC
altogether.  (tongue in cheek)

-- Damien



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2000-02-25 13:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-02-16 15:41 Calling C from OCaml, GC problems David Mentré
2000-02-18  1:26 ` Markus Mottl
2000-02-18  9:45 ` Xavier Leroy
2000-02-21 16:54   ` David Mentré
2000-02-21 23:41     ` Max Skaller
2000-02-18 10:48 ` Juan J. Quintela
2000-02-21 13:40   ` David Mentré
2000-02-23 19:39 Damien Doligez
2000-02-24 23:36 ` Max Skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).