caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Still strange GC problems with OCaml and C: OCaml 2.04 bug?
@ 2000-02-21 14:09 David Mentré
  2000-02-21 20:10 ` Markus Mottl
  0 siblings, 1 reply; 5+ messages in thread
From: David Mentré @ 2000-02-21 14:09 UTC (permalink / raw)
  To: caml-list; +Cc: colcombe

Hi all Camlists,

I'm still fighting against my bug while trying to interface OCaml with
CMU bdd library (in C).

Following Xavier comments, I've protected my values (with CAMLparamXX
and CAMLlocalXX macros). 

Actual code is available at:
  http://www.irisa.fr/paris/pages-perso/David-Mentre/bdd.ml
  http://www.irisa.fr/paris/pages-perso/David-Mentre/bdd_interface.c

 1. I still obtain a segfault, however, the gdb backtrace has changed:
(gdb) bt
#0  0x80655b5 in mark_slice ()
#1  0x80659eb in major_collection_slice ()
#2  0x8065fd3 in minor_collection ()
#3  0x8063859 in interprete ()
#4  0x8065043 in caml_main ()
#5  0x805ab45 in main ()
#6  0x2ab457e2 in __libc_start_main () from /lib/libc.so.6

    It is still a GC bug, however the problem no longer arises while our
    C interface allocates CAML memory. So I think the bug has moved (or
    this is another bug).

 2. More interestingly, by compiling this code with OCaml 2.99 under
    sun4 architecture, this bug no longer occurs. I obtain in place a:

Fatal error: uncaught exception Invalid_argument("Array.get")

    So, it would be possible that some Invalid_argument exceptions are
    not properly caught in OCaml 2.04.

    For CAML team, I was not able to produce a small/simple example that
    can trigger this bug. Only my gas-plant program seems to
    segfault. Sorry.

 3. I've found an interesting message on Damien Doliguez site[1]:

http://pauillac.inria.fr/~doligez/caml-guts/Fahndrich99.txt

    This message tells that you can have strange CAML GC related
    problems while deallocating C structures. However, I think my code
    is in case 2 of Manuel proposed fixes (i.e. box every C pointer
    inside an abstract CAML block). So, in my opinion, my bug is not
    related to this problem. But, as am I paranoid now :), I've set
    pointers to 1 after deallocating them. This not fixed the bug.


Anyway, if somebody has an advice on how to track down this bug, I'll
glady accept it.


David -- once happy in the Caml-only world


[1]http://pauillac.inria.fr/~doligez/caml-guts/
-- 
 David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Still strange GC problems with OCaml and C: OCaml 2.04 bug?
  2000-02-21 14:09 Still strange GC problems with OCaml and C: OCaml 2.04 bug? David Mentré
@ 2000-02-21 20:10 ` Markus Mottl
  2000-02-22  9:16   ` Congratulation! Bug found!! (GC & C interfacing problems) David Mentré
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Mottl @ 2000-02-21 20:10 UTC (permalink / raw)
  To: David Mentré; +Cc: OCAML

> I'm still fighting against my bug while trying to interface OCaml with
> CMU bdd library (in C).

Hm, I have just taken a look at the code. I do not want to be too fast with
my suggestion (I have not tried it), but I am pretty sure that the
following might be the bug:

You use "Store_field" throughout the code to assign pointers to fields in
structures which were allocated using "alloc_final".

I once had a similar bug in my PCRE-library, but Gerd Stolpmann was so kind
to send me the patch and explain the problem. Here his translated
explanation (seems reasonable):

  - after "alloc_small" the fields have to be initialized with
    "Field(var, n) = ...", not with "Store_field". The last version writes
    (with some bad luck) the address of the field into a list of addresses
    which have to be moved in case of a minor GC.

  - The fields of "alloc_final" are not considered by the GC. Therefore,
    they, too, have to be written to using "Field(var, n)" (or you may
    cast them to a normal C-struct). "Store_field" has, again, unexpected
    side effects.

Since you are lucky and have used access macros throughout the code, you
can quickly test my suggestion by changing them.

I hope that helps!

In case this is really the bug (probably), I'd suggest a revision of the
C-interface-documentation. At least to me it was not obvious that
"Store_field" leads to such additional, unexpected behaviour.

Good luck squeezing the bug,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Congratulation! Bug found!! (GC & C interfacing problems)
  2000-02-21 20:10 ` Markus Mottl
@ 2000-02-22  9:16   ` David Mentré
  2000-02-22 16:27     ` Markus Mottl
  0 siblings, 1 reply; 5+ messages in thread
From: David Mentré @ 2000-02-22  9:16 UTC (permalink / raw)
  To: Markus Mottl; +Cc: OCAML, xavier.leroy, colcombe, damien.doliguez

Hi Markus, Hi all camlists,

You were right Markus. Using directly the Field macro fixed my bug.

Markus Mottl <mottl@miss.wu-wien.ac.at> writes:

> You use "Store_field" throughout the code to assign pointers to fields in
> structures which were allocated using "alloc_final".
> 
> I once had a similar bug in my PCRE-library, but Gerd Stolpmann was so kind
> to send me the patch and explain the problem. Here his translated
> explanation (seems reasonable):
> 
>   - after "alloc_small" the fields have to be initialized with
>     "Field(var, n) = ...", not with "Store_field". The last version writes
>     (with some bad luck) the address of the field into a list of addresses
>     which have to be moved in case of a minor GC.
> 
>   - The fields of "alloc_final" are not considered by the GC. Therefore,
>     they, too, have to be written to using "Field(var, n)" (or you may
>     cast them to a normal C-struct). "Store_field" has, again, unexpected
>     side effects.

The explanation (or a guess ;) :

  1. a memory block is allocated with alloc_final, therefore this block
     internals should not be considered by the GC.

  2. I use the Store_field macro to update block content. 

  3. However, this macro is calling modify (function defined in
     byterun/memory.c) which in turn calls the Modify macro (defined in
     byterun/memory.h). As Markus said, this macro adds the address
     given in argument to a list of memory addresses (ref_table_ptr)
     that should be examined by the GC at collection time.

  4. So, we have a GC-opaque memory block whose content adresses have
     been added to a GC to-examine-later list. Therefore, at GC time:
     crash.


> In case this is really the bug (probably), I'd suggest a revision of the
> C-interface-documentation. At least to me it was not obvious that
> "Store_field" leads to such additional, unexpected behaviour.

I also subscribe to this documentation revision. I also volunteer, if
needed, to review/rewrite the doc part related to Interfacing C with
OCaml.


> Good luck squeezing the bug,

I've squeezed it, with your help. :)

One again, many many thanks,
Best regards,
david
-- 
 David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Congratulation! Bug found!! (GC & C interfacing problems)
  2000-02-22  9:16   ` Congratulation! Bug found!! (GC & C interfacing problems) David Mentré
@ 2000-02-22 16:27     ` Markus Mottl
  0 siblings, 0 replies; 5+ messages in thread
From: Markus Mottl @ 2000-02-22 16:27 UTC (permalink / raw)
  To: David Mentré; +Cc: OCAML

>   3. However, this macro is calling modify (function defined in
>      byterun/memory.c) which in turn calls the Modify macro (defined in
>      byterun/memory.h). As Markus said, this macro adds the address
>      given in argument to a list of memory addresses (ref_table_ptr)
>      that should be examined by the GC at collection time.
[...]
> I also subscribe to this documentation revision. I also volunteer, if
> needed, to review/rewrite the doc part related to Interfacing C with
> OCaml.

I guess that the confusion about the real things happening as explained in
"3" above comes from "rule 6" in the documentation of the C-interface,
which says:

  Direct assignment to a field of a block, as in

          Field(v, n) = w;

  is safe only if v is a block newly allocated by alloc_small; that
  is, if no allocation took place between the allocation of v and the
  assignment to the field. In all other cases, never assign directly.

This "safe only" and "in all other cases, never assign directly" leaves the
impression that the "Field" macro is a bit "evil" and could be avoided,
possibly by using this nice "Store_field"-macro. However, it is not only
"safe" to use "Field" in this case, it seems (?) that this is the only way
to do it correctly. Furthermore, I did not find any documentation on
correctly placing values into blocks created with "alloc_final", which
seems to be pretty similar to "alloc_small" in this respect.

The only information I found concerning "alloc_small" which appears to
indicate correct usage is:

  alloc_small(n, t) returns a fresh small block of size n <=
  Max_young_wosize words, with tag t. If this block is a structured block
  (i.e. if t < No_scan_tag), then the fields of the block (initially
  containing garbage) must be initialized with legal values (using direct
  assignment to the fields of the block) before the next allocation.

The intention of "using direct assignment to the fields" is obviously meant
as hint to use the "Field"-macro. Because most people don't know that
"Store_field" not only assigns directly, but does unexpected other things,
this information does probably not help...

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re:  Congratulation! Bug found!! (GC & C interfacing problems)
@ 2000-02-23 19:33 Damien Doligez
  0 siblings, 0 replies; 5+ messages in thread
From: Damien Doligez @ 2000-02-23 19:33 UTC (permalink / raw)
  To: caml-list

>From: David.Mentre@irisa.fr (David=?iso-8859-1?q?_Mentr=E9?=)

>I also subscribe to this documentation revision. I also volunteer, if
>needed, to review/rewrite the doc part related to Interfacing C with
>OCaml.

We are always very grateful for any contribution to the system and
especially the docs.  I've added a few notes in the doc based on your
story (as found in my mailbox when I came back from vacation), but I
guess what we really need is a better example to show all the features
of the interface.

-- Damien



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2000-02-24 13:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-02-21 14:09 Still strange GC problems with OCaml and C: OCaml 2.04 bug? David Mentré
2000-02-21 20:10 ` Markus Mottl
2000-02-22  9:16   ` Congratulation! Bug found!! (GC & C interfacing problems) David Mentré
2000-02-22 16:27     ` Markus Mottl
2000-02-23 19:33 Damien Doligez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).