Thanks for your help Sylvain,
David & Michel for all these explanations. These insightful comments have
enlightened my vision of the ocaml runtime.
It all makes sense so
I’ll follow your advice and stop worrying about the GC.
Let’s come back
to my problem. I played with ocaml interpreter a bit, and there is a behavior
that I cannot understand. This behavior actually creates a fairly large range
of problems all over my code base. See the following snippet:
# let q =
Queue.create () in
Queue.push 0 q;
q == q;;
- : bool = true
Standard behavior.
Now let see this:
# let q =
Queue.create () in
Queue.push 0 q;
q =
q;;
which hangs for ever…
I would have thought physical equality implies structural equality,
but it doesn’t seem like it.
Can you please explain to me what’s wrong there?
Thanks again for your help!
From: David Allsopp
[mailto:dra-news@metastack.com]
Sent: dimanche 9 août 2009 10:39
To: 'ivan chollet'
Subject: RE: [Caml-list] Re: ocaml sefault in bytecode: unanswered
questions
Chapter 18.2 of the manual is what you need
– it explains the value type used internally for the heap. In
hyper-simplistic terms, when the garbage collector runs it just assumes that
anything could be a pointer and applies a colouring to the heap to determine
reachable values (the memory bit required for this colouring is why integer
values are only 31 or 63 bits in OCaml – you get the performance of their
being unboxed, but the hit of losing a bit for the colouring).
Personally, I wouldn’t spend so much
time worrying about the Garbage Collector – it just works! To answer part
of your original question: using only the standard library, the only way that
you can segfault a program (other than by abusing Marshal) is to abuse the Obj
module (Obj.magic in particular allows you to circumvent the type system).
However, the documentation for Obj says it all – “Operations
on internal representations of values. Not for the casual user.”
If you’ve seen an example in the past
which segfaulted the bytecode runtime, then it was a bug in the compiler... if
you can still produce a repro case then raise a bug in mantis. However, the
Garbage Collector received a lot of attention in both 3.10 (big overhaul of
lazy values) and 3.11 (change the way the memory tables are implemented) so if
it was with an older version of OCaml that you saw it then the error may have
disappeared. Similarly, the semantics of the bytecode runtime and native
runtime are supposed to be the same – but there are a few instances where
the native runtime intentionally segfaults (for performance) where the bytecode
runtime would raise an exception (stack overflow is the principal one).
Best,
D
From: ivan chollet [mailto:ivan.chollet@free.fr]
Sent: 09 August 2009 08:59
To: 'David Allsopp'
Cc: caml-list@yquem.inria.fr; 'Edgar Friendly'
Subject: RE: [Caml-list] Re: ocaml sefault in bytecode: unanswered
questions
Definitely.
Actually I had my
real-world case in mind, so let me explain further with the following snippet:
let myfun =
doSomeWork (); myref := List.filter somefilterfunction !myref in
List.iter myfun
!myref
In this case, a new
linked list is created in each iteration of the List.filter. (that is, a new
list allocation)
Then, if doSomeWork
() does a lot of work and lots of allocations, the GC will be called on a
regular basis while in function myfun.
Then List.iter is
tail-recursive, so it doesn’t push its successive arguments on the stack.
So the successively created myref become unreachable while still iterating on
them.
So my question is,
how does the GC know whether all these myref created throughout the iteration
are collectable or not? I’m curious about how these myref are
tagged/untagged by the garbage collector. Maybe pointing me the relevant
portions of the ocamlrun source code would be nice.
Anyway no worries,
once I get a bit more free I’ll just try to read about this topic by
myself. Also I’ll try to send you some source code for that. All this
will take me a little while, so see you next time!
From: David Allsopp
[mailto:dra-news@metastack.com]
Sent: samedi 8 août 2009 19:25
To: 'ivan chollet'
Cc: caml-list@yquem.inria.fr
Subject: RE: [Caml-list] Re: ocaml sefault in bytecode: unanswered
questions
When you pass a value to a function, you
create a pointer to that value in the OCaml runtime – the GC can’t
collect the old value until List.iter completes because the value is still live
(internally, it’s part of a local root but, in practice, as List.iter is
implemented in OCaml directly it’s because an OCaml function parameter
references the value). Note that in this example:
let a = [1; 2; 3]
and b = [4; 5; 6]
and c = [7; 8; 9] in
let myref = ref a in
(* No allocations are done after here *)
myref := a;
myref := b;
myref := c;;
the assignments to [myref] do not result in
any memory being allocated at all (my point is that action of assigning to a
reference does not implicitly result in an allocation).
David
From: caml-list-bounces@yquem.inria.fr
[mailto:caml-list-bounces@yquem.inria.fr] On Behalf Of ivan chollet
Sent: 08 August 2009 18:10
To: ivan.chollet@free.fr
Cc: caml-list@yquem.inria.fr
Subject: [Caml-list] Re: ocaml sefault in bytecode: unanswered questions
Yes
it was a freebsd 6.4 with ocaml 3.10.2
I’ll
run the program on linux later and see how it goes.
Thanks
for your advices regarding debugging. I pretty much tried all of these
though… the thing is my error is not an ocaml error at runtime but an
error of the ocaml runtime. And to analyze a core dump of ocamlrun, I just
thought my best bet was gdb. Whatever.
OK
I’ll try to provide you with a minimal ocaml code that produce an
ocamlrun error. Might take a little while as I’m not free.
In
the meantime, I’ve got a newbie question regarding ocaml garbage
collector and the same List.iter stuff:
Say
you do a “List.iter myfun !myref”, where !myref is a list
(hehe…), and where myfun is a function that does reallocations of myref
(that is affectations like myref := [some new or old objects]). The pointers
myref that are generated through this process are destroyed each time a new
reallocation of myref is done. Of course the underlying linked lists that are
not referenced anymore shouldn’t be collected by the GC before the end of
the main “List.iter”, otherwise it’s iterating through a
linked list that has been garbage collected.
My
question is: does the GC know that it cannot collect the unreferenced myref
pointers before the end of the List.iter?
Sorry,
I just wanted to ask this question to rule it out.
Thanks
again.
On
07-08-2009, ivan chollet <ivan.chollet@free.fr> wrote:
>
>
This GDB was configured as "i386-marcel-freebsd"...(no debugging
symbols
>
found)...
>
>
Not very informative. So here are my questions:
I
suppose you are running freebsd ? Which version of freebsd, of ocaml ?
>
>
>
>
- What is the best way to
produce and analyze core dumps in ocaml?
>
Should I compile in bytecode or native? Is there any special gdb
"trick"
>
that gives you more information? Is there any special "trick" while
>
compiling the ocaml runtime to make it throw more information?
>
gdb
is not the perfect tool to debug ocaml program. You should give a
try
to ocamldebug which is a better option for bytecode (see below for
options).
Bytecode is more informative when coming to reporting
backtrace
(at least with old version of ocaml).
Compile
every program with "-g" option (just like gcc).
If
you have compiled everything with "-g" option, you can also use the
environment
variable OCAMLRUNPARAM="b" to get a backtrace for your
exception,
at runtime.
>
- Then, my main question
is actually: in bytecode, what can produce
>
segfaults? My ocaml code is completely standard, excepted that I use the
>
Marshal module. So my question is rather: outside the Marshal module, what
>
can cause segfault?
Some
part of the bytecode are just standard C, everything can cause a
segfault
just as C. These errors are not very common but it is possible
that
some case are not well handled on freebsd. Most probably a porting
issue.
Marshal
module can easily trigger a segfault when you map the loaded data
to
a type which doesn't match the dumped data.
Example:
List.length
(Marshal.from_string (Marshal.to_string 1234 []) 0);;
Here
the integer value is marshalled and then unmarshalled as a list ->
segfault.
>
>
- Slightly unrelated
question: I have been able to generate
>
segfaults by running ocaml code that: 1) iterates recursively through a list
>
reference 2) changes the reference while still iterating on it. For example,
>
you just do a "List.iter myfun !myref", and within the function
myfun, you
>
do stuff like "myref := List.filter somefilterfunction !myref". It is
not
>
good to program like this, but for some reason I thought ocaml would not
>
segfault on that. Is this expected behavior? If it's not, I'll be happy to
>
provide some simple source code that illustrates it. (nevermind I have
>
actually cleaned all my code base from these dirty uses of references)
>
Could
you provide a minimal example code for this error ? I don't think
this
should generate a segfault.
Regards
Sylvain
Le Gall