caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] How does OCaml update references when values are moved by the GC?
       [not found] <418632253.26199.1288302511712.JavaMail.root@zmbs1.inria.fr>
@ 2010-10-29  7:47 ` Damien Doligez
  2010-10-30 11:14   ` Elias Gabriel Amaral da Silva
                     ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Damien Doligez @ 2010-10-29  7:47 UTC (permalink / raw)
  To: caml users


On 2010-10-28, at 23:48, Jon Harrop wrote:

> How does OCaml update references in the stacks and heap when values are
> moved by the GC?


They are updated by the GC, of course.

-- Damien


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] How does OCaml update references when values are moved by the GC?
  2010-10-29  7:47 ` [Caml-list] How does OCaml update references when values are moved by the GC? Damien Doligez
@ 2010-10-30 11:14   ` Elias Gabriel Amaral da Silva
  2010-10-30 17:15   ` Jon Harrop
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Elias Gabriel Amaral da Silva @ 2010-10-30 11:14 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml users

2010/10/29 Damien Doligez <damien.doligez@inria.fr>:
>
> On 2010-10-28, at 23:48, Jon Harrop wrote:
>
>> How does OCaml update references in the stacks and heap when values are
>> moved by the GC?
>
>
> They are updated by the GC, of course.

can't the GC just put a new reference for it in a data structure? it
has to physically move it?

(i think you are referring to moving data to the older generation; it
could be a tree or a linked list, marking what exactly is old)

-- 
Elias Gabriel Amaral da Silva <tolkiendili@gmail.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [Caml-list] How does OCaml update references when values are moved by the GC?
  2010-10-29  7:47 ` [Caml-list] How does OCaml update references when values are moved by the GC? Damien Doligez
  2010-10-30 11:14   ` Elias Gabriel Amaral da Silva
@ 2010-10-30 17:15   ` Jon Harrop
  2010-10-30 17:38     ` Michael Ekstrand
  2010-10-30 20:40   ` Jon Harrop
       [not found]   ` <1033013046.78315.1288458970997.JavaMail.root@zmbs4.inria.fr>
  3 siblings, 1 reply; 8+ messages in thread
From: Jon Harrop @ 2010-10-30 17:15 UTC (permalink / raw)
  To: 'Damien Doligez', caml-list

I was hoping for a little more detail, of course. :-)

How is the mapping from old to new pointers stored? Does the GC rewrite all
of the thread-local stacks in series before allowing any of them to
continue? Does the write barrier record pointers written into the major heap
so only those specific locations are rewritten at minor heap collections but
the entire major heap is rewritten upon compaction? Can the GC distinguish
between an array of ints and an array of pointers at run-time in order to
avoid traversing all of the ints when trying to rewrite pointers?

Also, any idea what the maximum proportion of the running time of a program
is spent doing this rewriting? For example, if you fill a float->float hash
table with values does updating the pointers in the major heap account for a
significant proportion of the total running time of the program?

Cheers,
Jon.

> -----Original Message-----
> From: caml-list-bounces@yquem.inria.fr [mailto:caml-list-
> bounces@yquem.inria.fr] On Behalf Of Damien Doligez
> Sent: 29 October 2010 08:48
> To: caml users
> Subject: Re: [Caml-list] How does OCaml update references when values
> are moved by the GC?
> 
> 
> On 2010-10-28, at 23:48, Jon Harrop wrote:
> 
> > How does OCaml update references in the stacks and heap when values
> are
> > moved by the GC?
> 
> 
> They are updated by the GC, of course.
> 
> -- Damien
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] How does OCaml update references when values are moved by the GC?
  2010-10-30 17:15   ` Jon Harrop
@ 2010-10-30 17:38     ` Michael Ekstrand
  2010-10-30 20:40       ` Jon Harrop
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Ekstrand @ 2010-10-30 17:38 UTC (permalink / raw)
  To: caml-list

On 10/30/2010 12:15 PM, Jon Harrop wrote:
> I was hoping for a little more detail, of course. :-)
> 
> How is the mapping from old to new pointers stored? Does the GC rewrite all
> of the thread-local stacks in series before allowing any of them to
> continue?

I imagine so.

> Does the write barrier record pointers written into the major heap
> so only those specific locations are rewritten at minor heap collections but
> the entire major heap is rewritten upon compaction?

Yes.  The runtime maintains a 'remembered set', a list of pointers from
the major heap back to the minor heap.  Maintaining this set is why
mutable data can be expensive in OCaml - any time you store a pointer
into a mutable field, the runtime must check whether the new link is
from the major to the minor heap and update the refs list accordingly.
Richard WM Jones has details here:

http://rwmj.wordpress.com/2009/08/08/ocaml-internals-part-5-garbage-collection/

> Can the GC distinguish
> between an array of ints and an array of pointers at run-time in order to
> avoid traversing all of the ints when trying to rewrite pointers?

Not that I know of.  The tag block does not have a documented reserved
value to indicate that - there are values to indicate an unboxed float
array, a string, and an array of opaque values, but not an integer array
(unless the opaque value flag is set for integer arrays).

> Also, any idea what the maximum proportion of the running time of a program
> is spent doing this rewriting? For example, if you fill a float->float hash
> table with values does updating the pointers in the major heap account for a
> significant proportion of the total running time of the program?

In my data analysis jobs (which wind up allocating quite large heaps),
the compactor almost never (detectably) runs.  Minor cycles and major
slices are a much larger concern in my experience.  I work around that
by increasing the minor heap size to decrease minor heap thrashing (my
general rule of thumb is that I want each "work unit", whatever that may
be, to fit in the minor heap).

It could well be that other applications will have different
characteristics that trigger more compactions.  I cannot speak for those
applications.  Further, when I have huge floating-point data structures,
I'm usually using bigarrays (not because I choose them over arrays,
typically, but because such code in my work frequently has to interact
with BLAS or SPARSKIT at some point).

- Michael


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [Caml-list] How does OCaml update references when values are moved by the GC?
  2010-10-29  7:47 ` [Caml-list] How does OCaml update references when values are moved by the GC? Damien Doligez
  2010-10-30 11:14   ` Elias Gabriel Amaral da Silva
  2010-10-30 17:15   ` Jon Harrop
@ 2010-10-30 20:40   ` Jon Harrop
       [not found]   ` <1033013046.78315.1288458970997.JavaMail.root@zmbs4.inria.fr>
  3 siblings, 0 replies; 8+ messages in thread
From: Jon Harrop @ 2010-10-30 20:40 UTC (permalink / raw)
  To: caml-list

I just found some interesting statements made by others on this post about
optimizing OCaml's write barrier:

  http://eigenclass.org/R2/writings/optimizing-caml_modify

In the comments, Mauricio said that caml_modify (the write barrier)
accounted for over 30% of the total running time of the whole application.

> -----Original Message-----
> From: Jon Harrop [mailto:jon@ffconsultancy.com]
> Sent: 30 October 2010 18:16
> To: 'Damien Doligez'; 'caml-list@inria.fr'
> Subject: RE: [Caml-list] How does OCaml update references when values
> are moved by the GC?
> 
> I was hoping for a little more detail, of course. :-)
> 
> How is the mapping from old to new pointers stored? Does the GC rewrite
> all of the thread-local stacks in series before allowing any of them to
> continue? Does the write barrier record pointers written into the major
> heap so only those specific locations are rewritten at minor heap
> collections but the entire major heap is rewritten upon compaction? Can
> the GC distinguish between an array of ints and an array of pointers at
> run-time in order to avoid traversing all of the ints when trying to
> rewrite pointers?
> 
> Also, any idea what the maximum proportion of the running time of a
> program is spent doing this rewriting? For example, if you fill a
> float->float hash table with values does updating the pointers in the
> major heap account for a significant proportion of the total running
> time of the program?
> 
> Cheers,
> Jon.
> 
> > -----Original Message-----
> > From: caml-list-bounces@yquem.inria.fr [mailto:caml-list-
> > bounces@yquem.inria.fr] On Behalf Of Damien Doligez
> > Sent: 29 October 2010 08:48
> > To: caml users
> > Subject: Re: [Caml-list] How does OCaml update references when values
> > are moved by the GC?
> >
> >
> > On 2010-10-28, at 23:48, Jon Harrop wrote:
> >
> > > How does OCaml update references in the stacks and heap when values
> > are
> > > moved by the GC?
> >
> >
> > They are updated by the GC, of course.
> >
> > -- Damien
> >
> > _______________________________________________
> > Caml-list mailing list. Subscription management:
> > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> > Archives: http://caml.inria.fr
> > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> > Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [Caml-list] How does OCaml update references when values are moved by the GC?
  2010-10-30 17:38     ` Michael Ekstrand
@ 2010-10-30 20:40       ` Jon Harrop
  2010-11-03  9:49         ` Richard Jones
  0 siblings, 1 reply; 8+ messages in thread
From: Jon Harrop @ 2010-10-30 20:40 UTC (permalink / raw)
  To: 'Michael Ekstrand', caml-list

Hi Michael,

Thanks for the info. I stumbled upon Richard's excellent web page describing
the internals of OCaml and the write barrier but it does not describe how
the pointers actually get rewritten.

Cheers,
Jon.

> -----Original Message-----
> From: caml-list-bounces@yquem.inria.fr [mailto:caml-list-
> bounces@yquem.inria.fr] On Behalf Of Michael Ekstrand
> Sent: 30 October 2010 18:38
> To: caml-list@yquem.inria.fr
> Subject: Re: [Caml-list] How does OCaml update references when values
> are moved by the GC?
> 
> On 10/30/2010 12:15 PM, Jon Harrop wrote:
> > I was hoping for a little more detail, of course. :-)
> >
> > How is the mapping from old to new pointers stored? Does the GC
> rewrite all
> > of the thread-local stacks in series before allowing any of them to
> > continue?
> 
> I imagine so.
> 
> > Does the write barrier record pointers written into the major heap
> > so only those specific locations are rewritten at minor heap
> collections but
> > the entire major heap is rewritten upon compaction?
> 
> Yes.  The runtime maintains a 'remembered set', a list of pointers from
> the major heap back to the minor heap.  Maintaining this set is why
> mutable data can be expensive in OCaml - any time you store a pointer
> into a mutable field, the runtime must check whether the new link is
> from the major to the minor heap and update the refs list accordingly.
> Richard WM Jones has details here:
> 
> http://rwmj.wordpress.com/2009/08/08/ocaml-internals-part-5-garbage-
> collection/
> 
> > Can the GC distinguish
> > between an array of ints and an array of pointers at run-time in
> order to
> > avoid traversing all of the ints when trying to rewrite pointers?
> 
> Not that I know of.  The tag block does not have a documented reserved
> value to indicate that - there are values to indicate an unboxed float
> array, a string, and an array of opaque values, but not an integer
> array
> (unless the opaque value flag is set for integer arrays).
> 
> > Also, any idea what the maximum proportion of the running time of a
> program
> > is spent doing this rewriting? For example, if you fill a float-
> >float hash
> > table with values does updating the pointers in the major heap
> account for a
> > significant proportion of the total running time of the program?
> 
> In my data analysis jobs (which wind up allocating quite large heaps),
> the compactor almost never (detectably) runs.  Minor cycles and major
> slices are a much larger concern in my experience.  I work around that
> by increasing the minor heap size to decrease minor heap thrashing (my
> general rule of thumb is that I want each "work unit", whatever that
> may
> be, to fit in the minor heap).
> 
> It could well be that other applications will have different
> characteristics that trigger more compactions.  I cannot speak for
> those
> applications.  Further, when I have huge floating-point data
> structures,
> I'm usually using bigarrays (not because I choose them over arrays,
> typically, but because such code in my work frequently has to interact
> with BLAS or SPARSKIT at some point).
> 
> - Michael
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] How does OCaml update references when values are moved by the GC?
       [not found]   ` <1033013046.78315.1288458970997.JavaMail.root@zmbs4.inria.fr>
@ 2010-11-02 10:25     ` Xavier Leroy
  0 siblings, 0 replies; 8+ messages in thread
From: Xavier Leroy @ 2010-11-02 10:25 UTC (permalink / raw)
  To: Jon Harrop; +Cc: 'Damien Doligez', caml-list

Jon Harrop wrote:
> I was hoping for a little more detail, of course. :-)
> 
> How is the mapping from old to new pointers stored?

With forwaring pointers.  Just take 10 minutes to familiarize yourself
with the standard stop&copy algorithm and everything will be clear:

http://en.wikipedia.org/wiki/Cheney's_algorithm

- Xavier Leroy


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] How does OCaml update references when values are moved by the GC?
  2010-10-30 20:40       ` Jon Harrop
@ 2010-11-03  9:49         ` Richard Jones
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Jones @ 2010-11-03  9:49 UTC (permalink / raw)
  To: Jon Harrop; +Cc: 'Michael Ekstrand', caml-list

On Sat, Oct 30, 2010 at 09:40:32PM +0100, Jon Harrop wrote:
> Thanks for the info. I stumbled upon Richard's excellent web page describing
> the internals of OCaml and the write barrier but it does not describe how
> the pointers actually get rewritten.

I found out the details by reading the C code of the garbage collector
and sometimes the generated assembler from ocamlopt -S.  It's pretty
straightforward to follow.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-11-03  9:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <418632253.26199.1288302511712.JavaMail.root@zmbs1.inria.fr>
2010-10-29  7:47 ` [Caml-list] How does OCaml update references when values are moved by the GC? Damien Doligez
2010-10-30 11:14   ` Elias Gabriel Amaral da Silva
2010-10-30 17:15   ` Jon Harrop
2010-10-30 17:38     ` Michael Ekstrand
2010-10-30 20:40       ` Jon Harrop
2010-11-03  9:49         ` Richard Jones
2010-10-30 20:40   ` Jon Harrop
     [not found]   ` <1033013046.78315.1288458970997.JavaMail.root@zmbs4.inria.fr>
2010-11-02 10:25     ` Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).