caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] What is triggering a lot of GC work?
@ 2013-02-25  2:08 Francois Berenger
  2013-02-25  8:02 ` Mark Shinwell
  2013-02-25 13:31 ` AW: " Gerd Stolpmann
  0 siblings, 2 replies; 10+ messages in thread
From: Francois Berenger @ 2013-02-25  2:08 UTC (permalink / raw)
  To: caml-list

Hello,

Is there a way to profile a program in order
to know which places in the source code
trigger a lot of garbage collection work?

I've seen some profiling traces of OCaml programs
of mine, sometimes the trace is very flat,
and the obvious things are only GC-related.

I think it may mean some performance-critical part
is written in a functional style and may benefit
from some more imperative style.

Regards,
F.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] What is triggering a lot of GC work?
  2013-02-25  2:08 [Caml-list] What is triggering a lot of GC work? Francois Berenger
@ 2013-02-25  8:02 ` Mark Shinwell
  2013-02-25 10:32   ` ygrek
  2013-02-25 13:31 ` AW: " Gerd Stolpmann
  1 sibling, 1 reply; 10+ messages in thread
From: Mark Shinwell @ 2013-02-25  8:02 UTC (permalink / raw)
  To: Francois Berenger; +Cc: caml-list

On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote:
> Is there a way to profile a program in order
> to know which places in the source code
> trigger a lot of garbage collection work?

Well, as of last week, there is!

I'm working on a compiler and runtime patch which allows the
identification, without excessive overhead, of every location (source
file name / line number) which causes a minor or major heap allocation
together with the number of words allocated at that point.

There should be something available within the next couple of weeks.
It only works on native code compiled for x86-64 machines at present.
Currently it has only been tested on Linux---although I expect it to
work on other Unix-like platforms with little or no modification.

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] What is triggering a lot of GC work?
  2013-02-25  8:02 ` Mark Shinwell
@ 2013-02-25 10:32   ` ygrek
  2013-02-26  3:46     ` Francois Berenger
  0 siblings, 1 reply; 10+ messages in thread
From: ygrek @ 2013-02-25 10:32 UTC (permalink / raw)
  To: caml-list

On Mon, 25 Feb 2013 08:02:54 +0000
Mark Shinwell <mshinwell@janestreet.com> wrote:

> On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote:
> > Is there a way to profile a program in order
> > to know which places in the source code
> > trigger a lot of garbage collection work?
> 
> Well, as of last week, there is!
> 
> I'm working on a compiler and runtime patch which allows the
> identification, without excessive overhead, of every location (source
> file name / line number) which causes a minor or major heap allocation
> together with the number of words allocated at that point.
> 
> There should be something available within the next couple of weeks.
> It only works on native code compiled for x86-64 machines at present.
> Currently it has only been tested on Linux---although I expect it to
> work on other Unix-like platforms with little or no modification.

Meanwhile you can use poor man's allocation profiler :
- http://ygrek.org.ua/p/code/pmpa
- https://sympa-roc.inria.fr/wws/arc/caml-list/2011-08/msg00050.html

-- 
 ygrek
 http://ygrek.org.ua

^ permalink raw reply	[flat|nested] 10+ messages in thread

* AW: [Caml-list] What is triggering a lot of GC work?
  2013-02-25  2:08 [Caml-list] What is triggering a lot of GC work? Francois Berenger
  2013-02-25  8:02 ` Mark Shinwell
@ 2013-02-25 13:31 ` Gerd Stolpmann
  2013-02-25 15:45   ` Alain Frisch
  1 sibling, 1 reply; 10+ messages in thread
From: Gerd Stolpmann @ 2013-02-25 13:31 UTC (permalink / raw)
  To: Francois Berenger; +Cc: caml-list

Am 25.02.2013 03:08:14 schrieb(en) Francois Berenger:
> Hello,
> 
> Is there a way to profile a program in order
> to know which places in the source code
> trigger a lot of garbage collection work?
> 
> I've seen some profiling traces of OCaml programs
> of mine, sometimes the trace is very flat,
> and the obvious things are only GC-related.
> 
> I think it may mean some performance-critical part
> is written in a functional style and may benefit
> from some more imperative style.

This is really a hard question, and I fear an allocation profiler  
cannot always answer it. Imperative style means to use assignments, and  
assignments have often to go through caml_modify, and are not as cheap  
as you would think. In contrast, allocating something new can usually  
avoid caml_modify.

This can have counter-intuitive consequences. Yesterday I sped an  
imperative program up by adding allocations! The idea is so strange  
that I need to report it here. The program uses an array for storing  
intermediate values. Originally, there was only one such array, and  
sooner or later this array was moved to the major heap by the GC.  
Assigning the elements of an array in the major heap with young values  
is the most expensive form of assignment - the array elements are  
temporarily registered as roots by the OCaml runtime. So my idea was to  
create a fresh copy of the array now and then so it is more often in  
the minor heap (the array was quite small). Assignments within the  
minor heap are cheaper - no root registration. The program was 10%  
faster finally.

My general experience is that optimizing the memory behavior is one of  
the most difficult tasks, especially because the OCaml runtime is  
designed for functional programming, and short-living allocations are  
really cheap. Usual rules like "assignment is cheaper than new  
allocation" just do not hold. It depends.

Gerd


> 
> Regards,
> F.
> 
>-- 
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AW: [Caml-list] What is triggering a lot of GC work?
  2013-02-25 13:31 ` AW: " Gerd Stolpmann
@ 2013-02-25 15:45   ` Alain Frisch
  2013-02-25 16:26     ` Gerd Stolpmann
  2013-02-25 16:32     ` Gabriel Scherer
  0 siblings, 2 replies; 10+ messages in thread
From: Alain Frisch @ 2013-02-25 15:45 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Francois Berenger, caml-list

On 02/25/2013 02:31 PM, Gerd Stolpmann wrote:
> This can have counter-intuitive consequences. Yesterday I sped an
> imperative program up by adding allocations!

This is really an interesting scenario, thanks for sharing!

Two other approaches to addressing the same performance issue could have 
been:

  1. increase the size of the minor heap so that your array stays in it 
long enough;

  2. try to reduce the number of other allocations.

Did you try one of these approaches as well?  (1 in particular is 
particularly easy to test.)



Gabriel Scherer recently called the community to share representative 
"benchmarks", in order to help core developers target optimization 
efforts to where they are useful:

http://gallium.inria.fr/~scherer/gagallium/we-need-a-representative-benchmark-suite/

Gabriel: except from LexiFi's contribution, did you get any code?  Gerd: 
it would be great if you could share the code you mention above; is it 
an option?  There are a number of optimizations which have been proposed 
(related to boxing of floats, compilation strategy for let-binding on 
tuples, etc), which could reduce significantly the allocation rate of 
some programs.  In my experience, this reduction can be observed on 
real-sized programs, but it does not translate to noticeable speedups. 
It might be the case that your program would benefit from such 
optimizations.  Having access to the code would be very useful!


Alain

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AW: [Caml-list] What is triggering a lot of GC work?
  2013-02-25 15:45   ` Alain Frisch
@ 2013-02-25 16:26     ` Gerd Stolpmann
  2013-02-25 16:32     ` Gabriel Scherer
  1 sibling, 0 replies; 10+ messages in thread
From: Gerd Stolpmann @ 2013-02-25 16:26 UTC (permalink / raw)
  To: Alain Frisch; +Cc: Francois Berenger, caml-list

Am Montag, den 25.02.2013, 16:45 +0100 schrieb Alain Frisch:
> On 02/25/2013 02:31 PM, Gerd Stolpmann wrote:
> > This can have counter-intuitive consequences. Yesterday I sped an
> > imperative program up by adding allocations!
> 
> This is really an interesting scenario, thanks for sharing!
> 
> Two other approaches to addressing the same performance issue could have 
> been:
> 
>   1. increase the size of the minor heap so that your array stays in it 
> long enough;
> 
>   2. try to reduce the number of other allocations.
> 
> Did you try one of these approaches as well?  (1 in particular is 
> particularly easy to test.)

No, there was no chance of keeping this array in the minor heap
otherwise, the program was running for too long.

> Gabriel Scherer recently called the community to share representative 
> "benchmarks", in order to help core developers target optimization 
> efforts to where they are useful:
> 
> http://gallium.inria.fr/~scherer/gagallium/we-need-a-representative-benchmark-suite/
> 
> Gabriel: except from LexiFi's contribution, did you get any code?  Gerd: 
> it would be great if you could share the code you mention above; is it 
> an option?  

Unfortunately not - it's an interpreter I developed for my customer. I
can try to create a synthetic demo case just to show the effect. (The
array is in this program actually a kind of stack frame, and it is
interpreting some data manipulation code. When executing a statement,
the current data item is put into the first cell of the frame, so we
have really a lot of assignments here. The data items are strings, and
every data manipulation creates new strings, and this results in some
allocation speed (but not really high, as e.g. in a term rewriter).)

Gerd

> There are a number of optimizations which have been proposed 
> (related to boxing of floats, compilation strategy for let-binding on 
> tuples, etc), which could reduce significantly the allocation rate of 
> some programs.  In my experience, this reduction can be observed on 
> real-sized programs, but it does not translate to noticeable speedups. 
> It might be the case that your program would benefit from such 
> optimizations.  Having access to the code would be very useful!
> 
> 
> Alain
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AW: [Caml-list] What is triggering a lot of GC work?
  2013-02-25 15:45   ` Alain Frisch
  2013-02-25 16:26     ` Gerd Stolpmann
@ 2013-02-25 16:32     ` Gabriel Scherer
  2013-02-25 16:52       ` [Caml-list] OCaml benchmarks Török Edwin
  1 sibling, 1 reply; 10+ messages in thread
From: Gabriel Scherer @ 2013-02-25 16:32 UTC (permalink / raw)
  To: Alain Frisch; +Cc: Gerd Stolpmann, Francois Berenger, caml-list

Thanks for the friendly poking. I did get some code (I've actually
been surprised by how dedicated some submitters one, eg. Edwin Török),
but my plate has been full non-stop since and I haven't yet taken the
time to put this into shape. It's on my TODO list and I hope to share
some results in the coming weeks.

Regarding the interesting battle story from Gerd, my own idea was to
"oldify" the values before inserting them in the array, in order not
to fire the write barrier. Oldifying values is costly as well, so I'm
not sure if that's interesting if the array is long-lived but the
elements short-lived. And more importantly, the oldifying interface
is, to my knowledge, not exposed to end-users (while it's possible
through the C interface to allocate directly in the old region), so
this cannot be written and tested without ugly hacks right now. I'd
still be curious to know how this solution would compare to the
others.

On Mon, Feb 25, 2013 at 4:45 PM, Alain Frisch <alain.frisch@lexifi.com> wrote:
> On 02/25/2013 02:31 PM, Gerd Stolpmann wrote:
>>
>> This can have counter-intuitive consequences. Yesterday I sped an
>> imperative program up by adding allocations!
>
>
> This is really an interesting scenario, thanks for sharing!
>
> Two other approaches to addressing the same performance issue could have
> been:
>
>  1. increase the size of the minor heap so that your array stays in it long
> enough;
>
>  2. try to reduce the number of other allocations.
>
> Did you try one of these approaches as well?  (1 in particular is
> particularly easy to test.)
>
>
>
> Gabriel Scherer recently called the community to share representative
> "benchmarks", in order to help core developers target optimization efforts
> to where they are useful:
>
> http://gallium.inria.fr/~scherer/gagallium/we-need-a-representative-benchmark-suite/
>
> Gabriel: except from LexiFi's contribution, did you get any code?  Gerd: it
> would be great if you could share the code you mention above; is it an
> option?  There are a number of optimizations which have been proposed
> (related to boxing of floats, compilation strategy for let-binding on
> tuples, etc), which could reduce significantly the allocation rate of some
> programs.  In my experience, this reduction can be observed on real-sized
> programs, but it does not translate to noticeable speedups. It might be the
> case that your program would benefit from such optimizations.  Having access
> to the code would be very useful!
>
>
> Alain
>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Caml-list] OCaml benchmarks
  2013-02-25 16:32     ` Gabriel Scherer
@ 2013-02-25 16:52       ` Török Edwin
  0 siblings, 0 replies; 10+ messages in thread
From: Török Edwin @ 2013-02-25 16:52 UTC (permalink / raw)
  To: caml-list

On 02/25/2013 06:32 PM, Gabriel Scherer wrote:
> Thanks for the friendly poking. I did get some code (I've actually
> been surprised by how dedicated some submitters one, eg. Edwin Török),
> but my plate has been full non-stop since and I haven't yet taken the
> time to put this into shape. It's on my TODO list and I hope to share
> some results in the coming weeks.


Thanks, its not yet finished though: I meant to add a benchmark for ocaml-re too and then publish it.
I got sidetracked trying to find some meaningful way to easily represent the results though (the text output is a bit too verbose).

But since you brought it up I'd like your opinion on plots:

Currently I'm thinking of generating from the .csv:
 - one SVG boxplot for (weighted) median/mean of OCaml version X vs Y performance
 - one SVG paired barplot with confidence intervals for the individual benchmarks
  - instead of X-axis labels have on-mouse-over tooltips (SVG title element) describing benchmark name and time statistics

Initially I tried boxplots for the individual measurements (using PNG/PDF output of archimedes),
but the graphs either looked too crowded (not enough room to place all labels),
or there were too many graphs and hard to get an overall picture (if I put fewer benchmarks/page).


Best regards,
--Edwin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] What is triggering a lot of GC work?
  2013-02-25 10:32   ` ygrek
@ 2013-02-26  3:46     ` Francois Berenger
  2013-02-26  4:29       ` ygrek
  0 siblings, 1 reply; 10+ messages in thread
From: Francois Berenger @ 2013-02-26  3:46 UTC (permalink / raw)
  To: caml-list

On 02/25/2013 07:32 PM, ygrek wrote:
> On Mon, 25 Feb 2013 08:02:54 +0000
> Mark Shinwell <mshinwell@janestreet.com> wrote:
>
>> On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote:
>>> Is there a way to profile a program in order
>>> to know which places in the source code
>>> trigger a lot of garbage collection work?
>>
>> Well, as of last week, there is!
>>
>> I'm working on a compiler and runtime patch which allows the
>> identification, without excessive overhead, of every location (source
>> file name / line number) which causes a minor or major heap allocation
>> together with the number of words allocated at that point.
>>
>> There should be something available within the next couple of weeks.
>> It only works on native code compiled for x86-64 machines at present.
>> Currently it has only been tested on Linux---although I expect it to
>> work on other Unix-like platforms with little or no modification.
>
> Meanwhile you can use poor man's allocation profiler :
> - http://ygrek.org.ua/p/code/pmpa
> - https://sympa-roc.inria.fr/wws/arc/caml-list/2011-08/msg00050.html

Did the changes reported on mldonkey to do less allocations had a 
significant impact on performances?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] What is triggering a lot of GC work?
  2013-02-26  3:46     ` Francois Berenger
@ 2013-02-26  4:29       ` ygrek
  0 siblings, 0 replies; 10+ messages in thread
From: ygrek @ 2013-02-26  4:29 UTC (permalink / raw)
  To: Francois Berenger; +Cc: caml-list

On Tue, 26 Feb 2013 12:46:05 +0900
Francois Berenger <berenger@riken.jp> wrote:

> On 02/25/2013 07:32 PM, ygrek wrote:
> > On Mon, 25 Feb 2013 08:02:54 +0000
> > Mark Shinwell <mshinwell@janestreet.com> wrote:
> >
> >> On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote:
> >>> Is there a way to profile a program in order
> >>> to know which places in the source code
> >>> trigger a lot of garbage collection work?
> >>
> >> Well, as of last week, there is!
> >>
> >> I'm working on a compiler and runtime patch which allows the
> >> identification, without excessive overhead, of every location (source
> >> file name / line number) which causes a minor or major heap allocation
> >> together with the number of words allocated at that point.
> >>
> >> There should be something available within the next couple of weeks.
> >> It only works on native code compiled for x86-64 machines at present.
> >> Currently it has only been tested on Linux---although I expect it to
> >> work on other Unix-like platforms with little or no modification.
> >
> > Meanwhile you can use poor man's allocation profiler :
> > - http://ygrek.org.ua/p/code/pmpa
> > - https://sympa-roc.inria.fr/wws/arc/caml-list/2011-08/msg00050.html
> 
> Did the changes reported on mldonkey to do less allocations had a 
> significant impact on performances?

Unfortunately, there was no feedback from the users of embedded versions of mldonkey,
who have constrained memory and cpu resources, but there were no performance problems reported
since then either, so it is hard to tell.

-- 
 ygrek
 http://ygrek.org.ua

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-02-26  4:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-25  2:08 [Caml-list] What is triggering a lot of GC work? Francois Berenger
2013-02-25  8:02 ` Mark Shinwell
2013-02-25 10:32   ` ygrek
2013-02-26  3:46     ` Francois Berenger
2013-02-26  4:29       ` ygrek
2013-02-25 13:31 ` AW: " Gerd Stolpmann
2013-02-25 15:45   ` Alain Frisch
2013-02-25 16:26     ` Gerd Stolpmann
2013-02-25 16:32     ` Gabriel Scherer
2013-02-25 16:52       ` [Caml-list] OCaml benchmarks Török Edwin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).