caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Török Edwin" <edwintorok@gmail.com>
To: Damien Doligez <Damien.Doligez@inria.fr>
Cc: OCaml mailing list <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] OCaml GC [was Is OCaml fast?]
Date: Thu, 2 Dec 2010 17:07:59 +0200	[thread overview]
Message-ID: <20101202170759.1c9feec6@deb0> (raw)
In-Reply-To: <0EC166A9-4CBE-48D2-89CD-E50CD8191E11@inria.fr>

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

On Thu, 2 Dec 2010 13:57:23 +0100
Damien Doligez <Damien.Doligez@inria.fr> wrote:

> 
> On 2010-11-29, at 23:27, Török Edwin wrote:
> 
> > This seems to be in concordance with the "smaller minor heap => more
> > minor collections => slower program" observation, since it is:
> > smaller minor heap => more minor collections => more major slices =>
> > major slices can't collect long-lived objects => slower program
> > (long lived objects are sweeped too many times).
> > So more minor collections => higher cost for a major slice
> 
> This is a bit naive, because the size of a major slice depends on the
> amount of data promoted by the corresponding minor GC.  A smaller
> minor heap promotes less data each time, so you get more major
> slices, but they are smaller.  The total amount of work done by the
> major GC doesn't change because of that.  It only changes because a
> bigger minor heap gives more time for data to die before being
> promoted to the major heap.

Thanks for the explanation. 

Is there a way I could use more than 2 bits for the color?
I thought to try excluding some objects from collection if they failed
to get collected for a few cycles (i.e. just mark them black from the
start). Tried modifying Wosize_hd, and Make_header, but it is not
enough, anything else I missed?

> 
> > I think OCaml's GC should take into account how successful the last
> > major GC was (how many words it freed), and adjust speed
> > accordingly: if we know that the major slice will collect next to
> > nothing there is no point in wasting time and running it.
> > 
> > So this formula should probably be changed:
> >  p = (double) caml_allocated_words * 3.0 * (100 + caml_percent_free)
> >      / Wsize_bsize (caml_stat_heap_size) / caml_percent_free / 2.0;
> > 
> > Probably to something that also does:
> >  p =  p * major_slice_successrate
> 
> 
> The success rate is already taken into account by the major GC.  In
> fact a target success rate is set by the user: it is
>    caml_percent_free / (100 + caml_percent_free)

Thanks, that is probably what I had in mind, but adjusting it doesn't
work as I initially expected.

> and the major GC adjusts its speed according to this target.  If the
> target is not met, the free list is emptied faster than the GC can
> replenish it, so it gets depleted at some point and the major heap
> is then extended to satisfy allocation requests.  The bigger major
> heap then helps the major GC meet its target, because the success rate
> is simply (heap_size - live_data) / heap_size, and that gets higher
> as heap_size increases.
> 
> I didn't do the math, but I think your modification would make the
> major heap size increase without bound.

Yeah, I've been experimenting, and unless I put a lower bound on
'p' (like half of initial one) it used way too much memory (sure it was
also faster, but then it is also faster by not running the GC at all
which is not good).

Which parameter do you think could be automatically tuned to accomodate
bursts of allocations? Maybe major_heap_increment (based on recent
history of heap expansions, and promoted words)?

BTW I was also looking at ways to speed up mark_slice and sweep_slice,
(which is not easy since it is mostly optimized already).
Found a place where GCC is not smart enough though. Attached patch
improves GC sweep_slice by 3% - 10% in ocamlopt, depending on CPU (3%
Intel, 10% AMD Phenom II X6). Should I open a bug and attach it?

Using gcc -O3 gives would sometimes give the same benefits, but I
don't really trust -O3, -O2 is as far as I would go. See here for all
the details: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46763
[speaking of which why is the GC compiled with -O, is -O2 too risky
too?]

P.S.: minor_heap_size got bumped in 3.12 branch, so it is now greater
than major_heap_increment. Is that intended?

Best regards,
--Edwin

[-- Attachment #2: gc_opt.patch --]
[-- Type: text/x-patch, Size: 2499 bytes --]

diff -ru /home/edwin/ocaml-3.12.0+rc1/byterun//major_gc.c ./major_gc.c
--- /home/edwin/ocaml-3.12.0+rc1/byterun//major_gc.c	2009-11-04 14:25:47.000000000 +0200
+++ ./major_gc.c	2010-12-02 17:02:38.000000000 +0200
@@ -286,21 +286,25 @@
 {
   char *hp;
   header_t hd;
+  /* speed opt: keep global in local var */
+  char *gc_sweep_hp = caml_gc_sweep_hp;
 
   caml_gc_message (0x40, "Sweeping %ld words\n", work);
   while (work > 0){
-    if (caml_gc_sweep_hp < limit){
-      hp = caml_gc_sweep_hp;
+    if (gc_sweep_hp < limit){
+      hp = gc_sweep_hp;
       hd = Hd_hp (hp);
       work -= Whsize_hd (hd);
-      caml_gc_sweep_hp += Bhsize_hd (hd);
+      gc_sweep_hp += Bhsize_hd (hd);
+      PREFETCH_READ_NT(gc_sweep_hp);
       switch (Color_hd (hd)){
       case Caml_white:
+	caml_gc_sweep_hp = gc_sweep_hp;
         if (Tag_hd (hd) == Custom_tag){
           void (*final_fun)(value) = Custom_ops_val(Val_hp(hp))->finalize;
           if (final_fun != NULL) final_fun(Val_hp(hp));
         }
-        caml_gc_sweep_hp = caml_fl_merge_block (Bp_hp (hp));
+        gc_sweep_hp = caml_fl_merge_block (Bp_hp (hp));
         break;
       case Caml_blue:
         /* Only the blocks of the free-list are blue.  See [freelist.c]. */
@@ -311,7 +315,7 @@
         Hd_hp (hp) = Whitehd_hd (hd);
         break;
       }
-      Assert (caml_gc_sweep_hp <= limit);
+      Assert (gc_sweep_hp <= limit);
     }else{
       chunk = Chunk_next (chunk);
       if (chunk == NULL){
@@ -320,11 +324,12 @@
         work = 0;
         caml_gc_phase = Phase_idle;
       }else{
-        caml_gc_sweep_hp = chunk;
+        gc_sweep_hp = chunk;
         limit = chunk + Chunk_size (chunk);
       }
     }
   }
+  caml_gc_sweep_hp = gc_sweep_hp;
 }
 
diff -ru /home/edwin/ocaml-3.12.0+rc1/byterun//memory.h ./memory.h
--- /home/edwin/ocaml-3.12.0+rc1/byterun//memory.h	2008-12-03 20:09:09.000000000 +0200
+++ ./memory.h	2010-12-02 17:06:12.000000000 +0200
@@ -215,6 +215,16 @@
   #define CAMLunused
 #endif
 
+/* non-temporal prefetch for read (it need not be left in the cache after the access) */
+#if defined (__GNUC__) && (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ > 1))
+  #define PREFETCH_READ_NT(addr) __builtin_prefetch((addr), 0, 0)
+  #define PREFETCH_READ(addr) __builtin_prefetch((addr), 0, 3)
+#else
+  #define PREFETCH_READ_NT(addr)
+  #define PREFETCH_READ(addr)
+#endif
+
+
 #define CAMLxparam1(x) \
   struct caml__roots_block caml__roots_##x; \
   CAMLunused int caml__dummy_##x = ( \

  reply	other threads:[~2010-12-02 15:08 UTC|newest]

Thread overview: 113+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-22 13:21 Is OCaml fast? Thanassis Tsiodras
2010-11-22 13:35 ` [Caml-list] " Gregory Bellier
2010-11-22 13:39   ` Lukasz Stafiniak
2010-11-22 13:42   ` Thomas Fischbacher
2010-11-22 13:43 ` Sylvain Le Gall
2010-11-22 13:55 ` [Caml-list] " Dario Teixeira
2010-11-23  2:11   ` Isaac Gouy
     [not found]   ` <999386690.746882.1290478813177.JavaMail.root@zmbs4.inria.fr>
2010-11-23  9:12     ` [Caml-list] " Fabrice Le Fessant
2010-11-23 17:56       ` Isaac Gouy
2010-11-23 19:54         ` [Caml-list] " Mark Diekhans
2010-11-23 20:04           ` Isaac Gouy
2010-11-23 22:56   ` [Caml-list] " oliver
2010-11-23 23:54     ` Jon Harrop
2010-11-24  1:24       ` Erik de Castro Lopo
2010-11-25 11:17         ` Jon Harrop
2010-11-22 14:04 ` Gerd Stolpmann
2010-11-22 14:22   ` [was: Re: Is OCaml fast?] OCaml Shootout task force Sylvain Le Gall
2010-11-22 14:36   ` [Caml-list] Is OCaml fast? bluestorm
2010-11-22 15:01     ` Török Edwin
2010-11-22 15:54       ` Goswin von Brederlow
2010-11-22 15:02     ` Gerd Stolpmann
     [not found]     ` <582306206.731582.1290438133628.JavaMail.root@zmbs4.inria.fr>
2010-11-22 16:46       ` Fabrice Le Fessant
2010-11-22 18:33         ` Török Edwin
2010-11-27 21:11           ` Pierre Etchemaïté
2010-11-28 13:26             ` OCaml GC [was Is OCaml fast?] Christophe Raffalli
2010-11-28 14:29               ` [Caml-list] " Benedikt Meurer
2010-11-28 18:57                 ` Eray Ozkural
2010-11-28 19:40                   ` Jon Harrop
2010-11-28 19:59                     ` Benedikt Meurer
2010-11-28 23:34                       ` Jon Harrop
2010-11-29 12:11                         ` Benedikt Meurer
2010-11-29 22:27                 ` Török Edwin
     [not found]                 ` <916556265.243293.1291069665032.JavaMail.root@zmbs1.inria.fr>
2010-12-02 12:57                   ` Damien Doligez
2010-12-02 15:07                     ` Török Edwin [this message]
2010-12-03 21:42                       ` [Caml-list] optimizing caml_modify [was OCaml GC [was Is OCaml fast?]] ygrek
2010-11-23  2:03     ` Is OCaml fast? Isaac Gouy
2010-11-23 10:37       ` [Caml-list] " Christophe TROESTLER
2010-11-23 15:50         ` Jon Harrop
2010-11-23 18:06           ` Isaac Gouy
2010-11-23 21:34             ` [Caml-list] " Jon Harrop
2010-11-23 16:08         ` Jon Harrop
2010-11-23 18:03           ` Isaac Gouy
2010-11-23 19:14             ` [Caml-list] " Török Edwin
2010-11-23 20:17               ` Isaac Gouy
2010-11-23 21:14             ` [Caml-list] " Christophe TROESTLER
2010-11-23 21:55               ` Isaac Gouy
2010-11-23 22:25             ` [Caml-list] " Richard Jones
2010-11-24  0:11               ` Isaac Gouy
2010-11-23 17:53         ` Isaac Gouy
2010-11-23 19:24           ` [Caml-list] " Gerd Stolpmann
2010-11-23 20:28             ` Isaac Gouy
2010-11-23 20:55               ` [Caml-list] " Gerd Stolpmann
2010-11-23 21:32                 ` Isaac Gouy
2010-11-24 22:28             ` [Caml-list] " Goswin von Brederlow
2010-11-23 23:21           ` evil sloot
2010-11-24  6:54             ` [Caml-list] " David Rajchenbach-Teller
2010-11-22 17:02   ` [Caml-list] " Oliver Bandel
2010-11-22 17:08     ` David Rajchenbach-Teller
2010-11-22 17:23       ` Oliver Bandel
2010-11-22 17:54         ` David Rajchenbach-Teller
2010-11-22 23:55         ` Jeff Schultz
2010-11-22 23:28       ` Eray Ozkural
2010-11-23  2:01       ` Isaac Gouy
2010-11-23 23:27         ` [Caml-list] " oliver
2010-11-24  0:23           ` Isaac Gouy
2010-11-24  1:36             ` [Caml-list] " Eray Ozkural
2010-11-24  2:13               ` Isaac Gouy
2010-11-24  4:39                 ` [Caml-list] " Jeff Meister
2010-11-24  6:22                   ` Andrew
2010-11-24  7:16                     ` Isaac Gouy
2010-11-24  6:50                   ` Isaac Gouy
2010-11-24 10:24                     ` [Caml-list] " Christophe Troestler
2010-11-24 11:33                       ` Eray Ozkural
2010-11-24 17:32                       ` Isaac Gouy
2010-11-24 17:57                         ` [Caml-list] " Christophe Raffalli
2010-11-24 19:07                         ` Ed Keith
2010-11-24 19:13                           ` Isaac Gouy
2010-11-24 19:17                             ` [Caml-list] " David Rajchenbach-Teller
2010-11-24 22:25                               ` Oliver Bandel
2010-11-25 16:59                   ` Stefan Monnier
2010-11-25 18:21                     ` Isaac Gouy
2010-11-25 22:11                     ` [Caml-list] " Jon Harrop
     [not found]                     ` <1534555381.33107.1290723160355.JavaMail.root@zmbs4.inria.fr>
2010-11-25 22:50                       ` Fabrice Le Fessant
2010-11-26 20:25                         ` Isaac Gouy
2010-11-27 18:55                         ` Stefan Monnier
2010-11-28 18:14                         ` [Caml-list] " oliver
2010-11-29 14:19                           ` Gerd Stolpmann
2010-11-29 16:12                             ` Threading and SharedMem (Re: [Caml-list] Re: Is OCaml fast?) Oliver Bandel
2010-11-29 16:24                               ` Gerd Stolpmann
2010-11-29 16:33                                 ` Oliver Bandel
2010-11-27 15:58                     ` [Caml-list] Re: Is OCaml fast? Christophe Raffalli
2010-11-28 18:17                       ` oliver
2010-11-29  7:33                         ` Christophe Raffalli
2010-11-29 11:25                           ` Jon Harrop
2010-11-29 11:44                             ` oliver
2010-11-29 17:29                               ` Christophe Raffalli
2010-11-24  6:55               ` David Rajchenbach-Teller
2010-11-24  7:23                 ` Isaac Gouy
2010-11-23  2:06   ` Isaac Gouy
     [not found] ` <1110536178.728445.1290434684177.JavaMail.root@zmbs4.inria.fr>
2010-11-22 16:39   ` [Caml-list] " Fabrice Le Fessant
2010-11-22 17:21     ` Philippe Strauss
2010-11-22 18:33 ` Oliver Bandel
2010-11-23  1:58 ` Isaac Gouy
2010-11-24 10:29 ` [Caml-list] " David Allsopp
2010-11-24 18:39   ` Isaac Gouy
2010-11-24 20:59     ` [Caml-list] " David Allsopp
2010-11-25  0:16       ` Isaac Gouy
2010-11-24 14:07 ` [Caml-list] " Cedric Cellier
2010-11-24 14:34 ` Vincent Aravantinos
2010-11-24 15:30   ` Thanassis Tsiodras
2010-11-24 16:26     ` Oliver Bandel
2010-11-24 16:27     ` Vincent Aravantinos
2010-11-24 18:16 ` Isaac Gouy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101202170759.1c9feec6@deb0 \
    --to=edwintorok@gmail.com \
    --cc=Damien.Doligez@inria.fr \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).