From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Damien.Doligez@inria.fr>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by yquem.inria.fr (Postfix) with ESMTP id C3828BC57
	for <caml-list@yquem.inria.fr>; Thu,  2 Dec 2010 13:57:23 +0100 (CET)
X-IronPort-AV: E=Sophos;i="4.59,288,1288566000"; 
   d="scan'208";a="82171089"
Received: from 250-120.msr-inria.inria.fr (HELO [10.0.1.2]) ([193.55.250.120])
  by mail2-relais-roc.national.inria.fr with ESMTP/TLS/AES128-SHA; 02 Dec 2010 13:57:23 +0100
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Apple Message framework v1082)
Subject: Re: [Caml-list] OCaml GC [was Is OCaml fast?]
From: Damien Doligez <Damien.Doligez@inria.fr>
In-Reply-To: <916556265.243293.1291069665032.JavaMail.root@zmbs1.inria.fr>
Date: Thu, 2 Dec 2010 13:57:23 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <0EC166A9-4CBE-48D2-89CD-E50CD8191E11@inria.fr>
References: <AANLkTi=MFt3mGK2L_o=4=o=3FOtA2onvdvqTpw9SObYP@mail.gmail.com> <1290434674.16005.354.camel@thinkpad> <AANLkTi=nsgA3qEmpj0=89AodNN4M1tru+WJSsyYJ0ZqV@mail.gmail.com> <582306206.731582.1290438133628.JavaMail.root@zmbs4.inria.fr> <AANLkTinrvZRXzn5eFvyB2E4_CR_7+XU1zRtbNQLjMhVS@mail.gmail.com> <20101122203334.7adc5ee6@deb0> <20101127221121.0920db65@ordinaves.concept-micro.com> <4CF25878.4060202@univ-savoie.fr> <F909DA08-5F0C-4B3D-8FD2-4EB948098176@googlemail.com> <916556265.243293.1291069665032.JavaMail.root@zmbs1.inria.fr>
To: OCaml mailing list <caml-list@yquem.inria.fr>
X-Mailer: Apple Mail (2.1082)
X-Spam: no; 0.00; ocaml:01 ocaml:01 damien:01 damien:01 ocaml's:01 wsize:01 bsize:01 edwin:98 3.0:98 2.0:98 doligez:01 doligez:01 wrote:01 heap:01 heap:01 


On 2010-11-29, at 23:27, T=F6r=F6k Edwin wrote:

> This seems to be in concordance with the "smaller minor heap =3D> more
> minor collections =3D> slower program" observation, since it is:
> smaller minor heap =3D> more minor collections =3D> more major slices =
=3D>
> major slices can't collect long-lived objects =3D> slower program =
(long
> lived objects are sweeped too many times).
> So more minor collections =3D> higher cost for a major slice

This is a bit naive, because the size of a major slice depends on the
amount of data promoted by the corresponding minor GC.  A smaller minor
heap promotes less data each time, so you get more major slices, but
they are smaller.  The total amount of work done by the major GC doesn't
change because of that.  It only changes because a bigger minor heap
gives more time for data to die before being promoted to the major heap.

> I think OCaml's GC should take into account how successful the last
> major GC was (how many words it freed), and adjust speed accordingly:
> if we know that the major slice will collect next to nothing there is
> no point in wasting time and running it.
>=20
> So this formula should probably be changed:
>  p =3D (double) caml_allocated_words * 3.0 * (100 + caml_percent_free)
>      / Wsize_bsize (caml_stat_heap_size) / caml_percent_free / 2.0;
>=20
> Probably to something that also does:
>  p =3D  p * major_slice_successrate


The success rate is already taken into account by the major GC.  In fact
a target success rate is set by the user: it is
   caml_percent_free / (100 + caml_percent_free)
and the major GC adjusts its speed according to this target.  If the
target is not met, the free list is emptied faster than the GC can
replenish it, so it gets depleted at some point and the major heap
is then extended to satisfy allocation requests.  The bigger major
heap then helps the major GC meet its target, because the success rate
is simply (heap_size - live_data) / heap_size, and that gets higher
as heap_size increases.

I didn't do the math, but I think your modification would make the
major heap size increase without bound.

-- Damien