caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Benjamin Canou <benjamin.canou@gmail.com>
Cc: Caml List <caml-list@inria.fr>,
	Philippe Wang <philippe.wang@gmail.com>,
	adrien jonquet <adjonk@gmail.com>,
	Benjamin Canou <benjamin.canoou@gmail.com>,
	Emmanuel Chailloux <Emmanuel.Chailloux@lip6.fr>,
	Mathias Bourgoin <mathias.bourgoin@gmail.com>
Subject: Re: [Caml-list] OCaml Summer Project decisions are in
Date: Sun, 20 Apr 2008 18:12:57 +0200	[thread overview]
Message-ID: <480B6B89.2070403@inria.fr> (raw)
In-Reply-To: <1208614908.6790.53.camel@benjamin-laptop>

Berke Durak wrote:

> - How "stoppy" would a stop-the-world parallel GC be in practice?
> The more parallelism you have, the more work is done, the higher the
> frequency of a major collection.

By Amdahl's law, a stop-the-world GC cannot scale well, but it might
be good enough for 2 to 4 cores.  For instance, assuming a sequential
program that spends 25% of its time in garbage collection and
parallelizes with low overhead, you would get a speedup of 2 on
a 4-core machine.  A related question is how long it takes to stop a
large number of threads.  But only experimentation can tell.

> - I'm afraid true concurrency will introduce an awful lot of bugs in
> native bindings.  Thread-unsafe libraries will have to be replaced
> (Str, etc.)

Libraries with stateful APIs (such as Str, but also Hashtbl, etc)
already need user-level locking to be safely used in concurrent programs,
even with today's concurrent-but-not-parallel threading model for Caml.
The only additional problem I can see is with C bindings that expose a
functional API but have global state in their implementation, but I'm
not sure there are many. So, I wouldn't worry too much about this at
the moment.  But there are more acute problems with global C state in
the Caml run-time system itself (see below).

> Also what would be the CPU and memory costs?  Don't concurrent GCs
> require extra colors?

Not the ones I know.

Benjamin Canou wrote:

> So our proposal is to let this project be more "a first reusable step
> toward parallelism in OCaml" than "a parallel OCaml".
> More practically, we propose the following subtasks:
>   1. To strip down the current run-time library, rewriting some parts
> which are too much dependent on the current GC
>   2. To clean the (small) parts of the compiler preventing us from
> changing the allocator (for example, OCaml inlines some allocations by
> directly emitting code which modifies the heap pointer).
>   3. To define a clean and documented interface for adding new GCs,
> ideally adding a run-time switch to choose the GC.
>   4. To to reinject the current GC, or a simpler sequential GC we
> already wrote for another work, using this interface to validate the
> approach.
>   5. To design a first parallel GC, simple enough for us to be able to
> test and benchmark it before the end of the project and to implement it
> within our interface.

This sounds globally reasonable for a first step, although you should
be careful not to completely shift your objectives from parallelism to
plug-in GCs.

There are two point that I'd like to emphasize. The first step towards
any form of true parallelism in Caml (including message-passing
between threads having their own heaps and sequential GCs) is to clean
up the numerous global C variables that the run-time system uses.
But maybe this was implicit in your point 1.

Another crucial point is the ability to stop all threads and obtain
their roots, which is not obvious at all and a prerequisite for any
form of concurrent garbage collection.

At any rate, it's probably better not to ask too many questions at
this point and let you concentrate on the task at hand.  Feel free to
contact me and Damien Doligez for specific questions or general
advice.

- Xavier Leroy


  reply	other threads:[~2008-04-20 16:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-18 14:51 Yaron Minsky
2008-04-18 15:46 ` [Caml-list] " Jon Harrop
2008-04-18 16:42 ` Dario Teixeira
2008-04-18 19:23   ` Yaron Minsky
2008-04-19  8:46     ` Berke Durak
2008-04-19 14:21       ` Benjamin Canou
2008-04-20 16:12         ` Xavier Leroy [this message]
2008-04-21  9:24           ` Berke Durak
2008-04-21 11:17             ` Jon Harrop
2008-04-19 11:29   ` Francois Pottier
2008-04-19 15:11     ` Dario Teixeira
2008-04-19 13:38 ` Sylvain Le Gall
2008-04-20 20:06 ` [Caml-list] " Florian Hars

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=480B6B89.2070403@inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=Emmanuel.Chailloux@lip6.fr \
    --cc=adjonk@gmail.com \
    --cc=benjamin.canoou@gmail.com \
    --cc=benjamin.canou@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=mathias.bourgoin@gmail.com \
    --cc=philippe.wang@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).