From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=AWL,HTML_MESSAGE autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 06BCBBBCA for ; Sat, 19 Apr 2008 16:21:43 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aj8BAOicCUhIDtycc2dsb2JhbACCQjaObQEMAwQFCRSUQYQ0 X-IronPort-AV: E=Sophos;i="4.25,683,1199660400"; d="scan'208,217";a="11113575" Received: from discorde.inria.fr ([192.93.2.38]) by mail1-smtp-roc.national.inria.fr with ESMTP; 19 Apr 2008 16:21:42 +0200 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m3JELcff027858 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Sat, 19 Apr 2008 16:21:42 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aj8BAOicCUhIDtycc2dsb2JhbACCQjaObQEMAwQFCRSUQYQ0 X-IronPort-AV: E=Sophos;i="4.25,683,1199660400"; d="scan'208,217";a="11113574" Received: from fg-out-1718.google.com ([72.14.220.156]) by mail1-smtp-roc.national.inria.fr with ESMTP; 19 Apr 2008 16:21:42 +0200 Received: by fg-out-1718.google.com with SMTP id 22so826689fge.25 for ; Sat, 19 Apr 2008 07:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:from:to:cc:in-reply-to:references:content-type:date:message-id:mime-version:x-mailer; bh=y9ZucxTvtpzjsXN9EW4JzQIeUi8pTOBg2QfCCo2ZPGg=; b=wD00aBOpiMcfqnDjpnB5lAYMQXdutNlKUW4K1hfENbZ9rYav1tgC4sPCcZUMdJ3LGOqibj2J+v2iBCUoYcIrB1kUR/IyL3AXndo9zizek37CKcgjoQRZ0qsB6KSOzkubIF6MnLrVvjhBoKhk39/Myga2H1KtkAKZbhotBeO22rY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:in-reply-to:references:content-type:date:message-id:mime-version:x-mailer; b=wnNOHunO4motHjuOU7QDX+l4WtiHBDbjN39JfsQqxWR3MkWzpTE4F9ztWs3kpJNrLAOVr0mRGp+ZzKHrQqzKnrjixxoKrrjCCTynRPNz9AoYQ89XkFopjJTrEg0D9hTvXWJwjm2Pe6TFNDg0WaquWpj1bJ5i3MpVl3aa5E369u8= Received: by 10.86.89.1 with SMTP id m1mr8003673fgb.66.1208614901705; Sat, 19 Apr 2008 07:21:41 -0700 (PDT) Received: from ?192.168.0.10? ( [82.237.227.151]) by mx.google.com with ESMTPS id p38sm1862397fke.13.2008.04.19.07.21.39 (version=SSLv3 cipher=RC4-MD5); Sat, 19 Apr 2008 07:21:40 -0700 (PDT) Subject: Re: [Caml-list] OCaml Summer Project decisions are in From: Benjamin Canou To: Caml List Cc: Benjamin Canou , Emmanuel Chailloux , Philippe Wang , adrien jonquet , Mathias Bourgoin In-Reply-To: References: <865618.45090.qm@web54601.mail.re2.yahoo.com> <1208546581.16295.108.camel@nyc-qws-018.delacy.com> Content-Type: multipart/alternative; boundary="=-I+pQ/EnxZjDlDe51V/s0" Date: Sat, 19 Apr 2008 16:21:48 +0200 Message-Id: <1208614908.6790.53.camel@benjamin-laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.12.1 X-Miltered: at discorde with ID 4809FFF2.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocaml:01 parallelism:01 ocaml:01 bindings:01 gtk:01 parallelism:01 rewriting:01 lablgtk:01 trivial:01 run-time:01 rewriting:01 compiler:01 allocator:01 allocations:01 pointer:01 X-Attachments: cset="UTF-8" cset="utf-8" --=-I+pQ/EnxZjDlDe51V/s0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi, I can understand why you are all excited about parallelism in OCaml, and most of your questions/suggestions seem relevant. However, for the technical questions about the choices of the implementation, it is way too early for us to answer them. About the bugs in library bindings, I'm afraid you are spotting a really problematic problem. For example, if I remember well, gtk C functions have to be called within the same thread in C (which is a smart way for them to say that their code isn't reentrant), and this may raise a problem if we add real parallelism to OCaml. But you understand that rewriting lablgtk and/or programs using it is out of scope. You also noticed that the task is not trivial, and are concerned about the feasibility. In fact, we most probably won't have the time to find the best solution. So our proposal is to let this project be more "a first reusable step toward parallelism in OCaml" than "a parallel OCaml". More practically, we propose the following subtasks: 1. To strip down the current run-time library, rewriting some parts which are too much dependent on the current GC 2. To clean the (small) parts of the compiler preventing us from changing the allocator (for example, OCaml inlines some allocations by directly emitting code which modifies the heap pointer). 3. To define a clean and documented interface for adding new GCs, ideally adding a run-time switch to choose the GC. 4. To to reinject the current GC, or a simpler sequential GC we already wrote for another work, using this interface to validate the approach. 5. To design a first parallel GC, simple enough for us to be able to test and benchmark it before the end of the project and to implement it within our interface. With such an approach, we believe that the projet has a much greater chance to survive and perhaps be in integrated upstream. Also, with such a generic GC plugging interface, libraries will be able to provide specific GCs. For example including some tricks to be able to run gtk-like non reentrant C calls, or dedicated to tasks which are currently problematic due to the current allocation mecanism, like MLGMP. We'll probably open a blog or wiki to inform you about the progression, and collect suggestions and concerns. Regards, and many thanks for your interest. Benjamin Canou. Le samedi 19 avril 2008 à 10:46 +0200, Berke Durak a écrit : > The concurrent GC is a great idea. A few interrogations. > > - How "stoppy" would a stop-the-world parallel GC be in practice? The > more parallelism > you have, the more work is done, the higher the frequency of a major > collection. > > - Would major allocations be serialized? What about other > serialization points? > > - I'm afraid true concurrency will introduce an awful lot of bugs in > native bindings. Thread-unsafe libraries will have to be replaced > (Str, etc.) Also what would be the CPU > and memory costs? Don't concurrent GCs require extra colors? > > - In case of performance impacts, will the old single-threaded mode > still be available? > > The argument that "you'll get the same old perfomance if you run it in > single-threaded mode" > is not valid IMHO. Many people will use a thread here or there and > then you won't realistically be able to run in single-threaded mode. > > But then we can't pretend multi-core doesn't exist. A suggestion: > making the parallel GC available only on 64-bit seems a reasonable > restriction (if that's ever needed.) > > Also Damien Doligez (in addition to Xavier Leroy) certainly have nice > things to say about all this. > -- > Berke Durak > > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs --=-I+pQ/EnxZjDlDe51V/s0 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit   Hi,

I can understand why you are all excited about parallelism in OCaml, and most of your questions/suggestions seem relevant.
However, for the technical questions about the choices of the implementation, it is way too early for us to answer them.

About the bugs in library bindings, I'm afraid you are spotting a really problematic problem. For example, if I remember well, gtk C functions have to be called within the same thread in C (which is a smart way for them to say that their code isn't reentrant), and this may raise a problem if we add real parallelism to OCaml. But you understand that rewriting lablgtk and/or programs using it is out of scope.

You also noticed that the task is not trivial, and are concerned about the feasibility. In fact, we most probably won't have the time to find the best solution.
So our proposal is to let this project be more "a first reusable step toward parallelism in OCaml" than "a parallel OCaml".
More practically, we propose the following subtasks:
  1. To strip down the current run-time library, rewriting some parts which are too much dependent on the current GC
  2. To clean the (small) parts of the compiler preventing us from changing the allocator (for example, OCaml inlines some allocations by directly emitting code which modifies the heap pointer).
  3. To define a clean and documented interface for adding new GCs, ideally adding a run-time switch to choose the GC.
  4. To to reinject the current GC, or a simpler sequential GC we already wrote for another work, using this interface to validate the approach.
  5. To design a first parallel GC, simple enough for us to be able to test and benchmark it before the end of the project and to implement it within our interface.

With such an approach, we believe that the projet has a much greater chance to survive and perhaps be in integrated upstream.
Also, with such a generic GC plugging interface, libraries will be able to provide specific GCs. For example including some tricks to be able to run gtk-like non reentrant C calls, or dedicated to tasks which are currently problematic due to the current allocation mecanism, like MLGMP.

We'll probably open a blog or wiki to inform you about the progression, and collect suggestions and concerns.

Regards, and many thanks for your interest.
  Benjamin Canou.

Le samedi 19 avril 2008 à 10:46 +0200, Berke Durak a écrit :
The concurrent GC is a great idea.  A few interrogations.

- How "stoppy" would a stop-the-world parallel GC be in practice?  The more parallelism
you have, the more work is done, the higher the frequency of a major collection.

- Would major allocations be serialized?  What about other serialization points?

- I'm afraid true concurrency will introduce an awful lot of bugs in native bindings.  Thread-unsafe libraries will have to be replaced (Str, etc.)  Also what would be the CPU
and memory costs?  Don't concurrent GCs require extra colors?

- In case of performance impacts, will the old single-threaded mode still be available?

The argument that "you'll get the same old perfomance if you run it in single-threaded mode"
is not valid IMHO.  Many people will use a thread here or there and then you won't realistically be able to run in single-threaded mode.

But then we can't pretend multi-core doesn't exist.  A suggestion: making the parallel GC available only on 64-bit seems a reasonable restriction (if that's ever needed.)

Also Damien Doligez (in addition to Xavier Leroy) certainly have nice things to say about all this.
--
Berke Durak

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
--=-I+pQ/EnxZjDlDe51V/s0--