caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* The Future Possibility of Concurrent Garbage Collection?
@ 2006-09-14 15:40 Jim Battin
  2006-09-14 17:04 ` [Caml-list] " skaller
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Jim Battin @ 2006-09-14 15:40 UTC (permalink / raw)
  To: caml-list

Hello,

It seems Moore's law is taking us in the direction of more cores per
microprocessor with less effort placed on exploring ILP.  With the
advent of multi-core processors, and their inevitable ubiquity, are
there any plans, considerations, or ideas for a concurrent garbage
collector in Ocaml?

To my knowledge, Caml Light had a concurrent garbage collector under
development by Xavier Leroy but was abandoned due to significant
technical challenges.  Prior to that, there appears to have been some
academic research regarding concurrent GC (Doligez, Leroy).

Thanks,
Jim


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-14 15:40 The Future Possibility of Concurrent Garbage Collection? Jim Battin
@ 2006-09-14 17:04 ` skaller
  2006-09-14 21:00   ` Andrej Bauer
  2006-09-15 11:36 ` Damien Doligez
  2006-09-15 20:26 ` [Caml-list] " Florian Weimer
  2 siblings, 1 reply; 16+ messages in thread
From: skaller @ 2006-09-14 17:04 UTC (permalink / raw)
  To: Jim Battin; +Cc: caml-list

On Thu, 2006-09-14 at 10:40 -0500, Jim Battin wrote:
>  Prior to that, there appears to have been some
> academic research regarding concurrent GC (Doligez, Leroy).

There's some more recent stuff eg

"Scalable Real-time parallel GC for SMP", Cheng 

(sorry lost the URL ..) which suggests an implementation
for ML like language is quite possible. This paper
describes an algorithms for *parallel* collection,
not merely concurrent collection, that is, multiple
threads (multiple CPUs) participating in the collection.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage  Collection?
  2006-09-14 17:04 ` [Caml-list] " skaller
@ 2006-09-14 21:00   ` Andrej Bauer
  0 siblings, 0 replies; 16+ messages in thread
From: Andrej Bauer @ 2006-09-14 21:00 UTC (permalink / raw)
  Cc: caml-list

skaller wrote:
> On Thu, 2006-09-14 at 10:40 -0500, Jim Battin wrote:
>>  Prior to that, there appears to have been some
>> academic research regarding concurrent GC (Doligez, Leroy).
> 
> There's some more recent stuff eg
> 
> "Scalable Real-time parallel GC for SMP", Cheng 
> 
> (sorry lost the URL ..)

12 page journal version: 
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/guyb/www/papers/gc2001.pdf

Ph.D. thesis:
http://reports-archive.adm.cs.cmu.edu/anon/2001/CMU-CS-01-174.ps

Perry Cheng was a schoolmate of mine. The world is so small.

Best regards,

Andrej


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-14 15:40 The Future Possibility of Concurrent Garbage Collection? Jim Battin
  2006-09-14 17:04 ` [Caml-list] " skaller
@ 2006-09-15 11:36 ` Damien Doligez
  2006-09-15 14:29   ` Yaron Minsky
  2006-09-15 20:26 ` [Caml-list] " Florian Weimer
  2 siblings, 1 reply; 16+ messages in thread
From: Damien Doligez @ 2006-09-15 11:36 UTC (permalink / raw)
  To: caml-list


On 2006-09-14, at 17:40, Jim Battin wrote:

> It seems Moore's law is taking us in the direction of more cores per
> microprocessor with less effort placed on exploring ILP.  With the
> advent of multi-core processors, and their inevitable ubiquity, are
> there any plans, considerations, or ideas for a concurrent garbage
> collector in Ocaml?

It's very nice to have a concurrent run-time system, and we know how
to do it (at significant cost), but it's not really worth the trouble
until we have an answer to this question: what programming language
features do we put on top of it?

Shared memory with threads, locks, and conditions just doesn't cut
it, in terms of writing programs that work.


> To my knowledge, Caml Light had a concurrent garbage collector under
> development by Xavier Leroy but was abandoned due to significant
> technical challenges.  Prior to that, there appears to have been some
> academic research regarding concurrent GC (Doligez, Leroy).

The development was done by myself, it was done before the
publications, and as part of the academic research (like everything
we do here).

The most significant challenges are in making Posix threads work
under Unix (threads and Unix signals just don't mix well, given the
currently available APIs).


In summary, you shouldn't hold your breath.  In my opinion, we
will need some major breakthrough before we can make good use
of multicore architectures in normal programs.  In any language.

-- Damien


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 11:36 ` Damien Doligez
@ 2006-09-15 14:29   ` Yaron Minsky
  2006-09-15 16:24     ` Mike Lin
  2006-09-18  8:24     ` Hendrik Tews
  0 siblings, 2 replies; 16+ messages in thread
From: Yaron Minsky @ 2006-09-15 14:29 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 2808 bytes --]

On 9/15/06, Damien Doligez <damien.doligez@inria.fr> wrote:
>
>
> On 2006-09-14, at 17:40, Jim Battin wrote:
>
> > It seems Moore's law is taking us in the direction of more cores per
> > microprocessor with less effort placed on exploring ILP.  With the
> > advent of multi-core processors, and their inevitable ubiquity, are
> > there any plans, considerations, or ideas for a concurrent garbage
> > collector in Ocaml?
>
> It's very nice to have a concurrent run-time system, and we know how
> to do it (at significant cost), but it's not really worth the trouble
> until we have an answer to this question: what programming language
> features do we put on top of it?
>
> Shared memory with threads, locks, and conditions just doesn't cut
> it, in terms of writing programs that work.


I understand where you're coming from, but this point of view implies, I
think, an inappropriate division of labor between language designers and
language users.  I agree, ordinary shared-state concurrency with threads is
a disaster.  But the burden of figuring out how to write concurrent programs
in a reasonable way is not just the responsibility of the language designer
--- library writers can come up with good abstractions to take advantage of
threads as well.

That is, they could do so if they had access to threads with real
concurrency.  Right now, ocaml doesn't provide that, and it's a real
weakness of the language.  The lack of a concurrent GC, in my opinion,
stifles innovation in this area, at least within the caml world.

That said, I do understand that a concurrent GC is a big technical
challenge, and I can understand why the ocaml team isn't eager to take it on
right now.

> To my knowledge, Caml Light had a concurrent garbage collector under
> > development by Xavier Leroy but was abandoned due to significant
> > technical challenges.  Prior to that, there appears to have been some
> > academic research regarding concurrent GC (Doligez, Leroy).
>
> The development was done by myself, it was done before the
> publications, and as part of the academic research (like everything
> we do here).
>
> The most significant challenges are in making Posix threads work
> under Unix (threads and Unix signals just don't mix well, given the
> currently available APIs).
>
>
> In summary, you shouldn't hold your breath.  In my opinion, we
> will need some major breakthrough before we can make good use
> of multicore architectures in normal programs.  In any language.
>
> -- Damien
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 3628 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 14:29   ` Yaron Minsky
@ 2006-09-15 16:24     ` Mike Lin
  2006-09-15 16:44       ` Gerd Stolpmann
  2006-09-18  8:24     ` Hendrik Tews
  1 sibling, 1 reply; 16+ messages in thread
From: Mike Lin @ 2006-09-15 16:24 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

Slightly OT question: I often want to parallelize algorithms in
computational biology in which (a) the parallel computations take a long
time (seconds/minutes) to complete, (b) they use a very large heap (gigs) of
immutable data, and (c) they don't really need to synchronize at
intermediate points in the computation. This seems best accomplished with
fork(). What would be your favorite way to collect the results from the
child processes?
Right now I have a hacked up thing to marshal them through pipes. The parent
process reads the values from the pipes serially, which is obviously
sub-optimal, but I was too lazy to write a select() loop. Is this what you
would do or can you think of a better way?
For my purposes (embarassingly parallelizable computational biology), a
convenient and type-safe little library for doing this would satisfy 80% of
my SMP needs.
Mike

[-- Attachment #2: Type: text/html, Size: 964 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 16:24     ` Mike Lin
@ 2006-09-15 16:44       ` Gerd Stolpmann
  0 siblings, 0 replies; 16+ messages in thread
From: Gerd Stolpmann @ 2006-09-15 16:44 UTC (permalink / raw)
  To: Mike Lin; +Cc: caml-list

Am Freitag, den 15.09.2006, 12:24 -0400 schrieb Mike Lin:
> Slightly OT question: I often want to parallelize algorithms in
> computational biology in which (a) the parallel computations take a
> long time (seconds/minutes) to complete, (b) they use a very large
> heap (gigs) of immutable data, and (c) they don't really need to
> synchronize at intermediate points in the computation. This seems best
> accomplished with fork(). What would be your favorite way to collect
> the results from the child processes?
> Right now I have a hacked up thing to marshal them through pipes. The
> parent process reads the values from the pipes serially, which is
> obviously sub-optimal, but I was too lazy to write a select() loop. Is
> this what you would do or can you think of a better way?
> For my purposes (embarassingly parallelizable computational biology),
> a convenient and type-safe little library for doing this would satisfy
> 80% of my SMP needs.

Use my sunrpc library. It allows you to do asynchronous calls. One
process can call the other worker processes and wait until all the
results are back. Of course, the calls are type-safe.

There is even now a manager for forking subprocesses called netplex.
I've recently used these libraries to develop a highly parallelized
system that runs on a cluster of seven machines.

Both libraries (rpc and netplex) are part of the not yet released
ocamlnet2 package. You find the sources here:

https://gps.dynxs.de/svn/lib-ocamlnet2/trunk/

A test release is here (this version is used for the mentioned cluster
and very stable):

http://ocaml-programming.de/packages/ocamlnet-2.2test11.tar.gz

Gerd

-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-14 15:40 The Future Possibility of Concurrent Garbage Collection? Jim Battin
  2006-09-14 17:04 ` [Caml-list] " skaller
  2006-09-15 11:36 ` Damien Doligez
@ 2006-09-15 20:26 ` Florian Weimer
  2 siblings, 0 replies; 16+ messages in thread
From: Florian Weimer @ 2006-09-15 20:26 UTC (permalink / raw)
  To: Jim Battin; +Cc: caml-list

* Jim Battin:

> It seems Moore's law is taking us in the direction of more cores per
> microprocessor with less effort placed on exploring ILP.  With the
> advent of multi-core processors, and their inevitable ubiquity, are
> there any plans, considerations, or ideas for a concurrent garbage
> collector in Ocaml?

Right now, concurrent garbage collection seems to offer significantly
less throughput.  I'm not sure if it's worth all the effort.

Another thing that might be interesting is a way to execute multiple
run-times with independent heaps in a single process, with Ada-style
rendezvous betweeen them and a special low-overhead form of
marshalling.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 14:29   ` Yaron Minsky
  2006-09-15 16:24     ` Mike Lin
@ 2006-09-18  8:24     ` Hendrik Tews
  2006-09-18  8:38       ` skaller
  2006-09-18 13:09       ` Stefan Monnier
  1 sibling, 2 replies; 16+ messages in thread
From: Hendrik Tews @ 2006-09-18  8:24 UTC (permalink / raw)
  To: caml-list

"Yaron Minsky" <yminsky@cs.cornell.edu> writes:

   That said, I do understand that a concurrent GC is a big technical
   challenge, and I can understand why the ocaml team isn't eager to take it on
   right now.

The ocaml team could document the GC interface and modularize
everything, such that the user can choose between different
garbage collectors (at compile time, or even better at
application start).

Some parallel enthusiast might then come up with a parallel GC,
that is good enough for experiments until the right solutions are
available. 

Bye,

Hendrik


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-18  8:24     ` Hendrik Tews
@ 2006-09-18  8:38       ` skaller
  2006-09-18 13:09       ` Stefan Monnier
  1 sibling, 0 replies; 16+ messages in thread
From: skaller @ 2006-09-18  8:38 UTC (permalink / raw)
  To: Hendrik Tews; +Cc: caml-list

On Mon, 2006-09-18 at 10:24 +0200, Hendrik Tews wrote:
> "Yaron Minsky" <yminsky@cs.cornell.edu> writes:
> 
>    That said, I do understand that a concurrent GC is a big technical
>    challenge, and I can understand why the ocaml team isn't eager to take it on
>    right now.
> 
> The ocaml team could document the GC interface and modularize
> everything, such that the user can choose between different
> garbage collectors (at compile time, or even better at
> application start).

I doubt this is possible with native code compilers.
AFAIK allocation, write barrier, and int/box flag bit handling
aren't modelled using standard C ABI calls?

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: The Future Possibility of Concurrent Garbage Collection?
  2006-09-18  8:24     ` Hendrik Tews
  2006-09-18  8:38       ` skaller
@ 2006-09-18 13:09       ` Stefan Monnier
  2006-09-18 13:23         ` [Caml-list] " skaller
                           ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Stefan Monnier @ 2006-09-18 13:09 UTC (permalink / raw)
  To: caml-list

>    That said, I do understand that a concurrent GC is a big technical
>    challenge, and I can understand why the ocaml team isn't eager to take
>    it on right now.

> The ocaml team could document the GC interface and modularize
> everything, such that the user can choose between different
> garbage collectors (at compile time, or even better at
> application start).

The main cost of a concurrent GC is that the application code has to be
changed to cooperate with the GC.  So if you want to be able to use the same
application code for both the concurrent and the non-concurrent GC, you end
up paying the price of concurrent GC in both cases :-(


        Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] Re: The Future Possibility of Concurrent Garbage Collection?
  2006-09-18 13:09       ` Stefan Monnier
@ 2006-09-18 13:23         ` skaller
  2006-09-18 13:42         ` Rafael 'Dido' Sevilla
  2006-09-19  9:19         ` Hendrik Tews
  2 siblings, 0 replies; 16+ messages in thread
From: skaller @ 2006-09-18 13:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list


> The main cost of a concurrent GC is that the application code has to be
> changed to cooperate with the GC.  So if you want to be able to use the same
> application code for both the concurrent and the non-concurrent GC, you end
> up paying the price of concurrent GC in both cases :-(

Can you explain this in more detail?

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] Re: The Future Possibility of Concurrent Garbage Collection?
  2006-09-18 13:09       ` Stefan Monnier
  2006-09-18 13:23         ` [Caml-list] " skaller
@ 2006-09-18 13:42         ` Rafael 'Dido' Sevilla
  2006-09-19  0:09           ` Stefan Monnier
  2006-09-19  9:19         ` Hendrik Tews
  2 siblings, 1 reply; 16+ messages in thread
From: Rafael 'Dido' Sevilla @ 2006-09-18 13:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 623 bytes --]

Stefan Monnier wrote:
> The main cost of a concurrent GC is that the application code has to be
> changed to cooperate with the GC.

Frankly, I don't think so.  Lorenz Huelsbergen and Phil Winterbottom
implemented their VCGC algorithm [1] for the Limbo programming language
under Inferno and for SMLNJ as well, and their use of that concurrent
garbage collection algorithm required no changes to any Limbo or SMLNJ
application code.

[1] http://cm.bell-labs.com/who/lorenz/papers/ismm98.pdf

-- 
We must remember that we have more power than our enemies to
worsen our fate.
http://stormwyrm.blogspot.com/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 553 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: The Future Possibility of Concurrent Garbage Collection?
  2006-09-18 13:42         ` Rafael 'Dido' Sevilla
@ 2006-09-19  0:09           ` Stefan Monnier
  2006-09-19  2:02             ` [Caml-list] " Rafael 'Dido' Sevilla
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2006-09-19  0:09 UTC (permalink / raw)
  To: caml-list

>> The main cost of a concurrent GC is that the application code has to be
>> changed to cooperate with the GC.

> Frankly, I don't think so.  Lorenz Huelsbergen and Phil Winterbottom
> implemented their VCGC algorithm [1] for the Limbo programming language
> under Inferno and for SMLNJ as well, and their use of that concurrent
> garbage collection algorithm required no changes to any Limbo or SMLNJ
> application code.

I meant *compiled* application code, of course.  I.e. you can't change the
GC after compilation.  Admittedly, the VCGC algorithm is sufficiently
non-intrusive on the mutator that it might be OK to pay its cost even
when not using the VCGC algorithm.


        Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] Re: The Future Possibility of Concurrent Garbage Collection?
  2006-09-19  0:09           ` Stefan Monnier
@ 2006-09-19  2:02             ` Rafael 'Dido' Sevilla
  0 siblings, 0 replies; 16+ messages in thread
From: Rafael 'Dido' Sevilla @ 2006-09-19  2:02 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]

Stefan Monnier wrote:
> I meant *compiled* application code, of course.  I.e. you can't change the
> GC after compilation.  Admittedly, the VCGC algorithm is sufficiently
> non-intrusive on the mutator that it might be OK to pay its cost even
> when not using the VCGC algorithm.

Well, any garbage collection algorithm that requires read or write
barriers on the mutator (not just concurrent algorithms I believe) would
need such changes.  The VCGC algorithm has a write barrier used only
when references are changed, and given that most Ocaml code (like SML
code) is written in a functional style where references are seldom used,
it seems that it could be an excellent algorithm.

On the other hand, if you're talking about byte-compiled code, it might
be possible to insert any required write or read barriers in the virtual
machine itself.  For Limbo this only required Winterbottom and
Huelsbergen to change the Dis virtual machine; all previously compiled
Limbo code ran properly without change once they replaced the older mark
and sweep collector with it.

-- 
We must remember that we have more power than our enemies to
worsen our fate.
http://stormwyrm.blogspot.com/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 553 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] Re: The Future Possibility of Concurrent Garbage Collection?
  2006-09-18 13:09       ` Stefan Monnier
  2006-09-18 13:23         ` [Caml-list] " skaller
  2006-09-18 13:42         ` Rafael 'Dido' Sevilla
@ 2006-09-19  9:19         ` Hendrik Tews
  2 siblings, 0 replies; 16+ messages in thread
From: Hendrik Tews @ 2006-09-19  9:19 UTC (permalink / raw)
  To: caml-list

Stefan Monnier <monnier@iro.umontreal.ca> writes:

   > The ocaml team could document the GC interface and modularize
   > everything, such that the user can choose between different
   > garbage collectors (at compile time, or even better at
   > application start).

   The main cost of a concurrent GC is that the application code has to be
   changed to cooperate with the GC.  So if you want to be able to use the same
   application code for both the concurrent and the non-concurrent GC, you end
   up paying the price of concurrent GC in both cases :-(

For a start it would be sufficient to have a compiler switch that
disables all inlining of GC operations and uses some well defined
library interface instead. Nobody would have to pay anything
then. Optizing performance bottemlacks can wait until the first
alternative garbage collector is out. 

Hendrik


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2006-09-19  9:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-14 15:40 The Future Possibility of Concurrent Garbage Collection? Jim Battin
2006-09-14 17:04 ` [Caml-list] " skaller
2006-09-14 21:00   ` Andrej Bauer
2006-09-15 11:36 ` Damien Doligez
2006-09-15 14:29   ` Yaron Minsky
2006-09-15 16:24     ` Mike Lin
2006-09-15 16:44       ` Gerd Stolpmann
2006-09-18  8:24     ` Hendrik Tews
2006-09-18  8:38       ` skaller
2006-09-18 13:09       ` Stefan Monnier
2006-09-18 13:23         ` [Caml-list] " skaller
2006-09-18 13:42         ` Rafael 'Dido' Sevilla
2006-09-19  0:09           ` Stefan Monnier
2006-09-19  2:02             ` [Caml-list] " Rafael 'Dido' Sevilla
2006-09-19  9:19         ` Hendrik Tews
2006-09-15 20:26 ` [Caml-list] " Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).