caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
@ 2006-09-15 20:16 Jonathan Bryant
  0 siblings, 0 replies; 10+ messages in thread
From: Jonathan Bryant @ 2006-09-15 20:16 UTC (permalink / raw)
  To: Yaron Minsky

My two cents,

> It's very nice to have a concurrent run-time system, and we know how
> to do it (at significant cost), but it's not really worth the trouble
> until we have an answer to this question: what programming language
> features do we put on top of it?
>
> Shared memory with threads, locks, and conditions just doesn't cut
> it, in terms of writing programs that work.

For writing shared memory concurrent programs in OCaml, the Event  
module is a far better fit than locks, mutexes and conditions (oh,  
my!).  It fits the functional paradigm much better and makes program  
design far easier to reason about.  Even simple programs can benefit  
(design wise) from its use.  One of the major advantages is that it's  
defined the same way as most of the other major types in the language  
(lists, sets, etc.): recursively, so it flows natrually from the  
language.  For example, in C/C++/Java (shared memory concurrency), an  
algorithm such as merge sort is tricky, and parallelizing it is  
downright nasty, but in OCaml with the Event module (message passing  
concurrency), merge sort borders on trivial, and parallelizing it is  
almost as simple because the recursion of the algorithm and the  
recursion of the parallelism go hand in hand.  Try it and you'll see.

Mutexes, conditions, and locks really only need to be there for  
transliterating sequential, array based parallel code, but if you are  
doing that, the question really is why?  OCaml outperforms C pretty  
much everywhere but that, and the true benefits of OCaml in this  
situation (cleanliness, safety, readability, etc.) are much better  
exploited by rethinking the algorithm using message passing.  I feel  
that the (possible) slight hit in speed is completely justified by  
the numerous other benefits of the Event module.

> I understand where you're coming from, but this point of view  
> implies, I think, an inappropriate division of labor between  
> language designers and language users.  I agree, ordinary shared- 
> state concurrency with threads is a disaster.  But the burden of  
> figuring out how to write concurrent programs in a reasonable way  
> is not just the responsibility of the language designer --- library  
> writers can come up with good abstractions to take advantage of  
> threads as well.

Again, great abstractions to take advantage of these multi-cored  
processors is already in the OCaml standard library: the Event  
module.  All that we need is the ability to actually take advantage  
of it.

> ...threads with real concurrency.  Right now, ocaml doesn't provide  
> that, and it's a real weakness of the language.  The lack of a  
> concurrent GC, in my opinion, stifles innovation in this area...

Agreed.

--Jonathan Bryant


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-18  8:24     ` Hendrik Tews
@ 2006-09-18  8:38       ` skaller
  0 siblings, 0 replies; 10+ messages in thread
From: skaller @ 2006-09-18  8:38 UTC (permalink / raw)
  To: Hendrik Tews; +Cc: caml-list

On Mon, 2006-09-18 at 10:24 +0200, Hendrik Tews wrote:
> "Yaron Minsky" <yminsky@cs.cornell.edu> writes:
> 
>    That said, I do understand that a concurrent GC is a big technical
>    challenge, and I can understand why the ocaml team isn't eager to take it on
>    right now.
> 
> The ocaml team could document the GC interface and modularize
> everything, such that the user can choose between different
> garbage collectors (at compile time, or even better at
> application start).

I doubt this is possible with native code compilers.
AFAIK allocation, write barrier, and int/box flag bit handling
aren't modelled using standard C ABI calls?

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 14:29   ` Yaron Minsky
  2006-09-15 16:24     ` Mike Lin
@ 2006-09-18  8:24     ` Hendrik Tews
  2006-09-18  8:38       ` skaller
  1 sibling, 1 reply; 10+ messages in thread
From: Hendrik Tews @ 2006-09-18  8:24 UTC (permalink / raw)
  To: caml-list

"Yaron Minsky" <yminsky@cs.cornell.edu> writes:

   That said, I do understand that a concurrent GC is a big technical
   challenge, and I can understand why the ocaml team isn't eager to take it on
   right now.

The ocaml team could document the GC interface and modularize
everything, such that the user can choose between different
garbage collectors (at compile time, or even better at
application start).

Some parallel enthusiast might then come up with a parallel GC,
that is good enough for experiments until the right solutions are
available. 

Bye,

Hendrik


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-14 15:40 Jim Battin
  2006-09-14 17:04 ` [Caml-list] " skaller
  2006-09-15 11:36 ` Damien Doligez
@ 2006-09-15 20:26 ` Florian Weimer
  2 siblings, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2006-09-15 20:26 UTC (permalink / raw)
  To: Jim Battin; +Cc: caml-list

* Jim Battin:

> It seems Moore's law is taking us in the direction of more cores per
> microprocessor with less effort placed on exploring ILP.  With the
> advent of multi-core processors, and their inevitable ubiquity, are
> there any plans, considerations, or ideas for a concurrent garbage
> collector in Ocaml?

Right now, concurrent garbage collection seems to offer significantly
less throughput.  I'm not sure if it's worth all the effort.

Another thing that might be interesting is a way to execute multiple
run-times with independent heaps in a single process, with Ada-style
rendezvous betweeen them and a special low-overhead form of
marshalling.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 16:24     ` Mike Lin
@ 2006-09-15 16:44       ` Gerd Stolpmann
  0 siblings, 0 replies; 10+ messages in thread
From: Gerd Stolpmann @ 2006-09-15 16:44 UTC (permalink / raw)
  To: Mike Lin; +Cc: caml-list

Am Freitag, den 15.09.2006, 12:24 -0400 schrieb Mike Lin:
> Slightly OT question: I often want to parallelize algorithms in
> computational biology in which (a) the parallel computations take a
> long time (seconds/minutes) to complete, (b) they use a very large
> heap (gigs) of immutable data, and (c) they don't really need to
> synchronize at intermediate points in the computation. This seems best
> accomplished with fork(). What would be your favorite way to collect
> the results from the child processes?
> Right now I have a hacked up thing to marshal them through pipes. The
> parent process reads the values from the pipes serially, which is
> obviously sub-optimal, but I was too lazy to write a select() loop. Is
> this what you would do or can you think of a better way?
> For my purposes (embarassingly parallelizable computational biology),
> a convenient and type-safe little library for doing this would satisfy
> 80% of my SMP needs.

Use my sunrpc library. It allows you to do asynchronous calls. One
process can call the other worker processes and wait until all the
results are back. Of course, the calls are type-safe.

There is even now a manager for forking subprocesses called netplex.
I've recently used these libraries to develop a highly parallelized
system that runs on a cluster of seven machines.

Both libraries (rpc and netplex) are part of the not yet released
ocamlnet2 package. You find the sources here:

https://gps.dynxs.de/svn/lib-ocamlnet2/trunk/

A test release is here (this version is used for the mentioned cluster
and very stable):

http://ocaml-programming.de/packages/ocamlnet-2.2test11.tar.gz

Gerd

-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 14:29   ` Yaron Minsky
@ 2006-09-15 16:24     ` Mike Lin
  2006-09-15 16:44       ` Gerd Stolpmann
  2006-09-18  8:24     ` Hendrik Tews
  1 sibling, 1 reply; 10+ messages in thread
From: Mike Lin @ 2006-09-15 16:24 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

Slightly OT question: I often want to parallelize algorithms in
computational biology in which (a) the parallel computations take a long
time (seconds/minutes) to complete, (b) they use a very large heap (gigs) of
immutable data, and (c) they don't really need to synchronize at
intermediate points in the computation. This seems best accomplished with
fork(). What would be your favorite way to collect the results from the
child processes?
Right now I have a hacked up thing to marshal them through pipes. The parent
process reads the values from the pipes serially, which is obviously
sub-optimal, but I was too lazy to write a select() loop. Is this what you
would do or can you think of a better way?
For my purposes (embarassingly parallelizable computational biology), a
convenient and type-safe little library for doing this would satisfy 80% of
my SMP needs.
Mike

[-- Attachment #2: Type: text/html, Size: 964 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-15 11:36 ` Damien Doligez
@ 2006-09-15 14:29   ` Yaron Minsky
  2006-09-15 16:24     ` Mike Lin
  2006-09-18  8:24     ` Hendrik Tews
  0 siblings, 2 replies; 10+ messages in thread
From: Yaron Minsky @ 2006-09-15 14:29 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 2808 bytes --]

On 9/15/06, Damien Doligez <damien.doligez@inria.fr> wrote:
>
>
> On 2006-09-14, at 17:40, Jim Battin wrote:
>
> > It seems Moore's law is taking us in the direction of more cores per
> > microprocessor with less effort placed on exploring ILP.  With the
> > advent of multi-core processors, and their inevitable ubiquity, are
> > there any plans, considerations, or ideas for a concurrent garbage
> > collector in Ocaml?
>
> It's very nice to have a concurrent run-time system, and we know how
> to do it (at significant cost), but it's not really worth the trouble
> until we have an answer to this question: what programming language
> features do we put on top of it?
>
> Shared memory with threads, locks, and conditions just doesn't cut
> it, in terms of writing programs that work.


I understand where you're coming from, but this point of view implies, I
think, an inappropriate division of labor between language designers and
language users.  I agree, ordinary shared-state concurrency with threads is
a disaster.  But the burden of figuring out how to write concurrent programs
in a reasonable way is not just the responsibility of the language designer
--- library writers can come up with good abstractions to take advantage of
threads as well.

That is, they could do so if they had access to threads with real
concurrency.  Right now, ocaml doesn't provide that, and it's a real
weakness of the language.  The lack of a concurrent GC, in my opinion,
stifles innovation in this area, at least within the caml world.

That said, I do understand that a concurrent GC is a big technical
challenge, and I can understand why the ocaml team isn't eager to take it on
right now.

> To my knowledge, Caml Light had a concurrent garbage collector under
> > development by Xavier Leroy but was abandoned due to significant
> > technical challenges.  Prior to that, there appears to have been some
> > academic research regarding concurrent GC (Doligez, Leroy).
>
> The development was done by myself, it was done before the
> publications, and as part of the academic research (like everything
> we do here).
>
> The most significant challenges are in making Posix threads work
> under Unix (threads and Unix signals just don't mix well, given the
> currently available APIs).
>
>
> In summary, you shouldn't hold your breath.  In my opinion, we
> will need some major breakthrough before we can make good use
> of multicore architectures in normal programs.  In any language.
>
> -- Damien
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 3628 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-14 15:40 Jim Battin
  2006-09-14 17:04 ` [Caml-list] " skaller
@ 2006-09-15 11:36 ` Damien Doligez
  2006-09-15 14:29   ` Yaron Minsky
  2006-09-15 20:26 ` Florian Weimer
  2 siblings, 1 reply; 10+ messages in thread
From: Damien Doligez @ 2006-09-15 11:36 UTC (permalink / raw)
  To: caml-list


On 2006-09-14, at 17:40, Jim Battin wrote:

> It seems Moore's law is taking us in the direction of more cores per
> microprocessor with less effort placed on exploring ILP.  With the
> advent of multi-core processors, and their inevitable ubiquity, are
> there any plans, considerations, or ideas for a concurrent garbage
> collector in Ocaml?

It's very nice to have a concurrent run-time system, and we know how
to do it (at significant cost), but it's not really worth the trouble
until we have an answer to this question: what programming language
features do we put on top of it?

Shared memory with threads, locks, and conditions just doesn't cut
it, in terms of writing programs that work.


> To my knowledge, Caml Light had a concurrent garbage collector under
> development by Xavier Leroy but was abandoned due to significant
> technical challenges.  Prior to that, there appears to have been some
> academic research regarding concurrent GC (Doligez, Leroy).

The development was done by myself, it was done before the
publications, and as part of the academic research (like everything
we do here).

The most significant challenges are in making Posix threads work
under Unix (threads and Unix signals just don't mix well, given the
currently available APIs).


In summary, you shouldn't hold your breath.  In my opinion, we
will need some major breakthrough before we can make good use
of multicore architectures in normal programs.  In any language.

-- Damien


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage  Collection?
  2006-09-14 17:04 ` [Caml-list] " skaller
@ 2006-09-14 21:00   ` Andrej Bauer
  0 siblings, 0 replies; 10+ messages in thread
From: Andrej Bauer @ 2006-09-14 21:00 UTC (permalink / raw)
  Cc: caml-list

skaller wrote:
> On Thu, 2006-09-14 at 10:40 -0500, Jim Battin wrote:
>>  Prior to that, there appears to have been some
>> academic research regarding concurrent GC (Doligez, Leroy).
> 
> There's some more recent stuff eg
> 
> "Scalable Real-time parallel GC for SMP", Cheng 
> 
> (sorry lost the URL ..)

12 page journal version: 
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/guyb/www/papers/gc2001.pdf

Ph.D. thesis:
http://reports-archive.adm.cs.cmu.edu/anon/2001/CMU-CS-01-174.ps

Perry Cheng was a schoolmate of mine. The world is so small.

Best regards,

Andrej


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] The Future Possibility of Concurrent Garbage Collection?
  2006-09-14 15:40 Jim Battin
@ 2006-09-14 17:04 ` skaller
  2006-09-14 21:00   ` Andrej Bauer
  2006-09-15 11:36 ` Damien Doligez
  2006-09-15 20:26 ` Florian Weimer
  2 siblings, 1 reply; 10+ messages in thread
From: skaller @ 2006-09-14 17:04 UTC (permalink / raw)
  To: Jim Battin; +Cc: caml-list

On Thu, 2006-09-14 at 10:40 -0500, Jim Battin wrote:
>  Prior to that, there appears to have been some
> academic research regarding concurrent GC (Doligez, Leroy).

There's some more recent stuff eg

"Scalable Real-time parallel GC for SMP", Cheng 

(sorry lost the URL ..) which suggests an implementation
for ML like language is quite possible. This paper
describes an algorithms for *parallel* collection,
not merely concurrent collection, that is, multiple
threads (multiple CPUs) participating in the collection.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-09-18  8:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-15 20:16 [Caml-list] The Future Possibility of Concurrent Garbage Collection? Jonathan Bryant
  -- strict thread matches above, loose matches on Subject: below --
2006-09-14 15:40 Jim Battin
2006-09-14 17:04 ` [Caml-list] " skaller
2006-09-14 21:00   ` Andrej Bauer
2006-09-15 11:36 ` Damien Doligez
2006-09-15 14:29   ` Yaron Minsky
2006-09-15 16:24     ` Mike Lin
2006-09-15 16:44       ` Gerd Stolpmann
2006-09-18  8:24     ` Hendrik Tews
2006-09-18  8:38       ` skaller
2006-09-15 20:26 ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).