caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Raising an old issue : true concurrency in OCaml [Xavier, Damien, any]
@ 2005-04-07 21:47 Yoann Fabre
  2005-04-07 22:18 ` [Caml-list] " Christian Szegedy
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Yoann Fabre @ 2005-04-07 21:47 UTC (permalink / raw)
  To: caml-list


Well... shame on me!
I'm afraid I'm going to restart the "annual discussion on threads". I'm very
sorry about that Xavier, but I do believe that the recent trend in generic
CPU design calls for a new discussion. I can only hope all of you will
forgive my too quickly written English... and of course the fact that I'm
definitely "just a nameless student" :-)

Since it seems to be fashionable here nowadays, I introduce myself quickly
in the .sig. I also invite you to read the last (full) Xavier's reply about
this topic (link and quotes included below).

My point is the following (let's be blatant): I'm afraid any pure-OCaml
program will be limited to at most 50% of the CPU power on about 50% of the
machines in the next five years ; and probably 25/30% on the remaining 40%
of medium to high-end machines.

Why? Because current CPU design is going toward true parallelism with
multi-core and hyper-threading like techs.
This is /not/ yet another marketing hype... there're profound physical
reason being such a move. It seems to be impossible to scale up in frequency
any more due to leaking current etc. See articles on
http://www.anandtech.com/cpuchipsets/ for a short introduction and more
links. (I'm ashamed of such a lake of true reference, but I'm too lazy to
find them among my papers right now...)

So Intel, AMD, IBM/Motorola and Sun are all releasing multi-core/HT CPU
right now. SMP systems will not be the exception any more, but the rule!

See:
Intel Pentium 4C (1 HT core ~ 1.7 cores)
Intel Pentium 4D (2 cores)
Intel Pentium EE 840 (2 HT cores ~ 3.4 cores)
New Sun micro-arch, a kind of Intel HT
New (incoming) multi-core AMD64
New multi-core PowerPC
IBM Cell (at least 1+6 cores)
etc...

And of course, we cannot execute more than one Caml thread at a time...

So the questions are:
- Do we need to do something about it?
- If yes, what can be done in the short term?
- And in the long term?

IMHO, I think the Caml team need to do something... I do /love/ Ocaml but I
also do run hungry code (Non-ODE solving, temporal series discovery...) and
it seems that a lot of that code can be parallelized quite easily. I'm
already tired not to be able to go above about 65% CPU usage on my P4C 3.2
(Intel VTune, not Task-Manager figures). I've run the FPU part of some
constraint solving based on interval arithmetic in a C thread while the high
level constraint management was still in OCaml (tricky business). The FPU
ASM was nearly the same between ocamlopt and GCC 3.2 but I've nearly doubled
the perf! A simple and efficient usage of hyper-threading between strongly
FPU and ALU oriented threads (and maybe a better cache management provided
by the use of the two context entries). For some project I've seriously
considered using some language with a concurrent GC (IBM's java VM ?, C# ?).
What an awful thought, isn't it? Their god, help me! Avoid me C++ in any
way... (See also my short reply to Xavier at the end of this mail.)

So what can we do?
Here's my two cents proposal for the short/long term:

Phase 1 - Maybe provide some module in the stdlib to:
- allow easy management of multiple Caml /processes/
- allow easy communication with message passing (MPI like + Marshal)
- allow easy synchronisations between these pseudo-threads
Phase 2 - Add another module to:
- provide a standard interface to memory sharing
- allow some awful dirty hack to GC-allocate some special blocks into that
region? Some "custom2" with reference counting (ouch!) between the GCs
Phase 3 - write a concurrent GC (CGC) (re-ouch!)

Well, I've studied a bit the CGC field and read Damien's Phd... It's
frightening! But, to recast the issue in a somewhat broader context, can we
really still pretend that a "modern" generic language can live without true
concurrency in 2005?

All in all Xavier's recognized the importance of threading for OCaml, since
he's written the first linux pthread library for that very purpose...
Granted it was not aimed at performances, but it's now an open question. The
old dream of "never two times slower than C" is (was?) a reality. I use
OCaml partly for that kind of perf... Is "never height times slower than C"
(4 cores) still an acceptable tradeoffs? I don't think so and I'm worried.

Only hoping to trigger a constructive discussion,
Cordially,
Yoann.Fabre@lip6.fr

I code in Caml since 1995 (grateful victim of the ENS push on Prepa
program). I've read this list nearly daily for six years. I've written more
the 200.000 lines of OCaml code (wc -l) ranging from 3D system to ML like
byte code compiler and type-system, going through C binding of things like
CrystalSpace... I've also done dirty unsatisfactory hack in the runtime
system to implements a ultra-lightweight concurrent language (a la
Peyton-Jones feather-weight continuations) and some true interval arithmetic
etc...
Some poor fellow 	were also victim of my hard Ocaml evangelism -- Hi
Nadji, got that you're now in PhD with Francois, how is it going? Hi SebC !
Well I do not want to be conceited... I've only tried to establish that I've
got some XP in OCaml, since I know from practice that it does help in a
debate with some of the world-class coders who are wandering here...

-----------------------------------------------------------------------

Xavier Leroy 
Re: [Caml-list] Why systhreads?
2002-11-25 (10:01)              
http://caml.inria.fr/pub/ml-archives/caml-list/2002/11/64c14acb90cb14bedb2ca
cb73338fb15.en.html

What about parallelism on SMP machines?  The main issue here is that the
runtime system, and in particular the garbage collector and memory manager,
must be MP-safe. [...] a concurrent GC avoids this problem, but adds
tremendous complexity.
(Of course, all this SMP support stuff slows down the runtime system even if
there is only one processor, which is the case for almost all our users...)
[...]
Why was Concurrent Caml Light abandoned?
- Too complex; too hard to debug
- Dubious practical interest. Shared-memory multiprocessors have never
really "taken off", at least in the general public
[...]

>>> True, until now.

[...]
Even if you have a 4-processor SMP machine, it isn't clear whether you
should write your program using shared memory or using message passing
[...]

>>> OK if you only need to copy large linear block of memory. But seriously,
do you really want to marshal ~50Mo of complex Caml data structure? This
/has/ a non negligible cost and I don't want to think about maintaining
sharing (which may be mandatory). It's, what, 1.5 million accesses to the
hash table in marshal.c (with a mean of 8*4 bytes by block).

[...]
What about hyperthreading?  Well, I believe it's the last convulsive
movement of SMP's corpse :-)  We'll see how it goes market-wise.
[...] 

>>> All current CPU designs seem to disagree.

[...]
At any rate, the speedups announced for hyperthreading in the Pentium 4 are
below a factor of 1.5; probably not enough to offset the overhead of making
the OCaml runtime system thread-safe.  
[...]

>>> The ratio is more about 1.7. I've used HT for more than a year, and I
think it's a truly effective tech. Maybe the most effective tech. (from a
complexity/perf ratio viewpoint) introduced by intel since the P2.

[...]
In summary: there is no SMP support in OCaml, and it is very very unlikely
that there will ever be.

>>> Ouch!




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Raising an old issue : true concurrency in OCaml [Xavier, Damien, any]
  2005-04-07 21:47 Raising an old issue : true concurrency in OCaml [Xavier, Damien, any] Yoann Fabre
@ 2005-04-07 22:18 ` Christian Szegedy
  2005-04-08  0:07 ` Chris Campbell
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Christian Szegedy @ 2005-04-07 22:18 UTC (permalink / raw)
  To: caml-list

Yoann Fabre wrote:

>My point is the following (let's be blatant): I'm afraid any pure-OCaml
>program will be limited to at most 50% of the CPU power on about 50% of the
>machines in the next five years ; and probably 25/30% on the remaining 40%
>of medium to high-end machines.
>  
>
I am really shocked reading this. I am well into an OCaml
high-performance computing project. I only tested it on
single-processors, but I expected that (given the Thread
module) it would be no problem to parallelize it. Now it
turns out that it is impossible.
The typical platform for my project will be Linux on
multiprocessor AMD64. As in your case, most of my
algorithms can be very easily parallelized using
shared-memory parallelism in C/C++.

I second to your opinion that OCaml will be out of competition
if it will not support shared memory parallelism very soon.
Multiprocessor workstations have become very affordable and
multi-core CPUs are becoming standard.

I really hope that the OCaml team will accept the challenge,
otherwise one should look for alternatives. Does CML supports
real concurrency via shared memory parallelism?

Instead of Java, one could use Scala which generates java
bytecode, but the implementation did not seem to be very mature.
It is far from the stability and performance of OCaml.

Best regards, Christian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Raising an old issue : true concurrency in OCaml [Xavier, Damien, any]
  2005-04-07 21:47 Raising an old issue : true concurrency in OCaml [Xavier, Damien, any] Yoann Fabre
  2005-04-07 22:18 ` [Caml-list] " Christian Szegedy
@ 2005-04-08  0:07 ` Chris Campbell
  2005-04-10  9:59 ` Oliver Bandel
  2005-04-10 18:09 ` Christophe TROESTLER
  3 siblings, 0 replies; 6+ messages in thread
From: Chris Campbell @ 2005-04-08  0:07 UTC (permalink / raw)
  To: Yoann Fabre; +Cc: caml-list

On Apr 7, 2005 10:47 PM, Yoann Fabre <Yoann.Fabre@wanadoo.fr> wrote:

> And of course, we cannot execute more than one Caml thread at a time...

See below.

[snip]

> So what can we do?
> Here's my two cents proposal for the short/long term:
> 
> Phase 1 - Maybe provide some module in the stdlib to:
> - allow easy management of multiple Caml /processes/
> - allow easy communication with message passing (MPI like + Marshal)
> - allow easy synchronisations between these pseudo-threads
> Phase 2 - Add another module to:
> - provide a standard interface to memory sharing
> - allow some awful dirty hack to GC-allocate some special blocks into that
> region? Some "custom2" with reference counting (ouch!) between the GCs
> Phase 3 - write a concurrent GC (CGC) (re-ouch!)

Instead you could try something like AliceML and Oz approach.  They
have a single runtime, but distribution support.  Oz makes it easy to
distribute a program being network transparent.  Alice, I'm not sure
about.  They do memory sharing on local machines and tcp on networks.

You gain parallelism by distributing the program.  Sure it's done
manually, but in some ways that's a plus.

I'm not sure if OCaml has distributed systems support, but if not an
alternative to your proposal would be to build distribution support
for ocaml.

> Well, I've studied a bit the CGC field and read Damien's Phd... It's
> frightening! But, to recast the issue in a somewhat broader context, can we
> really still pretend that a "modern" generic language can live without true
> concurrency in 2005?

Concurrency != parallelism.  


Chris


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Raising an old issue : true concurrency in OCaml [Xavier, Damien, any]
  2005-04-07 21:47 Raising an old issue : true concurrency in OCaml [Xavier, Damien, any] Yoann Fabre
  2005-04-07 22:18 ` [Caml-list] " Christian Szegedy
  2005-04-08  0:07 ` Chris Campbell
@ 2005-04-10  9:59 ` Oliver Bandel
  2005-04-10 16:53   ` Yaron Minsky
  2005-04-10 18:09 ` Christophe TROESTLER
  3 siblings, 1 reply; 6+ messages in thread
From: Oliver Bandel @ 2005-04-10  9:59 UTC (permalink / raw)
  To: caml-list

On Thu, Apr 07, 2005 at 11:47:46PM +0200, Yoann Fabre wrote:
[...]
> My point is the following (let's be blatant): I'm afraid any pure-OCaml
> program will be limited to at most 50% of the CPU power on about 50% of the
> machines in the next five years ; and probably 25/30% on the remaining 40%
> of medium to high-end machines.
[...]

Maybe I'm not on the right track, but do you speak about
multiprocessor machines?
Isn't it the job of the system kernel to provide parallelism
to your application?
Why should the user do coding low-level hardware-stuff in the
user-space programs?  This does not make sense to me.

The OS should be able to hide special treatment of multiprocessor
architectures or CPU-internal modifications from the user and
do all this parallelism stuff transparently.

On Linux for example: don't forget to compile your kernel
with SMP-options and so on.

So, if I've overseen something, let me know it.


Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Raising an old issue : true concurrency in OCaml [Xavier, Damien, any]
  2005-04-10  9:59 ` Oliver Bandel
@ 2005-04-10 16:53   ` Yaron Minsky
  0 siblings, 0 replies; 6+ messages in thread
From: Yaron Minsky @ 2005-04-10 16:53 UTC (permalink / raw)
  To: Oliver Bandel; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

On Apr 10, 2005 5:59 AM, Oliver Bandel <oliver@first.in-berlin.de> wrote: 

> [... a discussion of how SMP support is really the OS's responsibility, 
> not the language's...]
> 
> So, if I've overseen something, let me know it.
> 
> 
You have. Due to the lack of a concurrent GC, OCaml doesn't allow multiple 
threads to be executing caml code at once. This means that OCaml (unlike, 
say, C) doesn't allow you to take performance advantage of a multi-CPU (or 
multi-core) machine by running multiple threads in the same executable. With 
the predicted rush of multi-core CPUs, the argument can be made that being 
able to take advantage of this kind of paralellism is increasingly 
important, and a feature that should be on OCaml's roadmap, which it 
currently is not.

Yaron

[-- Attachment #2: Type: text/html, Size: 1089 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Raising an old issue : true concurrency in OCaml [Xavier, Damien, any]
  2005-04-07 21:47 Raising an old issue : true concurrency in OCaml [Xavier, Damien, any] Yoann Fabre
                   ` (2 preceding siblings ...)
  2005-04-10  9:59 ` Oliver Bandel
@ 2005-04-10 18:09 ` Christophe TROESTLER
  3 siblings, 0 replies; 6+ messages in thread
From: Christophe TROESTLER @ 2005-04-10 18:09 UTC (permalink / raw)
  To: caml-list

On Thu, 7 Apr 2005, "Yoann Fabre" <Yoann.Fabre@wanadoo.fr> wrote:
> 
> Phase 3 - write a concurrent GC (CGC) (re-ouch!)

It is conceivable to have two GC, the one as of now and a concurrent
one and select the latter with an option of the compiler (say -smp),
or are the two GC requirements too different for that?

ChriS


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-04-10 21:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-07 21:47 Raising an old issue : true concurrency in OCaml [Xavier, Damien, any] Yoann Fabre
2005-04-07 22:18 ` [Caml-list] " Christian Szegedy
2005-04-08  0:07 ` Chris Campbell
2005-04-10  9:59 ` Oliver Bandel
2005-04-10 16:53   ` Yaron Minsky
2005-04-10 18:09 ` Christophe TROESTLER

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).