caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* thousands of CPU cores
@ 2008-07-10  5:57 J C
  2008-07-10  6:15 ` [Caml-list] " Erik de Castro Lopo
                   ` (5 more replies)
  0 siblings, 6 replies; 73+ messages in thread
From: J C @ 2008-07-10  5:57 UTC (permalink / raw)
  To: caml-list

I know that Caml team wanted to see if many-core shared-memory systems
were going to stick around before bothering with Caml development that
takes advantage of them.

Well, it looks like they are here to stay, after all:

http://news.cnet.com/8301-13924_3-9981760-64.html

As much as I hate to look a gift horse in the mouth, and I think Caml
has been a great and grossly underappreciated product, I need to see
if writing Caml is a viable code investment for the coming years or
something like Haskell, SML, F# or even Ada will be a better long-term
alternative.

Are there plans to make Caml threads OS-native threads, or add
OpenMP-style primitives, or otherwise support multiple CPU cores? And
if so, roughly in what time frame?


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  5:57 thousands of CPU cores J C
@ 2008-07-10  6:15 ` Erik de Castro Lopo
  2008-07-10 12:47   ` Oliver Bandel
  2008-07-10 11:35 ` Jon Harrop
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 73+ messages in thread
From: Erik de Castro Lopo @ 2008-07-10  6:15 UTC (permalink / raw)
  To: caml-list

J C wrote:

> As much as I hate to look a gift horse in the mouth, and I think Caml
> has been a great and grossly underappreciated product,

Agreed.

> I need to see
> if writing Caml is a viable code investment for the coming years or
> something like Haskell,

I think Haskell's STM is way overhyped and while it may be a good
solution for small numbers of cores (ie < 8) I am not convinced
it will scale to thousands of cores.

The haskell community also seems to be working towards some other
solutions:

   http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell

but I don't know enough about these.

> SML,

The life signs on this one are rather weak from what I have
seen.

> F#

Seriously? Being shackled to the .Net platform makes that one
a bit of a non-starter for me.

> Are there plans to make Caml threads OS-native threads,

This doesn't help because there is still a global lock around
the garbage collector. However, I believe that a concurrent
GC is being worked on as part of Jane ST Capital's Ocaml 
Summer of Code.

> or add OpenMP-style primitives,

Not exactly OpenMP, but see

   JoCaml : http://jocaml.inria.fr/
   CoThreads : http://cothreads.sf.net/


HTH,
Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"The day Microsoft makes something that doesn't suck is probably the
day they start making vacuum cleaners." -- Ernst Jan Plugge


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  5:57 thousands of CPU cores J C
  2008-07-10  6:15 ` [Caml-list] " Erik de Castro Lopo
@ 2008-07-10 11:35 ` Jon Harrop
  2008-07-14 11:32   ` J C
  2008-07-10 13:21 ` Jon Harrop
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 73+ messages in thread
From: Jon Harrop @ 2008-07-10 11:35 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008 06:57:44 J C wrote:
> As much as I hate to look a gift horse in the mouth, and I think Caml
> has been a great and grossly underappreciated product, I need to see
> if writing Caml is a viable code investment for the coming years or
> something like Haskell, SML, F# or even Ada will be a better long-term
> alternative.
>
> Are there plans to make Caml threads OS-native threads, or add
> OpenMP-style primitives, or otherwise support multiple CPU cores? And
> if so, roughly in what time frame?

OCaml already has OS native threads (albeit with a global lock), already 
supports OpenMP and can already be used to write parallel programs that 
exploit multiple cores.

While F# can be up to 100x faster than OCaml for some parallel tasks, there is 
no evidence indicating that SML, Haskell and Ada will be as good as OCaml, 
let alone better.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  6:15 ` [Caml-list] " Erik de Castro Lopo
@ 2008-07-10 12:47   ` Oliver Bandel
  2008-07-10 13:48     ` Hezekiah M. Carty
  0 siblings, 1 reply; 73+ messages in thread
From: Oliver Bandel @ 2008-07-10 12:47 UTC (permalink / raw)
  To: caml-list

Hello,

Zitat von Erik de Castro Lopo <mle+ocaml@mega-nerd.com>:
]...]
>
> Not exactly OpenMP, but see
>
>    JoCaml : http://jocaml.inria.fr/
>    CoThreads : http://cothreads.sf.net/
[...]


Is  JoCaml somehow related to Camlp3l?


Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  5:57 thousands of CPU cores J C
  2008-07-10  6:15 ` [Caml-list] " Erik de Castro Lopo
  2008-07-10 11:35 ` Jon Harrop
@ 2008-07-10 13:21 ` Jon Harrop
  2008-07-10 13:44 ` Peng Zang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-10 13:21 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008 06:57:44 J C wrote:
> ...many-core shared-memory systems...OpenMP-style primitives...

Incidentally, MP is good for distributed parallelism but fails to take 
advantage of shared memory (with a concurrent GC).

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  5:57 thousands of CPU cores J C
                   ` (2 preceding siblings ...)
  2008-07-10 13:21 ` Jon Harrop
@ 2008-07-10 13:44 ` Peng Zang
  2008-07-10 14:00   ` Jon Harrop
  2008-07-10 19:15 ` Gerd Stolpmann
  2008-07-11 14:06 ` Xavier Leroy
  5 siblings, 1 reply; 73+ messages in thread
From: Peng Zang @ 2008-07-10 13:44 UTC (permalink / raw)
  To: caml-list; +Cc: J C

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Thursday 10 July 2008 01:57:44 am J C wrote:
> I know that Caml team wanted to see if many-core shared-memory systems
> were going to stick around before bothering with Caml development that
> takes advantage of them.
>
> Well, it looks like they are here to stay, after all:
>
> http://news.cnet.com/8301-13924_3-9981760-64.html
>

This article doesn't say anything about whether the many-core system will be 
shared-memory.  Remember, a shared memory architecture has to deal with cache 
and memory coherence.  The prevailing view is that the overhead for such an 
approach does not scale.  For massively parallel computation we must turn to 
message passing or barrier/sync paradigms.

I am doubtful that a thousand core machine will be shared-memory based.

Also, this is a CNET article.. not exactly known for being in depth or well 
researched and this article is no exception.  It is an article based entirely 
on a few speculative comments of some Intel guys.  I wouldn't take it too 
seriously.

Personally, I can see why the Caml development team opted not to put effort 
into dealing with shared-memory systems.  It is a stop-gap solution.  That 
said, it is an important stop-gap solution and the gap may be a while so I 
can defintely understand why some people (eg. Jon) wish very hard for them to 
do something about it.  But as previous posts have mentioned, there's JoCaml, 
and MPI for OCaml, etc..

Peng

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)

iD8DBQFIdhI9fIRcEFL/JewRAtJAAKC2ec3IIMIdMPaUpEiOXIR+uICumwCfe88F
Ss7DtspzVZKK7sMiw/mXRqY=
=9lhT
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 12:47   ` Oliver Bandel
@ 2008-07-10 13:48     ` Hezekiah M. Carty
  0 siblings, 0 replies; 73+ messages in thread
From: Hezekiah M. Carty @ 2008-07-10 13:48 UTC (permalink / raw)
  To: Oliver Bandel; +Cc: caml-list

On Thu, Jul 10, 2008 at 8:47 AM, Oliver Bandel
<oliver@first.in-berlin.de> wrote:
> Is  JoCaml somehow related to Camlp3l?

Camlp3l is a set of libraries and helper tools which are written in
OCaml.  It is not tied to a particular OCaml release.

JoCaml provides a separate compiler and runtime.  It is, at least to
some extent, tied to a specific OCaml release to help ensure binary
compatibility with libraries compiled by the matching official OCaml
compiler.

Both Camlp3l and JoCaml have facilities for distributed computing, and
both seem to be maintained currently.  I don't know how Camlp3l,
JoCaml and MPI for OCaml compare in performance scaling.  They each
certainly have an interesting approach though.

Hez

-- 
Hezekiah M. Carty
Graduate Research Assistant
University of Maryland
Department of Atmospheric and Oceanic Science


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 13:44 ` Peng Zang
@ 2008-07-10 14:00   ` Jon Harrop
  2008-07-10 22:25     ` Richard Jones
                       ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-10 14:00 UTC (permalink / raw)
  To: peng.zang, caml-list

On Thursday 10 July 2008 14:44:25 Peng Zang wrote:
> On Thursday 10 July 2008 01:57:44 am J C wrote:
> > I know that Caml team wanted to see if many-core shared-memory systems
> > were going to stick around before bothering with Caml development that
> > takes advantage of them.
> >
> > Well, it looks like they are here to stay, after all:
> >
> > http://news.cnet.com/8301-13924_3-9981760-64.html
>
> This article doesn't say anything about whether the many-core system will
> be shared-memory.  Remember, a shared memory architecture has to deal with
> cache and memory coherence.  The prevailing view is that the overhead for
> such an approach does not scale.  For massively parallel computation we
> must turn to message passing or barrier/sync paradigms.
>
> I am doubtful that a thousand core machine will be shared-memory based.

Today's biggest shared-memory supercomputers already have thousands of cores.

> Also, this is a CNET article.. not exactly known for being in depth or well
> researched and this article is no exception.  It is an article based
> entirely on a few speculative comments of some Intel guys.  I wouldn't take
> it too seriously.
>
> Personally, I can see why the Caml development team opted not to put effort
> into dealing with shared-memory systems.

The OCaml development team put huge effort into their concurrent run-time.

> It is a stop-gap solution... 

That is not true. Many-core machines will always be decomposed into 
shared-memory clusters of as many cores as possible because shared memory 
parallelism will always be orders of magnitude more efficient than 
distributed parallelism.

OCaml is already ~8x slower than F# on today's eight core desktops. If OCaml's 
shortcomings are not remedied then it will become exponentially slower than 
parallelized languages like F# over the next few years until we reach the 
limit of shared memory parallelism in ~10 years time.

Consequently, the parallel GC scheduled for this summer will be the single 
most important development in OCaml world ever.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  5:57 thousands of CPU cores J C
                   ` (3 preceding siblings ...)
  2008-07-10 13:44 ` Peng Zang
@ 2008-07-10 19:15 ` Gerd Stolpmann
  2008-07-10 20:07   ` Sylvain Le Gall
                     ` (2 more replies)
  2008-07-11 14:06 ` Xavier Leroy
  5 siblings, 3 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-10 19:15 UTC (permalink / raw)
  To: J C; +Cc: caml-list


Am Mittwoch, den 09.07.2008, 22:57 -0700 schrieb J C:
> I know that Caml team wanted to see if many-core shared-memory systems
> were going to stick around before bothering with Caml development that
> takes advantage of them.
> 
> Well, it looks like they are here to stay, after all:
> 
> http://news.cnet.com/8301-13924_3-9981760-64.html
> 
> As much as I hate to look a gift horse in the mouth, and I think Caml
> has been a great and grossly underappreciated product, I need to see
> if writing Caml is a viable code investment for the coming years or
> something like Haskell, SML, F# or even Ada will be a better long-term
> alternative.
> 
> Are there plans to make Caml threads OS-native threads, or add
> OpenMP-style primitives, or otherwise support multiple CPU cores? And
> if so, roughly in what time frame?

I wouldn't take this article too seriously. It's just speculation.
Actually, the whole multi-core technology is a challenge for software
development. You cannot simply take a program that runs well on 4 cores
and expect that it scales up to 400. Software must be designed from
grounds up differently for such architectures. 

Just open up your mind to this perspective: It's a big risk for the CPU
vendors to haven taken the direction to multi-core. Except for some
standard components and some specially-adapted programs multi-core is
more or less not exploited today. So these vendors are trying to push
the software developers into this direction, and hope they find new
ideas for designing multi-core-capable programs. This article is just
propaganda for this hidden agenda. It can also happen that multi-core
with too many cores turns out as failure - in the sense that the mass
market is not ready for it.

In Ocaml you can exploit multi-core currently only by using
multi-processing parallel programs that communicate over message passing
(and only on Unix). Actually, it's an excellent language for this style.
I've written (with some other guys) a big distributed system using
Ocamlnet's netplex and sunrpc libraries (actually, a search engine...,
http://wink.com). Ocaml is an excellent choice because you can quickly
develop working programs that run 24/7. In the distributed world
stability is quite important.

For a quick introduction to the technology I'm talking about, see my
blog article here: http://blog.camlcity.org/blog/parallelmm.html

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: thousands of CPU cores
  2008-07-10 19:15 ` Gerd Stolpmann
@ 2008-07-10 20:07   ` Sylvain Le Gall
  2008-07-10 20:24     ` [Caml-list] " Gerd Stolpmann
  2008-07-10 20:48     ` Basile STARYNKEVITCH
  2008-07-10 23:33   ` [Caml-list] " Oliver Bandel
  2008-07-11  3:01   ` [Caml-list] thousands of CPU cores Brian Hurt
  2 siblings, 2 replies; 73+ messages in thread
From: Sylvain Le Gall @ 2008-07-10 20:07 UTC (permalink / raw)
  To: caml-list

On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> In Ocaml you can exploit multi-core currently only by using
> multi-processing parallel programs that communicate over message passing
> (and only on Unix). Actually, it's an excellent language for this style.

Why only on Unix ?

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 20:07   ` Sylvain Le Gall
@ 2008-07-10 20:24     ` Gerd Stolpmann
  2008-07-10 21:02       ` Sylvain Le Gall
  2008-07-15 15:21       ` Kuba Ober
  2008-07-10 20:48     ` Basile STARYNKEVITCH
  1 sibling, 2 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-10 20:24 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list


Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > In Ocaml you can exploit multi-core currently only by using
> > multi-processing parallel programs that communicate over message passing
> > (and only on Unix). Actually, it's an excellent language for this style.
> 
> Why only on Unix ?

No fork() on Windows. And emulating its effects is hard.

I would subsume Cygwin under "pseudo-Unix", and its fork emulation is so
slow that it would be a problem for speedy programs.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 20:07   ` Sylvain Le Gall
  2008-07-10 20:24     ` [Caml-list] " Gerd Stolpmann
@ 2008-07-10 20:48     ` Basile STARYNKEVITCH
  2008-07-10 21:12       ` Jon Harrop
  1 sibling, 1 reply; 73+ messages in thread
From: Basile STARYNKEVITCH @ 2008-07-10 20:48 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall wrote:
> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
>> In Ocaml you can exploit multi-core currently only by using
>> multi-processing parallel programs that communicate over message passing
>> (and only on Unix). Actually, it's an excellent language for this style.
> 
> Why only on Unix ?


Rumors (and my remembering of old benchmarks) say than inter-process 
communication -using pipes, fifos, or unix sockets- on recent Unix 
systems (in particular Linux) are significantly faster than the 
equivalent Windows counterpart.

Also, I don't know Windows but it could happen that some system calls in 
Windows useful for communication might be missing in Ocaml (but I really 
don't know; I never coded on Windows).

Besides, who want to code parallel program with Ocaml or Jocaml on non 
Unix plateforms? :-) :-)

And I am not sure that Ocaml has been ported to non-unix free 
software... Does Ocaml runs on Hurd, or Syllable, or EROS?

But don't trust me on all that. I am only familiar with Unix. Ask some 
more knowledgable guy!

(I am still not sure that the current implementation of Ocaml would 
nicely run on a thousand cores machine; and I am not sure that such a 
machine would run the current Linux)


Regards.
-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: thousands of CPU cores
  2008-07-10 20:24     ` [Caml-list] " Gerd Stolpmann
@ 2008-07-10 21:02       ` Sylvain Le Gall
  2008-07-10 21:19         ` [Caml-list] " Gerd Stolpmann
  2008-07-15 15:21       ` Kuba Ober
  1 sibling, 1 reply; 73+ messages in thread
From: Sylvain Le Gall @ 2008-07-10 21:02 UTC (permalink / raw)
  To: caml-list

On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
>
> Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
>> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
>> > In Ocaml you can exploit multi-core currently only by using
>> > multi-processing parallel programs that communicate over message passing
>> > (and only on Unix). Actually, it's an excellent language for this style.
>> 
>> Why only on Unix ?
>
> No fork() on Windows. And emulating its effects is hard.
>

open_process + stdin/stdout should do the trick... at least i think so.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 20:48     ` Basile STARYNKEVITCH
@ 2008-07-10 21:12       ` Jon Harrop
  0 siblings, 0 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-10 21:12 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008 21:48:54 Basile STARYNKEVITCH wrote:
> Sylvain Le Gall wrote:
> > On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> >> In Ocaml you can exploit multi-core currently only by using
> >> multi-processing parallel programs that communicate over message passing
> >> (and only on Unix). Actually, it's an excellent language for this style.
> >
> > Why only on Unix ?
>
> Rumors (and my remembering of old benchmarks) say than inter-process
> communication -using pipes, fifos, or unix sockets- on recent Unix
> systems (in particular Linux) are significantly faster than the
> equivalent Windows counterpart.

While that may be true, no Windows developer in their right mind would use 
such Linux technologies when Windows is pioneering vastly simpler and faster 
alternatives: .NET and the TPL.

> (I am still not sure that the current implementation of Ocaml would
> nicely run on a thousand cores machine; and I am not sure that such a
> machine would run the current Linux)

Linux is run on today's thousand-core shared-memory supercomputers.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 21:02       ` Sylvain Le Gall
@ 2008-07-10 21:19         ` Gerd Stolpmann
  2008-07-10 21:35           ` Jon Harrop
  2008-07-15 15:57           ` Kuba Ober
  0 siblings, 2 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-10 21:19 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list


Am Donnerstag, den 10.07.2008, 21:02 +0000 schrieb Sylvain Le Gall:
> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> >
> > Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
> >> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> >> > In Ocaml you can exploit multi-core currently only by using
> >> > multi-processing parallel programs that communicate over message passing
> >> > (and only on Unix). Actually, it's an excellent language for this style.
> >> 
> >> Why only on Unix ?
> >
> > No fork() on Windows. And emulating its effects is hard.
> >
> 
> open_process + stdin/stdout should do the trick... at least i think so.

After having ported godi to mingw I am not sure whether this works at
all. The point is that you usually want to inherit OS resources to the
child process (e.g. sockets). The CreateProcess Win32 call
(http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx) mentions
that you can inherit handles, but I would be careful with the
information given in MSDN. Often it works only as far as the presented
examples. Windows isn't written for multi-processing, and its syscalls
aren't as orthogonal as in Unix-type systems.

Furthermore, it looks like a pain in the ass - often you want to run
some initialization code, and without fork() you have to run it as often
as you start processes.

Also, Windows is just a bad platform for event-based programs, and you
want to do it to some extent (e.g. for watching all your child
processes). Only for socket handles there is a select() call. For all
other types of handles you cannot find out in advance whether the
operation would block or not.

So... if there is any chance you can select the OS, keep away from
Windows for this type of program.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 21:19         ` [Caml-list] " Gerd Stolpmann
@ 2008-07-10 21:35           ` Jon Harrop
  2008-07-10 22:39             ` Gerd Stolpmann
  2008-07-15 15:57           ` Kuba Ober
  1 sibling, 1 reply; 73+ messages in thread
From: Jon Harrop @ 2008-07-10 21:35 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008 22:19:05 Gerd Stolpmann wrote:
> After having ported godi to mingw I am not sure whether this works at
> all. The point is that you usually want to inherit OS resources to the
> child process (e.g. sockets). The CreateProcess Win32 call
> (http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx) mentions
> that you can inherit handles, but I would be careful with the
> information given in MSDN. Often it works only as far as the presented
> examples. Windows isn't written for multi-processing, and its syscalls
> aren't as orthogonal as in Unix-type systems.
>
> Furthermore, it looks like a pain in the ass - often you want to run
> some initialization code, and without fork() you have to run it as often
> as you start processes.
>
> Also, Windows is just a bad platform for event-based programs, and you
> want to do it to some extent (e.g. for watching all your child
> processes). Only for socket handles there is a select() call. For all
> other types of handles you cannot find out in advance whether the
> operation would block or not.
>
> So... if there is any chance you can select the OS, keep away from
> Windows for this type of program.

I think your conclusion needs qualification.

You are trying to shoehorn an existing process-based Linux solution onto 
Windows and discovering that it does not work well. However, Windows already 
provides different ways of achieving the same thing, e.g. F# with its 
first-class events and built-in asynchronous programming syntax using 
the .NET thread pool and mailboxes for message passing.

Moreover, the idiomatic solution on Windows is almost certainly faster than 
anything you can reasonably write under Linux in the absence of a concurrent 
GC (i.e. off the JVM).

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 14:00   ` Jon Harrop
@ 2008-07-10 22:25     ` Richard Jones
  2008-07-10 23:04       ` Jon Harrop
  2008-07-11 14:53     ` [Caml-list] thousands of CPU cores Peng Zang
  2008-07-15 14:39     ` Kuba Ober
  2 siblings, 1 reply; 73+ messages in thread
From: Richard Jones @ 2008-07-10 22:25 UTC (permalink / raw)
  To: Jon Harrop; +Cc: peng.zang, caml-list

On Thu, Jul 10, 2008 at 03:00:02PM +0100, Jon Harrop wrote:
> Today's biggest shared-memory supercomputers already have thousands of cores.

Distributed shared memory perhaps, but thousand core machines are
certainly not UMA SMP.  It's simply not possible for them to be.

> OCaml is already ~8x slower than F# on today's eight core desktops.

You don't half talk a load of nonsense.  MPI OCaml programs on 8 cores
are just as fast, _and_ crucially can scale over clusters and to
future multicore machines.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 21:35           ` Jon Harrop
@ 2008-07-10 22:39             ` Gerd Stolpmann
  0 siblings, 0 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-10 22:39 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list


Am Donnerstag, den 10.07.2008, 22:35 +0100 schrieb Jon Harrop:
> On Thursday 10 July 2008 22:19:05 Gerd Stolpmann wrote:
> > After having ported godi to mingw I am not sure whether this works at
> > all. The point is that you usually want to inherit OS resources to the
> > child process (e.g. sockets). The CreateProcess Win32 call
> > (http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx) mentions
> > that you can inherit handles, but I would be careful with the
> > information given in MSDN. Often it works only as far as the presented
> > examples. Windows isn't written for multi-processing, and its syscalls
> > aren't as orthogonal as in Unix-type systems.
> >
> > Furthermore, it looks like a pain in the ass - often you want to run
> > some initialization code, and without fork() you have to run it as often
> > as you start processes.
> >
> > Also, Windows is just a bad platform for event-based programs, and you
> > want to do it to some extent (e.g. for watching all your child
> > processes). Only for socket handles there is a select() call. For all
> > other types of handles you cannot find out in advance whether the
> > operation would block or not.
> >
> > So... if there is any chance you can select the OS, keep away from
> > Windows for this type of program.
> 
> I think your conclusion needs qualification.
> 
> You are trying to shoehorn an existing process-based Linux solution onto 
> Windows and discovering that it does not work well. However, Windows already 
> provides different ways of achieving the same thing, e.g. F# with its 
> first-class events and built-in asynchronous programming syntax using 
> the .NET thread pool and mailboxes for message passing.
> 
> Moreover, the idiomatic solution on Windows is almost certainly faster than 
> anything you can reasonably write under Linux in the absence of a concurrent 
> GC (i.e. off the JVM).

You are right, this is pretty much incomparable. MS has its own way of
doing things, making it hard to write portable code. I mean other OS
vendors managed to jump on the POSIX train.

We should stop here, it's getting off-topic.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 22:25     ` Richard Jones
@ 2008-07-10 23:04       ` Jon Harrop
  2008-07-10 23:41         ` Oliver Bandel
  0 siblings, 1 reply; 73+ messages in thread
From: Jon Harrop @ 2008-07-10 23:04 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008 23:25:36 you wrote:
> On Thu, Jul 10, 2008 at 03:00:02PM +0100, Jon Harrop wrote:
> > OCaml is already ~8x slower than F# on today's eight core desktops.
>
> You don't half talk a load of nonsense.  MPI OCaml programs on 8 cores
> are just as fast...

You may recall that we already tested this and your fastest (unsafe) OCaml 
implementation of the (embarassingly parallel) matrix multiply remains up to 
100x slower than my safe F# implementation:

  http://groups.google.com/group/fa.caml/msg/c3dbf6c5cdb3a898

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 19:15 ` Gerd Stolpmann
  2008-07-10 20:07   ` Sylvain Le Gall
@ 2008-07-10 23:33   ` Oliver Bandel
  2008-07-10 23:43     ` Oliver Bandel
  2008-07-11  6:26     ` Sylvain Le Gall
  2008-07-11  3:01   ` [Caml-list] thousands of CPU cores Brian Hurt
  2 siblings, 2 replies; 73+ messages in thread
From: Oliver Bandel @ 2008-07-10 23:33 UTC (permalink / raw)
  To: caml-list

Zitat von Gerd Stolpmann <info@gerd-stolpmann.de>:

>
> Am Mittwoch, den 09.07.2008, 22:57 -0700 schrieb J C:
> > I know that Caml team wanted to see if many-core shared-memory
> systems
> > were going to stick around before bothering with Caml development
> that
> > takes advantage of them.
> >
> > Well, it looks like they are here to stay, after all:
> >
> > http://news.cnet.com/8301-13924_3-9981760-64.html
> >
> > As much as I hate to look a gift horse in the mouth, and I think
> Caml
> > has been a great and grossly underappreciated product, I need to
> see
> > if writing Caml is a viable code investment for the coming years or
> > something like Haskell, SML, F# or even Ada will be a better
> long-term
> > alternative.
> >
> > Are there plans to make Caml threads OS-native threads, or add
> > OpenMP-style primitives, or otherwise support multiple CPU cores?
> And
> > if so, roughly in what time frame?
>
> I wouldn't take this article too seriously. It's just speculation.
> Actually, the whole multi-core technology is a challenge for software
> development. You cannot simply take a program that runs well on 4
> cores
> and expect that it scales up to 400. Software must be designed from
> grounds up differently for such architectures.
>
> Just open up your mind to this perspective: It's a big risk for the
> CPU
> vendors to haven taken the direction to multi-core.
[...]

Well Sun has machines with a lot of processors and cores and
their machines are really workhorses. :)

I had worked on such a monster, if I remember correctly,
they had 8 cores, and later had upgraded the system to use 32 (?!).

I'm not quite sure on the numbers.

So, if we talk about Intel, they now tell us about 1000 cores...
...a while ago they didn't wanted to go into the multicore dirction
and had only seen a higher cpu-frequency up into the many-GHz
as a solution.
There are better architectures. Some DSPs had made more
instructions per second with some hundred MHz than some GHz-Intel CPU's,
becuase if you only need one cycle to load (more than none) new
instructions
while working at the current one, this means you are much faster.

Now Intel has decided to try more than one Core and their marketing
was good.... there already were multicore processors from other
technology
companies. OK, and while they slept for a while, now they talk about
100'ds, when they have just a few... and who want's this old 32-Bit
bullshit?
32 Bit's technology was available on workstations in the middle of the
eighties. It's about 20 years old.

And Intel processors are good to use, when the winter is cold ;-)


But in general multicore machines are a good idea.
And also other architectures are a good idea. I mean,
there are many ways in which technology can progress,
and only selecting one mostly seems to be a marketing thing, IMHO.


> Except for some
> standard components and some specially-adapted programs multi-core is
> more or less not exploited today.

Well, on systems like Sun's stations or other high-end servers,
there is a lot of parallelism exploited since a while.

But not on our small-budget home PCs.


> So these vendors are trying to push
> the software developers into this direction, and hope they find new
> ideas for designing multi-core-capable programs. This article is just
> propaganda for this hidden agenda. It can also happen that multi-core
> with too many cores turns out as failure - in the sense that the mass
> market is not ready for it.

OK, mass market is the key-word here!


>
> In Ocaml you can exploit multi-core currently only by using
> multi-processing parallel programs that communicate over message
> passing
> (and only on Unix).

Using multi-processes instead of multi-threads is the
usual way on Unix, and it has a lot of advantages.
Threads-apologetes often say, threads are the ultimative
technology... but processes have the advantage of encapsulation
of the wohole environment of the program.

This means a higher degree of saveness/security.

If there would be processes on Windows like on Unix/Linux,
they also would be more interesting to the people. And since
Threads are the only thing to make parallelism on Windows,
it's the only way, most people know of.
Again this is marketing and influence of the marketing leader.

Shell was nothing one would mention under Windows, and now M$
created a PowerShell, and then people on Windows one day will
ask, if there is something like a shell on Linux. Didn't M$ invented
shells?

The same with threads... they are heavy in use on Windows, so it seems
they must be used. But Processes also can be used - at least on Linux
and Unix.



On the above mentioned Sun machines there was paralellism
used a lot. It's their business. They do it. For them it's normal
business. And they heavily use processes. (This does not mean that there
are no multithreaded programs... if you have both, it's even better, of
course, but the processes can be swapt much easier from processor to
processor than threads.)



> Actually, it's an excellent language for this
> style.

Yes.

> http://wink.com). Ocaml is an excellent choice because you can
> quickly
> develop working programs that run 24/7. In the distributed world
> stability is quite important.

Hehe, stability is important also in non-distributed software.
But on the systems that are main stream, this is not the main issue ;-)
And threads btw. make programming stable programs quite harder,
especially if using C or C++. Ocaml ihas it's advantage here,
because it's quite safe/stable.


Ciao,
    Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 23:04       ` Jon Harrop
@ 2008-07-10 23:41         ` Oliver Bandel
  2008-07-11  0:17           ` Oliver Bandel
  0 siblings, 1 reply; 73+ messages in thread
From: Oliver Bandel @ 2008-07-10 23:41 UTC (permalink / raw)
  To: caml-list

Zitat von Jon Harrop <jon@ffconsultancy.com>:

> On Thursday 10 July 2008 23:25:36 you wrote:
> > On Thu, Jul 10, 2008 at 03:00:02PM +0100, Jon Harrop wrote:
> > > OCaml is already ~8x slower than F# on today's eight core
> desktops.
> >
> > You don't half talk a load of nonsense.  MPI OCaml programs on 8
> cores
> > are just as fast...
>
> You may recall that we already tested this and your fastest (unsafe)
> OCaml
> implementation of the (embarassingly parallel) matrix multiply
> remains up to
> 100x slower than my safe F# implementation:
>
>   http://groups.google.com/group/fa.caml/msg/c3dbf6c5cdb3a898
[...]

Sorry, but following this link, I could not found details on your
comparison. Could please send the correct link, where you
showed details?

And a question to the message passing: which message passing technique
from Ocaml did you use?


Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 23:33   ` [Caml-list] " Oliver Bandel
@ 2008-07-10 23:43     ` Oliver Bandel
  2008-07-11  6:26     ` Sylvain Le Gall
  1 sibling, 0 replies; 73+ messages in thread
From: Oliver Bandel @ 2008-07-10 23:43 UTC (permalink / raw)
  To: caml-list

Zitat von Oliver Bandel <oliver@first.in-berlin.de>:

[...]
> There are better architectures. Some DSPs had made more
> instructions per second with some hundred MHz than some GHz-Intel
> CPU's,
> becuase if you only need one cycle to load (more than none) new

    ... (more then ONE) ...



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 23:41         ` Oliver Bandel
@ 2008-07-11  0:17           ` Oliver Bandel
  2008-07-11  9:30             ` Richard Jones
  0 siblings, 1 reply; 73+ messages in thread
From: Oliver Bandel @ 2008-07-11  0:17 UTC (permalink / raw)
  To: caml-list

Hello Jon,


Zitat von Oliver Bandel <oliver@first.in-berlin.de>:
[...]
> And a question to the message passing: which message passing
> technique
> from Ocaml did you use?
[...]

Oh, sorry, I read your mail again and saw that you talked about
Richards OCaml implementation.

I googled for some keyords and found an implementation, mentioned on
this list (well, this thread was that bloated, that I didn't read it
completely, and so I've seen Richards mail the first time this night).

There was a module "Ancient", and I looked, what it does.

It allows using the swap-space for too-large files.
But that's IMHO a way to use more mem, but not to
provide fast executions.

I didn't use that module and heard of it just some minutes ago,
but it seems it's the wrong tool.


So, there should be better ways to go.


Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 19:15 ` Gerd Stolpmann
  2008-07-10 20:07   ` Sylvain Le Gall
  2008-07-10 23:33   ` [Caml-list] " Oliver Bandel
@ 2008-07-11  3:01   ` Brian Hurt
  2008-07-11 13:01     ` Gerd Stolpmann
  2008-07-11 15:01     ` Peng Zang
  2 siblings, 2 replies; 73+ messages in thread
From: Brian Hurt @ 2008-07-11  3:01 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: J C, caml-list



On Thu, 10 Jul 2008, Gerd Stolpmann wrote:

>
> I wouldn't take this article too seriously. It's just speculation.

I would take the article seriously.

> Just open up your mind to this perspective: It's a big risk for the CPU
> vendors to haven taken the direction to multi-core.

*Precisely*.  It also stands in stark contrast to the last 50 or so years 
of CPU development, which focused around making single-threaded code 
faster.  And, I note, it's not just one CPU manufacturer who has done this 
(which could be chalked up to stupid management or stupid engineers)- but 
*every* CPU manufacturer.  And what do they get out of it, other than 
ticked off software developers grumbling about having to switch to 
multithreaded code?

I can only see one explanation: they had no choice.  They couldn't make 
single threaded code any faster by throwing more transistors at the 
problem.  We've maxed out speculative execution and instruction level 
parallelism, pushed cache out well past the point of diminishing returns, 
and added all the execution units that'll ever be used, what more can be 
done?  The only way left to increase speed is multicore.

And you still have the steady drum beat of Moore's law- which, by the way, 
only gaurentees that the number of transistors per square millimeter 
doubles every yeah so often (I think the current number is 2 years).  So 
we have the new process which gives us twice the number of transistors as 
the old process, but nothing we can use those new transistors on to make 
single threaded code go any faster.  So you might as well give them two 
cores where they used to have one.

At this point, there are only two things that can prevent kilo-core chips: 
one, some bright boy could come up with some way to use those extra 
transistors to make single-threaded code go faster, or two: Moore's law 
could "expire" within the next 16 years.  We're at quad-core already, 
another 8 doublings every 2 years, with all doublings spent on more cores, 
gets us to 1024 cores.

I think it'll be worse than this, actually, once it gets going.  The 
Pentium III (single core) was 9.5 million transistors, while the Core Duo 
was 291 million.  Even accounting for the 2 cores and some cost to put 
them together, you're looking at the P-III to be 1/16th the size of a 
Core.  If put on the same process so the P-III runs at more or less the 
same clock speed, how much slower would the P-III be?  1/10th?  1/2?  90% 
the speed of the Core?  So long as it's above 1/16th the speed, you're 
ahead.  If your code can scale that far, it's worthwhile to have 32 P-III 
cores instead of a the dual core Core Duo- it's faster.

Yes, there are limits to this (memory bandwidth springs to mind), but the 
point is that more, simpler cores could be a performance win, increasing 
the rate cores multiply faster than Moore's law would dictate.  If we 
decide to go to P-III cores instead of Core cores, we could have 1024-core 
chips comming out in 8 years (4 doublings, not 8, to go from 4x32=64 cores 
to 1024 cores).  And remember, if Moore's law holds out for another 8 
years after we hit 1K cores, that's another 4 doublings, and we're looking 
at CPUs with 16K cores- literally, tens of thousands of cores.

If Moore's law doesn't hold up, that's going to be a different, and much 
larger and smellier, pile of fecal matter to hit the rotary air impeller.

Brian


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: thousands of CPU cores
  2008-07-10 23:33   ` [Caml-list] " Oliver Bandel
  2008-07-10 23:43     ` Oliver Bandel
@ 2008-07-11  6:26     ` Sylvain Le Gall
  2008-07-11  8:50       ` [Caml-list] " Jon Harrop
  1 sibling, 1 reply; 73+ messages in thread
From: Sylvain Le Gall @ 2008-07-11  6:26 UTC (permalink / raw)
  To: caml-list

On 10-07-2008, Oliver Bandel <oliver@first.in-berlin.de> wrote:
> Using multi-processes instead of multi-threads is the
> usual way on Unix, and it has a lot of advantages.
> Threads-apologetes often say, threads are the ultimative
> technology... but processes have the advantage of encapsulation
> of the wohole environment of the program.
>

There is also the fact that using multi process allow you to go further
than the memory limit per process (3GB for Linux/ 1GB for Windows). With
the actual increase of amount of RAM, this can be an issue. For some
time, most of the vendor are selling computer with at least 1GB and
often 2GB or more.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-11  6:26     ` Sylvain Le Gall
@ 2008-07-11  8:50       ` Jon Harrop
  2008-07-11  9:29         ` Sylvain Le Gall
  2008-07-13  3:17         ` Code Mobility [was Re: thousands of CPU cores] Robert Fischer
  0 siblings, 2 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-11  8:50 UTC (permalink / raw)
  To: caml-list

On Friday 11 July 2008 07:26:44 Sylvain Le Gall wrote:
> On 10-07-2008, Oliver Bandel <oliver@first.in-berlin.de> wrote:
> > Using multi-processes instead of multi-threads is the
> > usual way on Unix, and it has a lot of advantages.
> > Threads-apologetes often say, threads are the ultimative
> > technology... but processes have the advantage of encapsulation
> > of the wohole environment of the program.
>
> There is also the fact that using multi process allow you to go further
> than the memory limit per process...

Yes.

> (3GB for Linux/

Is that for 32-bit Linux?

> 1GB for Windows)...  

32-bit Windows XP has a 2Gb default process memory limit:

  http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
  http://msdn.microsoft.com/en-us/library/aa366778.aspx

32-bit Windows Server can be increased to 3Gb.

However, any serious power users will already be on 64-bit where these limits 
have been relegated to quaint stories your grandpa will tell you.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: thousands of CPU cores
  2008-07-11  8:50       ` [Caml-list] " Jon Harrop
@ 2008-07-11  9:29         ` Sylvain Le Gall
  2008-07-15 16:01           ` [Caml-list] " Kuba Ober
  2008-07-13  3:17         ` Code Mobility [was Re: thousands of CPU cores] Robert Fischer
  1 sibling, 1 reply; 73+ messages in thread
From: Sylvain Le Gall @ 2008-07-11  9:29 UTC (permalink / raw)
  To: caml-list

On 11-07-2008, Jon Harrop <jon@ffconsultancy.com> wrote:
> On Friday 11 July 2008 07:26:44 Sylvain Le Gall wrote:
>> On 10-07-2008, Oliver Bandel <oliver@first.in-berlin.de> wrote:
>
> However, any serious power users will already be on 64-bit where these limits 
> have been relegated to quaint stories your grandpa will tell you.
>

As you cannot ignore people running on Windows, you cannot ignore people
running on older hardware.

If you plan to program a big DB that will use more than 3GB on 32 bits
hardware, you should take care of this memory limit and consider
splitting your application into several process... 

The "process" approach to parallelism:
- is basic but should fit to most OS around
- require work to split application correctly, wrt to require
  communication bandwidth
- cannot take advantage of shared memory (well you CAN share memory, but
  it is not as easy as in thread/single process)
- increase safety by really separating data

I mean, you can get really good performance with threaded app BUT you
have many drawbacks that create weird behavior/undetectable runtime
bugs. Process approach is portable and limit possible bugs to
communication...

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11  0:17           ` Oliver Bandel
@ 2008-07-11  9:30             ` Richard Jones
  2008-09-21 19:05               ` Michaël Grünewald
  0 siblings, 1 reply; 73+ messages in thread
From: Richard Jones @ 2008-07-11  9:30 UTC (permalink / raw)
  To: Oliver Bandel; +Cc: caml-list

If you also follow the rest of that thread, there's a message passing
OCaml version by Gerd Stolpmann which also scales properly.

To be honest, matrix multiplication interests me not at all since no
one is hand coding their own matrix multiplication when there are
perfectly good, parallel libraries available for most languages,
including OCaml.  Even if you were writing all your applications in C,
you'd still be stupid to hand roll your own matrix multiplication.
Let's have a real example instead.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11  3:01   ` [Caml-list] thousands of CPU cores Brian Hurt
@ 2008-07-11 13:01     ` Gerd Stolpmann
  2008-07-11 13:43       ` Jon Harrop
  2008-07-12 17:35       ` Brian Hurt
  2008-07-11 15:01     ` Peng Zang
  1 sibling, 2 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-11 13:01 UTC (permalink / raw)
  To: Brian Hurt; +Cc: caml-list


Am Donnerstag, den 10.07.2008, 23:01 -0400 schrieb Brian Hurt:
> 
> On Thu, 10 Jul 2008, Gerd Stolpmann wrote:
> 
> >
> > I wouldn't take this article too seriously. It's just speculation.
> 
> I would take the article seriously.
> 
> > Just open up your mind to this perspective: It's a big risk for the CPU
> > vendors to haven taken the direction to multi-core.
> 
> *Precisely*.  It also stands in stark contrast to the last 50 or so years 
> of CPU development, which focused around making single-threaded code 
> faster.  And, I note, it's not just one CPU manufacturer who has done this 
> (which could be chalked up to stupid management or stupid engineers)- but 
> *every* CPU manufacturer.  And what do they get out of it, other than 
> ticked off software developers grumbling about having to switch to 
> multithreaded code?
> 
> I can only see one explanation: they had no choice.  They couldn't make 
> single threaded code any faster by throwing more transistors at the 
> problem.  We've maxed out speculative execution and instruction level 
> parallelism, pushed cache out well past the point of diminishing returns, 
> and added all the execution units that'll ever be used, what more can be 
> done?  The only way left to increase speed is multicore.
> 
> And you still have the steady drum beat of Moore's law- which, by the way, 
> only gaurentees that the number of transistors per square millimeter 
> doubles every yeah so often (I think the current number is 2 years).  So 
> we have the new process which gives us twice the number of transistors as 
> the old process, but nothing we can use those new transistors on to make 
> single threaded code go any faster.  So you might as well give them two 
> cores where they used to have one.
> 
> At this point, there are only two things that can prevent kilo-core chips: 
> one, some bright boy could come up with some way to use those extra 
> transistors to make single-threaded code go faster, or two: Moore's law 
> could "expire" within the next 16 years.  We're at quad-core already, 
> another 8 doublings every 2 years, with all doublings spent on more cores, 
> gets us to 1024 cores.

Well, it is an open question whether this alternative holds. I mean
there is a market, and if the market says, "no we don't need that
multicore monsters", the chip companies cannot simply ignore it. Of
course, there are applications that would extremely benefit from them,
but it is the question whether this is only a niche, or something you
can make enough revenues to pay the development of such large
multicores.

In the past, it was very important for hardware vendors that existing
software runs quicker on new CPU generations. This is no longer true for
multicore. So unless there is a software revolution that makes it simple
to exploit multicore, we won't see 1024-cores for the masses.

> I think it'll be worse than this, actually, once it gets going.  The 
> Pentium III (single core) was 9.5 million transistors, while the Core Duo 
> was 291 million.  Even accounting for the 2 cores and some cost to put 
> them together, you're looking at the P-III to be 1/16th the size of a 
> Core.  If put on the same process so the P-III runs at more or less the 
> same clock speed, how much slower would the P-III be?  1/10th?  1/2?  90% 
> the speed of the Core?  So long as it's above 1/16th the speed, you're 
> ahead.  If your code can scale that far, it's worthwhile to have 32 P-III 
> cores instead of a the dual core Core Duo- it's faster.
> 
> Yes, there are limits to this (memory bandwidth springs to mind), but the 
> point is that more, simpler cores could be a performance win, increasing 
> the rate cores multiply faster than Moore's law would dictate.  If we 
> decide to go to P-III cores instead of Core cores, we could have 1024-core 
> chips comming out in 8 years (4 doublings, not 8, to go from 4x32=64 cores 
> to 1024 cores).  And remember, if Moore's law holds out for another 8 
> years after we hit 1K cores, that's another 4 doublings, and we're looking 
> at CPUs with 16K cores- literally, tens of thousands of cores.

Well, there is a another option for the chip industry. Instead of
keeping the die at some size and packing more and more cores on it, they
can also sell smaller chips for less. Basically, this alternate path
already exists (e.g. Intel's Atom). Of course, this makes this industry
more boring, and they would turn into more normal industrial component
suppliers.

At some point this will be unavoidable. The question is whether this
happens in the next years.

Gerd

> If Moore's law doesn't hold up, that's going to be a different, and much 
> larger and smellier, pile of fecal matter to hit the rotary air impeller.
> 
> Brian
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 13:01     ` Gerd Stolpmann
@ 2008-07-11 13:43       ` Jon Harrop
  2008-07-11 14:03         ` Basile STARYNKEVITCH
  2008-07-11 17:54         ` Richard Jones
  2008-07-12 17:35       ` Brian Hurt
  1 sibling, 2 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-11 13:43 UTC (permalink / raw)
  To: caml-list

On Friday 11 July 2008 14:01:45 Gerd Stolpmann wrote:
> In the past, it was very important for hardware vendors that existing
> software runs quicker on new CPU generations. This is no longer true for
> multicore. So unless there is a software revolution that makes it simple
> to exploit multicore, we won't see 1024-cores for the masses.

That revolution happened several years ago when everyone migrated to the JVM 
and CLR and their concurrent GCs made it easy to exploit multicores.

Ironically, given the hype surrounding functional programming for parallelism, 
all open source FPLs were left behind. On Linux, even the future prospects 
are bleak: no tail calls on the JVM, prohibitively difficult to implement an 
efficient concurrent GC yourself and Mono is going nowhere.

> Well, there is a another option for the chip industry. Instead of
> keeping the die at some size and packing more and more cores on it, they
> can also sell smaller chips for less. Basically, this alternate path
> already exists (e.g. Intel's Atom). Of course, this makes this industry
> more boring, and they would turn into more normal industrial component
> suppliers.

Other factors are certainly playing an increasingly important role. ARM cores 
are now outselling Intel and AMD thanks to an exploding embedded market. ARM 
CPUs found their way into ultramobiles and are now moving up into laptops. 
Microsoft are supporting ARM-based Windows and .NET.

However, even ARM recently went multicore and their GHz quadcore Cortex-A8 
with integrated nVidia graphics is going into mobile phones. By the end of 
this year, OCaml will not even be able to tap the power of my mobile phone...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 13:43       ` Jon Harrop
@ 2008-07-11 14:03         ` Basile STARYNKEVITCH
  2008-07-11 15:08           ` Jon Harrop
  2008-07-11 17:28           ` Jon Harrop
  2008-07-11 17:54         ` Richard Jones
  1 sibling, 2 replies; 73+ messages in thread
From: Basile STARYNKEVITCH @ 2008-07-11 14:03 UTC (permalink / raw)
  To: caml-list

Jon Harrop wrote:
> 
> Ironically, given the hype surrounding functional programming for parallelism, 
> all open source FPLs were left behind. On Linux, even the future prospects 
> are bleak: no tail calls on the JVM, prohibitively difficult to implement an 
> efficient concurrent GC yourself and Mono is going nowhere.

It is not specific to Linux (and probably not even to *opensource* 
functional programming languages; I believe proprietary functional 
languages implementations face the same problems). In my perception, 
functional programming requires *blindly fast* memory allocation for 
values which are becoming garbage quickly. This seems a property of 
functional programming (and more generally any programming style 
discouraging side effects), in other words functional programming need 
very efficient garbage collectors (A.Appel wrote stuff on this almost 
20? years ago).

And coding efficient parallel generational collector is really hard, at 
least on current hardware (ask Damien Doligez). Perhaps chip makers 
might acknowledge the importance of supporting read & write barriers on 
the bare metal (which seems easy, more a managerial/merket decision than 
a technical one; on x86 we've got MMX, then SSE1, ... SSE5, AVX but no 
additional instructions for read & write barriers...). There is a 
chicken&egg issue here (no hardware assist for good GC, so no good 
functional language implementation on multicore).

As a case in point, I suggest an experiment (which unfortunately I don't 
have the time or motivation to realize). Replace the current Ocaml GC 
either in bytecode or in nativecode ocaml by Boehm's collector (which is 
multithread compatible). I'm sure you'll get a significant performance 
loss, but you should gain the true multi-threading feature. Of course, 
synchronization issues will appear, very probably in application code 
(and some C function wrappers).

Regards

-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10  5:57 thousands of CPU cores J C
                   ` (4 preceding siblings ...)
  2008-07-10 19:15 ` Gerd Stolpmann
@ 2008-07-11 14:06 ` Xavier Leroy
  2008-07-11 15:20   ` Oliver Bandel
                     ` (3 more replies)
  5 siblings, 4 replies; 73+ messages in thread
From: Xavier Leroy @ 2008-07-11 14:06 UTC (permalink / raw)
  To: J C; +Cc: caml-list

J C wrote:
> I know that Caml team wanted to see if many-core shared-memory systems
> were going to stick around before bothering with Caml development that
> takes advantage of them.
> Well, it looks like they are here to stay, after all:
> http://news.cnet.com/8301-13924_3-9981760-64.html

As others mentioned already, nothing in this news item talks about
shared memory parallelism.  There are good reasons to think that the
illusion of shared memory cannot be maintained in the presence of
hundreds of computing elements, even using cc-NUMA techniques
(i.e. hardware emulation of shared memory on top of high-speed
point-to-point links).  Look at GPUs, which are the closest we have
today to a manycore system: 128 cores are available today, more is
in preparation, but the programming model is definitely not SMP.

I had the opportunity to discuss this with Anwar Ghuloum, the Intel
principal engineer quoted in the Cnet article (and the driving force
behind Intel's joining the Caml consortium, by the way).  His group is
definitely not committed to shared-memory approaches, but instead
investigates high-level data-parallel programming models with a strong
functional flavor that can be mapped both to shared memory and
non-shared memory systems.
(http://techresearch.intel.com/articles/Tera-Scale/1514.htm)

So, I am as convinced as ever that shared mutable data is a terrible
programming model for parallelism, because 1- it's awfully low-level,
error-prone and non-compositional, and 2- interesting parallel
hardware doesn't implement it anyway.  (Examples: clusters, GPUs, the
Cell processor.)  At best, shared memory is a low-level implementation
technique which, like manual memory management, pointer arithmetic and
self-modifying code, is best hidden in the OS and language runtime
system, but should never be exposed to the programmer.

The interesting question that this community should focus on
(rather than throwing fits about concurrent GC and the like) is coming
up with good programming models for parallelism.  I'm quite fond of
message passing myself, but agree that more constrained data-parallel
models have value as well.  As Gerd Stolpmann mentioned, various forms
of message passing can be exploited from OCaml today, but there is
certainly room for improvement there.

Jon Harrop wrote:
> Today's biggest shared-memory supercomputers already have thousands
> of cores.

??? All these supercomputers are clusters.

Sylvain Le Gall wrote:

> There is also the fact that using multi process allow you to go further
> than the memory limit per process (3GB for Linux/ 1GB for Windows). With
> the actual increase of amount of RAM, this can be an issue.

The correct solution to this issue isn't to artificially split your
program in multiple processes, but to move to 64-bit architectures.
You would be hard pressed to buy a desktop PC today that isn't 64-bit
capable.  Linux x86-64 works beautifully.

- Xavier Leroy


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 14:00   ` Jon Harrop
  2008-07-10 22:25     ` Richard Jones
@ 2008-07-11 14:53     ` Peng Zang
  2008-07-15 14:39     ` Kuba Ober
  2 siblings, 0 replies; 73+ messages in thread
From: Peng Zang @ 2008-07-11 14:53 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 10 July 2008 10:00:02 am Jon Harrop wrote:
> Today's biggest shared-memory supercomputers already have thousands of
> cores.
>
> > Also, this is a CNET article.. not exactly known for being in depth or
> > well researched and this article is no exception.  It is an article based
> > entirely on a few speculative comments of some Intel guys.  I wouldn't
> > take it too seriously.
> >
> > Personally, I can see why the Caml development team opted not to put
> > effort into dealing with shared-memory systems.
>
> The OCaml development team put huge effort into their concurrent run-time.

No, don't get me wrong, I'm all about concurrency and I'm glad the OCaml dev 
team put a lot of effort into it.  I'm talking about specific optimizations 
for shared-memory architectures.

> > It is a stop-gap solution...
>
> That is not true. Many-core machines will always be decomposed into
> shared-memory clusters of as many cores as possible because shared memory
> parallelism will always be orders of magnitude more efficient than
> distributed parallelism.

Hmm... that's a good point.  Although, I want to point out that parallel 
algorithm design (and hardware design) isn't nearly as well studied.

Peng
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)

iD8DBQFId3PnfIRcEFL/JewRApH6AKDBI5Wd95Gc6YIt/nvU41lIdiaw2ACfcONA
YX8PCVBkcnSYkN3R8MC1yys=
=rkJx
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11  3:01   ` [Caml-list] thousands of CPU cores Brian Hurt
  2008-07-11 13:01     ` Gerd Stolpmann
@ 2008-07-11 15:01     ` Peng Zang
  2008-07-12  0:23       ` Oliver Bandel
  1 sibling, 1 reply; 73+ messages in thread
From: Peng Zang @ 2008-07-11 15:01 UTC (permalink / raw)
  To: caml-list; +Cc: Brian Hurt, Gerd Stolpmann

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 10 July 2008 11:01:31 pm Brian Hurt wrote:
> On Thu, 10 Jul 2008, Gerd Stolpmann wrote:
> > I wouldn't take this article too seriously. It's just speculation.
>
> I would take the article seriously.
>
> > Just open up your mind to this perspective: It's a big risk for the CPU
> > vendors to haven taken the direction to multi-core.
>
> *Precisely*.  It also stands in stark contrast to the last 50 or so years
> of CPU development, which focused around making single-threaded code
> faster.  And, I note, it's not just one CPU manufacturer who has done this
> (which could be chalked up to stupid management or stupid engineers)- but
> *every* CPU manufacturer.  And what do they get out of it, other than
> ticked off software developers grumbling about having to switch to
> multithreaded code?

I think we can all agree that more computing units being used in parallel is 
going to be the future.  The main point here is that a shared-memory 
architecture is not necessarily (and in my opinion doubtful) the approach 
that will be taken for large numbers of CPUs.

Peng
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)

iD4DBQFId3XafIRcEFL/JewRAqWsAJQIUFRO7aMoyVOZGzmKbXITloOwAKCm+QZd
WR7HXzzrzuNL8q3q3HuztQ==
=2IcK
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 14:03         ` Basile STARYNKEVITCH
@ 2008-07-11 15:08           ` Jon Harrop
  2008-07-11 17:28           ` Jon Harrop
  1 sibling, 0 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-11 15:08 UTC (permalink / raw)
  To: caml-list

On Friday 11 July 2008 15:03:48 Basile STARYNKEVITCH wrote:
> It is not specific to Linux (and probably not even to *opensource*
> functional programming languages; I believe proprietary functional
> languages implementations face the same problems).

Indeed, Mathematica has the same problem but, I believe, Wolfram Research are 
migrating it to the JVM for this reason.

> In my perception, 
> functional programming requires *blindly fast* memory allocation for
> values which are becoming garbage quickly. This seems a property of 
> functional programming (and more generally any programming style
> discouraging side effects), in other words functional programming need
> very efficient garbage collectors (A.Appel wrote stuff on this almost
> 20? years ago).

Although that is established functional folklore, I believe it is misguided to 
try to apply that to more mainstream concerns. Moreover, the problem can be 
largely avoided by adopting a more modern JIT-based approach to language 
implementation anyway.

OCaml and its ancestors and relatives like Haskell have traditionally been 
used by academics for applications with the value lifetime distribution that 
you describe (very high allocation rates for short lived values) when it is 
not unusual to see 30% CPU time spent in the GC.

However, OCaml really pioneered the use of this family of languages in 
completely different applications such as numerical methods for scientific 
computing thanks to OCaml's unusually good floating point performance. Such 
applications do not share the characteristic that you describe but they still 
benefit enormously from first-class functions, tail calls, an expressive 
static type system and so on. These applications benefit far more from good 
code generation than from a fast GC and it is now unusual to see >5% CPU time 
spent in the GC for most OCaml programs.

Type specialization during JIT compilation removes the need for a uniform 
run-time representation of values which, amongst other things, obviates all 
boxing of floats. Value types allow custom data structures to be stored 
unboxed when appropriate (e.g. complex numbers).

This is why F# can be so productive for high-performance numerics even though 
it is built upon a run-time that was specifically designed for C#'s 
characteristics.

> As a case in point, I suggest an experiment (which unfortunately I don't
> have the time or motivation to realize). Replace the current Ocaml GC
> either in bytecode or in nativecode ocaml by Boehm's collector (which is
> multithread compatible). I'm sure you'll get a significant performance
> loss, but you should gain the true multi-threading feature. Of course,
> synchronization issues will appear, very probably in application code
> (and some C function wrappers).

That is an interesting idea and, in fact, perhaps LLVM+Boehm would be the 
easiest way to create a new functional language implementation that captures 
F#'s productivity.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 14:06 ` Xavier Leroy
@ 2008-07-11 15:20   ` Oliver Bandel
  2008-07-11 15:23   ` Bill
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 73+ messages in thread
From: Oliver Bandel @ 2008-07-11 15:20 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: J C, caml-list

Zitat von Xavier Leroy <Xavier.Leroy@inria.fr>:
[...]
> The interesting question that this community should focus on
> (rather than throwing fits about concurrent GC and the like) is
> coming
> up with good programming models for parallelism.

OK, I agree, but parallel programming is not really new.
It's just new on or small PCs we ahve at home
It's used since 20 years on high-performance/high-end computers.

I'm not sure there were functional programming models
used, I rather doubt this, but I'm not very familiar with that area.
If there already would be some kind of know-how in parallel functional
programming, this might be adapted to the PCs.

[...]
>  I'm quite fond of
> message passing myself, but agree that more constrained data-parallel
> models have value as well.  As Gerd Stolpmann mentioned, various
> forms
> of message passing can be exploited from OCaml today, but there is
> certainly room for improvement there.
[...]

Can you name it? I mean... the room for improvement.

Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 14:06 ` Xavier Leroy
  2008-07-11 15:20   ` Oliver Bandel
@ 2008-07-11 15:23   ` Bill
  2008-07-11 18:14   ` Mattias Engdegård
  2008-07-12 23:05   ` J C
  3 siblings, 0 replies; 73+ messages in thread
From: Bill @ 2008-07-11 15:23 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: J C, caml-list

On Fri, 2008-07-11 at 16:06 +0200, Xavier Leroy wrote:
   . . .
> The interesting question that this community should focus on
> (rather than throwing fits about concurrent GC and the like) is coming
> up with good programming models for parallelism.  I'm quite fond of
> message passing myself, but agree that more constrained data-parallel
> models have value as well.  As Gerd Stolpmann mentioned, various forms
> of message passing can be exploited from OCaml today, but there is
> certainly room for improvement there.

Perhaps this is subsumed by some of the terminology flying around this
discussion, but what about (synchronous) dataflow?  I had some pretty
good-looking preliminary results implementing telecom algorithms in
dataflow networks.  One nice side-effect was that latency and throughput
could be tied to the "aspect ratio" (length vs. breadth) of the dataflow
network.  This could be an opening for the design-space trade-off design
style that hardware designers are used to but that is rare in software.

The resulting designs look upside-down to software designers -- instead
of a few big processes doing complicated work and
communicating/coordinating with each other there is a large number of
small functions each doing its thing to the next item on its input queue
and passing it on.

 -- Bill Wood



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 14:03         ` Basile STARYNKEVITCH
  2008-07-11 15:08           ` Jon Harrop
@ 2008-07-11 17:28           ` Jon Harrop
  1 sibling, 0 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-11 17:28 UTC (permalink / raw)
  To: caml-list

On Friday 11 July 2008 15:03:48 Basile STARYNKEVITCH wrote:
> As a case in point, I suggest an experiment (which unfortunately I don't
> have the time or motivation to realize). Replace the current Ocaml GC
> either in bytecode or in nativecode ocaml by Boehm's collector (which is
> multithread compatible).

Now that I come to think of it, doesn't OCaml extensively break Boehm's 
assumptions, e.g. that pointer-like values refer to the start of an allocated 
block? So Boehm is likely to not collect anything.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 13:43       ` Jon Harrop
  2008-07-11 14:03         ` Basile STARYNKEVITCH
@ 2008-07-11 17:54         ` Richard Jones
  2008-07-11 18:30           ` Raoul Duke
  1 sibling, 1 reply; 73+ messages in thread
From: Richard Jones @ 2008-07-11 17:54 UTC (permalink / raw)
  To: caml-list

On Fri, Jul 11, 2008 at 02:43:53PM +0100, Jon Harrop wrote:
> On Friday 11 July 2008 14:01:45 Gerd Stolpmann wrote:
> > In the past, it was very important for hardware vendors that existing
> > software runs quicker on new CPU generations. This is no longer true for
> > multicore. So unless there is a software revolution that makes it simple
> > to exploit multicore, we won't see 1024-cores for the masses.
> 
> That revolution happened several years ago when everyone migrated to the JVM 
> and CLR and their concurrent GCs made it easy to exploit multicores.

CLR & JVM running easily on 1024 cores, this I gotta see!

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 14:06 ` Xavier Leroy
  2008-07-11 15:20   ` Oliver Bandel
  2008-07-11 15:23   ` Bill
@ 2008-07-11 18:14   ` Mattias Engdegård
  2008-07-12 23:05   ` J C
  3 siblings, 0 replies; 73+ messages in thread
From: Mattias Engdegård @ 2008-07-11 18:14 UTC (permalink / raw)
  To: Xavier.Leroy; +Cc: jhc0033, caml-list

>[...] There are good reasons to think that the
>illusion of shared memory cannot be maintained in the presence of
>hundreds of computing elements, even using cc-NUMA techniques
>(i.e. hardware emulation of shared memory on top of high-speed
>point-to-point links).

I'm not arguing any of your points but just note that larger NUMA
machines than that are available and sometimes practical - SGI Altix
go up to 1024 cores with a single system image.

(To answer Richard Jones's question, I know Bea have tested their JVM
on such a machine but I have no idea whether it turned out to be
useful. I doubt there are many Java applications actually needing such
a wide JVM.)


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 17:54         ` Richard Jones
@ 2008-07-11 18:30           ` Raoul Duke
  0 siblings, 0 replies; 73+ messages in thread
From: Raoul Duke @ 2008-07-11 18:30 UTC (permalink / raw)
  To: caml-list

> CLR & JVM running easily on 1024 cores, this I gotta see!

not ideal but apparently (i don't work for them and have never used
them) if you stick to (unfortunately proprietary $ystem$ like) Azul,
you can get up to 864 cores.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 15:01     ` Peng Zang
@ 2008-07-12  0:23       ` Oliver Bandel
  2008-07-12 22:54         ` J C
  0 siblings, 1 reply; 73+ messages in thread
From: Oliver Bandel @ 2008-07-12  0:23 UTC (permalink / raw)
  To: peng.zang; +Cc: caml-list, Brian Hurt, Gerd Stolpmann

Zitat von Peng Zang <peng.zang@gmail.com>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Thursday 10 July 2008 11:01:31 pm Brian Hurt wrote:
> > On Thu, 10 Jul 2008, Gerd Stolpmann wrote:
> > > I wouldn't take this article too seriously. It's just
> speculation.
> >
> > I would take the article seriously.
> >
> > > Just open up your mind to this perspective: It's a big risk for
> the CPU
> > > vendors to haven taken the direction to multi-core.
> >
> > *Precisely*.  It also stands in stark contrast to the last 50 or so
> years
> > of CPU development, which focused around making single-threaded
> code
> > faster.  And, I note, it's not just one CPU manufacturer who has
> done this
> > (which could be chalked up to stupid management or stupid
> engineers)- but
> > *every* CPU manufacturer.  And what do they get out of it, other
> than
> > ticked off software developers grumbling about having to switch to
> > multithreaded code?
>
> I think we can all agree that more computing units being used in
> parallel is
> going to be the future.  The main point here is that a shared-memory
> architecture is not necessarily (and in my opinion doubtful) the
> approach
> that will be taken for large numbers of CPUs.
[...]

For example, if you have a non-profit research project,
you can use the BOINC infrastructure, which provides
about 580000 PCs to help you :)

http://en.wikipedia.org/wiki/Berkeley_Open_Infrastructure_for_Network_Computing

There is no Shared-Mem as we know it from our local PCs, there
is distributed calculation around the whole planet.

Threads will not help there ;-)

Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 13:01     ` Gerd Stolpmann
  2008-07-11 13:43       ` Jon Harrop
@ 2008-07-12 17:35       ` Brian Hurt
  1 sibling, 0 replies; 73+ messages in thread
From: Brian Hurt @ 2008-07-12 17:35 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list



On Fri, 11 Jul 2008, Gerd Stolpmann wrote:

> Well, it is an open question whether this alternative holds. I mean
> there is a market, and if the market says, "no we don't need that
> multicore monsters", the chip companies cannot simply ignore it.

You're still assuming the chip companies *have a choice*.  And that 
there for, if the markets are just demanding enough, they'll go "oh, well, 
we didn't realize how important this is to you- we'll just go back to 
single threaded chips now..."  My point is that *all* of the chip 
companies are not this stupid, and if there was a choice, most of them 
wouldn't have choosen this route.  If there was one holdout, one chip 
company not going multicore, I'd probably agree with you.  I don't see the 
holdout (not among the chip makers at all pushing the performance 
envelope- yeah, a number of players in the embedded market aren't going 
multicore).

> Well, there is a another option for the chip industry. Instead of
> keeping the die at some size and packing more and more cores on it, they
> can also sell smaller chips for less. Basically, this alternate path
> already exists (e.g. Intel's Atom). Of course, this makes this industry
> more boring, and they would turn into more normal industrial component
> suppliers.

This has two problems, and "boredom" isn't one of them:

1) It undercuts the main selling point to get people to upgrade- increased 
performance.  We're used to buying new CPUs every yeah so often because 
the ones on the market are significantly faster than the ones we have 
(although the definition of "significantly faster" has slowly gone down 
over the years).

One of the ways the next generation was made faster was by throwing more 
transistors at the problem (allowing you to do more work per clock cycle). 
The other way was clocking the CPUs faster, but that also seems to be 
hitting problems (I'm not sure if the Power 6 means we may be exiting the 
clock speed drought, or if it's just a "revenge of the RISC" (i.e. 
something specific to the Power architecture that allows it to be clocked 
~2x as fast as the x86).

In either case, you're going to see a drop in the unit sales of "high end" 
CPUs.  And the third world isn't going to help with this- when I first 
became a professional software engineer, I bragged about having access to 
400MHz CPUs with 64M of ram- basically, computers with the horse power of 
a OLPC.

2) It really cuts into margins.  Generally, the amount of money the chip 
maker gets from selling a single $200 CPU is greater than the amount they 
get from selling 2 $100 CPUs, let alone 8 $25 CPUs.

And those are just the problems from the CPU vendor's perspective.  Now, 
lets look at this from the Software Developer's perspective:

We've fallen solidly off the Clock Speed ramp- even if we ignore the P4's 
inflated clock rates, it's been over 8 years since we've hit 1GHz, we 
should be at 8GHz or more now.  Now the 2-3GHz we're at.  And if we can't 
throw more transistors at the problem, CPUs simply aren't getting 
significantly faster.  Oh, there may be percentage-point gains now and 
again, but basically, where we are is where we'll stay.

Moore's Law just ended, from the software developers perspective.

No longer will the increasing speed of CPUs offset the increasing bloat 
and slowness of our code.

This turns out to be just a radical change of a different sort.

Brian


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-12  0:23       ` Oliver Bandel
@ 2008-07-12 22:54         ` J C
  2008-07-19 12:06           ` Oliver Bandel
  0 siblings, 1 reply; 73+ messages in thread
From: J C @ 2008-07-12 22:54 UTC (permalink / raw)
  To: caml-list

On Fri, Jul 11, 2008 at 5:23 PM, Oliver Bandel
<oliver@first.in-berlin.de> wrote:

> For example, if you have a non-profit research project,
> you can use the BOINC infrastructure, which provides
> about 580000 PCs to help you :)
>
> http://en.wikipedia.org/wiki/Berkeley_Open_Infrastructure_for_Network_Computing
>
> There is no Shared-Mem as we know it from our local PCs, there
> is distributed calculation around the whole planet.
>
> Threads will not help there ;-)

But on each of those PCs there may be 1000 cores in the near future.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11 14:06 ` Xavier Leroy
                     ` (2 preceding siblings ...)
  2008-07-11 18:14   ` Mattias Engdegård
@ 2008-07-12 23:05   ` J C
  3 siblings, 0 replies; 73+ messages in thread
From: J C @ 2008-07-12 23:05 UTC (permalink / raw)
  To: caml-list

On Fri, Jul 11, 2008 at 7:06 AM, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
>  Look at GPUs, which are the closest we have
> today to a manycore system: 128 cores are available today, more is
> in preparation, but the programming model is definitely not SMP.

I was reading an article about CUDA written by an in-the-trenches
GPGPU programmer. I can't find it now, but one of the points of the
article, as I understood it, was that stream-oriented approaches (like
BrookGPU) look great in theory, but don't work very well in practice -
they can often be orders of magnitude slower than "dirty" approaches
that use some mutable shared memory block in the video card. In other
words, pure-functional programming for multi-core concurrency is just
a speculative promise (for academic funding purposes perhaps).

(If this sounds familiar and anyone has the link, please post)


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Code Mobility [was Re: thousands of CPU cores]
  2008-07-11  8:50       ` [Caml-list] " Jon Harrop
  2008-07-11  9:29         ` Sylvain Le Gall
@ 2008-07-13  3:17         ` Robert Fischer
  1 sibling, 0 replies; 73+ messages in thread
From: Robert Fischer @ 2008-07-13  3:17 UTC (permalink / raw)
  To: caml-list

One thing being left out on this conversation is code mobility.

If I want to execute an arbitrary piece of code in parallel, organizing a distributed message
passing system to accomplish that is nontrivial.

Since we're in a functional language, a key piece of our functionality is the ability to abstract
out what code we're executing in new and exciting ways -- CPS being a classic example of this.
However, if I'm in a multi-process system, my capability of doing CPS is exceedingly limited,
because OCaml isn't the kind of language where you can just schlep around arbitrary code.

So, until we have code mobility, there is a huge win of multithreaded code over multiprocess code.

Problems which can be simply broken down at a nice, obvious breakpoint work just fine in a
multiprocess style.  But if I want implicit or radical parallelism -- if I want the ability to go
"take this function -- whatever it is -- and execute it in parallel and give me back the result",
then I'm going to need truly parallel threads.  And that kind of stunts is exactly the kind of thing
which is going to make your code leverage kilocores effectively.

~~ Robert.

Jon Harrop wrote:
> On Friday 11 July 2008 07:26:44 Sylvain Le Gall wrote:
>> On 10-07-2008, Oliver Bandel <oliver@first.in-berlin.de> wrote:
>>> Using multi-processes instead of multi-threads is the
>>> usual way on Unix, and it has a lot of advantages.
>>> Threads-apologetes often say, threads are the ultimative
>>> technology... but processes have the advantage of encapsulation
>>> of the wohole environment of the program.
>> There is also the fact that using multi process allow you to go further
>> than the memory limit per process...
> 
> Yes.
> 
>> (3GB for Linux/
> 
> Is that for 32-bit Linux?
> 
>> 1GB for Windows)...  
> 
> 32-bit Windows XP has a 2Gb default process memory limit:
> 
>   http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
>   http://msdn.microsoft.com/en-us/library/aa366778.aspx
> 
> 32-bit Windows Server can be increased to 3Gb.
> 
> However, any serious power users will already be on 64-bit where these limits 
> have been relegated to quaint stories your grandpa will tell you.
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 11:35 ` Jon Harrop
@ 2008-07-14 11:32   ` J C
  2008-07-14 12:08     ` Jon Harrop
  0 siblings, 1 reply; 73+ messages in thread
From: J C @ 2008-07-14 11:32 UTC (permalink / raw)
  To: caml-list

On Thu, Jul 10, 2008 at 4:35 AM, Jon Harrop <jon@ffconsultancy.com> wrote:

> OCaml already has OS native threads (albeit with a global lock), already
> supports OpenMP and can already be used to write parallel programs that
> exploit multiple cores.
>
> ...
> Incidentally, MP is good for distributed parallelism but fails to take
> advantage of shared memory (with a concurrent GC).

I think you are confusing stuff. OpenMP is a shared-memory API, MPI is
a message-passing interface, OpenMPI is one implementation of the
latter, OCamlMPI is another. OCaml has little to do with OpenMP
though. Am I wrong?


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-14 11:32   ` J C
@ 2008-07-14 12:08     ` Jon Harrop
  2008-07-14 17:04       ` Mike Lin
  2008-07-14 17:16       ` Richard Jones
  0 siblings, 2 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-14 12:08 UTC (permalink / raw)
  To: caml-list

On Monday 14 July 2008 12:32:53 J C wrote:
> On Thu, Jul 10, 2008 at 4:35 AM, Jon Harrop <jon@ffconsultancy.com> wrote:
> > OCaml already has OS native threads (albeit with a global lock), already
> > supports OpenMP and can already be used to write parallel programs that
> > exploit multiple cores.
> >
> > ...
> > Incidentally, MP is good for distributed parallelism but fails to take
> > advantage of shared memory (with a concurrent GC).
>
> I think you are confusing stuff.

Yes. You stated "OpenMP" in your question but my response was about "OpenMPI". 
Sorry. I was not aware of OpenMP, which appears to post-date my use of C and 
Fortran.

> OpenMP is a shared-memory API, MPI is 
> a message-passing interface, OpenMPI is one implementation of the
> latter, OCamlMPI is another. OCaml has little to do with OpenMP
> though. Am I wrong?

I believe you are correct. Moreover, I suspect that adding support for OpenMP 
to OCaml would be difficult because the current OCaml implementation is 
thread unsafe.

Perhaps the parallel GC could enable support for things like OpenMP but I 
personally would rather see a shift to similar functionality to that of 
Microsoft's TPL because (I assume) it is better for parallel tree operations 
that are themselves more common in languages like OCaml.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-14 12:08     ` Jon Harrop
@ 2008-07-14 17:04       ` Mike Lin
  2008-07-14 17:28         ` Jon Harrop
  2008-07-14 17:16       ` Richard Jones
  1 sibling, 1 reply; 73+ messages in thread
From: Mike Lin @ 2008-07-14 17:04 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]

On Mon, Jul 14, 2008 at 8:08 AM, Jon Harrop <jon@ffconsultancy.com> wrote:

>
> Perhaps the parallel GC could enable support for things like OpenMP but I
> personally would rather see a shift to similar functionality to that of
> Microsoft's TPL because (I assume) it is better for parallel tree
> operations
> that are themselves more common in languages like OCaml.


OpenMP is really great for parallelizing tight loops in numerical code,
which is one scenario in which I'd agree shared memory is much better than
message passing, at least as far as it scales. I wish I had this for my
OCaml CRF and M^3 network code!

But for higher level, map/reduce type of stuffs, I really think message
passing tends to gets you there. In such applications I am usually
interested in distributing across a compute farm anyway, for both CPU and
memory requirements. I started with a lame homerolled fork+Marshal library,
then moved on to Gerd's RPC stuff, now finally I'm playing with ocamlp3l...

Incidentally, it occurs to me that when one is optimizing the kind of tight
numerical loops that can really benefit from shared memory, the FIRST step,
before parallelizing, is to do away with any heap allocations in the loop.
The following is not a serious proposal, but just to kick the idea around -
what is the feasibility of removing the global interpreter lock for segments
of code which perform no heap allocations? i.e. what besides the GC is
stopping us?

Mike

[-- Attachment #2: Type: text/html, Size: 1748 bytes --]

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-14 12:08     ` Jon Harrop
  2008-07-14 17:04       ` Mike Lin
@ 2008-07-14 17:16       ` Richard Jones
  1 sibling, 0 replies; 73+ messages in thread
From: Richard Jones @ 2008-07-14 17:16 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

On Mon, Jul 14, 2008 at 01:08:23PM +0100, Jon Harrop wrote:
> I believe you are correct. Moreover, I suspect that adding support for OpenMP 
> to OCaml would be difficult because the current OCaml implementation is 
> thread unsafe.

OpenMP isn't your typical library.  It's a set of wierd preprocessor
directives which are added directly into C/C++ code (and I believe
FORTRAN too).  Things like:

  #pragma omp parallel for
  for (i = 0; i < 100; i++)
  {
     a[i] = k * b[i];
  }

The, erm, feature here is that the code still gives the same result
(just more slowly) if the #pragmas are simply ignored.

I recently saw Ulrich Drepper giving a talk about OpenMP and it was
interesting in a 'what are the C programmers smoking nowadays' kind of
way.  The barriers to porting OpenMP to OCaml go far beyond lack or
otherwise of concurrent garbage collection.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-14 17:04       ` Mike Lin
@ 2008-07-14 17:28         ` Jon Harrop
  0 siblings, 0 replies; 73+ messages in thread
From: Jon Harrop @ 2008-07-14 17:28 UTC (permalink / raw)
  To: caml-list

On Monday 14 July 2008 18:04:01 Mike Lin wrote:
> Incidentally, it occurs to me that when one is optimizing the kind of tight
> numerical loops that can really benefit from shared memory, the FIRST step,
> before parallelizing, is to do away with any heap allocations in the loop.
> The following is not a serious proposal, but just to kick the idea around -
> what is the feasibility of removing the global interpreter lock for
> segments of code which perform no heap allocations? i.e. what besides the
> GC is stopping us?

I have had similar ideas but I think it would be much wiser to move away from 
OCaml and start afresh. You can easily write a compiler for a subset of OCaml 
suitable for high-performance numerics using LLVM. Users can easily develop 
their numerical code with OCaml and then quote it using camlp4 to have it 
compiled at run-time. However, this only works for a tiny DSL where no 
allocations are required (you can easily improve upon OCaml by not boxing 
floats and by using value types though).

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-10 14:00   ` Jon Harrop
  2008-07-10 22:25     ` Richard Jones
  2008-07-11 14:53     ` [Caml-list] thousands of CPU cores Peng Zang
@ 2008-07-15 14:39     ` Kuba Ober
  2008-07-19 12:41       ` Oliver Bandel
  2 siblings, 1 reply; 73+ messages in thread
From: Kuba Ober @ 2008-07-15 14:39 UTC (permalink / raw)
  To: caml-list

> > It is a stop-gap solution...
>
> That is not true. Many-core machines will always be decomposed into
> shared-memory clusters of as many cores as possible because shared memory
> parallelism will always be orders of magnitude more efficient than
> distributed parallelism.

The way "shared memory" on today's systems is implemented in hardware is
already by essentially message passing. It's just that hardcoded logic does it
all and provides an impression of shared memory, rather than having software
deal with it.

The fact that the software sees it as shared memory doesn't change the fact
that at current system bandwidths we've already run into physical 
implementation limits that make the smooth, fully-random-access memory
a mere illusion. When you read a single uncached byte out of RAM,
there's a big bunch of housekeeping and what-amounts-to-transactional
processing done at the hardware level.

If you count the "efficiency" of such out-of-the-blue uncached truly random
access in terms of clock cycles, current hardware may be 1-2 orders of
magnitude less efficient than almost any 8-bit microcontroller out there...
On most MCUs you can read a random byte out of the SRAM in say 1-4 clock
cycles. On your commonplace modern multicore CPU, it may take a hundred clock
cycles to do the same, and essentially the same amount of time in terms of the
wall clock (a 2GHz CPU has only 100 times faster clock than a run of the mill
20MHz MCU).

What I'm trying to say is that such random, small memory accesses highlight
the inherent message passing / transactional overhead of the hardware
implementation. Those overheads amortize when you run real number tasks,
not a made-up cold single byte access of course. But they are there.

It's akin to mmaped file: you can use CPU's MMU to implement it in the 
usual OS/stock hardware framework, or you can have an FPGA handle memory
transactions and talk directly to the hard drive. It doesn't change the
fact that it's still a mmaped file :)

Cheers, Kuba


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 20:24     ` [Caml-list] " Gerd Stolpmann
  2008-07-10 21:02       ` Sylvain Le Gall
@ 2008-07-15 15:21       ` Kuba Ober
  1 sibling, 0 replies; 73+ messages in thread
From: Kuba Ober @ 2008-07-15 15:21 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008, Gerd Stolpmann wrote:
> Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
> > On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > > In Ocaml you can exploit multi-core currently only by using
> > > multi-processing parallel programs that communicate over message
> > > passing (and only on Unix). Actually, it's an excellent language for
> > > this style.
> >
> > Why only on Unix ?
>
> No fork() on Windows. And emulating its effects is hard.
>
> I would subsume Cygwin under "pseudo-Unix", and its fork emulation is so
> slow that it would be a problem for speedy programs.

AFAIK, Cygwin's fork() emulation is quite limited since Cygwin didn't go the 
way of doing a custom process loader. I.e. instead of having a *tiny* 
statically linked loader executable which then actually brings in code and
data pages into the process space (and would allow sharing them copy-on-write
with other processes), they just use Windows for that, and that's why they
have to emulate (and make up) their process IDs. Doing an exec() when you have
your own loader is trivial: just tell the loader to deallocate all of
the process's virtual memory (save for that of the loader's), and load 
something else instead.

I have fiddled some time ago with a very basic custom loader which uses native
API and it worked OK for what it did. Of course Cygwin has to work under 
win 95, so it has no access to native API there, but on anything modern it
could easily have speeds comparable to native. I mean, Microsoft has
implemented the Posix subsystem using same native API and its fork() works
just fine (read: way better than Cygwin's).

I have last delved into all this ~5+ years ago, so things may have changed
in the meantime...

Cheers, Kuba


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-10 21:19         ` [Caml-list] " Gerd Stolpmann
  2008-07-10 21:35           ` Jon Harrop
@ 2008-07-15 15:57           ` Kuba Ober
  2008-07-15 18:03             ` Gerd Stolpmann
  1 sibling, 1 reply; 73+ messages in thread
From: Kuba Ober @ 2008-07-15 15:57 UTC (permalink / raw)
  To: caml-list

On Thursday 10 July 2008, Gerd Stolpmann wrote:
> Am Donnerstag, den 10.07.2008, 21:02 +0000 schrieb Sylvain Le Gall:
> > On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > > Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
> > >> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > >> > In Ocaml you can exploit multi-core currently only by using
> > >> > multi-processing parallel programs that communicate over message
> > >> > passing (and only on Unix). Actually, it's an excellent language for
> > >> > this style.
> > >>
> > >> Why only on Unix ?
> > >
> > > No fork() on Windows. And emulating its effects is hard.
> >
> > open_process + stdin/stdout should do the trick... at least i think so.
>
> After having ported godi to mingw I am not sure whether this works at
> all. The point is that you usually want to inherit OS resources to the
> child process (e.g. sockets). The CreateProcess Win32 call
> (http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx) mentions
> that you can inherit handles, but I would be careful with the
> information given in MSDN. Often it works only as far as the presented
> examples. Windows isn't written for multi-processing, and its syscalls
> aren't as orthogonal as in Unix-type systems.

Windows syscalls are quite reasonable IMHO, if a tad undocumented. ReactOS
folk have done a great job of reimplementing most of them, and there isn't
anything mucho broken about those. In fact, I'd posit that Windows native
syscalls expose some functionality that's traditionally unavailable on unices
and requires hacks to achieve (usually via executable code injection). Just
look at what Wine folks had to do in order to emulate some win32 (not even
native!) API on Linux: a lot of hard work for what amounts to a single API
call. This of course works both ways, and on Windows, while a fork()
implementation is simple, it AFAIK requires a custom loader or some other
ingenuity to work.

> Furthermore, it looks like a pain in the ass - often you want to run
> some initialization code, and without fork() you have to run it as often
> as you start processes.

On Windows, there's the native API, which is then used by the win32 subsystem
and posix subsystems to do the job. Native API allows fork() implementations
mostly on par with what you get on Unices. MS has a posix subsystem on which
fork() performs in the same ballpark as fork() on linux, and make Cygwin's
fork() look bad like it deserves. About the only good thing about Cygwin's 
fork() is that it works on win9x.

> Also, Windows is just a bad platform for event-based programs, and you
> want to do it to some extent (e.g. for watching all your child
> processes). Only for socket handles there is a select() call. For all
> other types of handles you cannot find out in advance whether the
> operation would block or not.

This is misinformation at best, FUD at worst. I'm no Microsoft fanboy,
but the reality is quite opposite to what you claim. Windows has quite
robust asynchronous I/O support.

Cheers, Kuba


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-11  9:29         ` Sylvain Le Gall
@ 2008-07-15 16:01           ` Kuba Ober
  0 siblings, 0 replies; 73+ messages in thread
From: Kuba Ober @ 2008-07-15 16:01 UTC (permalink / raw)
  To: caml-list

On Friday 11 July 2008, Sylvain Le Gall wrote:
> On 11-07-2008, Jon Harrop <jon@ffconsultancy.com> wrote:
> > On Friday 11 July 2008 07:26:44 Sylvain Le Gall wrote:
> >> On 10-07-2008, Oliver Bandel <oliver@first.in-berlin.de> wrote:
> >
> > However, any serious power users will already be on 64-bit where these
> > limits have been relegated to quaint stories your grandpa will tell you.
>
> As you cannot ignore people running on Windows, you cannot ignore people
> running on older hardware.
>
> If you plan to program a big DB that will use more than 3GB on 32 bits
> hardware, you should take care of this memory limit and consider
> splitting your application into several process...

Re-mapping stuff into and out of virtual memory space is relatively trivial
and doesn't require multiple processes. Multiple processes only give you
the advantage of having all this memory mapped at once, split across
processes though. So it may or may not be an actual improvement, depending
on the application...

Cheers, Kuba


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-15 15:57           ` Kuba Ober
@ 2008-07-15 18:03             ` Gerd Stolpmann
  2008-07-15 19:23               ` Adrien
                                 ` (3 more replies)
  0 siblings, 4 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-15 18:03 UTC (permalink / raw)
  To: Kuba Ober; +Cc: caml-list


Am Dienstag, den 15.07.2008, 11:57 -0400 schrieb Kuba Ober:
> On Thursday 10 July 2008, Gerd Stolpmann wrote:
> > Am Donnerstag, den 10.07.2008, 21:02 +0000 schrieb Sylvain Le Gall:
> > > On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > > > Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
> > > >> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > > >> > In Ocaml you can exploit multi-core currently only by using
> > > >> > multi-processing parallel programs that communicate over message
> > > >> > passing (and only on Unix). Actually, it's an excellent language for
> > > >> > this style.
> > > >>
> > > >> Why only on Unix ?
> > > >
> > > > No fork() on Windows. And emulating its effects is hard.
> > >
> > > open_process + stdin/stdout should do the trick... at least i think so.
> >
> > After having ported godi to mingw I am not sure whether this works at
> > all. The point is that you usually want to inherit OS resources to the
> > child process (e.g. sockets). The CreateProcess Win32 call
> > (http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx) mentions
> > that you can inherit handles, but I would be careful with the
> > information given in MSDN. Often it works only as far as the presented
> > examples. Windows isn't written for multi-processing, and its syscalls
> > aren't as orthogonal as in Unix-type systems.
> 
> Windows syscalls are quite reasonable IMHO, if a tad undocumented. ReactOS
> folk have done a great job of reimplementing most of them, and there isn't
> anything mucho broken about those. In fact, I'd posit that Windows native
> syscalls expose some functionality that's traditionally unavailable on unices
> and requires hacks to achieve (usually via executable code injection). Just
> look at what Wine folks had to do in order to emulate some win32 (not even
> native!) API on Linux: a lot of hard work for what amounts to a single API
> call. This of course works both ways, and on Windows, while a fork()
> implementation is simple, it AFAIK requires a custom loader or some other
> ingenuity to work.

Sure, both systems follow different philosophies.

> > Furthermore, it looks like a pain in the ass - often you want to run
> > some initialization code, and without fork() you have to run it as often
> > as you start processes.
> 
> On Windows, there's the native API, which is then used by the win32 subsystem
> and posix subsystems to do the job. Native API allows fork() implementations
> mostly on par with what you get on Unices. MS has a posix subsystem on which
> fork() performs in the same ballpark as fork() on linux, and make Cygwin's
> fork() look bad like it deserves. About the only good thing about Cygwin's 
> fork() is that it works on win9x.

Well, there's now SFU for Windows (but only for XP Professional and
Windows 2003, not for XP Home and Vista, AFAIK). That's a cool solution
when you want to run Win32 and POSIX programs on the same system, and
maybe an alternative to using virtualization. But it is nothing for
developing consumer programs on Windows.

Btw, has something tried to compile O'Caml on SFU? It's a 230M free
download. There seems to be gcc and lots of GNU stuff, too (yes, it's
from MS...).

> > Also, Windows is just a bad platform for event-based programs, and you
> > want to do it to some extent (e.g. for watching all your child
> > processes). Only for socket handles there is a select() call. For all
> > other types of handles you cannot find out in advance whether the
> > operation would block or not.
> 
> This is misinformation at best, FUD at worst. I'm no Microsoft fanboy,
> but the reality is quite opposite to what you claim. Windows has quite
> robust asynchronous I/O support.

No, this is not misinformation, this is the result of digging deeply
into the Win32 API for an attempt to port Ocamlnet to Win32 (which will
finally happen to some degree). There's overlapped I/O, but the
difficulty is that you have to start an operation before you can watch
asynchronously for its completion. There is no way to check in advance
for that (and that was my claim). Also, there is a quite small limit for
the number of resources you can watch at the same time (I think 32 or
64).

Look at what Cygwin has done. Basically, they start helper threads for
emulating select(). For some cases, there is no real select() support,
e.g. the output side of pipes is always considered as writable. Only the
input side can be watched.

Of course, these difficulties result from porting Unix libraries to
Windows. You can say with some right: If you want to program in the
event-based way on Windows, you must do it the Windows style. Sure, we
are running into the same problems as with fork() - different
philosophies make it problematic to write portable programs. What's
quite interesting is that the Win32 APIs are less powerful in these
areas (process creation, watching events) than the Unix counterpart.
That's my whole claim. If I had to develop programs only for Windows,
I'd do it multi-threaded because Win32 is much better there.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-15 18:03             ` Gerd Stolpmann
@ 2008-07-15 19:23               ` Adrien
  2008-07-15 19:45                 ` Adrien
  2008-07-16  8:59               ` Michaël Grünewald
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 73+ messages in thread
From: Adrien @ 2008-07-15 19:23 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Kuba Ober, caml-list

2008/7/15 Gerd Stolpmann <info@gerd-stolpmann.de>:
>
> Well, there's now SFU for Windows (but only for XP Professional and
> Windows 2003, not for XP Home and Vista, AFAIK). That's a cool solution
> when you want to run Win32 and POSIX programs on the same system, and
> maybe an alternative to using virtualization. But it is nothing for
> developing consumer programs on Windows.
>
> Btw, has something tried to compile O'Caml on SFU? It's a 230M free
> download. There seems to be gcc and lots of GNU stuff, too (yes, it's
> from MS...).

Well, I did a few months ago. For those who don't know, SFU (Services
For Unix) provides approximately the same features as cygwin.
SFU works really well and is professional : you use the installer and
you're done. It's usually easier to get an SFU build than a mingw or a
cygwin one.

There's a catch however : it works too well !
It's certainly surprising but SFU gives you an Unix-like environment
and you'll certainly try to interact with windows and that's where the
problem lies. For example, paths will clash iirc.
However it is worth being investigated. These are s-nt.h :

First, the official one for windows :

#define OCAML_OS_TYPE "Win32"

#undef BSD_SIGNALS
#define HAS_STRERROR
#define HAS_SOCKETS
#define HAS_GETCWD
#define HAS_UTIME
#define HAS_DUP2
#define HAS_GETHOSTNAME
#define HAS_MKTIME
#define HAS_PUTENV
#define HAS_LOCALE


Then the one you can obtain when configure'ing under SFU :

#define OCAML_OS_TYPE "Unix"
#define OCAML_STDLIB_DIR "/usr/local/lib/ocaml"
#define POSIX_SIGNALS
#define HAS_GETRUSAGE
#define HAS_TIMES
#define HAS_TERMCAP
#define HAS_SOCKETS
#define HAS_INET_ATON
#define HAS_UNISTD
#define HAS_OFF_T
#define HAS_DIRENT
#define HAS_REWINDDIR
#define HAS_LOCKF
#define HAS_MKFIFO
#define HAS_GETCWD
#define HAS_GETWD
#define HAS_GETPRIORITY
#define HAS_UTIME
#define HAS_DUP2
#define HAS_FCHMOD
#define HAS_TRUNCATE
#define HAS_SYS_SELECT_H
#define HAS_SELECT
#define HAS_SYMLINK
#define HAS_WAITPID
#define HAS_GETGROUPS
#define HAS_TERMIOS
#define HAS_SETITIMER
#define HAS_GETHOSTNAME
#define HAS_UNAME
#define HAS_GETTIMEOFDAY
#define HAS_MKTIME
#define HAS_SETSID
#define HAS_PUTENV
#define HAS_LOCALE
#define HAS_MMAP
#define HAS_SIGWAIT


It's a bit longer !

I have since uninstalled SFU but could reinstall it, virtualBox will
be handy (except that compiling software inside a virtual machine is
everything but funny).
By the way, SFU is available for Vista and should work under XP Home
(Home is like Pro with a few things removed). Microsoft announced
there would be no other version of SFU [1] but SFU is not their
product, it is Interix's and I think Interix will continue to ship new
versions though these probably won't be free.

Also, the utilities are not necessarily Gnu, they provide BSD ones. ;)
But, but, ... I just saw debian-interix which claims "since the last
buildd run there are now over 1000 packages available for
interix-i386", and even better, the project is still active, the
latest is dated from "2008-06-30". I definitely have to install SFU
again ! (I'll just need a windows partition bigger than 4GB...).

[1] : http://www.microsoft-watch.com/content/operating_systems/its_the_end_of_the_line_for_microsofts_services_for_unix_product.html


The not-so-funny part is that applications compiled under SFU need SFU
installed to run so this limits portability but ocaml-compiled
programs may not have this problem. The only way to know more about
this is certainly to experiment. Also, I don't know how free is SFU's
license.


 ---

Adrien Nader


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-15 19:23               ` Adrien
@ 2008-07-15 19:45                 ` Adrien
  0 siblings, 0 replies; 73+ messages in thread
From: Adrien @ 2008-07-15 19:45 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Kuba Ober, caml-list

2008/7/15 Adrien <camaradetux@gmail.com>:
> 2008/7/15 Gerd Stolpmann <info@gerd-stolpmann.de>:
>>
>> Well, there's now SFU for Windows (but only for XP Professional and
>> Windows 2003, not for XP Home and Vista, AFAIK). That's a cool solution
>> when you want to run Win32 and POSIX programs on the same system, and
>> maybe an alternative to using virtualization. But it is nothing for
>> developing consumer programs on Windows.
>>
>> Btw, has something tried to compile O'Caml on SFU? It's a 230M free
>> download. There seems to be gcc and lots of GNU stuff, too (yes, it's
>> from MS...).
>
> Well, I did a few months ago. For those who don't know, SFU (Services
> For Unix) provides approximately the same features as cygwin.
> SFU works really well and is professional : you use the installer and
> you're done. It's usually easier to get an SFU build than a mingw or a
> cygwin one.
>
> There's a catch however : it works too well !
> It's certainly surprising but SFU gives you an Unix-like environment
> and you'll certainly try to interact with windows and that's where the
> problem lies. For example, paths will clash iirc.
> However it is worth being investigated. These are s-nt.h :
>
> First, the official one for windows :
>
> #define OCAML_OS_TYPE "Win32"
>
> #undef BSD_SIGNALS
> #define HAS_STRERROR
> #define HAS_SOCKETS
> #define HAS_GETCWD
> #define HAS_UTIME
> #define HAS_DUP2
> #define HAS_GETHOSTNAME
> #define HAS_MKTIME
> #define HAS_PUTENV
> #define HAS_LOCALE
>
>
> Then the one you can obtain when configure'ing under SFU :
>
> #define OCAML_OS_TYPE "Unix"
> #define OCAML_STDLIB_DIR "/usr/local/lib/ocaml"
> #define POSIX_SIGNALS
> #define HAS_GETRUSAGE
> #define HAS_TIMES
> #define HAS_TERMCAP
> #define HAS_SOCKETS
> #define HAS_INET_ATON
> #define HAS_UNISTD
> #define HAS_OFF_T
> #define HAS_DIRENT
> #define HAS_REWINDDIR
> #define HAS_LOCKF
> #define HAS_MKFIFO
> #define HAS_GETCWD
> #define HAS_GETWD
> #define HAS_GETPRIORITY
> #define HAS_UTIME
> #define HAS_DUP2
> #define HAS_FCHMOD
> #define HAS_TRUNCATE
> #define HAS_SYS_SELECT_H
> #define HAS_SELECT
> #define HAS_SYMLINK
> #define HAS_WAITPID
> #define HAS_GETGROUPS
> #define HAS_TERMIOS
> #define HAS_SETITIMER
> #define HAS_GETHOSTNAME
> #define HAS_UNAME
> #define HAS_GETTIMEOFDAY
> #define HAS_MKTIME
> #define HAS_SETSID
> #define HAS_PUTENV
> #define HAS_LOCALE
> #define HAS_MMAP
> #define HAS_SIGWAIT
>
>
> It's a bit longer !
>
> I have since uninstalled SFU but could reinstall it, virtualBox will
> be handy (except that compiling software inside a virtual machine is
> everything but funny).
> By the way, SFU is available for Vista and should work under XP Home
> (Home is like Pro with a few things removed). Microsoft announced
> there would be no other version of SFU [1] but SFU is not their
> product, it is Interix's and I think Interix will continue to ship new
> versions though these probably won't be free.
>
> Also, the utilities are not necessarily Gnu, they provide BSD ones. ;)
> But, but, ... I just saw debian-interix which claims "since the last
> buildd run there are now over 1000 packages available for
> interix-i386", and even better, the project is still active, the
> latest is dated from "2008-06-30". I definitely have to install SFU
> again ! (I'll just need a windows partition bigger than 4GB...).
>
> [1] : http://www.microsoft-watch.com/content/operating_systems/its_the_end_of_the_line_for_microsofts_services_for_unix_product.html
>
>
> The not-so-funny part is that applications compiled under SFU need SFU
> installed to run so this limits portability but ocaml-compiled
> programs may not have this problem. The only way to know more about
> this is certainly to experiment. Also, I don't know how free is SFU's
> license.
>
>
>  ---
>
> Adrien Nader
>


It may not run on XP Home contrary to what I stated. Debian-interix
says the following in its INSTALL file :
  * On Windows XP make sure, you
     DON'T "Use simple file sharing"
XP Home can only "use simple file sharing" so sfu may not work on XP
Home (though you can get the required functionnality).


 ---

Adrien Nader


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-15 18:03             ` Gerd Stolpmann
  2008-07-15 19:23               ` Adrien
@ 2008-07-16  8:59               ` Michaël Grünewald
  2008-07-16 16:43                 ` Gerd Stolpmann
  2008-07-16 11:46               ` Richard Jones
  2008-07-17 12:48               ` Kuba Ober
  3 siblings, 1 reply; 73+ messages in thread
From: Michaël Grünewald @ 2008-07-16  8:59 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Kuba Ober, caml-list

Gerd Stolpmann wrote:

> Well, there's now SFU for Windows (but only for XP Professional and
> Windows 2003, not for XP Home and Vista, AFAIK). That's a cool solution
> when you want to run Win32 and POSIX programs on the same system, and
> maybe an alternative to using virtualization. But it is nothing for
> developing consumer programs on Windows.
> 
> Btw, has something tried to compile O'Caml on SFU? It's a 230M free
> download. There seems to be gcc and lots of GNU stuff, too (yes, it's
> from MS...).

I did this a few monthes ago, I followed the NetBSD way, since SFU is 
supported by NetBSD's `pkgsrc'. This was really *easy*, thanks to the 
efforts of the `pkgsrc' maintainers. However, I did not play that much 
with the system, my point was to test SFU by running very Unix-oriented 
and complex proecdures in it.

See http://www.netbsd.org/docs/software/packages.html
for general information about NetBSD's pkgsrc; Microsoft SFU is refered 
to as Interix here, e.g. in the ``Supported Platforms'' section.

The `pkgsrc' software is a port infrastructure similar to what is found 
on *BSD and MacPorts, if you have used one of them, you certainly will 
feel comfortable with `pkgsrc'. Documentation for `pkgsrc' is available 
at http://www.netbsd.org/docs/pkgsrc/, besides the introduction, see 
especially sections 3.2 (Bootstrapping) and 4.2 (Installing ports), it 
shall be enough to get started!
-- 
Cheers,
Michaël


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-15 18:03             ` Gerd Stolpmann
  2008-07-15 19:23               ` Adrien
  2008-07-16  8:59               ` Michaël Grünewald
@ 2008-07-16 11:46               ` Richard Jones
  2008-07-16 18:35                 ` Erik de Castro Lopo
  2008-07-17 12:48               ` Kuba Ober
  3 siblings, 1 reply; 73+ messages in thread
From: Richard Jones @ 2008-07-16 11:46 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Kuba Ober, caml-list

On a different, but not unrelated topic, Debian have a cross-compiler
(based on MinGW) so you don't need to leave the safety & comfort of
Linux in order to build Windows DLLs and binaries.

  http://packages.debian.org/search?keywords=mingw32

Fedora are going to offer a MinGW cross-compiler and libraries too,
shortly:

  https://fedoraproject.org/wiki/SIGs/MinGW

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-16  8:59               ` Michaël Grünewald
@ 2008-07-16 16:43                 ` Gerd Stolpmann
  0 siblings, 0 replies; 73+ messages in thread
From: Gerd Stolpmann @ 2008-07-16 16:43 UTC (permalink / raw)
  To: Michaël Grünewald; +Cc: caml-list


Am Mittwoch, den 16.07.2008, 10:59 +0200 schrieb Michaël Grünewald:
> Gerd Stolpmann wrote:
> 
> > Well, there's now SFU for Windows (but only for XP Professional and
> > Windows 2003, not for XP Home and Vista, AFAIK). That's a cool solution
> > when you want to run Win32 and POSIX programs on the same system, and
> > maybe an alternative to using virtualization. But it is nothing for
> > developing consumer programs on Windows.
> > 
> > Btw, has something tried to compile O'Caml on SFU? It's a 230M free
> > download. There seems to be gcc and lots of GNU stuff, too (yes, it's
> > from MS...).
> 
> I did this a few monthes ago, I followed the NetBSD way, since SFU is 
> supported by NetBSD's `pkgsrc'. This was really *easy*, thanks to the 
> efforts of the `pkgsrc' maintainers. However, I did not play that much 
> with the system, my point was to test SFU by running very Unix-oriented 
> and complex proecdures in it.
> 
> See http://www.netbsd.org/docs/software/packages.html
> for general information about NetBSD's pkgsrc; Microsoft SFU is refered 
> to as Interix here, e.g. in the ``Supported Platforms'' section.

Good to know.

> The `pkgsrc' software is a port infrastructure similar to what is found 
> on *BSD and MacPorts, if you have used one of them, you certainly will 
> feel comfortable with `pkgsrc'. Documentation for `pkgsrc' is available 
> at http://www.netbsd.org/docs/pkgsrc/, besides the introduction, see 
> especially sections 3.2 (Bootstrapping) and 4.2 (Installing ports), it 
> shall be enough to get started!

Yes, I know pkgsrc very well. I used it years ago to build software on
Solaris. Later I took it as starting point for GODI.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-16 11:46               ` Richard Jones
@ 2008-07-16 18:35                 ` Erik de Castro Lopo
  0 siblings, 0 replies; 73+ messages in thread
From: Erik de Castro Lopo @ 2008-07-16 18:35 UTC (permalink / raw)
  To: caml-list

Richard Jones wrote:

> On a different, but not unrelated topic, Debian have a cross-compiler
> (based on MinGW) so you don't need to leave the safety & comfort of
> Linux in order to build Windows DLLs and binaries.
> 
>   http://packages.debian.org/search?keywords=mingw32

I am the main author of a libsndfile (a library for reading/writing
audio files like WAV, AIFF etc) written in C and widely used across
all the major platforms.

I have recently switched to doing all my windows builds for libsndfile
on a Debian/Ubuntu box, cross-compiling using these MinGW tools and
running the test suite under Wine (the windows emulator).

For me, this is about 100 times easier than dealing with the pain
that is windows.

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"Arguing that Java is better than C++ is like arguing that
grasshoppers taste better than tree bark." -- Thant Tessman


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Re: thousands of CPU cores
  2008-07-15 18:03             ` Gerd Stolpmann
                                 ` (2 preceding siblings ...)
  2008-07-16 11:46               ` Richard Jones
@ 2008-07-17 12:48               ` Kuba Ober
  3 siblings, 0 replies; 73+ messages in thread
From: Kuba Ober @ 2008-07-17 12:48 UTC (permalink / raw)
  To: caml-list

On Tuesday 15 July 2008, you wrote:
> Am Dienstag, den 15.07.2008, 11:57 -0400 schrieb Kuba Ober:
> > On Thursday 10 July 2008, Gerd Stolpmann wrote:
> > > Am Donnerstag, den 10.07.2008, 21:02 +0000 schrieb Sylvain Le Gall:
> > > > On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > > > > Am Donnerstag, den 10.07.2008, 20:07 +0000 schrieb Sylvain Le Gall:
> > > > >> On 10-07-2008, Gerd Stolpmann <info@gerd-stolpmann.de> wrote:
> > > > >> > In Ocaml you can exploit multi-core currently only by using
> > > > >> > multi-processing parallel programs that communicate over message
> > > > >> > passing (and only on Unix). Actually, it's an excellent language
> > > > >> > for this style.
> > > > >>
> > > > >> Why only on Unix ?
> > > > >
> > > > > No fork() on Windows. And emulating its effects is hard.
> > > >
> > > > open_process + stdin/stdout should do the trick... at least i think
> > > > so.
> > >
> > > After having ported godi to mingw I am not sure whether this works at
> > > all. The point is that you usually want to inherit OS resources to the
> > > child process (e.g. sockets). The CreateProcess Win32 call
> > > (http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx) mentions
> > > that you can inherit handles, but I would be careful with the
> > > information given in MSDN. Often it works only as far as the presented
> > > examples. Windows isn't written for multi-processing, and its syscalls
> > > aren't as orthogonal as in Unix-type systems.
> >
> > Windows syscalls are quite reasonable IMHO, if a tad undocumented.
> > ReactOS folk have done a great job of reimplementing most of them, and
> > there isn't anything mucho broken about those. In fact, I'd posit that
> > Windows native syscalls expose some functionality that's traditionally
> > unavailable on unices and requires hacks to achieve (usually via
> > executable code injection). Just look at what Wine folks had to do in
> > order to emulate some win32 (not even native!) API on Linux: a lot of
> > hard work for what amounts to a single API call. This of course works
> > both ways, and on Windows, while a fork() implementation is simple, it
> > AFAIK requires a custom loader or some other ingenuity to work.
>
> Sure, both systems follow different philosophies.
>
> > > Furthermore, it looks like a pain in the ass - often you want to run
> > > some initialization code, and without fork() you have to run it as
> > > often as you start processes.
> >
> > On Windows, there's the native API, which is then used by the win32
> > subsystem and posix subsystems to do the job. Native API allows fork()
> > implementations mostly on par with what you get on Unices. MS has a posix
> > subsystem on which fork() performs in the same ballpark as fork() on
> > linux, and make Cygwin's fork() look bad like it deserves. About the only
> > good thing about Cygwin's fork() is that it works on win9x.
>
> Well, there's now SFU for Windows (but only for XP Professional and
> Windows 2003, not for XP Home and Vista, AFAIK). That's a cool solution
> when you want to run Win32 and POSIX programs on the same system, and
> maybe an alternative to using virtualization. But it is nothing for
> developing consumer programs on Windows.
>
> Btw, has something tried to compile O'Caml on SFU? It's a 230M free
> download. There seems to be gcc and lots of GNU stuff, too (yes, it's
> from MS...).
>
> > > Also, Windows is just a bad platform for event-based programs, and you
> > > want to do it to some extent (e.g. for watching all your child
> > > processes). Only for socket handles there is a select() call. For all
> > > other types of handles you cannot find out in advance whether the
> > > operation would block or not.
> >
> > This is misinformation at best, FUD at worst. I'm no Microsoft fanboy,
> > but the reality is quite opposite to what you claim. Windows has quite
> > robust asynchronous I/O support.
>
> No, this is not misinformation, this is the result of digging deeply
> into the Win32 API for an attempt to port Ocamlnet to Win32 (which will
> finally happen to some degree).

If you limit yourself to Win32 API, you're right that you won't get it
to work the way you want to. As soon as you start digging into the DDK,
there are ways to do it.

> There's overlapped I/O, but the 
> difficulty is that you have to start an operation before you can watch
> asynchronously for its completion. There is no way to check in advance
> for that (and that was my claim). Also, there is a quite small limit for
> the number of resources you can watch at the same time (I think 32 or
> 64).
>
> Look at what Cygwin has done. Basically, they start helper threads for
> emulating select(). For some cases, there is no real select() support,
> e.g. the output side of pipes is always considered as writable. Only the
> input side can be watched.

A lot of that is done since they limit themselves to the API, and old
API at that. SFU achieves it without much in the way of limits by using
admittedly undocumented, albeit relatively understandable native APIs,
and by gaining more direct access to the network stack. There's no reason
nowadays why Cygwin folks couldn't do the same.

Cheers, Kuba


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-12 22:54         ` J C
@ 2008-07-19 12:06           ` Oliver Bandel
  0 siblings, 0 replies; 73+ messages in thread
From: Oliver Bandel @ 2008-07-19 12:06 UTC (permalink / raw)
  To: caml-list; +Cc: J C

Zitat von J C <jhc0033@gmail.com>:

> On Fri, Jul 11, 2008 at 5:23 PM, Oliver Bandel
> <oliver@first.in-berlin.de> wrote:
>
> > For example, if you have a non-profit research project,
> > you can use the BOINC infrastructure, which provides
> > about 580000 PCs to help you :)
> >
> >
>
http://en.wikipedia.org/wiki/Berkeley_Open_Infrastructure_for_Network_Computing
> >
> > There is no Shared-Mem as we know it from our local PCs, there
> > is distributed calculation around the whole planet.
> >
> > Threads will not help there ;-)
>
> But on each of those PCs there may be 1000 cores in the near future.
[...]

Yes, maybe.
I don't say that global networked computer power substitutes
multicore-machines.
But on multicore-machines one could use multiple processes.

And I don't say that threads will not help sometimes,
I just think, they are over estimated.

BTW: I've forgotten where the article was, but there was an
introductional article on programming of encryption algoritms,
and the author said: "don't use multithreaded code", because
it's too easy to have weak code that makes the whole system
unsafe.

I agree with him.
But I also think, not only in encryption-programs
threads should be avoided, if possible. I don't mean
never use threads, I just mean: use only, if really necessary.

Multiple processes are less sensitive to programming errors
than multithreaded code.
processes give you encapsulation. If one process crashes,
the others do not. If you crash one thread, the whole application
is affected.


Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-15 14:39     ` Kuba Ober
@ 2008-07-19 12:41       ` Oliver Bandel
  0 siblings, 0 replies; 73+ messages in thread
From: Oliver Bandel @ 2008-07-19 12:41 UTC (permalink / raw)
  To: Kuba Ober; +Cc: caml-list

Zitat von Kuba Ober <ober.14@osu.edu>:
[...]
> If you count the "efficiency" of such out-of-the-blue uncached truly
> random
> access in terms of clock cycles, current hardware may be 1-2 orders
> of
> magnitude less efficient than almost any 8-bit microcontroller out
> there...
> On most MCUs you can read a random byte out of the SRAM in say 1-4
> clock
> cycles. On your commonplace modern multicore CPU, it may take a
> hundred clock
> cycles to do the same, and essentially the same amount of time in
> terms of the
> wall clock (a 2GHz CPU has only 100 times faster clock than a run of
> the mill
> 20MHz MCU).
[...]

Given a RAM with a certain clock frequency, when
the Microcontroller works at the same frequency as the RAM,
and a CPU of a typical computer uses a much higher
frequency, it's normal that the CPU has to wait longer.
That's thze reasion why cache-RAM is used on more then one level.

But there are also processors, that can lookup RAM while at the same
time working on instructions that were fetched before.
Some DSPs are really fast... for example Analog Devices'
TigerShark can do between one and four operations in one
clock cycle (on average two instructions per cycle).
And itÄs a single-core DSP.

When it runs at 600 MHz, it can do 32-Bit Floating-Point
operations  with 3.6 GFlops.

http://www.analog.com/en/embedded-processing-dsp/tigersharc/content/tigersharc_benchmarks/fca.html


OK, this is a very specific processor and comparing it with
CPU's of the computers we are using today, is maybe
a littlebid inproper. But I just wanted to say: a good design can do a
lot
things possible, which may not be used in many CPUs.

For example the TigerShark also has links to other processors
and therefore can be used in multi-processor systems.

http://www.analog.com/en/embedded-processing-dsp/tigersharc/content/tigersharc_processor__architectural_features/fca.html

http://www.analog.com/en/embedded-processing-dsp/tigersharc/content/tigersharc_architectural_backgrounder/fca.html



And the idea of Links between processors was used in the 1980's
by T-400 and T-800 from INMOS:
   http://en.wikipedia.org/wiki/Transputer

It seems, they were too far ahead to be commercially successful.

Ciao,
   Oliver

P.S.: Remember the Altivec unit of G4-processor, for example...
        ...they also gave good speedup and Math-speed.
       So, a good CPU-design can give advanatges in speed.









>
> What I'm trying to say is that such random, small memory accesses
> highlight
> the inherent message passing / transactional overhead of the hardware
> implementation. Those overheads amortize when you run real number
> tasks,
> not a made-up cold single byte access of course. But they are there.
>
> It's akin to mmaped file: you can use CPU's MMU to implement it in
> the
> usual OS/stock hardware framework, or you can have an FPGA handle
> memory
> transactions and talk directly to the hard drive. It doesn't change
> the
> fact that it's still a mmaped file :)
>
> Cheers, Kuba
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-07-11  9:30             ` Richard Jones
@ 2008-09-21 19:05               ` Michaël Grünewald
  2008-09-21 21:41                 ` Jon Harrop
  0 siblings, 1 reply; 73+ messages in thread
From: Michaël Grünewald @ 2008-09-21 19:05 UTC (permalink / raw)
  To: OCaml users

Richard Jones wrote:
> If you also follow the rest of that thread, there's a message passing
> OCaml version by Gerd Stolpmann which also scales properly.
> 
> To be honest, matrix multiplication interests me not at all since no
> one is hand coding their own matrix multiplication when there are
> perfectly good, parallel libraries available for most languages,
> including OCaml.  Even if you were writing all your applications in C,
> you'd still be stupid to hand roll your own matrix multiplication.
> Let's have a real example instead.

This is true while your are concerned with matrix over the real or
complex numbers, but if you want to use arbitrary precision arithmetic,
finite fields, quaternions or any ring you like, then you are stuck.
Linear algebra is useful in every mathematical field, not just numerical
computing.

It is not ridiculous at all to code matrix routines in OCaml, since you
can use functors to use your routines with any kind of scalar, not just
complex numbers. And I already had to code dense matrix operations for
these reasons.

BTW, if anybody here knows presentations about matrix implementation(s),
I would be very glad to know about it.
-- 
Cheers,
Michaël


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-09-21 19:05               ` Michaël Grünewald
@ 2008-09-21 21:41                 ` Jon Harrop
  2008-09-22  7:51                   ` Alan Schmitt
  0 siblings, 1 reply; 73+ messages in thread
From: Jon Harrop @ 2008-09-21 21:41 UTC (permalink / raw)
  To: caml-list

On Sunday 21 September 2008 20:05:15 Michaël Grünewald wrote:
> This is true while your are concerned with matrix over the real or
> complex numbers, but if you want to use arbitrary precision arithmetic,
> finite fields, quaternions or any ring you like, then you are stuck.
> Linear algebra is useful in every mathematical field, not just numerical
> computing.
>
> It is not ridiculous at all to code matrix routines in OCaml, since you
> can use functors to use your routines with any kind of scalar, not just
> complex numbers. And I already had to code dense matrix operations for
> these reasons.
>
> BTW, if anybody here knows presentations about matrix implementation(s),
> I would be very glad to know about it.

Exactly. OCaml's poor performance in the case of nxn matrix multiply stems 
almost entirely from the inefficiency of the gather operation which is O(n^2) 
and serial in OCaml but would be O(1) and parallel if each thread could write 
results directly into a shared data structure.

This is a fundamental problem that afflicts all parallel algorithms that 
gather a non-trivial result. In fact, matrix multiplication is not even worst 
case because gather is only O(n^2) of an O(n^3) total.

Also, note that matrix multiplication is embarassingly parallel. So OCaml's 
current problems with parallelism are not limited to slow interthread 
communication.

The good news is that the parallel GC is coming along nicely and this will be 
a solved problem before long... :-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-09-21 21:41                 ` Jon Harrop
@ 2008-09-22  7:51                   ` Alan Schmitt
  2008-09-22 19:03                     ` Jon Harrop
  0 siblings, 1 reply; 73+ messages in thread
From: Alan Schmitt @ 2008-09-22  7:51 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 227 bytes --]

On 21 sept. 08, at 23:41, Jon Harrop wrote:

> The good news is that the parallel GC is coming along nicely and  
> this will be
> a solved problem before long... :-)

I'd love to hear more about this. Could you develop?

Alan

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-09-22  7:51                   ` Alan Schmitt
@ 2008-09-22 19:03                     ` Jon Harrop
  2008-09-22 19:49                       ` David Teller
                                         ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: Jon Harrop @ 2008-09-22 19:03 UTC (permalink / raw)
  To: caml-list

On Monday 22 September 2008 08:51:03 Alan Schmitt wrote:
> On 21 sept. 08, at 23:41, Jon Harrop wrote:
> > The good news is that the parallel GC is coming along nicely and
> > this will be a solved problem before long... :-)
>
> I'd love to hear more about this. Could you develop?

Sure thing. I wrote to the guys doing this work a couple of times and they 
were very friendly. Apparently they are currently ironing out the last of the 
bugs before going public.

I don't think I am the only person struggling to contain my excitement. :-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-09-22 19:03                     ` Jon Harrop
@ 2008-09-22 19:49                       ` David Teller
  2008-09-23  6:42                       ` kirillkh
  2008-09-24 13:30                       ` [Caml-list] Link tracking Chris Clearwater
  2 siblings, 0 replies; 73+ messages in thread
From: David Teller @ 2008-09-22 19:49 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

On Mon, 2008-09-22 at 20:03 +0100, Jon Harrop wrote:
> Sure thing. I wrote to the guys doing this work a couple of times and they 
> were very friendly. Apparently they are currently ironing out the last of the 
> bugs before going public.
> 
> I don't think I am the only person struggling to contain my excitement. :-)

      *
*_o/
    |/
   / 


'nuff said
-- 
David Teller-Rajchenbach
 Security of Distributed Systems
  http://www.univ-orleans.fr/lifo/Members/David.Teller
 Angry researcher: French Universities need reforms, but the LRU act brings liquidations. 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] thousands of CPU cores
  2008-09-22 19:03                     ` Jon Harrop
  2008-09-22 19:49                       ` David Teller
@ 2008-09-23  6:42                       ` kirillkh
  2008-09-24 13:30                       ` [Caml-list] Link tracking Chris Clearwater
  2 siblings, 0 replies; 73+ messages in thread
From: kirillkh @ 2008-09-23  6:42 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

What about the standard library being single-threaded? How hard will it be
to adjust it for multiple threads, will OCaml maintainers even agree to such
adjustments and how will this affect performance?

On Mon, Sep 22, 2008 at 10:03 PM, Jon Harrop <
jonathandeanharrop@googlemail.com> wrote:

> On Monday 22 September 2008 08:51:03 Alan Schmitt wrote:
> > On 21 sept. 08, at 23:41, Jon Harrop wrote:
> > > The good news is that the parallel GC is coming along nicely and
> > > this will be a solved problem before long... :-)
> >
> > I'd love to hear more about this. Could you develop?
>
> Sure thing. I wrote to the guys doing this work a couple of times and they
> were very friendly. Apparently they are currently ironing out the last of
> the
> bugs before going public.
>
> I don't think I am the only person struggling to contain my excitement. :-)
>
> --
> Dr Jon Harrop, Flying Frog Consultancy Ltd.
> http://www.ffconsultancy.com/?e
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 2063 bytes --]

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Link tracking
  2008-09-22 19:03                     ` Jon Harrop
  2008-09-22 19:49                       ` David Teller
  2008-09-23  6:42                       ` kirillkh
@ 2008-09-24 13:30                       ` Chris Clearwater
  2008-09-24 15:43                         ` Jon Harrop
  2 siblings, 1 reply; 73+ messages in thread
From: Chris Clearwater @ 2008-09-24 13:30 UTC (permalink / raw)
  To: caml-list

On Mon, 2008-09-22 at 20:03 +0100, Jon Harrop wrote:
> Dr Jon Harrop, Flying Frog Consultancy Ltd.
> http://www.ffconsultancy.com/?e

I notice that your caml-list postings tack "?e" onto the end of all
links to your company page. You seem to use "?u" on your c.l.f postings.
Is this used to gauge the effective of drawing eyeballs to your website?



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [Caml-list] Link tracking
  2008-09-24 13:30                       ` [Caml-list] Link tracking Chris Clearwater
@ 2008-09-24 15:43                         ` Jon Harrop
  0 siblings, 0 replies; 73+ messages in thread
From: Jon Harrop @ 2008-09-24 15:43 UTC (permalink / raw)
  To: caml-list

On Wednesday 24 September 2008 14:30:34 Chris Clearwater wrote:
> On Mon, 2008-09-22 at 20:03 +0100, Jon Harrop wrote:
> > Dr Jon Harrop, Flying Frog Consultancy Ltd.
> > http://www.ffconsultancy.com/?e
>
> I notice that your caml-list postings tack "?e" onto the end of all
> links to your company page. You seem to use "?u" on your c.l.f postings.
> Is this used to gauge the effective of drawing eyeballs to your website?

Yes. Forgive my posting this to the list but I noticed none of my posts have 
gone through lately so I want to make sure this works... ;-)

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2008-09-24 14:42 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-10  5:57 thousands of CPU cores J C
2008-07-10  6:15 ` [Caml-list] " Erik de Castro Lopo
2008-07-10 12:47   ` Oliver Bandel
2008-07-10 13:48     ` Hezekiah M. Carty
2008-07-10 11:35 ` Jon Harrop
2008-07-14 11:32   ` J C
2008-07-14 12:08     ` Jon Harrop
2008-07-14 17:04       ` Mike Lin
2008-07-14 17:28         ` Jon Harrop
2008-07-14 17:16       ` Richard Jones
2008-07-10 13:21 ` Jon Harrop
2008-07-10 13:44 ` Peng Zang
2008-07-10 14:00   ` Jon Harrop
2008-07-10 22:25     ` Richard Jones
2008-07-10 23:04       ` Jon Harrop
2008-07-10 23:41         ` Oliver Bandel
2008-07-11  0:17           ` Oliver Bandel
2008-07-11  9:30             ` Richard Jones
2008-09-21 19:05               ` Michaël Grünewald
2008-09-21 21:41                 ` Jon Harrop
2008-09-22  7:51                   ` Alan Schmitt
2008-09-22 19:03                     ` Jon Harrop
2008-09-22 19:49                       ` David Teller
2008-09-23  6:42                       ` kirillkh
2008-09-24 13:30                       ` [Caml-list] Link tracking Chris Clearwater
2008-09-24 15:43                         ` Jon Harrop
2008-07-11 14:53     ` [Caml-list] thousands of CPU cores Peng Zang
2008-07-15 14:39     ` Kuba Ober
2008-07-19 12:41       ` Oliver Bandel
2008-07-10 19:15 ` Gerd Stolpmann
2008-07-10 20:07   ` Sylvain Le Gall
2008-07-10 20:24     ` [Caml-list] " Gerd Stolpmann
2008-07-10 21:02       ` Sylvain Le Gall
2008-07-10 21:19         ` [Caml-list] " Gerd Stolpmann
2008-07-10 21:35           ` Jon Harrop
2008-07-10 22:39             ` Gerd Stolpmann
2008-07-15 15:57           ` Kuba Ober
2008-07-15 18:03             ` Gerd Stolpmann
2008-07-15 19:23               ` Adrien
2008-07-15 19:45                 ` Adrien
2008-07-16  8:59               ` Michaël Grünewald
2008-07-16 16:43                 ` Gerd Stolpmann
2008-07-16 11:46               ` Richard Jones
2008-07-16 18:35                 ` Erik de Castro Lopo
2008-07-17 12:48               ` Kuba Ober
2008-07-15 15:21       ` Kuba Ober
2008-07-10 20:48     ` Basile STARYNKEVITCH
2008-07-10 21:12       ` Jon Harrop
2008-07-10 23:33   ` [Caml-list] " Oliver Bandel
2008-07-10 23:43     ` Oliver Bandel
2008-07-11  6:26     ` Sylvain Le Gall
2008-07-11  8:50       ` [Caml-list] " Jon Harrop
2008-07-11  9:29         ` Sylvain Le Gall
2008-07-15 16:01           ` [Caml-list] " Kuba Ober
2008-07-13  3:17         ` Code Mobility [was Re: thousands of CPU cores] Robert Fischer
2008-07-11  3:01   ` [Caml-list] thousands of CPU cores Brian Hurt
2008-07-11 13:01     ` Gerd Stolpmann
2008-07-11 13:43       ` Jon Harrop
2008-07-11 14:03         ` Basile STARYNKEVITCH
2008-07-11 15:08           ` Jon Harrop
2008-07-11 17:28           ` Jon Harrop
2008-07-11 17:54         ` Richard Jones
2008-07-11 18:30           ` Raoul Duke
2008-07-12 17:35       ` Brian Hurt
2008-07-11 15:01     ` Peng Zang
2008-07-12  0:23       ` Oliver Bandel
2008-07-12 22:54         ` J C
2008-07-19 12:06           ` Oliver Bandel
2008-07-11 14:06 ` Xavier Leroy
2008-07-11 15:20   ` Oliver Bandel
2008-07-11 15:23   ` Bill
2008-07-11 18:14   ` Mattias Engdegård
2008-07-12 23:05   ` J C

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).