Looking for information regarding use of OCaml in scientific computing and simulation

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Looking for information regarding use of OCaml in scientific  computing and simulation
@ 2009-11-25 11:05 David MENTRE
  2009-11-25 11:59 ` Sylvain Le Gall
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: David MENTRE @ 2009-11-25 11:05 UTC (permalink / raw)
  To: caml users

Hello,

I'm considering doing a short presentation of OCaml to my colleagues
in my research lab. They are working in the telecommunication and
power electronic sectors, mainly doing signal processing and
simulations. I know OCaml[1] but not specifically those domains.

Therefore, I'm looking for reusable material for a presentation:
 - Slides on the use of OCaml in the signal processing and simulation domains;

 - Code snippets of OCaml used in scientific computing or simulation,
typically for advocacy like "it takes 10 lines in OCaml to do this,
you would use 50 lines in C++ to do the same thing";

 - Evidence of *actual use* of OCaml for scientific computing or
simulation, especially regarding usable libraries, bindings, etc.

 - Evidence of people having switched from C/C++ simulators to OCaml
ones : good and bad points, issues, things to look at, etc.

 - My colleagues are working a lot with Mathlab, is there any synergy
there (bindings, ways to integrate Mathlab within OCaml code or vice
versa, ...)?

You can reply to me on this list or off list. In case of personal
reply, let me know if I can reuse your name and affiliation.

If this presentation is done, I'll make the slides available under a
free license.

Many thanks in advance for any pointer,
Best regards,
david

[1] For both OCaml's strong (typing, GC, efficiency, ...) and bad
(strange syntax, ...) points.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Looking for information regarding use of OCaml in scientific computing and simulation
  2009-11-25 11:05 Looking for information regarding use of OCaml in scientific computing and simulation David MENTRE
@ 2009-11-25 11:59 ` Sylvain Le Gall
  2009-11-25 12:32 ` [Caml-list] " blue storm
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Sylvain Le Gall @ 2009-11-25 11:59 UTC (permalink / raw)
  To: caml-list

Hello,

On 25-11-2009, David MENTRE <dmentre@linux-france.org> wrote:
> Hello,
>
> I'm considering doing a short presentation of OCaml to my colleagues
> in my research lab. They are working in the telecommunication and
> power electronic sectors, mainly doing signal processing and
> simulations. I know OCaml[1] but not specifically those domains.
>
> Therefore, I'm looking for reusable material for a presentation:
>  - Slides on the use of OCaml in the signal processing and simulation domains;
>

Maybe you can have a look at 
https://forge.ocamlcore.org/docman/view.php/77/34/VSYML-ocaml-meeting-2009.pdf

This is the slide of last year OCaml Meeting talk about VHDL simulation,
using OCaml. Maybe Florent Ouchet, the author, can give you more hint.

(I think VSYML is doing thing at symbolic level, so maybe it is the kind
of simulation you are looking for).

>  - Code snippets of OCaml used in scientific computing or simulation,
> typically for advocacy like "it takes 10 lines in OCaml to do this,
> you would use 50 lines in C++ to do the same thing";
>

Maybe "OCaml for Scientist" from Jon Harrop can be a good start. There
some examples at the end that can help you (with code).

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
  2009-11-25 11:05 Looking for information regarding use of OCaml in scientific computing and simulation David MENTRE
  2009-11-25 11:59 ` Sylvain Le Gall
@ 2009-11-25 12:32 ` blue storm
  2009-11-28 23:23 ` Jan Kybic
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: blue storm @ 2009-11-25 12:32 UTC (permalink / raw)
  To: David MENTRE; +Cc: caml users

On Wed, Nov 25, 2009 at 12:05 PM, David MENTRE <dmentre@linux-france.org> wrote:
>  - My colleagues are working a lot with Mathlab, is there any synergy
> there (bindings, ways to integrate Mathlab within OCaml code or vice
> versa, ...)?

See OcamlMex : http://ocamlmex.gforge.inria.fr/

When working with scientific software libraries, you may need to
communicate with Fortran Code. The standard Bigarray library is very
useful for communicating numeric data sets in a layout directly
compatible with C or Fortran arrays :
http://caml.inria.fr/pub/docs/manual-ocaml/manual043.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation
  2009-11-25 11:05 Looking for information regarding use of OCaml in scientific computing and simulation David MENTRE
  2009-11-25 11:59 ` Sylvain Le Gall
  2009-11-25 12:32 ` [Caml-list] " blue storm
@ 2009-11-28 23:23 ` Jan Kybic
  2009-11-29 23:11 ` Jon Harrop
  2009-12-04  9:55 ` David MENTRE
  4 siblings, 0 replies; 12+ messages in thread
From: Jan Kybic @ 2009-11-28 23:23 UTC (permalink / raw)
  To: David MENTRE; +Cc: caml users

> I'm considering doing a short presentation of OCaml to my colleagues
> in my research lab. They are working in the telecommunication and
> power electronic sectors, mainly doing signal processing and
> simulations. I know OCaml[1] but not specifically those domains.
>
> Therefore, I'm looking for reusable material for a presentation:
>  - Slides on the use of OCaml in the signal processing and simulation domains;
>
>  - Code snippets of OCaml used in scientific computing or simulation,
> typically for advocacy like "it takes 10 lines in OCaml to do this,
> you would use 50 lines in C++ to do the same thing";
>
>  - Evidence of *actual use* of OCaml for scientific computing or
> simulation, especially regarding usable libraries, bindings, etc.

I am writing this offline, so I cannot provide any pointers now but it
should be easy to find them on the Web.

I used to be a proficient C programmer, currently I am writing a
majority of my code in Ocaml.

There are Ocaml bindings to many libraries for scientific (=numerical)
computing, such as BLAS+LAPACK, GSL, Cubpack, FFTW and others. You can
use MPI for parallel (cluster) programming. For visualization, I have
tried an interface to Gnuplot and OpenGL. For image processing, there
is the CamlImages library.

I have interfaced my Ocaml code with C, C++ (via C), Matlab, and Python.
Because of this interoperability, it is normally relatively easy to
switch to Ocaml gradually, rewriting your programs one by one.

I have been using Ocaml for scientific computing for about seven
years, in domains ranging from solving integral equations to medical
image segmentation (look on my web page for details). However, most of
this is research code (=alpha quality, works good enough for me), so I
am only making it available on request for research purposes. Sadly,
most people that have asked for it have been put off by the fact that
the code was written in Ocaml...

I do not really have any short and self-contained examples of Ocaml's
superiority. My observation is that for simple algorithms based on
high-level operations, I tend to use Matlab or Python, because these
languages are very concise and have comprehensive libraries, and I can
get a proof-of-concept implementation written extremely fast. If
maximum speed is paramount, I write the computational core in
C/C++. Ocaml is the language of choice for me if the algorithm at hand
is complex, uses complicated data structures and/or cannot be
decomposed into a few simple "computational cores". Ocaml is excellent
for this - it is expressive, sufficiently concise, takes care of
memory management, its strong typing prevents many errors and it is
only a few times slower than C (1.5-5, depending on the application).

I used to quote the generalization abilities of Ocaml (polymorphism,
functors, objects) among its advantages, but I have not been using them
much recently because of the rather high computational penalty they
involve. This is in my opinion one of the weak points of Ocaml for
high performance numerical computing.

As for Ocaml syntax, I got used to it. I quite liked the indentation
(Haskel/Python-like) based syntax implemented through the TWT
preprocessor. However, I stopped using it because it interfered with
other preprocessors I wanted to use (e.g. macros in Camlp4) and
sometimes confused error reporting. 

For some projects I have written both Ocaml and C++ implementations
and in all cases writing the Ocaml version was a more enjoyable
experience for me, the code was easier and faster to write and there
were less errors - no segmentation faults. 

If you do prepare some presentation about using Ocaml for scientific
computing, I would be interested in seeing it.

Yours,

Jan

-- 
-------------------------------------------------------------------------
Jan Kybic <kybic@fel.cvut.cz>                       tel. +420 2 2435 5721
http://cmp.felk.cvut.cz/~kybic                      ICQ 200569450

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation
  2009-11-25 11:05 Looking for information regarding use of OCaml in scientific computing and simulation David MENTRE
                   ` (2 preceding siblings ...)
  2009-11-28 23:23 ` Jan Kybic
@ 2009-11-29 23:11 ` Jon Harrop
       [not found]   ` <4a708d20911291416x2be905f7p93f559543a77d97f@mail.gmail.com>
  2009-12-04  9:55 ` David MENTRE
  4 siblings, 1 reply; 12+ messages in thread
From: Jon Harrop @ 2009-11-29 23:11 UTC (permalink / raw)
  To: caml-list

On Wednesday 25 November 2009 11:05:14 David MENTRE wrote:
>  - Code snippets of OCaml used in scientific computing or simulation,
> typically for advocacy like "it takes 10 lines in OCaml to do this,
> you would use 50 lines in C++ to do the same thing";

You may find the examples from OCaml for Scientists enlightening:

  http://www.ffconsultancy.com/products/ocaml_for_scientists/complete/

For example, the "n"th-nearest neighbours example expresses a simple 
recurrence relation for finding neighbour shells in networks (bonded atoms in 
simulated non-crystalline materials in my case). The idiomatic OCaml 
implementation is a dozen lines of code or so and more than fast enough. The 
obvious C++ implementation is over a hundred lines of code and a hundred 
times slower. You can optimize the C++ by resorting to mutable data 
structures and working out when it is safe to overwrite them but it is 
tedious and error-prone to beat OCaml's performance from C++.

>  - Evidence of *actual use* of OCaml for scientific computing or
> simulation, especially regarding usable libraries, bindings, etc.

We had hundreds of customers using OCaml for scientific computing, mostly in 
the biological sector but also physics, chemistry and engineering.

>  - Evidence of people having switched from C/C++ simulators to OCaml
> ones : good and bad points, issues, things to look at, etc.

If you look back at the archives of the mailing list, lots of people who've 
used OCaml for technical computing came from a C++ background, myself 
included.

OCaml has lots of huge advantages over C++:

1. Interactive top-level makes it easy to test code and perform simple 
computations.

2. Safety: incorrect OCaml code breaks in nice (usually deterministic) ways.

3. Better static checking: OCaml catches far more bugs at compile time than 
C++ does, making it much faster to develop in and easier to maintain OCaml 
code bases. Static checking is so good in OCaml that you rarely need a 
debugger. Once you've written your own HM type inferencer (only ~100 LOC) you 
can understand OCaml's error messages really easily whereas template-related 
C++ errors are a nightmare.

4. Functional programming: much easier to express functional solutions (e.g. 
combinators for integration and differentiation) and factor code elegantly 
and concisely.

5. Exceptions fast enough for control flow: OCaml's exceptions are several 
times faster than C++'s.

6. First-class array and list literals.

7. Algebraic data types with pattern matching: makes code for manipulating 
trees simple and efficient and, consequently, trees are far more common in 
OCaml than in C++.

8. Parametric polymorphism: a constrained form of templates that gives almost 
all of the useful expressiveness but without the obfuscated error messages.

9. Garbage collection: efficient automatic collection of unreachable values 
removes a class of bugs without significant performance degradation. OCaml's 
GC is nice and incremental with good pause times for visualization whereas 
C++ relies heavily upon scope-based destruction that results in avalanches, 
stalling for arbitrarily long pauses. The first OCaml project I did was a 
port of a visualization code base from C++ to OCaml and, IIRC, the OCaml was 
4x less code and with 5x shorter pause times.

10. Fast compilation: OCaml compiles orders of magnitude faster than C++. In 
fact, Google's new Go programming language is specifically designed for very 
fast compilation at a significant cost in terms of run-time performance but 
my preliminary tests indicate that OCaml both compiles and runs faster than 
Go.

11. I found OCaml far easier to learn than C++.

The main disadvantages of OCaml are:

1. No longer competitively performant on today's computers, for several 
reasons:

a) OCaml's GC prevents threads from running in parallel, making it much harder 
or impossible to obtain speedups from multicores when other languages make it 
easy to obtain near-linear speedups for many problems on today's widest 
desktops.

b) OCaml's data representation makes numerics and abstractions inefficient 
because it introduces a huge amount of boxing, e.g. complex arithmetic can be 
5x slower and polymorphism can be 100x slower compared to F#.

c) OCaml's x86 code gen has been overtaken by others (e.g. the JVM).

2. OCaml's Foreign Function Interface (FFI) is comparatively cumbersome, 
error-prone and inefficient. This makes OCaml largely uninteroperable: 
compare the development of Qt and OpenGL 2/3 bindings in OCaml to other 
languages, for example.

3. Lack of operator overloading can make numerics tedious.

4. Printing, hashing, equality, comparison and marshalling functions should be 
associated with each type and automatically used by the compiler. OCaml's 
module-based solution is tedious, error prone and inefficient.

The OCaml community has shrunk significantly in recent years (traffic here 
fell 28% from 2007-2008 and another 22% from 2008-2009, sales of OFS fell 30% 
from 2007-2008 and another 50% from 2008-2009). I believe this is because the 
OCaml community was ~80% technical users in 2006 and most of them have since 
left for languages that address the disadvantages I just listed, primarily 
parallelism. The main problem as I see it is that all of the other solutions 
for technical users outside Windows (i.e. except F#) introduce serious 
problems of their own (e.g. no TCO in Clojure and Scala, poor inference in 
Scala and Haskell, unpredictable time and space in Haskell, conservative GC 
on Mono and ATS).

The good news is that I've been working on a project (HLVM) specifically 
designed to address these problems, with high performance parallel numerics 
as well as having C-like data representation with an easy-to-use FFI and many 
other advantages including an optimizing native-code REPL, generic printing 
and serialization. This is just a hobby but I'm amazed at how easily I've 
been able to progress thanks to the awesome OCaml+LLVM combo.

>  - My colleagues are working a lot with Mathlab, is there any synergy
> there (bindings, ways to integrate Mathlab within OCaml code or vice
> versa, ...)?
>
> You can reply to me on this list or off list. In case of personal
> reply, let me know if I can reuse your name and affiliation.

Please do. :-)

> If this presentation is done, I'll make the slides available under a
> free license.

Great, thanks.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Looking for information regarding use of OCaml in scientific  computing and simulation
  2009-11-25 11:05 Looking for information regarding use of OCaml in scientific computing and simulation David MENTRE
                   ` (3 preceding siblings ...)
  2009-11-29 23:11 ` Jon Harrop
@ 2009-12-04  9:55 ` David MENTRE
  4 siblings, 0 replies; 12+ messages in thread
From: David MENTRE @ 2009-12-04  9:55 UTC (permalink / raw)
  To: caml users

Hello,

2009/11/25 David MENTRE <dmentre@linux-france.org>:
> I'm considering doing a short presentation of OCaml to my colleagues
> in my research lab.
[...]
> Therefore, I'm looking for reusable material for a presentation:

Many thanks to all who replied. As usual, some answers were not in
what I expected but they gave me new ideas. I'm going to post the
slides when done but don't hold your breath (probably at the beginning
of 2010). ;-)

Regards,
d.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
       [not found]       ` <4a708d20912200638q5e7d72acu9cae3b564ada085d@mail.gmail.com>
@ 2009-12-22  4:40         ` Linas Vepstas
  2009-12-22 13:11           ` Eray Ozkural
  2009-12-22 13:46           ` Gerd Stolpmann
  0 siblings, 2 replies; 12+ messages in thread
From: Linas Vepstas @ 2009-12-22  4:40 UTC (permalink / raw)
  To: Lukasz Stafiniak; +Cc: Dario Teixeira, Erik Rigtorp, caml-list

Hi Lukasz,

Yikes!  Care to start an argument on my behalf?

2009/12/20 Lukasz Stafiniak <lukstafi@gmail.com>:
> ---------- Forwarded message ----------
> From: Dario Teixeira <darioteixeira@yahoo.com>
> Date: Sun, Dec 20, 2009 at 3:27 PM
> Subject: Re: [Caml-list] Re: OCaml is  broken
> To: Erik Rigtorp <erik@rigtorp.com>
> Cc: caml-list <caml-list@yquem.inria.fr>
>
>
> Hi,
>
>> It's too bad that INRIA is not interested in fixing this bug. No
>> matter what people say I consider this a bug. Two cores is standard by
>> now, I'm used to 8, next year 32 and so on. OCaml will only become
>> more and more irrelevant. I hate to see that happening.

Hear, hear!

> This is a perennial topic in this list.  Without meaning to dwell too
> long on old arguments, I simply ask you to consider the following:
>
> - Do you really think a concurrent GC with shared memory will scale neatly
>  to those 32 cores?

Time to start funding GC research?  Is concurrent GC really
that bad?

> - Will memory access remain homogeneous for all cores as soon as we get into
>  the dozens of cores?

Yes, NUMA (non-uniform memory access) is well-known to be
painful to optimize for, due to the difficulty of predicting locality
of reference.  But is tuning for NUMA harder than creating/using
message-passing code?  Not by a long-shot.

Anyway, CPU designers understand that NUMA is unpopular
with software types, and is trying hard to to make mem access
as homogenous as possible.   Look at 'blue waters' for an
extreme example.

> - Have you considered that many Ocaml users prefer a GC that offers maximum
>  single core performance, because their application is parallelised via
>  multiple processes communicating via message passing?

Have you ever tried writing a significant or complex algo using
message passing?  Its fun if you have nothing better to to --
its a good intellectual challenge.  You can even learn some
interesting computer science while you do it.

However, if you are  interested in merely using the system
to do your "real" work, then writing message-passing code
is an utter waste of time -- its difficult, time-consuming, error
prone, hard to balance and optimize & tune, works well only
for "embarrasingly parallel" code, etc.  Even the evil
slow-down of NUMA is often better than trying to
performance-tune a message-passing system.

Let me put it this way: suggesting that programmers can
write their own message-passing system is kind of like
telling them that they can write their own garbage-collection
system, or design their own closures, or they can go
create their own type system. Of course they can ... and
if they wanted to do that, they would be programming in
C or assembly, and would probably be designing new
languages.  Cause by the time you get done with message
passing, you've created a significant and rich programming
system that resembles a poorly-designed language... been
there, done that.

> In this context,
>  your "bug" is actually a "feature".

Why not give people the choice?

--linas

disclaimer: I don't (currently) use caml, -- this is an outsiders
viewpoint.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
  2009-12-22  4:40         ` Linas Vepstas
@ 2009-12-22 13:11           ` Eray Ozkural
  2009-12-22 13:44             ` Eray Ozkural
  2009-12-22 19:49             ` Jon Harrop
  2009-12-22 13:46           ` Gerd Stolpmann
  1 sibling, 2 replies; 12+ messages in thread
From: Eray Ozkural @ 2009-12-22 13:11 UTC (permalink / raw)
  To: linasvepstas; +Cc: Lukasz Stafiniak, caml-list

On Tue, Dec 22, 2009 at 6:40 AM, Linas Vepstas <linasvepstas@gmail.com> wrote:
> However, if you are  interested in merely using the system
> to do your "real" work, then writing message-passing code
> is an utter waste of time -- its difficult, time-consuming, error
> prone, hard to balance and optimize & tune, works well only
> for "embarrasingly parallel" code, etc.  Even the evil
> slow-down of NUMA is often better than trying to
> performance-tune a message-passing system.

Message passing doesn't work well only for embarrassingly parallel
code. For instance, you can implement the aforementioned parallel
quicksort rather easily, but it's true that message passing is
low-level. The really bad thing about MPI is that it assumes some
C-like environment. C has the worst semantics ever, so programs that
require/encourage such a style of writing are inherently, well,  bad
=) And yes, MPI's difficult to debug.

What message passing really is, it is the perfect match to a
distributed memory architecture. Since, as you suggest, multicore
chips have more or less a shared memory architecture, message passing
is indeed not a good match.

However, let's not forget about the new GPU architectures, which are
sort of hybrid. The newer GPUs will have more exotic on-chip
interconnection networks.

> Let me put it this way: suggesting that programmers can
> write their own message-passing system is kind of like
> telling them that they can write their own garbage-collection
> system, or design their own closures, or they can go
> create their own type system. Of course they can ... and
> if they wanted to do that, they would be programming in
> C or assembly, and would probably be designing new
> languages.  Cause by the time you get done with message
> passing, you've created a significant and rich programming
> system that resembles a poorly-designed language... been
> there, done that.

For a functional language, am I right in expecting a high-level and
clean interface for explicit parallelism?

I suppose a "spawn" directive would not be very hard to implement. Or
a parallel let block. I don't know what kind of direction the caml
researchers have in mind (except for some INRIA projects I've read
about) but I suppose it can be done (closures could be a nice
interface, I think).

Message Passing/Distributed Memory can also be accommodated I suppose.
For distributed memory, you just need an addressing scheme to denote
the processor number. You could allocate with a parallel new that
takes a processor number argument for instance. For message passing,
you need one-to-one, collective, and one-sided communications. On top
of an imperative but failsafe and debuggable interface, you could
provide neat functional blocks. For instance, you could present the
user with a shuffle/map/reduce interface like in Google's grid
computing platform.

I would also like to have some computing-abstraction (like Monads, but
more flexible) so I can easily build networks of sequential algorithms
(kind of like Communicating Sequential Processses). It would also be
nice to have functors that implement commonly occurring parallel
programming patterns (such as master-slave dynamic load balancing or
pipelining). Such things are difficult to manage in a low-level
language like C, but they would be a piece of cake in ocaml.  Or is
fate sending that my way? Oh, not again! :)

OcamlP3l looks pretty cool. Parallel combinators? Definitely what I'm
talking about, as usual the future is here with ocaml ;)

http://ocamlp3l.inria.fr/eng.htm

Best,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
  2009-12-22 13:11           ` Eray Ozkural
@ 2009-12-22 13:44             ` Eray Ozkural
  2009-12-22 19:49             ` Jon Harrop
  1 sibling, 0 replies; 12+ messages in thread
From: Eray Ozkural @ 2009-12-22 13:44 UTC (permalink / raw)
  To: linasvepstas; +Cc: Lukasz Stafiniak, caml-list

On Tue, Dec 22, 2009 at 3:11 PM, Eray Ozkural <examachine@gmail.com> wrote:
> However, let's not forget about the new GPU architectures, which are
> sort of hybrid. The newer GPUs will have more exotic on-chip
> interconnection networks.

Some more clarification, as the number of cores increase, you would
expect more of an MIMD architecture rather than SMP-like shared
memory+cache or the SIMD that some architectures were based on (like
the Cell processor etc.) Or a hybrid one, who knows? The certain
picture at the moment is that, the architectures are getting *more*
complex to program and to optimize for.

I don't think we can neglect the use of a shared memory space. Among
other things, with multiple cores, it allows us to directly implement
the PRAM algorithms that are prevalent in parallel computing
literature. On the other hand, most of the existing parallel
applications assume the traditional cluster architecture. I don't
think we can say at the moment, this or that programming *paradigm* is
the best, but I think it's high time we experiment with higher-level
constructs that are fitting for functional languages. Sure, you can
have parallelism with just a multi-threading construct and an
atomicity construct but I think there is much more to parallelism than
that for a high-level language. That stuff we have in parallel
assembly, it would be the sort of code that a compiler generates
perhaps.

I also anticipate that it should not be very difficult to write a
parallelizing compiler for ocaml. It would be great to directly target
all those nifty register files etc. in the NVIDIA GPU's. :) And it
seems perhaps the only sane way to make the kind of fine optimizations
that a complex architecture would call for.

Best,

-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy
http://myspace.com/arizanesil http://myspace.com/malfunct

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
  2009-12-22  4:40         ` Linas Vepstas
  2009-12-22 13:11           ` Eray Ozkural
@ 2009-12-22 13:46           ` Gerd Stolpmann
  1 sibling, 0 replies; 12+ messages in thread
From: Gerd Stolpmann @ 2009-12-22 13:46 UTC (permalink / raw)
  To: linasvepstas; +Cc: Lukasz Stafiniak, caml-list

> Have you ever tried writing a significant or complex algo using
> message passing?  Its fun if you have nothing better to to --
> its a good intellectual challenge.  You can even learn some
> interesting computer science while you do it.
> 
> However, if you are  interested in merely using the system
> to do your "real" work, then writing message-passing code
> is an utter waste of time -- its difficult, time-consuming, error
> prone, hard to balance and optimize & tune, works well only
> for "embarrasingly parallel" code, etc.  Even the evil
> slow-down of NUMA is often better than trying to
> performance-tune a message-passing system.

Well, it is true that message passing is more expensive, and you need
bigger data sets until it is worth it (nonsense to do a 10x10 matrix
multiplication with message passing). However, I don't think it is that
complicated as you describe. Especially ocaml's uniform representation
of values can help a lot, and hide many of the low-level details. It
could be a lot like continuation-passing style.

Hard to balance and optimize & tune: This is true for _any_
parallelization strategy.

> Let me put it this way: suggesting that programmers can
> write their own message-passing system is kind of like
> telling them that they can write their own garbage-collection
> system, or design their own closures, or they can go
> create their own type system. Of course they can ... and
> if they wanted to do that, they would be programming in
> C or assembly, and would probably be designing new
> languages.  Cause by the time you get done with message
> passing, you've created a significant and rich programming
> system that resembles a poorly-designed language... been
> there, done that.

See it this way: The typical ocaml programmer doesn't like system
programming, and will seldom/never touch C or assembly. The task it to
help this kind of programmer, and to make parallel programming available
in a higher-level way than it is available elsewhere. 

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
  2009-12-22 13:11           ` Eray Ozkural
  2009-12-22 13:44             ` Eray Ozkural
@ 2009-12-22 19:49             ` Jon Harrop
  2009-12-22 21:11               ` Mike Lin
  1 sibling, 1 reply; 12+ messages in thread
From: Jon Harrop @ 2009-12-22 19:49 UTC (permalink / raw)
  To: caml-list

On Tuesday 22 December 2009 13:11:58 Eray Ozkural wrote:
> On Tue, Dec 22, 2009 at 6:40 AM, Linas Vepstas <linasvepstas@gmail.com> 
wrote:
> > However, if you are  interested in merely using the system
> > to do your "real" work, then writing message-passing code
> > is an utter waste of time -- its difficult, time-consuming, error
> > prone, hard to balance and optimize & tune, works well only
> > for "embarrasingly parallel" code, etc.  Even the evil
> > slow-down of NUMA is often better than trying to
> > performance-tune a message-passing system.
>
> Message passing doesn't work well only for embarrassingly parallel
> code.

Message passing doesn't necessarily work well for embarrassingly-parallel 
problems either because you cannot use in-place algorithms and scatter and 
gather are O(n).

> For instance, you can implement the aforementioned parallel 
> quicksort rather easily,

But you cannot improve performance easily and performance is the *only* 
motivation for parallelism. So the fact that you can make naive use of 
message passing easily from OCaml is useless in practice.

> What message passing really is, it is the perfect match to a
> distributed memory architecture. Since, as you suggest, multicore
> chips have more or less a shared memory architecture, message passing
> is indeed not a good match.

Yes. Conversely, shared memory is effectively a hardware accelerated form of 
message passing.

> > Let me put it this way: suggesting that programmers can
> > write their own message-passing system is kind of like
> > telling them that they can write their own garbage-collection
> > system, or design their own closures, or they can go
> > create their own type system. Of course they can ... and
> > if they wanted to do that, they would be programming in
> > C or assembly, and would probably be designing new
> > languages.  Cause by the time you get done with message
> > passing, you've created a significant and rich programming
> > system that resembles a poorly-designed language... been
> > there, done that.
>
> For a functional language, am I right in expecting a high-level and
> clean interface for explicit parallelism?

I think that is a perfectly reasonable thing to expect but you still need to 
understand its characteristics and how to leverage them in order to make good 
use of the feature.

> I suppose a "spawn" directive would not be very hard to implement.

You cannot implement it with useful efficiency in OCaml.

> Message Passing/Distributed Memory can also be accommodated I suppose.

Sure but it is worth remembering that distributed parallelism across clusters 
is a tiny niche compared to multicores.

> OcamlP3l looks pretty cool. Parallel combinators? Definitely what I'm
> talking about, as usual the future is here with ocaml ;)
>
> http://ocamlp3l.inria.fr/eng.htm

Try solving some real problems with OCamlP3L and F#. I'm sure you'll agree 
that the OCaml approach is certainly not the future.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Caml-list] Looking for information regarding use of OCaml in  scientific computing and simulation
  2009-12-22 19:49             ` Jon Harrop
@ 2009-12-22 21:11               ` Mike Lin
  0 siblings, 0 replies; 12+ messages in thread
From: Mike Lin @ 2009-12-22 21:11 UTC (permalink / raw)
  To: caml-list

On Tue, Dec 22, 2009 at 2:49 PM, Jon Harrop <jon@ffconsultancy.com> wrote:
> Sure but it is worth remembering that distributed parallelism across clusters
> is a tiny niche compared to multicores.

I think the balance is slightly different than this in
scientific/research computing (the original subject of this thread).
At least in the U.S., a ten-page proposal will garner at least a
million core-hours at a TeraGrid site for pretty much any non-crackpot
academic project, free of charge (the systems are funded by the
gov't). NSF graduate fellows get like 100k core-hours just to mess
around, no PI and no particular project proposal needed.

The infrastructure, documentation, and tech support at the TeraGrid
sites generally assume MPI-based jobs, and give you access to [tens
of] thousands of cores. That's a pretty compelling resource if the
alternative is the relatively measly theoretical speedup from a
multicore (8x, etc.).

Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-12-22 21:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-25 11:05 Looking for information regarding use of OCaml in scientific computing and simulation David MENTRE
2009-11-25 11:59 ` Sylvain Le Gall
2009-11-25 12:32 ` [Caml-list] " blue storm
2009-11-28 23:23 ` Jan Kybic
2009-11-29 23:11 ` Jon Harrop
     [not found]   ` <4a708d20911291416x2be905f7p93f559543a77d97f@mail.gmail.com>
     [not found]     ` <3ae3aa420911300830h63a04b21r2e09fb4e34cdb7f7@mail.gmail.com>
     [not found]       ` <4a708d20912200638q5e7d72acu9cae3b564ada085d@mail.gmail.com>
2009-12-22  4:40         ` Linas Vepstas
2009-12-22 13:11           ` Eray Ozkural
2009-12-22 13:44             ` Eray Ozkural
2009-12-22 19:49             ` Jon Harrop
2009-12-22 21:11               ` Mike Lin
2009-12-22 13:46           ` Gerd Stolpmann
2009-12-04  9:55 ` David MENTRE

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).