caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* How to write efficient threaded programs on OCaml
@ 2006-02-07 23:15 Christophe TROESTLER
  2006-02-08  8:21 ` [Caml-list] " Erik de Castro Lopo
  2006-02-13 10:25 ` Florian Weimer
  0 siblings, 2 replies; 10+ messages in thread
From: Christophe TROESTLER @ 2006-02-07 23:15 UTC (permalink / raw)
  To: OCaml Mailing List

Hi,

I was wondering if some among you would like to share some hints about
how to write efficient multithreaded applications in OCaml.  Indeed,
for example, the following concurrency test shows OCaml performing
quite poorly:
http://shootout.alioth.debian.org/debian/benchmark.php?test=chameneos&lang=all
A qprof profiling reveals that OCaml is spending 63-73% of its time on
the function caml_process_pending_signals and 13-18% on
pthread_cond_signal.  Is there a way to improve the performance of
this code ?

Now maybe this is not to be considered like a good analogous of a
possible "real life" application.  Hence my question: what do people
do to get decent performance in these cases?

Cheers,
ChriS


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-07 23:15 How to write efficient threaded programs on OCaml Christophe TROESTLER
@ 2006-02-08  8:21 ` Erik de Castro Lopo
  2006-02-08 14:38   ` Christophe TROESTLER
  2006-02-13 10:25 ` Florian Weimer
  1 sibling, 1 reply; 10+ messages in thread
From: Erik de Castro Lopo @ 2006-02-08  8:21 UTC (permalink / raw)
  To: caml-list

Christophe TROESTLER wrote:

> Hi,
> 
> I was wondering if some among you would like to share some hints about
> how to write efficient multithreaded applications in OCaml.  Indeed,
> for example, the following concurrency test shows OCaml performing
> quite poorly:

This question (or variants of it) comes around quite regularly. The 
standard response is:

    http://sardes.inrialpes.fr/~aschmitt/cwn/2002.11.26.html#8

If this has changed recently, I hope someone will speak up :-).

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
"Attacks by Microsoft Chairman Bill Gates on the GNU General
Public License, under which much open source and free software is
distributed, have been driven by a fear that the GPL creates a
domain of software that Microsoft cannot privatize and control"
   -- Richard Stallman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-08  8:21 ` [Caml-list] " Erik de Castro Lopo
@ 2006-02-08 14:38   ` Christophe TROESTLER
  2006-02-08 15:17     ` Xavier Leroy
  0 siblings, 1 reply; 10+ messages in thread
From: Christophe TROESTLER @ 2006-02-08 14:38 UTC (permalink / raw)
  To: OCaml Mailing List

On Wed, 8 Feb 2006, Erik de Castro Lopo <ocaml-erikd@mega-nerd.com> wrote:
> 
> This question (or variants of it) comes around quite regularly. The 
> standard response is:
> 
>     http://sardes.inrialpes.fr/~aschmitt/cwn/2002.11.26.html#8

Thanks for the link but I know about that answer (and sorry to start
yet again a discussion on this topic it's just that I can't figure out
why there is such a difference between programs using the same
concepts).  The test is supposed to be executed on a single processor
and can be seen as related to (2)-(3) [i.e. _not_ SMP] as there are 4
different threads -- one can think some are doing I/O, others are
listening to user events.  What is special about this test is that the
4 threads communicate a lot (this will obviously not be so intensive
in a real app but still lots of sync can happen).

My question is twofold.  When using the Event module in the same way
MLton does, the running time is ~30 times larger.  I'd like an
explanation about why there is such a huge difference and possible
ways to reduce it...  Using an implementation close to C (Mutex,
Condition) divides the time by 2 but still is about 6 times slower
than C.  Is it to say a better implementation is possible for OCaml ???

ChriS


---
P.S. BTW, these slowdowns especially happen with native code --
bytecode (with -vmthreads) is immensely faster (of course this is a
different implementation).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-08 14:38   ` Christophe TROESTLER
@ 2006-02-08 15:17     ` Xavier Leroy
  2006-02-08 16:58       ` Matthieu Dubuget
  2006-02-08 22:13       ` Christophe TROESTLER
  0 siblings, 2 replies; 10+ messages in thread
From: Xavier Leroy @ 2006-02-08 15:17 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: OCaml Mailing List

> My question is twofold.  When using the Event module in the same way
> MLton does, the running time is ~30 times larger.  I'd like an
> explanation about why there is such a huge difference and possible
> ways to reduce it...  Using an implementation close to C (Mutex,
> Condition) divides the time by 2 but still is about 6 times slower
> than C.  Is it to say a better implementation is possible for OCaml ???

Could you please tell us where to find the OCaml code you're
discussing?  I haven't seen it anywhere on the Web page you posted.

> P.S. BTW, these slowdowns especially happen with native code --
> bytecode (with -vmthreads) is immensely faster (of course this is a
> different implementation).

You've answered your own question: context switches with system
threads can be expensive depending on the characteristics of the
underlying POSIX threads implementation.  You're really testing
threading libraries, not language implementations.

My feelings at this point (without having seen the code nor the task
spec) is that you don't want to use threads at all, rather something
like CPS.

- Xavier Leroy


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-08 15:17     ` Xavier Leroy
@ 2006-02-08 16:58       ` Matthieu Dubuget
  2006-02-08 22:13       ` Christophe TROESTLER
  1 sibling, 0 replies; 10+ messages in thread
From: Matthieu Dubuget @ 2006-02-08 16:58 UTC (permalink / raw)
  To: OCaml Mailing List

Xavier Leroy a écrit :
>> My question is twofold.  When using the Event module in the same way
>> MLton does, the running time is ~30 times larger.  
I did this rewrite, mimicing MLton's version, and asked for improvements
on this list some times ago:
http://caml.inria.fr/pub/ml-archives/caml-list/2006/01/4eabc18ec6462a33b81c1b9969a6b18a.en.html

The attached code seems not retrievable from Caml-list official
archives. It's here:
http://shootout.alioth.debian.org/debian/benchmark.php?test=chameneos&lang=ocaml&id=0

>> I'd like an
>> explanation about why there is such a huge difference and possible
>> ways to reduce it...  Using an implementation close to C (Mutex,
>> Condition) divides the time by 2 but still is about 6 times slower
>> than C.  Is it to say a better implementation is possible for OCaml ???
>>     
>
> Could you please tell us where to find the OCaml code you're
> discussing?  I haven't seen it anywhere on the Web page you posted.
>
>   


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-08 15:17     ` Xavier Leroy
  2006-02-08 16:58       ` Matthieu Dubuget
@ 2006-02-08 22:13       ` Christophe TROESTLER
  2006-02-09 17:39         ` Alessandro Baretta
  1 sibling, 1 reply; 10+ messages in thread
From: Christophe TROESTLER @ 2006-02-08 22:13 UTC (permalink / raw)
  To: Xavier.Leroy; +Cc: caml-list

On Wed, 08 Feb 2006, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
> 
> Could you please tell us where to find the OCaml code you're
> discussing?  I haven't seen it anywhere on the Web page you posted.

Sorry.  You get the code by clicking on the language name:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all
The task description is at the bottom of the page.

The one similar to MLton is:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=ocaml&id=0

The one similar to C code:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=ocaml&id=3

> You're really testing threading libraries, not language implementations.

Point taken.  After a more careful investigation, MLton does not use
OS threads so it is not really a valid comparison.  The GCC one (which
uses pthreads and semaphores) still seems to hold -- maybe I
overlooked something.

> My feelings at this point (without having seen the code nor the task
> spec) is that you don't want to use threads at all, rather something
> like CPS.

Possibly.  I will investigate what I can do in that direction
(pointers are welcome :).

Thanks,
ChriS


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-08 22:13       ` Christophe TROESTLER
@ 2006-02-09 17:39         ` Alessandro Baretta
  2006-02-09 22:26           ` Matthew Hannigan
  0 siblings, 1 reply; 10+ messages in thread
From: Alessandro Baretta @ 2006-02-09 17:39 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1862 bytes --]

Christophe TROESTLER wrote:
> On Wed, 08 Feb 2006, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
> 
>>Could you please tell us where to find the OCaml code you're
>>discussing?  I haven't seen it anywhere on the Web page you posted.
> 
> 
> Sorry.  You get the code by clicking on the language name:
> http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all
> The task description is at the bottom of the page.
> 
> The one similar to MLton is:
> http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=ocaml&id=0

Wow! It's incredible to see just how slow an algorithm can become when threading 
is used. A vanilla userland scheduler can produce a three-order of magnitude 
speedup over a threading library. I'm attaching an implementation of chameneos 
which uses a batched queue à la Okasaki to schedule chameneos meetings. 
Obviously, this algorithms is absolutely deterministic, while the implementation 
using threads follows a random execution path, which is in general different at 
every execution of the program. Yet the program satifies the benchmark 
specification and is about 700 times faster than
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=ocaml&id=0

I have used this approach to write my Schopenhauer software PLC in Ocaml. This 
is one of the key design factors for achieving realtime performance without 
hogging the CPU.

Yet, what conclusion should I draw? Is the GNU/Debian/Linux-2.6 threading 
support creepingly slow, or does ocaml have an insurmountable aversion for threads?

Alex

-- 
*********************************************************************

Ing. Alessandro Baretta

Studio Baretta
http://studio.baretta.com/

Consulenza Tecnologica e Ingegneria Industriale
Technological Consulting and Industrial Engineering

tel. +39 02 370 111 55
fax. +39 02 370 111 54


[-- Attachment #2: chameneos_alex.ml --]
[-- Type: application/x-extension-ml, Size: 2065 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-09 17:39         ` Alessandro Baretta
@ 2006-02-09 22:26           ` Matthew Hannigan
  0 siblings, 0 replies; 10+ messages in thread
From: Matthew Hannigan @ 2006-02-09 22:26 UTC (permalink / raw)
  To: Alessandro Baretta; +Cc: Christophe TROESTLER, caml-list

On Thu, Feb 09, 2006 at 06:39:05PM +0100, Alessandro Baretta wrote:
> Yet, what conclusion should I draw? Is the GNU/Debian/Linux-2.6 threading 
> support creepingly slow, or does ocaml have an insurmountable aversion for 
> threads?

I believe Linux 2.6 threading 'NPTL' is one of the fastest of any OS around.
It is possible to use the back-compatible library though.  I presume
Debian have not forced this!

http://en.wikipedia.org/wiki/NPTL

Matt


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-07 23:15 How to write efficient threaded programs on OCaml Christophe TROESTLER
  2006-02-08  8:21 ` [Caml-list] " Erik de Castro Lopo
@ 2006-02-13 10:25 ` Florian Weimer
  2006-02-16 18:11   ` Christophe TROESTLER
  1 sibling, 1 reply; 10+ messages in thread
From: Florian Weimer @ 2006-02-13 10:25 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: OCaml Mailing List

* Christophe TROESTLER:

> http://shootout.alioth.debian.org/debian/benchmark.php?test=chameneos&lang=all
> A qprof profiling reveals that OCaml is spending 63-73% of its time on
> the function caml_process_pending_signals and 13-18% on
> pthread_cond_signal.

Could this be a measuring artifact?  It seems that all Caml code is
accounted for in caml_process_pending_signals.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] How to write efficient threaded programs on OCaml
  2006-02-13 10:25 ` Florian Weimer
@ 2006-02-16 18:11   ` Christophe TROESTLER
  0 siblings, 0 replies; 10+ messages in thread
From: Christophe TROESTLER @ 2006-02-16 18:11 UTC (permalink / raw)
  To: OCaml Mailing List

On Mon, 13 Feb 2006, Florian Weimer <fw@deneb.enyo.de> wrote:
> 
> * Christophe TROESTLER:
> 
> > http://shootout.alioth.debian.org/debian/benchmark.php?test=chameneos&lang=all
> > A qprof profiling reveals that OCaml is spending 63-73% of its time on
> > the function caml_process_pending_signals and 13-18% on
> > pthread_cond_signal.
> 
> Could this be a measuring artifact?  It seems that all Caml code is
> accounted for in caml_process_pending_signals.

Maybe.  This is why I asked on the maling list : to have the opinions
of people more knowledgeable than me on this subject.

Cheers,
ChriS


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-02-16 18:11 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-07 23:15 How to write efficient threaded programs on OCaml Christophe TROESTLER
2006-02-08  8:21 ` [Caml-list] " Erik de Castro Lopo
2006-02-08 14:38   ` Christophe TROESTLER
2006-02-08 15:17     ` Xavier Leroy
2006-02-08 16:58       ` Matthieu Dubuget
2006-02-08 22:13       ` Christophe TROESTLER
2006-02-09 17:39         ` Alessandro Baretta
2006-02-09 22:26           ` Matthew Hannigan
2006-02-13 10:25 ` Florian Weimer
2006-02-16 18:11   ` Christophe TROESTLER

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).