caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Sylvain Le Gall <sylvain@le-gall.net>
To: caml-list@inria.fr
Subject: Re: Threads performance issue.
Date: Mon, 16 Feb 2009 16:32:58 +0000 (UTC)	[thread overview]
Message-ID: <slrngpj59q.e8q.sylvain@gallu.homelinux.org> (raw)
In-Reply-To: <2184b2340902160715y1f935b5ehc0e6195b3f75b66b@mail.gmail.com>

Hello,

On 16-02-2009, Rémi Dewitte <remi@gide.net> wrote:
> --===============0282778124==
> Content-Type: multipart/alternative; boundary=00504502b0791d7c5b04630aa761
>
> --00504502b0791d7c5b04630aa761
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
>
> Hello,
>
> I would like to read two files in two different threads.
>
> I have made a first version reading the first then the second and it takes
> 2.8s (native).
>
> I decided to make a threaded version and before any use of thread I realize=
> d
> that just linking no even using it to the threads library makes my first
> version of the program to run in 12s !
>

There is a small function call to handle thread
(caml_(enter|leave)_blocking_section). I don't know how much it cost in
term of performance but I am under the impression that it cost more time
than you win. These function calls can be found in many files all around
the OCaml source distribution... 

> I use pcre, extlib, csv libraries as well.
>

Some of this library can have a high cost for thread synchronisation on global
variable. You need to investigate. 

> I guess it might come from GC slowing down thinks here, doesn't it ? Where
> can it come from otherwise ? Is there a workaround or something I should
> know ?

Maybe... You need to look at external library and to benchmark your own
code... This is not an easy task.

>
> Can ocaml use multiple cores ?
>

As advertised in the OCaml documentation:

http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html

The threads library is implemented by time-sharing on a single
processor. It will not take advantage of multi-processor machines. Using
this library will therefore never make programs run faster. However,
many programs are easier to write when structured as several
communicating processes.

One of the point is that the GC doesn't take advantage of multiple-core.
Current GC that support this feature are slower than OCaml
single-threaded GC...

> Do you have few pointers on libraries to make parallel I/Os ?
>

Since you are running a fairly recent Linux kernel, I recommend you:
https://forge.ocamlcore.org/projects/libaio-ocaml/

which should allow you to use AIO (asynchronous IO in the kernel, see
"man aio_read").

Now on a more "ask-yourself" tone:

I have tried using thread to speed up IO on multiple core (in C code).
It is really tricky to get something that really work faster. In fact,
for reading you don't get performance at all when using threaded IO. I
am still asking myself why. I think it as something todo with the fact
that when you generate too much read request, the OS begin to do
inefficient I/O seek all around (almost no effect on Linux, timex4 on
Windows). As a matter of fact (for now), using non-threaded Unix.read
with 16k buffer and threaded Unix.write with 4M buffer is the most
efficient I/O scheme.

All in all, I think you should not try to use thread to improve your
software performance in OCaml - or rely on low-level asynchronous IO
(aio). 

Regards,
Sylvain Le Gall


  parent reply	other threads:[~2009-02-16 16:33 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-16 15:15 Rémi Dewitte
2009-02-16 15:28 ` [Caml-list] " Michał Maciejewski
2009-02-16 15:32   ` Rémi Dewitte
2009-02-16 15:42     ` David Allsopp
2009-02-16 16:07       ` Rémi Dewitte
2009-02-16 16:32 ` Sylvain Le Gall [this message]
2009-02-17 13:52   ` [Caml-list] " Frédéric Gava
2009-02-16 16:47 ` [Caml-list] " Yaron Minsky
2009-02-16 17:37   ` Rémi Dewitte
2009-02-17  7:40     ` Rémi Dewitte
2009-02-17  8:59       ` Mark Shinwell
2009-02-17  9:09         ` Rémi Dewitte
2009-02-17  9:53         ` Jon Harrop
2009-02-17 10:07       ` Sylvain Le Gall
2009-02-17 10:26         ` [Caml-list] " Mark Shinwell
2009-02-17 10:50           ` Rémi Dewitte
2009-02-17 10:56             ` Mark Shinwell
2009-02-17 11:33             ` Jon Harrop
2009-02-17 12:20         ` Yaron Minsky
2009-02-17 12:26           ` Rémi Dewitte
2009-02-17 17:14           ` Sylvain Le Gall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=slrngpj59q.e8q.sylvain@gallu.homelinux.org \
    --to=sylvain@le-gall.net \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).