Hello !

Many thanks all for your answers !

Managing to have the almost same performance whether in mutithreaded
environment or not (even not using threads for this particular task) is
something I would like to have anyway.

I'll give a try to big buffers using Using.read. Any example code around ?
And then why not try iao !

Memory mapping of the file could be done using BigArray or do I have to
write C code ?

Rémi

On Tue, Feb 17, 2009 at 11:26, Mark Shinwell <mshinwell@janestcapital.com>wrote:

> On Tue, Feb 17, 2009 at 10:07:05AM +0000, Sylvain Le Gall wrote:
> > On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
> > You are using input_char and standard IO channel. This is a good choice
> > for non-threaded program. But in your case, I will use Unix.read with a
> > big buffer (32KB to 4MB) and change your program to use it. As
> > benchmarked by John Harrop, you are spending most of your time in
> > caml_enter|leave_blocking section.
>
> This isn't quite right actually -- the profile is deceiving.  It is true
> that there are a lot of calls to enter/leave_blocking_section, but you're
> actually being killed by the overhead of an independent locking strategy
> in the channel-based I/O calls.  I've measured this using some hackery
> with a hex editor.  When you call input_char, you acquire and then release
> another lock which is specific to these calls (the global runtime lock is
> often not released here).  This process isn't especially cheap, so it would
> be better to use one of the other channel calls to read data in larger
> blocks.
>
> Mark
>