Not using channels with either file descriptors or bigarray works well in my case.

Good to know when working with ocaml to take care of channels ;) !

Rémi

2009/2/17 Yaron Minsky <yminsky@gmail.com>
Interestingly, this probably has nothing to do with the size of the buffer.  input_char actually acquires and releases a lock for every single call, whether or not an underlying system call is required to fill the buffer.  This has always struck me as an odd aspect of the in/out channel implementation, and means that IO is a lot more expensive in a threaded context than it should be. 

At Jane Street, performance-sensitive code tends to use other libraries that we've built directly on top of file descriptors that batches the IO and doesn't require constant lock acquisition.

y


On Tue, Feb 17, 2009 at 5:07 AM, Sylvain Le Gall <sylvain@le-gall.net> wrote:
On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
>
> test.csv is a 21mo file with ~13k rows and a thousands of columns on a 15rp=
> m
> disk.
>
> ocaml version : 3.11.0
>

You are using input_char and standard IO channel. This is a good choice
for non-threaded program. But in your case, I will use Unix.read with a
big buffer (32KB to 4MB) and change your program to use it. As
benchmarked by John Harrop, you are spending most of your time in
caml_enter|leave_blocking section. I think it comes from reading using
std IO channel which use 4k buffer. Using a bigger buffer will allow
less call to this two functions (but you won't win time at the end, I
think you will just reduce the difference between non-threaded and
threaded code).

Regards
Sylvain Le Gall

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs