caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Threads performance issue.
@ 2009-02-16 15:15 Rémi Dewitte
  2009-02-16 15:28 ` [Caml-list] " Michał Maciejewski
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-16 15:15 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 679 bytes --]

Hello,

I would like to read two files in two different threads.

I have made a first version reading the first then the second and it takes
2.8s (native).

I decided to make a threaded version and before any use of thread I realized
that just linking no even using it to the threads library makes my first
version of the program to run in 12s !

I use pcre, extlib, csv libraries as well.

I guess it might come from GC slowing down thinks here, doesn't it ? Where
can it come from otherwise ? Is there a workaround or something I should
know ?

Can ocaml use multiple cores ?

Do you have few pointers on libraries to make parallel I/Os ?

Thanks,
Rémi

[-- Attachment #2: Type: text/html, Size: 721 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Caml-list] Threads performance issue.
  2009-02-16 15:15 Threads performance issue Rémi Dewitte
@ 2009-02-16 15:28 ` Michał Maciejewski
  2009-02-16 15:32   ` Rémi Dewitte
  2009-02-16 16:32 ` Sylvain Le Gall
  2009-02-16 16:47 ` [Caml-list] " Yaron Minsky
  2 siblings, 1 reply; 21+ messages in thread
From: Michał Maciejewski @ 2009-02-16 15:28 UTC (permalink / raw)
  To: caml-list

Hi,

2009/2/16 Rémi Dewitte <remi@gide.net>:
> I guess it might come from GC slowing down thinks here, doesn't it ?
I don't think so. Why do you think it's GC?

> Can ocaml use multiple cores ?
No and as far as I know it's because of GC. ;-)

regards
Miichal


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-16 15:28 ` [Caml-list] " Michał Maciejewski
@ 2009-02-16 15:32   ` Rémi Dewitte
  2009-02-16 15:42     ` David Allsopp
  0 siblings, 1 reply; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-16 15:32 UTC (permalink / raw)
  To: Michał Maciejewski; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 812 bytes --]

On Mon, Feb 16, 2009 at 16:28, Michał Maciejewski <michal.m.pl@gmail.com>wrote:

> Hi,
>
> 2009/2/16 Rémi Dewitte <remi@gide.net>:
> > I guess it might come from GC slowing down thinks here, doesn't it ?
> I don't think so. Why do you think it's GC?
>
Bad guess :) !
Any hint why just linking makes things slow ?


>
> > Can ocaml use multiple cores ?
> No and as far as I know it's because of GC. ;-)
>
Ok that's a shame but I will live with :)

>
> regards
> Miichal
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 1868 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [Caml-list] Threads performance issue.
  2009-02-16 15:32   ` Rémi Dewitte
@ 2009-02-16 15:42     ` David Allsopp
  2009-02-16 16:07       ` Rémi Dewitte
  0 siblings, 1 reply; 21+ messages in thread
From: David Allsopp @ 2009-02-16 15:42 UTC (permalink / raw)
  To: 'Rémi Dewitte', 'Michał Maciejewski'; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]

Which OS (and port, if applicable) are you using?

 

From: caml-list-bounces@yquem.inria.fr [mailto:caml-list-bounces@yquem.inria.fr] On Behalf Of Rémi Dewitte
Sent: 16 February 2009 15:33
To: Michał Maciejewski
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Threads performance issue.

 

 

On Mon, Feb 16, 2009 at 16:28, Michał Maciejewski <michal.m.pl@gmail.com> wrote:

Hi,

2009/2/16 Rémi Dewitte <remi@gide.net>:

> I guess it might come from GC slowing down thinks here, doesn't it ?

I don't think so. Why do you think it's GC?

Bad guess :) !
Any hint why just linking makes things slow ?
 


> Can ocaml use multiple cores ?

No and as far as I know it's because of GC. ;-)

Ok that's a shame but I will live with :)


regards
Miichal

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

 


[-- Attachment #2: Type: text/html, Size: 7125 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-16 15:42     ` David Allsopp
@ 2009-02-16 16:07       ` Rémi Dewitte
  0 siblings, 0 replies; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-16 16:07 UTC (permalink / raw)
  To: David Allsopp; +Cc: Michał Maciejewski, caml-list

[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]

Ubuntu 8.10, kernel 2.6.27-11-generic
Ocaml 3.10.2 shipped with the distrib

Thanks a lot,
Rémi

On Mon, Feb 16, 2009 at 16:42, David Allsopp <dra-news@metastack.com> wrote:

>  Which OS (and port, if applicable) are you using?
>
>
>
> *From:* caml-list-bounces@yquem.inria.fr [mailto:
> caml-list-bounces@yquem.inria.fr] *On Behalf Of *Rémi Dewitte
> *Sent:* 16 February 2009 15:33
> *To:* Michał Maciejewski
> *Cc:* caml-list@inria.fr
> *Subject:* Re: [Caml-list] Threads performance issue.
>
>
>
>
>
> On Mon, Feb 16, 2009 at 16:28, Michał Maciejewski <michal.m.pl@gmail.com>
> wrote:
>
> Hi,
>
> 2009/2/16 Rémi Dewitte <remi@gide.net>:
>
> > I guess it might come from GC slowing down thinks here, doesn't it ?
>
> I don't think so. Why do you think it's GC?
>
> Bad guess :) !
> Any hint why just linking makes things slow ?
>
>
>
> > Can ocaml use multiple cores ?
>
> No and as far as I know it's because of GC. ;-)
>
>  Ok that's a shame but I will live with :)
>
>
> regards
> Miichal
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>

[-- Attachment #2: Type: text/html, Size: 3855 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Threads performance issue.
  2009-02-16 15:15 Threads performance issue Rémi Dewitte
  2009-02-16 15:28 ` [Caml-list] " Michał Maciejewski
@ 2009-02-16 16:32 ` Sylvain Le Gall
  2009-02-17 13:52   ` [Caml-list] " Frédéric Gava
  2009-02-16 16:47 ` [Caml-list] " Yaron Minsky
  2 siblings, 1 reply; 21+ messages in thread
From: Sylvain Le Gall @ 2009-02-16 16:32 UTC (permalink / raw)
  To: caml-list

Hello,

On 16-02-2009, Rémi Dewitte <remi@gide.net> wrote:
> --===============0282778124==
> Content-Type: multipart/alternative; boundary=00504502b0791d7c5b04630aa761
>
> --00504502b0791d7c5b04630aa761
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
>
> Hello,
>
> I would like to read two files in two different threads.
>
> I have made a first version reading the first then the second and it takes
> 2.8s (native).
>
> I decided to make a threaded version and before any use of thread I realize=
> d
> that just linking no even using it to the threads library makes my first
> version of the program to run in 12s !
>

There is a small function call to handle thread
(caml_(enter|leave)_blocking_section). I don't know how much it cost in
term of performance but I am under the impression that it cost more time
than you win. These function calls can be found in many files all around
the OCaml source distribution... 

> I use pcre, extlib, csv libraries as well.
>

Some of this library can have a high cost for thread synchronisation on global
variable. You need to investigate. 

> I guess it might come from GC slowing down thinks here, doesn't it ? Where
> can it come from otherwise ? Is there a workaround or something I should
> know ?

Maybe... You need to look at external library and to benchmark your own
code... This is not an easy task.

>
> Can ocaml use multiple cores ?
>

As advertised in the OCaml documentation:

http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html

The threads library is implemented by time-sharing on a single
processor. It will not take advantage of multi-processor machines. Using
this library will therefore never make programs run faster. However,
many programs are easier to write when structured as several
communicating processes.

One of the point is that the GC doesn't take advantage of multiple-core.
Current GC that support this feature are slower than OCaml
single-threaded GC...

> Do you have few pointers on libraries to make parallel I/Os ?
>

Since you are running a fairly recent Linux kernel, I recommend you:
https://forge.ocamlcore.org/projects/libaio-ocaml/

which should allow you to use AIO (asynchronous IO in the kernel, see
"man aio_read").

Now on a more "ask-yourself" tone:

I have tried using thread to speed up IO on multiple core (in C code).
It is really tricky to get something that really work faster. In fact,
for reading you don't get performance at all when using threaded IO. I
am still asking myself why. I think it as something todo with the fact
that when you generate too much read request, the OS begin to do
inefficient I/O seek all around (almost no effect on Linux, timex4 on
Windows). As a matter of fact (for now), using non-threaded Unix.read
with 16k buffer and threaded Unix.write with 4M buffer is the most
efficient I/O scheme.

All in all, I think you should not try to use thread to improve your
software performance in OCaml - or rely on low-level asynchronous IO
(aio). 

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-16 15:15 Threads performance issue Rémi Dewitte
  2009-02-16 15:28 ` [Caml-list] " Michał Maciejewski
  2009-02-16 16:32 ` Sylvain Le Gall
@ 2009-02-16 16:47 ` Yaron Minsky
  2009-02-16 17:37   ` Rémi Dewitte
  2 siblings, 1 reply; 21+ messages in thread
From: Yaron Minsky @ 2009-02-16 16:47 UTC (permalink / raw)
  To: Rémi Dewitte; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1293 bytes --]

2009/2/16 Rémi Dewitte <remi@gide.net>

> Hello,
>
> I would like to read two files in two different threads.
>
> I have made a first version reading the first then the second and it takes
> 2.8s (native).
>
> I decided to make a threaded version and before any use of thread I
> realized that just linking no even using it to the threads library makes my
> first version of the program to run in 12s !


Do you have a short benchmark you can post?  The idea that the
thread-overhead would make a difference like that, particularly for IO-bound
code (which I'm guessing this is) is pretty surprising.

y


>
> I use pcre, extlib, csv libraries as well.
>
> I guess it might come from GC slowing down thinks here, doesn't it ? Where
> can it come from otherwise ? Is there a workaround or something I should
> know ?
>
> Can ocaml use multiple cores ?
>
> Do you have few pointers on libraries to make parallel I/Os ?
>
> Thanks,
> Rémi
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>

[-- Attachment #2: Type: text/html, Size: 2033 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-16 16:47 ` [Caml-list] " Yaron Minsky
@ 2009-02-16 17:37   ` Rémi Dewitte
  2009-02-17  7:40     ` Rémi Dewitte
  0 siblings, 1 reply; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-16 17:37 UTC (permalink / raw)
  To: yminsky; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 5317 bytes --]

Yaron,

I use a slightly modified version of the CSV library's load_rows . Here is
the main code which is highly imperative style. I might transform it in
purely functional style ?

The main program is :

open Printf;;
open Sys;;
let timed_exec start_message f =
  print_string start_message;
  let st1 = time () in
  let r = f () in
  print_endline ("done in " ^ (string_of_float ((time ()) -. st1)) );
  r;;

(* This line enabled makes the program really slow ! *)
let run_threaded f = Thread.create (fun () -> f (); Thread.exit ()) ()

let () = timed_exec "Reading data " (fun () ->
  load_rows (fun _ -> ()) (open_in "file1.csv");
  load_rows (fun _ -> ()) (open_in "file2.csv");
  ()
)

The load_rows :
let load_rows ?(separator = ',') ?(nread = -1) f chan =
  let nr = ref 0 in
  let row = ref [] in            (* Current row. *)
  let field = ref [] in            (* Current field. *)
  let state = ref StartField in        (* Current state. *)
  let end_of_field () =
    let field_list = List.rev !field in
    let field_len = List.length field_list in
    let field_str = String.create field_len in
    let rec loop i = function
    [] -> ()
      | x :: xs ->
      field_str.[i] <- x;
      loop (i+1) xs
    in
    loop 0 field_list;
    row := (Some field_str) :: !row;
    field := [];
    state := StartField
  in
  let empty_field () =
    row := None :: !row;
    field := [];
    state := StartField
  in
  let end_of_row () =
    let row_list = List.rev !row in
    f row_list;
    row := [];
    state := StartField;
    nr := !nr + 1;
  in
  let rec loop () =
    let c = input_char chan in
    if c != '\r' then (            (* Always ignore \r characters. *)
      match !state with
      StartField ->            (* Expecting quote or other char. *)
        if c = '"' then (
          state := InQuotedField;
          field := []
        ) else if c = separator then (* Empty field. *)
          empty_field ()
        else if c = '\n' then (    (* Empty field, end of row. *)
          empty_field ();
          end_of_row ()
        ) else (
          state := InUnquotedField;
          field := [c]
        )
    | InUnquotedField ->        (* Reading chars to end of field. *)
        if c = separator then    (* End of field. *)
          end_of_field ()
        else if c = '\n' then (    (* End of field and end of row. *)
          end_of_field ();
          end_of_row ()
        ) else
          field := c :: !field
    | InQuotedField ->        (* Reading chars to end of field. *)
        if c = '"' then
          state := InQuotedFieldAfterQuote
        else
          field := c :: !field
    | InQuotedFieldAfterQuote ->
        if c = '"' then (        (* Doubled quote. *)
          field := c :: !field;
          state := InQuotedField
        ) else if c = '0' then (    (* Quote-0 is ASCII NUL. *)
          field := '\000' :: !field;
          state := InQuotedField
        ) else if c = separator then (* End of field. *)
          end_of_field ()
        else if c = '\n' then (    (* End of field and end of row. *)
          end_of_field ();
          end_of_row ()
        ) else (            (* Bad single quote in field. *)
          field := c :: '"' :: !field;
          state := InQuotedField
        )
    ); (* end of match *)
  if( nread < 0 or !nr < nread) then loop () else ()
  in
  try
    loop ()
  with
      End_of_file ->
    (* Any part left to write out? *)
    (match !state with
         StartField ->
           if !row <> [] then
         ( empty_field (); end_of_row () )
       | InUnquotedField | InQuotedFieldAfterQuote ->
           end_of_field (); end_of_row ()
       | InQuotedField ->
           raise (Bad_CSV_file "Missing end quote after quoted field.")
    )


Thanks,
Rémi

On Mon, Feb 16, 2009 at 17:47, Yaron Minsky <yminsky@gmail.com> wrote:

> 2009/2/16 Rémi Dewitte <remi@gide.net>
>
>> Hello,
>>
>> I would like to read two files in two different threads.
>>
>> I have made a first version reading the first then the second and it takes
>> 2.8s (native).
>>
>> I decided to make a threaded version and before any use of thread I
>> realized that just linking no even using it to the threads library makes my
>> first version of the program to run in 12s !
>
>
> Do you have a short benchmark you can post?  The idea that the
> thread-overhead would make a difference like that, particularly for IO-bound
> code (which I'm guessing this is) is pretty surprising.
>
> y
>
>
>>
>> I use pcre, extlib, csv libraries as well.
>>
>> I guess it might come from GC slowing down thinks here, doesn't it ? Where
>> can it come from otherwise ? Is there a workaround or something I should
>> know ?
>>
>> Can ocaml use multiple cores ?
>>
>> Do you have few pointers on libraries to make parallel I/Os ?
>>
>> Thanks,
>> Rémi
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 9597 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-16 17:37   ` Rémi Dewitte
@ 2009-02-17  7:40     ` Rémi Dewitte
  2009-02-17  8:59       ` Mark Shinwell
  2009-02-17 10:07       ` Sylvain Le Gall
  0 siblings, 2 replies; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-17  7:40 UTC (permalink / raw)
  To: yminsky; +Cc: caml-list


[-- Attachment #1.1: Type: text/plain, Size: 6548 bytes --]

I have made some further experiments.
I have a functional version of the reading algorithm. I have the original
imperative version of the algorithm.
Either it is linked to thread (T) or not (X). Either it uses extlib (E) or
not (X).

Results are.
                  XX      TX     XE     TE
Imperative | 3.37 | 7.80 | 3.56 | 8.40
Functional | 4.20 | 8.28 | 4.47 | 9.08

test.csv is a 21mo file with ~13k rows and a thousands of columns on a 15rpm
disk.

ocaml version : 3.11.0

uname -a gives
Linux localhost 2.6.28.4-server-1mnb #1 SMP Mon Feb 9 09:05:19 EST 2009 i686
Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz GNU/Linux

While I think I have to find improvements to the functional version, I
struggle finding a rationale behind this high loss of performance while I am
not even using threads, just linking to...

Cheers,
Rémi

On Mon, Feb 16, 2009 at 18:37, Rémi Dewitte <remi@gide.net> wrote:

> Yaron,
>
> I use a slightly modified version of the CSV library's load_rows . Here is
> the main code which is highly imperative style. I might transform it in
> purely functional style ?
>
> The main program is :
>
> open Printf;;
> open Sys;;
> let timed_exec start_message f =
>   print_string start_message;
>   let st1 = time () in
>   let r = f () in
>   print_endline ("done in " ^ (string_of_float ((time ()) -. st1)) );
>   r;;
>
> (* This line enabled makes the program really slow ! *)
> let run_threaded f = Thread.create (fun () -> f (); Thread.exit ()) ()
>
> let () = timed_exec "Reading data " (fun () ->
>   load_rows (fun _ -> ()) (open_in "file1.csv");
>   load_rows (fun _ -> ()) (open_in "file2.csv");
>   ()
> )
>
> The load_rows :
> let load_rows ?(separator = ',') ?(nread = -1) f chan =
>   let nr = ref 0 in
>   let row = ref [] in            (* Current row. *)
>   let field = ref [] in            (* Current field. *)
>   let state = ref StartField in        (* Current state. *)
>   let end_of_field () =
>     let field_list = List.rev !field in
>     let field_len = List.length field_list in
>     let field_str = String.create field_len in
>     let rec loop i = function
>     [] -> ()
>       | x :: xs ->
>       field_str.[i] <- x;
>       loop (i+1) xs
>     in
>     loop 0 field_list;
>     row := (Some field_str) :: !row;
>     field := [];
>     state := StartField
>   in
>   let empty_field () =
>     row := None :: !row;
>     field := [];
>     state := StartField
>   in
>   let end_of_row () =
>     let row_list = List.rev !row in
>     f row_list;
>     row := [];
>     state := StartField;
>     nr := !nr + 1;
>   in
>   let rec loop () =
>     let c = input_char chan in
>     if c != '\r' then (            (* Always ignore \r characters. *)
>       match !state with
>       StartField ->            (* Expecting quote or other char. *)
>         if c = '"' then (
>           state := InQuotedField;
>           field := []
>         ) else if c = separator then (* Empty field. *)
>           empty_field ()
>         else if c = '\n' then (    (* Empty field, end of row. *)
>           empty_field ();
>           end_of_row ()
>         ) else (
>           state := InUnquotedField;
>           field := [c]
>         )
>     | InUnquotedField ->        (* Reading chars to end of field. *)
>         if c = separator then    (* End of field. *)
>           end_of_field ()
>         else if c = '\n' then (    (* End of field and end of row. *)
>           end_of_field ();
>           end_of_row ()
>         ) else
>           field := c :: !field
>     | InQuotedField ->        (* Reading chars to end of field. *)
>         if c = '"' then
>           state := InQuotedFieldAfterQuote
>         else
>           field := c :: !field
>     | InQuotedFieldAfterQuote ->
>         if c = '"' then (        (* Doubled quote. *)
>           field := c :: !field;
>           state := InQuotedField
>         ) else if c = '0' then (    (* Quote-0 is ASCII NUL. *)
>           field := '\000' :: !field;
>           state := InQuotedField
>         ) else if c = separator then (* End of field. *)
>           end_of_field ()
>         else if c = '\n' then (    (* End of field and end of row. *)
>           end_of_field ();
>           end_of_row ()
>         ) else (            (* Bad single quote in field. *)
>           field := c :: '"' :: !field;
>           state := InQuotedField
>         )
>     ); (* end of match *)
>   if( nread < 0 or !nr < nread) then loop () else ()
>   in
>   try
>     loop ()
>   with
>       End_of_file ->
>     (* Any part left to write out? *)
>     (match !state with
>          StartField ->
>            if !row <> [] then
>          ( empty_field (); end_of_row () )
>        | InUnquotedField | InQuotedFieldAfterQuote ->
>            end_of_field (); end_of_row ()
>        | InQuotedField ->
>            raise (Bad_CSV_file "Missing end quote after quoted field.")
>     )
>
>
> Thanks,
> Rémi
>
>
> On Mon, Feb 16, 2009 at 17:47, Yaron Minsky <yminsky@gmail.com> wrote:
>
>> 2009/2/16 Rémi Dewitte <remi@gide.net>
>>
>>> Hello,
>>>
>>> I would like to read two files in two different threads.
>>>
>>> I have made a first version reading the first then the second and it
>>> takes 2.8s (native).
>>>
>>> I decided to make a threaded version and before any use of thread I
>>> realized that just linking no even using it to the threads library makes my
>>> first version of the program to run in 12s !
>>
>>
>> Do you have a short benchmark you can post?  The idea that the
>> thread-overhead would make a difference like that, particularly for IO-bound
>> code (which I'm guessing this is) is pretty surprising.
>>
>> y
>>
>>
>>>
>>> I use pcre, extlib, csv libraries as well.
>>>
>>> I guess it might come from GC slowing down thinks here, doesn't it ?
>>> Where can it come from otherwise ? Is there a workaround or something I
>>> should know ?
>>>
>>> Can ocaml use multiple cores ?
>>>
>>> Do you have few pointers on libraries to make parallel I/Os ?
>>>
>>> Thanks,
>>> Rémi
>>>
>>> _______________________________________________
>>> Caml-list mailing list. Subscription management:
>>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>>> Archives: http://caml.inria.fr
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>>
>>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 11086 bytes --]

[-- Attachment #2: transf.ml --]
[-- Type: application/octet-stream, Size: 3457 bytes --]

(* open ExtLib *)

(** Slithly modified copy from module CSV *)
exception Bad_CSV_file of string

type state_t = StartField
	       | InUnquotedField
	       | InQuotedField
	       | InQuotedFieldAfterQuote


let load_rows ?(separator = ',') ?(nread = -1) f chan =
  let entry = (StartField,[],[],0) in
(*   let st = ref entry in *)
  let string_of_field field =
    let field_list = List.rev field in
    let field_len = List.length field_list in
    let field_str = String.create field_len in
    let rec loop i = function
	[] -> ()
      | x :: xs ->
	  field_str.[i] <- x;
	  loop (i+1) xs
    in
    loop 0 field_list;
    field_str
  in
  let end_of_field field row nr =
    let sf = (string_of_field field) in
    (StartField,[],(Some sf :: row),nr)
  in
  let empty_field row nr =
    (StartField,[],None :: row,nr)
  in
  let end_of_row row nr =
    f (List.rev row);
    (StartField,[],[],nr + 1)
  in
  let empty_field_and_end_row row nr =
    end_of_row (None :: row) nr
  in
  let end_of_field_and_row field row nr =
    let sf = (string_of_field field) in
    end_of_row (Some sf :: row) nr
  in
  let rec read_char chan = try
     let c = input_char chan in (if c != '\r' then Some c else read_char chan)
    with End_of_file -> None
  in

  let rec loop st =
   let (state,field,row,nr) = st in
   if(nr = nread) then
     ()
   else (
   let co = read_char chan in
   match co with
   | None ->
	(match state with
	   | StartField -> if (row <> []) then (let _ = empty_field_and_end_row row nr in ()) else ()
	   | InUnquotedField | InQuotedFieldAfterQuote ->
	       let _ = empty_field_and_end_row row nr in ()
	   | InQuotedField ->
	       raise (Bad_CSV_file "Missing end quote after quoted field.")
	)
   | Some c ->
      (let stn = (match state with
	  StartField ->			(* Expecting quote or other char. *)
	    if c = '"' then (
	      (InQuotedField,[],row,nr)
	    ) else if c = separator then (* Empty field. *)
	      empty_field row nr
	    else if c = '\n' then (	(* Empty field, end of row. *)
	      empty_field_and_end_row row nr
	    ) else (
	      (InUnquotedField,[c],row,nr)
	    )
	| InUnquotedField ->		(* Reading chars to end of field. *)
	    if c = separator then	(* End of field. *)
	      end_of_field field row nr
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field_and_row field row nr
	    ) else
              (state,c :: field,row,nr)
	| InQuotedField ->		(* Reading chars to end of field. *)
	    if c = '"' then
	      (InQuotedFieldAfterQuote,field,row,nr)
	    else
	      (state,c::field,row,nr)
	| InQuotedFieldAfterQuote ->
	    if c = '"' then (		(* Doubled quote. *)
	      (InQuotedField,c::field,row,nr)
	    ) else if c = '0' then (	(* Quote-0 is ASCII NUL. *)
	      (InQuotedField,'\000' :: field,row,nr)
	    ) else if c = separator then (* End of field. *)
	      end_of_field field row nr
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field_and_row field row nr
	    ) else (			(* Bad single quote in field. *)
	      (InQuotedField,c :: '"' :: field,row,nr)
	    )
        ) in loop stn )
   )
   in
   loop entry

(* let run_threaded f = Thread.create (fun () -> f (); Thread.exit ()) () *)

let () = let i = open_in "test.csv" in load_rows (fun _ -> ()) i; close_in i
let () = let i = open_in "test.csv" in load_rows (fun _ -> ()) i; close_in i
let () = let i = open_in "test.csv" in load_rows (fun _ -> ()) i; close_in i

[-- Attachment #3: transi.ml --]
[-- Type: application/octet-stream, Size: 3311 bytes --]

(* open ExtLib *)

(** Slithly modified copy from module CSV *)
exception Bad_CSV_file of string

type state_t = StartField
	       | InUnquotedField
	       | InQuotedField
	       | InQuotedFieldAfterQuote

let load_rows ?(separator = ',') ?(nread = -1) f chan =
  let nr = ref 0 in
  let row = ref [] in			(* Current row. *)
  let field = ref [] in			(* Current field. *)
  let state = ref StartField in		(* Current state. *)
  let end_of_field () =
    let field_list = List.rev !field in
    let field_len = List.length field_list in
    let field_str = String.create field_len in
    let rec loop i = function
	[] -> ()
      | x :: xs ->
	  field_str.[i] <- x;
	  loop (i+1) xs
    in
    loop 0 field_list;
    row := (Some field_str) :: !row;
    field := [];
    state := StartField
  in
  let empty_field () =
    row := None :: !row;
    field := [];
    state := StartField
  in
  let end_of_row () =
    let row_list = List.rev !row in
    f row_list;
    row := [];
    state := StartField;
    nr := !nr + 1;
  in
  let rec loop () =
    let c = input_char chan in
    if c != '\r' then (			(* Always ignore \r characters. *)
      match !state with
	  StartField ->			(* Expecting quote or other char. *)
	    if c = '"' then (
	      state := InQuotedField;
	      field := []
	    ) else if c = separator then (* Empty field. *)
	      empty_field ()
	    else if c = '\n' then (	(* Empty field, end of row. *)
	      empty_field ();
	      end_of_row ()
	    ) else (
	      state := InUnquotedField;
	      field := [c]
	    )
	| InUnquotedField ->		(* Reading chars to end of field. *)
	    if c = separator then	(* End of field. *)
	      end_of_field ()
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field ();
	      end_of_row ()
	    ) else
	      field := c :: !field
	| InQuotedField ->		(* Reading chars to end of field. *)
	    if c = '"' then
	      state := InQuotedFieldAfterQuote
	    else
	      field := c :: !field
	| InQuotedFieldAfterQuote ->
	    if c = '"' then (		(* Doubled quote. *)
	      field := c :: !field;
	      state := InQuotedField
	    ) else if c = '0' then (	(* Quote-0 is ASCII NUL. *)
	      field := '\000' :: !field;
	      state := InQuotedField
	    ) else if c = separator then (* End of field. *)
	      end_of_field ()
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field ();
	      end_of_row ()
	    ) else (			(* Bad single quote in field. *)
	      field := c :: '"' :: !field;
	      state := InQuotedField
	    )
    ); (* end of match *)
  if( nread < 0 or !nr < nread) then loop () else ()
  in
  try
    loop ()
  with
      End_of_file ->
	(* Any part left to write out? *)
	(match !state with
	     StartField ->
	       if !row <> [] then
		 ( empty_field (); end_of_row () )
	   | InUnquotedField | InQuotedFieldAfterQuote ->
	       end_of_field (); end_of_row ()
	   | InQuotedField ->
	       raise (Bad_CSV_file "Missing end quote after quoted field.")
	)

(* let run_threaded f = Thread.create (fun () -> f (); Thread.exit ()) () *)

let () = let i = open_in "test.csv" in load_rows (fun _ -> ()) i; close_in i
let () = let i = open_in "test.csv" in load_rows (fun _ -> ()) i; close_in i
let () = let i = open_in "test.csv" in load_rows (fun _ -> ()) i; close_in i

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-17  7:40     ` Rémi Dewitte
@ 2009-02-17  8:59       ` Mark Shinwell
  2009-02-17  9:09         ` Rémi Dewitte
  2009-02-17  9:53         ` Jon Harrop
  2009-02-17 10:07       ` Sylvain Le Gall
  1 sibling, 2 replies; 21+ messages in thread
From: Mark Shinwell @ 2009-02-17  8:59 UTC (permalink / raw)
  To: Rémi Dewitte; +Cc: caml-list

On Tue, Feb 17, 2009 at 08:40:11AM +0100, Rémi Dewitte wrote:
> I have made some further experiments.
> I have a functional version of the reading algorithm. I have the original
> imperative version of the algorithm.
> Either it is linked to thread (T) or not (X). Either it uses extlib (E) or
> not (X).

Using:

ocamlopt -o foo transf.ml
ocamlopt -thread -o foothread transf.ml unix.cmxa threads.cmxa

on local disk with a 24Mb, approx. 500,000-line CSV file, I only see a
minor slowdown with foothread as opposed to foo.  (A minor slowdown would
inded be expected.)  So I'm confused as to why your results are so
different.

You could use ocamlopt -p and run gprof on the resulting gmon.out file.

Mark


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-17  8:59       ` Mark Shinwell
@ 2009-02-17  9:09         ` Rémi Dewitte
  2009-02-17  9:53         ` Jon Harrop
  1 sibling, 0 replies; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-17  9:09 UTC (permalink / raw)
  To: Mark Shinwell; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1025 bytes --]

You need to uncomment the line 107 with Thread calls so that it is
effectively linked to threads I think and see the difference !

I will try the profiling !

Rémi

On Tue, Feb 17, 2009 at 09:59, Mark Shinwell <mshinwell@janestcapital.com>wrote:

> On Tue, Feb 17, 2009 at 08:40:11AM +0100, Rémi Dewitte wrote:
> > I have made some further experiments.
> > I have a functional version of the reading algorithm. I have the original
> > imperative version of the algorithm.
> > Either it is linked to thread (T) or not (X). Either it uses extlib (E)
> or
> > not (X).
>
> Using:
>
> ocamlopt -o foo transf.ml
> ocamlopt -thread -o foothread transf.ml unix.cmxa threads.cmxa
>
> on local disk with a 24Mb, approx. 500,000-line CSV file, I only see a
> minor slowdown with foothread as opposed to foo.  (A minor slowdown would
> inded be expected.)  So I'm confused as to why your results are so
> different.
>
> You could use ocamlopt -p and run gprof on the resulting gmon.out file.
>
> Mark
>

[-- Attachment #2: Type: text/html, Size: 1513 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Threads performance issue.
  2009-02-17  8:59       ` Mark Shinwell
  2009-02-17  9:09         ` Rémi Dewitte
@ 2009-02-17  9:53         ` Jon Harrop
  1 sibling, 0 replies; 21+ messages in thread
From: Jon Harrop @ 2009-02-17  9:53 UTC (permalink / raw)
  To: caml-list

On Tuesday 17 February 2009 08:59:44 Mark Shinwell wrote:
> On Tue, Feb 17, 2009 at 08:40:11AM +0100, Rémi Dewitte wrote:
> > I have made some further experiments.
> > I have a functional version of the reading algorithm. I have the original
> > imperative version of the algorithm.
> > Either it is linked to thread (T) or not (X). Either it uses extlib (E)
> > or not (X).
>
> Using:
>
> ocamlopt -o foo transf.ml
> ocamlopt -thread -o foothread transf.ml unix.cmxa threads.cmxa
>
> on local disk with a 24Mb, approx. 500,000-line CSV file, I only see a
> minor slowdown with foothread as opposed to foo.  (A minor slowdown would
> inded be expected.)  So I'm confused as to why your results are so
> different.

If you uncomment Rémi's line of code that requires threading (or factor it out 
into a different file and compile that as well when using threading) then you 
should be able to reproduce his results as I have.

> You could use ocamlopt -p and run gprof on the resulting gmon.out file.

Indeed the profiling results are most enlightening. Without threads everything 
looks great:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 38.89      0.07     0.07        3    23.33    60.00  camlTransf__loop_109
 22.22      0.11     0.04  2873940     0.00     0.00  caml_ml_input_char
 11.11      0.13     0.02  2873940     0.00     0.00  
camlTransf__read_char_106
 11.11      0.15     0.02   381696     0.00     0.00  camlTransf__code_begin
...

With threads, everything looks awful:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 22.22      0.08     0.08                             
caml_thread_leave_blocking_section
 16.67      0.14     0.06                             
caml_thread_enter_blocking_section
 13.89      0.19     0.05        3    16.67    50.00  camlTransf__loop_109
  9.72      0.23     0.04                             caml_io_mutex_unlock
  5.56      0.24     0.02  3255698     0.00     0.00  caml_c_call
...

So most of the time is spent locking and unlocking OCaml's good old giant 
global lock. Unfortunately the blocking calls are "spontaneous" so you cannot 
track them down using a profile but, given the huge rise in time spent in 
read_char, my guess is that OCaml has introduced an OS kernel lock around 
every single byte read! If so, I'm surprised it is running *this* fast...

Rémi: assuming I am correct, I'd recommend using memory mapping (and rewriting 
that code ;-).

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Threads performance issue.
  2009-02-17  7:40     ` Rémi Dewitte
  2009-02-17  8:59       ` Mark Shinwell
@ 2009-02-17 10:07       ` Sylvain Le Gall
  2009-02-17 10:26         ` [Caml-list] " Mark Shinwell
  2009-02-17 12:20         ` Yaron Minsky
  1 sibling, 2 replies; 21+ messages in thread
From: Sylvain Le Gall @ 2009-02-17 10:07 UTC (permalink / raw)
  To: caml-list

On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
>
> test.csv is a 21mo file with ~13k rows and a thousands of columns on a 15rp=
> m
> disk.
>
> ocaml version : 3.11.0
>

You are using input_char and standard IO channel. This is a good choice
for non-threaded program. But in your case, I will use Unix.read with a
big buffer (32KB to 4MB) and change your program to use it. As
benchmarked by John Harrop, you are spending most of your time in
caml_enter|leave_blocking section. I think it comes from reading using
std IO channel which use 4k buffer. Using a bigger buffer will allow
less call to this two functions (but you won't win time at the end, I
think you will just reduce the difference between non-threaded and
threaded code).

Regards
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-17 10:07       ` Sylvain Le Gall
@ 2009-02-17 10:26         ` Mark Shinwell
  2009-02-17 10:50           ` Rémi Dewitte
  2009-02-17 12:20         ` Yaron Minsky
  1 sibling, 1 reply; 21+ messages in thread
From: Mark Shinwell @ 2009-02-17 10:26 UTC (permalink / raw)
  To: remi; +Cc: caml-list

On Tue, Feb 17, 2009 at 10:07:05AM +0000, Sylvain Le Gall wrote:
> On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
> You are using input_char and standard IO channel. This is a good choice
> for non-threaded program. But in your case, I will use Unix.read with a
> big buffer (32KB to 4MB) and change your program to use it. As
> benchmarked by John Harrop, you are spending most of your time in
> caml_enter|leave_blocking section.

This isn't quite right actually -- the profile is deceiving.  It is true
that there are a lot of calls to enter/leave_blocking_section, but you're
actually being killed by the overhead of an independent locking strategy
in the channel-based I/O calls.  I've measured this using some hackery
with a hex editor.  When you call input_char, you acquire and then release
another lock which is specific to these calls (the global runtime lock is
often not released here).  This process isn't especially cheap, so it would
be better to use one of the other channel calls to read data in larger blocks.

Mark


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-17 10:26         ` [Caml-list] " Mark Shinwell
@ 2009-02-17 10:50           ` Rémi Dewitte
  2009-02-17 10:56             ` Mark Shinwell
  2009-02-17 11:33             ` Jon Harrop
  0 siblings, 2 replies; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-17 10:50 UTC (permalink / raw)
  To: Mark Shinwell; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]

Hello !

Many thanks all for your answers !

Managing to have the almost same performance whether in mutithreaded
environment or not (even not using threads for this particular task) is
something I would like to have anyway.

I'll give a try to big buffers using Using.read. Any example code around ?
And then why not try iao !

Memory mapping of the file could be done using BigArray or do I have to
write C code ?

Rémi

On Tue, Feb 17, 2009 at 11:26, Mark Shinwell <mshinwell@janestcapital.com>wrote:

> On Tue, Feb 17, 2009 at 10:07:05AM +0000, Sylvain Le Gall wrote:
> > On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
> > You are using input_char and standard IO channel. This is a good choice
> > for non-threaded program. But in your case, I will use Unix.read with a
> > big buffer (32KB to 4MB) and change your program to use it. As
> > benchmarked by John Harrop, you are spending most of your time in
> > caml_enter|leave_blocking section.
>
> This isn't quite right actually -- the profile is deceiving.  It is true
> that there are a lot of calls to enter/leave_blocking_section, but you're
> actually being killed by the overhead of an independent locking strategy
> in the channel-based I/O calls.  I've measured this using some hackery
> with a hex editor.  When you call input_char, you acquire and then release
> another lock which is specific to these calls (the global runtime lock is
> often not released here).  This process isn't especially cheap, so it would
> be better to use one of the other channel calls to read data in larger
> blocks.
>
> Mark
>

[-- Attachment #2: Type: text/html, Size: 2117 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-17 10:50           ` Rémi Dewitte
@ 2009-02-17 10:56             ` Mark Shinwell
  2009-02-17 11:33             ` Jon Harrop
  1 sibling, 0 replies; 21+ messages in thread
From: Mark Shinwell @ 2009-02-17 10:56 UTC (permalink / raw)
  To: Rémi Dewitte; +Cc: caml-list

On Tue, Feb 17, 2009 at 11:50:44AM +0100, Rémi Dewitte wrote:
> Hello !
> 
> Many thanks all for your answers !
> 
> Managing to have the almost same performance whether in mutithreaded
> environment or not (even not using threads for this particular task) is
> something I would like to have anyway.
> 
> I'll give a try to big buffers using Using.read. Any example code around ?
> And then why not try iao !

Try something straightforward to start with like reading the file just
one line at a time?  Especially if the lines are long this should give a
good improvement.

Mark


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-17 10:50           ` Rémi Dewitte
  2009-02-17 10:56             ` Mark Shinwell
@ 2009-02-17 11:33             ` Jon Harrop
  1 sibling, 0 replies; 21+ messages in thread
From: Jon Harrop @ 2009-02-17 11:33 UTC (permalink / raw)
  To: caml-list

On Tuesday 17 February 2009 10:50:44 Rémi Dewitte wrote:
> Memory mapping of the file could be done using BigArray or do I have to
> write C code ?

You can memory map files very easily entirely from within OCaml.

This was actually covered in the OCaml Journal article about OpenGL 2, which 
used file mapping as an easy way to load texture maps. First, you open the 
file to create a file descriptor:

  try_finally (Unix.openfile file [] 777)

Then you map the file to create a big array:

    (fun desc ->
       let source = Array1.map_file desc int8_signed c_layout false (-1) in

Then you can do something with the big array, like copy it into an ordinary 
string:

       String.init (Array1.dim source) (fun i -> Char.chr source.{i}))

Finally, you close the file:

    Unix.close

Note that I have used try_finally and String.init functions from my own stdlib 
but their purpose and use should be obvious.

You probably just want to replace read_char with a function that increments a 
counter and reads from the array, with the whole parser inside the 
try_finally.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-17 10:07       ` Sylvain Le Gall
  2009-02-17 10:26         ` [Caml-list] " Mark Shinwell
@ 2009-02-17 12:20         ` Yaron Minsky
  2009-02-17 12:26           ` Rémi Dewitte
  2009-02-17 17:14           ` Sylvain Le Gall
  1 sibling, 2 replies; 21+ messages in thread
From: Yaron Minsky @ 2009-02-17 12:20 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

Interestingly, this probably has nothing to do with the size of the buffer.
input_char actually acquires and releases a lock for every single call,
whether or not an underlying system call is required to fill the buffer.
This has always struck me as an odd aspect of the in/out channel
implementation, and means that IO is a lot more expensive in a threaded
context than it should be.

At Jane Street, performance-sensitive code tends to use other libraries that
we've built directly on top of file descriptors that batches the IO and
doesn't require constant lock acquisition.

y

On Tue, Feb 17, 2009 at 5:07 AM, Sylvain Le Gall <sylvain@le-gall.net>wrote:

> On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
> >
> > test.csv is a 21mo file with ~13k rows and a thousands of columns on a
> 15rp=
> > m
> > disk.
> >
> > ocaml version : 3.11.0
> >
>
> You are using input_char and standard IO channel. This is a good choice
> for non-threaded program. But in your case, I will use Unix.read with a
> big buffer (32KB to 4MB) and change your program to use it. As
> benchmarked by John Harrop, you are spending most of your time in
> caml_enter|leave_blocking section. I think it comes from reading using
> std IO channel which use 4k buffer. Using a bigger buffer will allow
> less call to this two functions (but you won't win time at the end, I
> think you will just reduce the difference between non-threaded and
> threaded code).
>
> Regards
> Sylvain Le Gall
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 2617 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-17 12:20         ` Yaron Minsky
@ 2009-02-17 12:26           ` Rémi Dewitte
  2009-02-17 17:14           ` Sylvain Le Gall
  1 sibling, 0 replies; 21+ messages in thread
From: Rémi Dewitte @ 2009-02-17 12:26 UTC (permalink / raw)
  To: yminsky; +Cc: Sylvain Le Gall, caml-list


[-- Attachment #1.1: Type: text/plain, Size: 2424 bytes --]

Not using channels with either file descriptors or bigarray works well in my
case.

Good to know when working with ocaml to take care of channels ;) !

Rémi

2009/2/17 Yaron Minsky <yminsky@gmail.com>

> Interestingly, this probably has nothing to do with the size of the
> buffer.  input_char actually acquires and releases a lock for every single
> call, whether or not an underlying system call is required to fill the
> buffer.  This has always struck me as an odd aspect of the in/out channel
> implementation, and means that IO is a lot more expensive in a threaded
> context than it should be.
>
> At Jane Street, performance-sensitive code tends to use other libraries
> that we've built directly on top of file descriptors that batches the IO and
> doesn't require constant lock acquisition.
>
> y
>
>
> On Tue, Feb 17, 2009 at 5:07 AM, Sylvain Le Gall <sylvain@le-gall.net>wrote:
>
>> On 17-02-2009, Rémi Dewitte <remi@gide.net> wrote:
>> >
>> > test.csv is a 21mo file with ~13k rows and a thousands of columns on a
>> 15rp=
>> > m
>> > disk.
>> >
>> > ocaml version : 3.11.0
>> >
>>
>> You are using input_char and standard IO channel. This is a good choice
>> for non-threaded program. But in your case, I will use Unix.read with a
>> big buffer (32KB to 4MB) and change your program to use it. As
>> benchmarked by John Harrop, you are spending most of your time in
>> caml_enter|leave_blocking section. I think it comes from reading using
>> std IO channel which use 4k buffer. Using a bigger buffer will allow
>> less call to this two functions (but you won't win time at the end, I
>> think you will just reduce the difference between non-threaded and
>> threaded code).
>>
>> Regards
>> Sylvain Le Gall
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>

[-- Attachment #1.2: Type: text/html, Size: 3792 bytes --]

[-- Attachment #2: transi2.ml --]
[-- Type: text/x-ocaml, Size: 3737 bytes --]

(* open ExtLib *)

(** Slithly modified copy from module CSV *)
exception Bad_CSV_file of string

type state_t = StartField
	       | InUnquotedField
	       | InQuotedField
	       | InQuotedFieldAfterQuote

let load_rows ?(separator = ',') ?(nread = -1) f file =
  let nr = ref 0 in
  let row = ref [] in			(* Current row. *)
  let field = ref [] in			(* Current field. *)
  let state = ref StartField in		(* Current state. *)
  let end_of_field () =
    let field_list = List.rev !field in
    let field_len = List.length field_list in
    let field_str = String.create field_len in
    let rec loop i = function
	[] -> ()
      | x :: xs ->
	  field_str.[i] <- x;
	  loop (i+1) xs
    in
    loop 0 field_list;
    row := (Some field_str) :: !row;
    field := [];
    state := StartField
  in
  let empty_field () =
    row := None :: !row;
    field := [];
    state := StartField
  in
  let end_of_row () =
    let row_list = List.rev !row in
    f row_list;
    row := [];
    state := StartField;
    nr := !nr + 1;
  in
  let process c =
    if c != '\r' then (			(* Always ignore \r characters. *)
      match !state with
	  StartField ->			(* Expecting quote or other char. *)
	    if c = '"' then (
	      state := InQuotedField;
	      field := []
	    ) else if c = separator then (* Empty field. *)
	      empty_field ()
	    else if c = '\n' then (	(* Empty field, end of row. *)
	      empty_field ();
	      end_of_row ()
	    ) else (
	      state := InUnquotedField;
	      field := [c]
	    )
	| InUnquotedField ->		(* Reading chars to end of field. *)
	    if c = separator then	(* End of field. *)
	      end_of_field ()
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field ();
	      end_of_row ()
	    ) else
	      field := c :: !field
	| InQuotedField ->		(* Reading chars to end of field. *)
	    if c = '"' then
	      state := InQuotedFieldAfterQuote
	    else
	      field := c :: !field
	| InQuotedFieldAfterQuote ->
	    if c = '"' then (		(* Doubled quote. *)
	      field := c :: !field;
	      state := InQuotedField
	    ) else if c = '0' then (	(* Quote-0 is ASCII NUL. *)
	      field := '\000' :: !field;
	      state := InQuotedField
	    ) else if c = separator then (* End of field. *)
	      end_of_field ()
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field ();
	      end_of_row ()
	    ) else (			(* Bad single quote in field. *)
	      field := c :: '"' :: !field;
	      state := InQuotedField
	    )
    ) (* end of match *)
  in
  let continue = ref true in
  let file_in = Unix.openfile file [Unix.O_RDONLY] 0o640 in
  let end_processing () =
    continue := false;
    try Unix.close file_in with _ -> ();
    (match !state with
      | StartField ->
	 if !row <> [] then
	   ( empty_field (); end_of_row () )
      | InUnquotedField | InQuotedFieldAfterQuote ->
	    end_of_field (); end_of_row ()
      | InQuotedField ->
	   raise (Bad_CSV_file "Missing end quote after quoted field.")
    )
  in
  let buffer_length = 2 * 1024 * 1024 in
  let buffer = String.make buffer_length '\000' in
  let process_buffer l = 
(*     for i = 0 to l do *)
    let ii = ref 0 in
    while (!continue) && (!ii) <= l do
      let i = !ii in
      process buffer.[i];
      ii := i + 1;
      if( nread > 0 && !nr = nread ) then end_processing () else ()
    done
  in
  while !(continue)
  do 
    let n = Unix.read file_in buffer 0 buffer_length in
    if (n > 0 )
    then process_buffer n
    else end_processing ()
  done

let run_threaded f = Thread.create (fun () -> f (); Thread.exit ()) ()

let t1 = load_rows (fun _ -> ()) "test.csv"
let t2 = load_rows (fun _ -> ()) "test2.csv"
let t3 = load_rows (fun _ -> ()) "test3.csv"

[-- Attachment #3: transimm.ml --]
[-- Type: text/x-ocaml, Size: 3514 bytes --]

(* open ExtLib *)

open Bigarray

(** Slithly modified copy from module CSV *)
exception Bad_CSV_file of string

type state_t = StartField
	       | InUnquotedField
	       | InQuotedField
	       | InQuotedFieldAfterQuote

let load_rows ?(separator = ',') ?(nread = -1) f file =
  let nr = ref 0 in
  let row = ref [] in			(* Current row. *)
  let field = ref [] in			(* Current field. *)
  let state = ref StartField in		(* Current state. *)
  let end_of_field () =
    let field_list = List.rev !field in
    let field_len = List.length field_list in
    let field_str = String.create field_len in
    let rec loop i = function
	[] -> ()
      | x :: xs ->
	  field_str.[i] <- x;
	  loop (i+1) xs
    in
    loop 0 field_list;
    row := (Some field_str) :: !row;
    field := [];
    state := StartField
  in
  let empty_field () =
    row := None :: !row;
    field := [];
    state := StartField
  in
  let end_of_row () =
    let row_list = List.rev !row in
    f row_list;
    row := [];
    state := StartField;
    nr := !nr + 1;
  in
  let process c =
    if c != '\r' then (			(* Always ignore \r characters. *)
      match !state with
	  StartField ->			(* Expecting quote or other char. *)
	    if c = '"' then (
	      state := InQuotedField;
	      field := []
	    ) else if c = separator then (* Empty field. *)
	      empty_field ()
	    else if c = '\n' then (	(* Empty field, end of row. *)
	      empty_field ();
	      end_of_row ()
	    ) else (
	      state := InUnquotedField;
	      field := [c]
	    )
	| InUnquotedField ->		(* Reading chars to end of field. *)
	    if c = separator then	(* End of field. *)
	      end_of_field ()
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field ();
	      end_of_row ()
	    ) else
	      field := c :: !field
	| InQuotedField ->		(* Reading chars to end of field. *)
	    if c = '"' then
	      state := InQuotedFieldAfterQuote
	    else
	      field := c :: !field
	| InQuotedFieldAfterQuote ->
	    if c = '"' then (		(* Doubled quote. *)
	      field := c :: !field;
	      state := InQuotedField
	    ) else if c = '0' then (	(* Quote-0 is ASCII NUL. *)
	      field := '\000' :: !field;
	      state := InQuotedField
	    ) else if c = separator then (* End of field. *)
	      end_of_field ()
	    else if c = '\n' then (	(* End of field and end of row. *)
	      end_of_field ();
	      end_of_row ()
	    ) else (			(* Bad single quote in field. *)
	      field := c :: '"' :: !field;
	      state := InQuotedField
	    )
    ) (* end of match *)
  in
  let file_in = Unix.openfile file [Unix.O_RDONLY] 0o640 in
  let end_processing () =
    try Unix.close file_in with _ -> ();
    (match !state with
      | StartField ->
	 if !row <> [] then
	   ( empty_field (); end_of_row () )
      | InUnquotedField | InQuotedFieldAfterQuote ->
	    end_of_field (); end_of_row ()
      | InQuotedField ->
	   raise (Bad_CSV_file "Missing end quote after quoted field.")
    )
  in
  let mmap = Bigarray.Array1.map_file file_in Bigarray.char Bigarray.c_layout false (-1) in
  let l = (Bigarray.Array1.dim mmap) in
  let continue = ref true in
  let i = ref 0 in
  while !continue do
    process (Array1.(*unsafe_*)get mmap !i);
    i := !i + 1;
    continue := (nread < 0 || !nr < nread ) && !i < l
  done;
  end_processing ()
;;

let run_threaded f = Thread.create (fun () -> f (); Thread.exit ());;

load_rows (fun _ -> ()) "test.csv" ;;
load_rows (fun _ -> ()) "test2.csv";;
load_rows (fun _ -> ()) "test3.csv";;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Caml-list] Re: Threads performance issue.
  2009-02-16 16:32 ` Sylvain Le Gall
@ 2009-02-17 13:52   ` Frédéric Gava
  0 siblings, 0 replies; 21+ messages in thread
From: Frédéric Gava @ 2009-02-17 13:52 UTC (permalink / raw)
  Cc: caml-list

>> I would like to read two files in two different threads.
>>
>> I have made a first version reading the first then the second and it takes
>> 2.8s (native).
>>
>> I decided to make a threaded version and before any use of thread I realize=
>> d
>> that just linking no even using it to the threads library makes my first
>> version of the program to run in 12s !

This kind of trick can work when files are on different discs due to two 
I/O calls (even using blocking reading). But in a single thread using 
non-blocking reading should also work but it is more difficult to write.

FG


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Threads performance issue.
  2009-02-17 12:20         ` Yaron Minsky
  2009-02-17 12:26           ` Rémi Dewitte
@ 2009-02-17 17:14           ` Sylvain Le Gall
  1 sibling, 0 replies; 21+ messages in thread
From: Sylvain Le Gall @ 2009-02-17 17:14 UTC (permalink / raw)
  To: caml-list

Hello,

On 17-02-2009, Yaron Minsky <yminsky@gmail.com> wrote:
>
> Interestingly, this probably has nothing to do with the size of the buffer.
> input_char actually acquires and releases a lock for every single call,
> whether or not an underlying system call is required to fill the buffer.
> This has always struck me as an odd aspect of the in/out channel
> implementation, and means that IO is a lot more expensive in a threaded
> context than it should be.
>
>
> On Tue, Feb 17, 2009 at 5:07 AM, Sylvain Le Gall <sylvain@le-gall.net>wrote=
>>
>> You are using input_char and standard IO channel. This is a good choice
>> for non-threaded program. But in your case, I will use Unix.read with a
>> big buffer (32KB to 4MB) and change your program to use it. As
>> benchmarked by John Harrop, you are spending most of your time in
>> caml_enter|leave_blocking section. I think it comes from reading using
>> std IO channel which use 4k buffer. Using a bigger buffer will allow
>> less call to this two functions (but you won't win time at the end, I
>> think you will just reduce the difference between non-threaded and
>> threaded code).
>>

You are probably true concerning the fact that it has nothing to do with
size of the buffer. I am just mixing two kind of optimization. Anyway, I
think even if the size is not important, using Unix.read + file
descriptor should do the trick.

Thanks for your detailed explanation.

Regards
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-02-17 17:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-16 15:15 Threads performance issue Rémi Dewitte
2009-02-16 15:28 ` [Caml-list] " Michał Maciejewski
2009-02-16 15:32   ` Rémi Dewitte
2009-02-16 15:42     ` David Allsopp
2009-02-16 16:07       ` Rémi Dewitte
2009-02-16 16:32 ` Sylvain Le Gall
2009-02-17 13:52   ` [Caml-list] " Frédéric Gava
2009-02-16 16:47 ` [Caml-list] " Yaron Minsky
2009-02-16 17:37   ` Rémi Dewitte
2009-02-17  7:40     ` Rémi Dewitte
2009-02-17  8:59       ` Mark Shinwell
2009-02-17  9:09         ` Rémi Dewitte
2009-02-17  9:53         ` Jon Harrop
2009-02-17 10:07       ` Sylvain Le Gall
2009-02-17 10:26         ` [Caml-list] " Mark Shinwell
2009-02-17 10:50           ` Rémi Dewitte
2009-02-17 10:56             ` Mark Shinwell
2009-02-17 11:33             ` Jon Harrop
2009-02-17 12:20         ` Yaron Minsky
2009-02-17 12:26           ` Rémi Dewitte
2009-02-17 17:14           ` Sylvain Le Gall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).