caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Strange slowness of input_line on mingw
@ 2002-10-24 14:05 Yaron M. Minsky
  2002-10-28 15:26 ` Xavier Leroy
  0 siblings, 1 reply; 8+ messages in thread
From: Yaron M. Minsky @ 2002-10-24 14:05 UTC (permalink / raw)
  To: Caml List 

I've noticed some strangely slow behavior from input_line on mingw.  I
wrote a simple loop to scan through a file, and found that for a given
file, it took about 10 seconds to run, whereas wc -l took only a small
fraction of a second -- the difference was about a factor of 70.  This is
on a W2K machine using mingw.  On the other hand, using the same file on a
linux box, the difference between wc -l and my code was only about a
factor of 3.
Any ideas where the big difference might be coming from?   The code I
wrote is attached below.
y


open Printf

let fname = try Sys.argv.(1) with _ -> printf "Must provide filename\n";
exit (-1)let file = open_in fname
(* let read = Newparse.create_tick_reader file*)

let () =
  try
    while true do
      ignore (input_line file);
    done
  with
      End_of_file -> printf "File ended.  Hah!\n"



-- 
|--------/            Yaron M. Minsky              \--------|
|--------\ http://www.cs.cornell.edu/home/yminsky/ /--------|

Open PGP --- KeyID B1FFD916 (new key as of Dec 4th)
Fingerprint: 5BF6 83E1 0CE3 1043 95D8 F8D5 9F12 B3A9 B1FF D916



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-24 14:05 [Caml-list] Strange slowness of input_line on mingw Yaron M. Minsky
@ 2002-10-28 15:26 ` Xavier Leroy
  2002-10-28 16:12   ` Yaron M. Minsky
  0 siblings, 1 reply; 8+ messages in thread
From: Xavier Leroy @ 2002-10-28 15:26 UTC (permalink / raw)
  To: Yaron M. Minsky; +Cc: Caml List

> I've noticed some strangely slow behavior from input_line on mingw.  I
> wrote a simple loop to scan through a file, and found that for a given
> file, it took about 10 seconds to run, whereas wc -l took only a small
> fraction of a second -- the difference was about a factor of 70.  This is
> on a W2K machine using mingw.  On the other hand, using the same file on a
> linux box, the difference between wc -l and my code was only about a
> factor of 3.
> Any ideas where the big difference might be coming from?   The code I
> wrote is attached below.

input_line has to work a bit harder than wc because it actually copies
the data to strings.  However, on my tests with your program (Linux,
OCaml 3.06), this makes essentially no difference: both your code and
wc run at about 50 Mb/s.  

Two possible explanations:

1- The file wasn't in the file cache when you timed your program; then you
timed wc, at which time the file was in the file cache.  In other
terms, you're measuring the difference between a "cold cache read" and a
"warm cache read".  Try measuring wc first :-)

2- Your file contains very long lines and you're using OCaml 3.04 or
earlier.  There was a performance bug in pre-3.06 versions causing
input_line to run slowly on very long lines (100000 characters or
more).

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-28 15:26 ` Xavier Leroy
@ 2002-10-28 16:12   ` Yaron M. Minsky
  2002-10-28 17:14     ` M E Leypold @ labnet
  2002-10-29  0:10     ` Oleg
  0 siblings, 2 replies; 8+ messages in thread
From: Yaron M. Minsky @ 2002-10-28 16:12 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Caml List

I don't think any of your explanations explain it:  the lines weren't
particularly long and I was using ocaml 3.06 on both linux and windows. 
And I did the test multiple times, in different orders, so the caching
is not the issue.

And, of course, the weirdest thing is that this appears to be a
windows-specific problem.  On Linux, there speed difference between wc
and my program was quite small.  The x70 difference was only found on
the mingw builds I tried.  (I didn't try a straight cygwin version,
since I don't have that build system set up at present.)

So, any other ideas, or suggestion as to how to narrow down the problem?

Thanks,
Yaron

On Mon, 2002-10-28 at 10:26, Xavier Leroy wrote:
> > I've noticed some strangely slow behavior from input_line on mingw.  I
> > wrote a simple loop to scan through a file, and found that for a given
> > file, it took about 10 seconds to run, whereas wc -l took only a small
> > fraction of a second -- the difference was about a factor of 70.  This is
> > on a W2K machine using mingw.  On the other hand, using the same file on a
> > linux box, the difference between wc -l and my code was only about a
> > factor of 3.
> > Any ideas where the big difference might be coming from?   The code I
> > wrote is attached below.
> 
> input_line has to work a bit harder than wc because it actually copies
> the data to strings.  However, on my tests with your program (Linux,
> OCaml 3.06), this makes essentially no difference: both your code and
> wc run at about 50 Mb/s.  
> 
> Two possible explanations:
> 
> 1- The file wasn't in the file cache when you timed your program; then you
> timed wc, at which time the file was in the file cache.  In other
> terms, you're measuring the difference between a "cold cache read" and a
> "warm cache read".  Try measuring wc first :-)
> 
> 2- Your file contains very long lines and you're using OCaml 3.04 or
> earlier.  There was a performance bug in pre-3.06 versions causing
> input_line to run slowly on very long lines (100000 characters or
> more).
> 
> - Xavier Leroy
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-28 16:12   ` Yaron M. Minsky
@ 2002-10-28 17:14     ` M E Leypold @ labnet
  2002-10-28 17:28       ` Sven Luther
  2002-10-29  0:10     ` Oleg
  1 sibling, 1 reply; 8+ messages in thread
From: M E Leypold @ labnet @ 2002-10-28 17:14 UTC (permalink / raw)
  To: Yaron M. Minsky; +Cc: Caml List



Yaron M. Minsky writes:

 > So, any other ideas, or suggestion as to how to narrow down the problem?

Rather generic: Some kind of system call tracing (is that possible
with mingw?), to find out, wether the things happening between
userland and kernel are roughly equivalent in both cases. I mean: Same
number of reads, reading chunks around the same size and so on. If
not, then I'd look for a problem/difference in the C-Runtime against
which the OCaml intepreter (or the executable of your programm) are
linked.
 
If things really happen in a different way deeper in userland, the GC
statistics might be different. Try printing that (I personally do not
believe it, but strange things happen now and then). 

Another Idea: Is your file large? If yes, the OCaml program might use
more memory (nothing is freed until th GC hit's the first time). And
another rather wild hypothesis is, that the kernel might be somehow
unwilling to grant that amount of memory and takes its time. But that
would mean that the OCaml program's process is blocked during a kernel
call for some time. Can you take pure user cpu time in windows? If
yes: try that.

All this is rather generic. I'm no expert on how the OCaml system
works internally or interfaces with the host system. Taking this
profiles would be just an attempt to find more differences between
'wc' and the OCaml implementation to have some data for more educated
guesses.

Regards -- Markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-28 17:14     ` M E Leypold @ labnet
@ 2002-10-28 17:28       ` Sven Luther
  2002-10-28 17:42         ` Yaron M. Minsky
  0 siblings, 1 reply; 8+ messages in thread
From: Sven Luther @ 2002-10-28 17:28 UTC (permalink / raw)
  To: M E Leypold @ labnet; +Cc: Yaron M. Minsky, Caml List

On Mon, Oct 28, 2002 at 06:14:37PM +0100, M E Leypold @ labnet wrote:
> 
> 
> Yaron M. Minsky writes:
> 
>  > So, any other ideas, or suggestion as to how to narrow down the problem?
> 
> Rather generic: Some kind of system call tracing (is that possible
> with mingw?), to find out, wether the things happening between
> userland and kernel are roughly equivalent in both cases. I mean: Same
> number of reads, reading chunks around the same size and so on. If
> not, then I'd look for a problem/difference in the C-Runtime against
> which the OCaml intepreter (or the executable of your programm) are
> linked.
>  
> If things really happen in a different way deeper in userland, the GC
> statistics might be different. Try printing that (I personally do not
> believe it, but strange things happen now and then). 
> 
> Another Idea: Is your file large? If yes, the OCaml program might use
> more memory (nothing is freed until th GC hit's the first time). And
> another rather wild hypothesis is, that the kernel might be somehow
> unwilling to grant that amount of memory and takes its time. But that
> would mean that the OCaml program's process is blocked during a kernel
> call for some time. Can you take pure user cpu time in windows? If
> yes: try that.
> 
> All this is rather generic. I'm no expert on how the OCaml system
> works internally or interfaces with the host system. Taking this
> profiles would be just an attempt to find more differences between
> 'wc' and the OCaml implementation to have some data for more educated
> guesses.

BTW, what about re-implementing wc in ocaml using direct reading of the
file without creating strings, and measure the time taken by this ?

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-28 17:28       ` Sven Luther
@ 2002-10-28 17:42         ` Yaron M. Minsky
  0 siblings, 0 replies; 8+ messages in thread
From: Yaron M. Minsky @ 2002-10-28 17:42 UTC (permalink / raw)
  To: luther; +Cc: leypold, yminsky, caml-list

I think the whole wc-connection is a bit of a distraction here.  It was
just a way of benchmarking the two implementations.  The basic issue is
that for some reason the same input_line based code is about 50-70 times
slower on mingw than it is on Linux.  I just wc as a way of making it
clear that the difference was not due to hardware differences between the
platforms.
That said, it might be useful to do something like what you suggest, since
it would make it clear whether input_line is the problem, or whether it's
a generalized problem with the i/o system.
y

>
> BTW, what about re-implementing wc in ocaml using direct reading of the
> file without creating strings, and measure the time taken by this ?
>
> Friendly,
>
> Sven Luther
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives:
> http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs
> FAQ: http://caml.inria.fr/FAQ/ Beginner's list:
> http://groups.yahoo.com/group/ocaml_beginners


-- 
|--------/            Yaron M. Minsky              \--------|
|--------\ http://www.cs.cornell.edu/home/yminsky/ /--------|

Open PGP --- KeyID B1FFD916 (new key as of Dec 4th)
Fingerprint: 5BF6 83E1 0CE3 1043 95D8 F8D5 9F12 B3A9 B1FF D916



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-28 16:12   ` Yaron M. Minsky
  2002-10-28 17:14     ` M E Leypold @ labnet
@ 2002-10-29  0:10     ` Oleg
  2002-10-29  1:06       ` Yaron M. Minsky
  1 sibling, 1 reply; 8+ messages in thread
From: Oleg @ 2002-10-29  0:10 UTC (permalink / raw)
  To: Yaron M. Minsky; +Cc: Caml List

On Monday 28 October 2002 11:12 am, Yaron M. Minsky wrote:
> The x70 difference was only found on
> the mingw builds I tried.  (I didn't try a straight cygwin version,
> since I don't have that build system set up at present.)

So was wc from cygwin, or did you build it using mingw? (ie I understand that 
your O'Caml is using mingw, but what about wc - what was it compiled with?)

Oleg
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Strange slowness of input_line on mingw
  2002-10-29  0:10     ` Oleg
@ 2002-10-29  1:06       ` Yaron M. Minsky
  0 siblings, 0 replies; 8+ messages in thread
From: Yaron M. Minsky @ 2002-10-29  1:06 UTC (permalink / raw)
  To: oleg_inconnu; +Cc: yminsky, caml-list

wc was built with cygwin.

y

> On Monday 28 October 2002 11:12 am, Yaron M. Minsky wrote:
>> The x70 difference was only found on
>> the mingw builds I tried.  (I didn't try a straight cygwin version,
>> since I don't have that build system set up at present.)
>
> So was wc from cygwin, or did you build it using mingw? (ie I
> understand that  your O'Caml is using mingw, but what about wc - what
> was it compiled with?)
>
> Oleg
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives:
> http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs
> FAQ: http://caml.inria.fr/FAQ/ Beginner's list:
> http://groups.yahoo.com/group/ocaml_beginners


-- 
|--------/            Yaron M. Minsky              \--------|
|--------\ http://www.cs.cornell.edu/home/yminsky/ /--------|

Open PGP --- KeyID B1FFD916 (new key as of Dec 4th)
Fingerprint: 5BF6 83E1 0CE3 1043 95D8 F8D5 9F12 B3A9 B1FF D916



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-10-29  1:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-24 14:05 [Caml-list] Strange slowness of input_line on mingw Yaron M. Minsky
2002-10-28 15:26 ` Xavier Leroy
2002-10-28 16:12   ` Yaron M. Minsky
2002-10-28 17:14     ` M E Leypold @ labnet
2002-10-28 17:28       ` Sven Luther
2002-10-28 17:42         ` Yaron M. Minsky
2002-10-29  0:10     ` Oleg
2002-10-29  1:06       ` Yaron M. Minsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).