caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Brian Hurt <bhurt@spnz.org>
To: Michael Hoisie <mbh@OCF.Berkeley.EDU>
Cc: caml-list@pauillac.inria.fr
Subject: Re: [Caml-list] Arbitrarily throwing End_of_file
Date: Sun, 9 Nov 2003 20:47:56 -0600 (CST)	[thread overview]
Message-ID: <Pine.LNX.4.44.0311092036210.5009-100000@localhost.localdomain> (raw)
In-Reply-To: <20031109171638.42cd6807.mbh@ocf.berkeley.edu>

On Sun, 9 Nov 2003, Michael Hoisie wrote:

> I have a file which is approximately 278,440 lines of text (more
> specifically, it is the result of doing 'ls -lAR /')

-l lists the file size in *bytes*, not lines.  Use 'wc -l longfile.dat' to
determine the number of lines.  If each line is ~10.6 bytes long
(including the EOLN) then a 278,000 byte file will be about 26,000 lines
long.  The -A means "almost all" (everything except . and ..), and the R
means recursive (list subdirectories as well).

> 
> I was trying to write this relatively simple program to analyze it but
> it seems that End_of_file was thrown very early.
> 
> To test, it, I made a simple function:  
> 
> let rec count_lines file n =
>     try let str = input_line file in
>     count_lines file (n + 1)
>     with End_of_file -> Printf.printf "The file is %d\n lines long" n

This function isn't tail recursive- the function's call to itself is
within a try/with block, which breaks the tail recursion.  That isn't the
problem you're hitting, but you're not far from hitting it.  I generally
hit it about 30,000 functions deep or so.  Try the following instead:

let rec count_lines file n =
    let line, eof = try (input_line file), false
                   with End_of_file -> "", true
    in
    if not eof then
        begin
            (* do something with line here *)
            count_lines file (n + 1)
        end
    else
        n

let file = open_in "longfile.dat" in
Printf.printf "The file is %d lines long.\n" (count_lines file 0)

Note that the tail recursion is now outside of the try/with block, and 
this function will work with any length file.

Brian


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


      reply	other threads:[~2003-11-10  1:49 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-10  1:16 Michael Hoisie
2003-11-10  2:47 ` Brian Hurt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.44.0311092036210.5009-100000@localhost.localdomain \
    --to=bhurt@spnz.org \
    --cc=caml-list@pauillac.inria.fr \
    --cc=mbh@OCF.Berkeley.EDU \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).