From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id CAA03570; Mon, 10 Nov 2003 02:49:08 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id CAA03501 for ; Mon, 10 Nov 2003 02:49:07 +0100 (MET) Received: from herd.plethora.net (herd.plethora.net [205.166.146.1]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id hAA1n6124769 for ; Mon, 10 Nov 2003 02:49:06 +0100 (MET) Received: from bhurt.plethora.net (bhurt.plethora.net [205.166.146.49]) by herd.plethora.net (8.11.6/8.10.1) with ESMTP id hAA1n1C20749; Sun, 9 Nov 2003 19:49:02 -0600 (CST) Date: Sun, 9 Nov 2003 20:47:56 -0600 (CST) From: Brian Hurt X-X-Sender: bhurt@localhost.localdomain To: Michael Hoisie cc: caml-list@pauillac.inria.fr Subject: Re: [Caml-list] Arbitrarily throwing End_of_file In-Reply-To: <20031109171638.42cd6807.mbh@ocf.berkeley.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 arbitrarily:01 26,000:99 printf:01 printf:01 recursion:01 recursion:01 byte:01 rec:01 rec:01 nov:01 30,000:97 eof:02 eof:02 bytes:02 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sun, 9 Nov 2003, Michael Hoisie wrote: > I have a file which is approximately 278,440 lines of text (more > specifically, it is the result of doing 'ls -lAR /') -l lists the file size in *bytes*, not lines. Use 'wc -l longfile.dat' to determine the number of lines. If each line is ~10.6 bytes long (including the EOLN) then a 278,000 byte file will be about 26,000 lines long. The -A means "almost all" (everything except . and ..), and the R means recursive (list subdirectories as well). > > I was trying to write this relatively simple program to analyze it but > it seems that End_of_file was thrown very early. > > To test, it, I made a simple function: > > let rec count_lines file n = > try let str = input_line file in > count_lines file (n + 1) > with End_of_file -> Printf.printf "The file is %d\n lines long" n This function isn't tail recursive- the function's call to itself is within a try/with block, which breaks the tail recursion. That isn't the problem you're hitting, but you're not far from hitting it. I generally hit it about 30,000 functions deep or so. Try the following instead: let rec count_lines file n = let line, eof = try (input_line file), false with End_of_file -> "", true in if not eof then begin (* do something with line here *) count_lines file (n + 1) end else n let file = open_in "longfile.dat" in Printf.printf "The file is %d lines long.\n" (count_lines file 0) Note that the tail recursion is now outside of the try/with block, and this function will work with any length file. Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners