From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL,HTML_MESSAGE autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 140F3BC69 for ; Tue, 2 Oct 2007 14:56:36 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAADfdAUdCm3xrnWdsb2JhbACCcYtFAQEBAQsN X-IronPort-AV: E=Sophos;i="4.21,220,1188770400"; d="scan'208,217";a="2266116" Received: from janestcapital.com (HELO smtp.janestcapital.com) ([66.155.124.107]) by mail2-smtp-roc.national.inria.fr with ESMTP; 02 Oct 2007 14:56:35 +0200 Received: from [172.25.129.161] [38.96.172.125] by janestcapital.com with ESMTP (SMTPD-9.10) id A002035C; Tue, 02 Oct 2007 08:56:34 -0400 Message-ID: <47024002.2080206@janestcapital.com> Date: Tue, 02 Oct 2007 08:56:34 -0400 From: Brian Hurt User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax) X-Accept-Language: en-us, en MIME-Version: 1.0 To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] best and fastest way to read lines from a file? References: <779bf2730710011427g5983da4cw6ad8b715a9e38771@mail.gmail.com> <47016CEE.8010704@crans.org> <200710021239.l92CdwZ15641@virtutech.se> In-Reply-To: <200710021239.l92CdwZ15641@virtutech.se> Content-Type: multipart/alternative; boundary="------------020500040201030101030301" X-Spam: no; 0.00; mattias:01 readline:01 recursive:01 mattias:01 readline:01 recursive:01 wrote:01 wrote:01 faq:01 faq:01 rec:01 rec:01 avoids:01 avoids:01 exception:01 This is a multi-part message in MIME format. --------------020500040201030101030301 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Mattias Engdegård wrote: >>let line_count filename = >> let f = open_in filename in >> let rec loop count = >> match (readline f) with >> | Some(_) -> loop (count+1) >> | None -> count in >> loop 0;; >> >> > >Something like this should be even faster: > >exception Done of int > >let line_count file = > let rec loop count = > let _ = > try > input_line f > with End_of_file -> raise (Done count) > in > loop (count + 1) > in > try loop 0 with Done x -> x > >as it avoids unnecessary consing in the inner loop. > >__ > This should be a FAQ. let line_count file = let rec loop count = let test = try let _ = input_line f in true with | Not_found -> false in if test then loop (count + 1) else count in loop 0 ;; No consing, no unnecessary try/catch, tail recursive. Brian --------------020500040201030101030301 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit Mattias Engdegård wrote:
let line_count filename =
  let f = open_in filename in
  let rec loop count =
    match (readline f) with
      | Some(_) -> loop (count+1)
      | None -> count in
  loop 0;;
    

Something like this should be even faster:

exception Done of int

let line_count file =
  let rec loop count =
    let _ =
      try
	input_line f
      with End_of_file -> raise (Done count)
    in
      loop (count + 1)
  in
    try loop 0 with Done x -> x

as it avoids unnecessary consing in the inner loop.

__

This should be a FAQ.

let line_count file =
    let rec loop count =
       let test =
          try
             let _ = input_line f in
             true
          with
          | Not_found -> false
       in
       if test then
          loop (count + 1)
       else
          count
    in
    loop 0
;;

No consing, no unnecessary try/catch, tail recursive.

Brian

--------------020500040201030101030301--