From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id C2102BC69 for ; Tue, 2 Oct 2007 22:24:04 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAOxDAkdA6aa1mGdsb2JhbACONgEBAQEHAgYRGA X-IronPort-AV: E=Sophos;i="4.21,221,1188770400"; d="scan'208";a="17208802" Received: from py-out-1112.google.com ([64.233.166.181]) by mail4-smtp-sop.national.inria.fr with ESMTP; 02 Oct 2007 22:24:03 +0200 Received: by py-out-1112.google.com with SMTP id u52so7757434pyb for ; Tue, 02 Oct 2007 13:24:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=c/rYNnq/YhSn3CZ2oH/1Aw05TUI9wvdhl6nWCzhgk2I=; b=qAWs8C1qKX5vcV4Nu1ba6y3PVXBLR8yg7rP9pIfNyU38F7Hco65d2yIdzHMSnl6iBE8nxDgGPEn9MFwavWjUcI9C9w7+VQ4EZ+IyPzbwt5S9fsqwDKJTINlHyjuwRb1FfQ3NlL9Ru4rPiEFIU6ZOY3hbJ/2/gICwtsvonF5Kh9E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=gKK7MoQIjarkHtbt7MKn54TkAQUY16xvOw6SU2TwvIXU52OU9XOgqNaQhORyu02Wsi3qdiXf0jYVdo2XZIqupmy20xaMNj1vxipdLYxJz7XJnaIDAwvP5VDt/QQJ1il5t2lk96de4w4nUM9BFMVQspIvNdp5PMJZjA1iSuAyW5k= Received: by 10.65.151.6 with SMTP id d6mr19341364qbo.1191356639144; Tue, 02 Oct 2007 13:23:59 -0700 (PDT) Received: by 10.90.72.12 with HTTP; Tue, 2 Oct 2007 13:23:59 -0700 (PDT) Message-ID: <95513600710021323u762efd5k5ee6bdd03d7cc37@mail.gmail.com> Date: Tue, 2 Oct 2007 22:23:59 +0200 From: "Olivier Andrieu" To: kirillkh Subject: Re: [Caml-list] best and fastest way to read lines from a file? Cc: caml-list@yquem.inria.fr In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <779bf2730710011427g5983da4cw6ad8b715a9e38771@mail.gmail.com> <47016CEE.8010704@crans.org> <200710021239.l92CdwZ15641@virtutech.se> <47024002.2080206@janestcapital.com> X-Spam: no; 0.00; andrieu:01 oandrieu:01 pitfalls:01 mattias:01 channel-:01 'a-:01 'b-:01 prev:01 val:01 prev:01 val:01 readline:01 tolerate:01 recursion:01 ocaml:01 On 10/2/07, kirillkh wrote: > OK, so I'll give up the parsing/buffering part and only leave efficient > exception handling. This should leave the user free to do anything with it, > but prevent performance pitfalls. The following is based on Mattias > Engdegard's code: > > (* I couldn't figure out, how to declare a polymorphic exception properly *) > exception Done of 'a > > let fold_file (file: in_channel) > (read_func: in_channel->'a) > (elem_func: 'a->'b->'b) > (seed: 'b) = > let rec loop prev_val = > let input = > try read_func file > with End_of_file -> raise (Done prev_val) > in > let combined_val = elem_func input prev_val in > loop combined_val > in > try loop seed with Done x -> x > > And the usage for line counting: > > let line_count filename = > let f = open_in filename in > let counter _ count = count + 1 in > fold_file f readline counter 0 > > Since it's library code, we can tolerate the little annoyance of the second > try-catch. As far as I can tell, this code has the same performance > characteristics as yours: no consing + tail recursion. Any other problems > with it? well apart from the fact that you cannot have "polymorphic exceptions" in OCaml, this kind of code is IMHO much more natural with an imperative loop instead of a functional one: let fold_file read chan f init = let acc = ref init in begin try while true do let d = read chan in acc := f d !acc done with End_of_file -> () end ; !acc -- Olivier