From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA23476; Sun, 30 Mar 2003 12:20:57 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id MAA23531 for ; Sun, 30 Mar 2003 12:20:56 +0200 (MET DST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h2UAKm503672; Sun, 30 Mar 2003 12:20:48 +0200 (MET DST) Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA23674; Sun, 30 Mar 2003 12:20:48 +0200 (MET DST) Date: Sun, 30 Mar 2003 12:20:48 +0200 From: Xavier Leroy To: Christian Lindig , oliver@first.in-berlin.de, Caml Mailing List Subject: Re: [Caml-list] Performance-cost of ^ Message-ID: <20030330122048.C22539@pauillac.inria.fr> References: <20030326095209.GA579@first.in-berlin.de> <6E2CBEC7-5F86-11D7-B838-0003930FCE12@inria.fr> <20030328162110.GB548@eecs.harvard.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <20030328162110.GB548@eecs.harvard.edu>; from lindig@eecs.harvard.edu on Fri, Mar 28, 2003 at 05:21:10PM +0100 X-Spam: no; 0.00; caml-list:01 oliver:01 bandel:01 quadratic:01 damien:01 buffer:01 bin:01 inefficient:01 rec:01 doligez:01 skin:97 binary:02 module:03 string:03 wrote:03 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > > Oliver Bandel wrote: > > >I'm reading in a file linewise. For some operations I need it as one > > >long string. > > > > > >How to acchieve this performant? Is it ok to use ^ for a list of > > >lines (with List.fold_right? or List.fold_left?) It's inefficient for large files (quadratic time). Christian Linding wrote: > Damien Doligez wrote: > > You should use String.concat. > > What about using the Buffer module? It sounds like it was especially > designed to build up long strings. Yes, Buffer will work fine here, with about the same efficiency as String.concat. If the file is a regular file and isn't expected to change during reading, the simplest and most efficient solution is: let ic = open_in_bin filename in let len = in_channel_length ic in let s = String.create len in really_input ic s 0 len; close_in ic; s In other circumstances (i.e. reading from a pipe or socket; desire to read the file as a text file and not a binary file), consider the following solution: let b = Buffer.create 1024 in let s = Buffer.create 1024 in let rec read_channel ic = let n = input ic s 0 1024 in if n > 0 then begin Buffer.add_substring b s 0 n; read_channel ic end in read_channel ic; Buffer.contents b Many ways to skin a cat, I'm afraid. - Xavier Leroy ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners