From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id JAA04639; Wed, 17 Mar 2004 09:41:44 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA03906 for ; Wed, 17 Mar 2004 09:41:43 +0100 (MET) Received: from lri.lri.fr (lri.lri.fr [129.175.15.1]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i2H8fgHd023919 for ; Wed, 17 Mar 2004 09:41:42 +0100 Received: from pc8-123 (pc8-123 [129.175.8.123]) by lri.lri.fr (8.12.10/jtpda-5.4) with ESMTP id i2H8MJCr004973 ; Wed, 17 Mar 2004 09:22:19 +0100 (MET) Received: from filliatr by pc8-123 with local (Exim 3.35 #1 (Debian)) id 1B3WJf-0002TY-00; Wed, 17 Mar 2004 09:22:19 +0100 From: Jean-Christophe Filliatre MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <16472.2747.9259.155336@gargle.gargle.HOWL> Date: Wed, 17 Mar 2004 09:22:19 +0100 To: =?ISO-8859-1?Q?Agust=EDn_Valverde?= CC: caml-list@inria.fr Subject: Re: [Caml-list] Better option to read a file In-Reply-To: References: X-Mailer: VM 7.03 under Emacs 20.7.2 Reply-To: Jean-Christophe.Filliatre@lri.fr (Jean-Christophe Filliatre) X-MailScanner: Found to be clean X-Miltered: at concorde by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; filliatre:01 filliatre:01 lri:01 caml-list:01 agust:01 char:01 char:01 quadratic:01 occupies:01 buffer:01 buffer:01 cin:99 cin:99 lri:01 filliatr:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk X-Status: X-Keywords: X-UID: 150 Agustín Valverde writes: > Second: > > let rec unir c ac = unir ac^(Char.escaped c);; > > let leer2 fl = > let form = ref "" in > let c = ref '-' in > let arch = open_in fl in > (try > (while true do (c := input_char arch); (if !c != '\n' then (form > := unir !c !form) else ()) done) > with End_of_file -> close_in arch); > !form;; Note that this function is very inefficient: you are indeed building a lot of intermediate strings with "unir", resulting in a quadratic space (even if the final string occupies linear space). Using a buffer as suggested by Christoph is clearly better (the module Buffer from ocaml standard library is doubling its internal string buffer as needed, without you to worry about it). > I also have a parser to convert the string, could I to improve these > functions merging them with the parser in some way? The use of ocamllex in combination with a buffer is both easy and efficient. I sketch it: ====================================================================== { open Lexing let buf = Buffer.create 1024 } rule read = parse | "\\n" { Buffer.add_char buf '\n'; read lexbuf } | "\\t" { Buffer.add_char buf '\t'; read lexbuf } | "\\\\" { Buffer.add_char buf '\\'; read lexbuf } | _ { Buffer.add_string buf (lexeme lexbuf); read lexbuf } | eof { Buffer.contents buf } { let read_file f = let cin = open_in f in let lb = from_channel cin in let s = read lb in close_in cin; s } ====================================================================== -- Jean-Christophe Filliâtre (http://www.lri.fr/~filliatr) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners