From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 238BFBB91 for ; Wed, 26 Jan 2005 20:15:18 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0QJFHE2021675 for ; Wed, 26 Jan 2005 20:15:17 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id UAA20084 for ; Wed, 26 Jan 2005 20:15:17 +0100 (MET) Received: from out2.smtp.messagingengine.com (out2.smtp.messagingengine.com [66.111.4.26]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0QJFGDr021670 for ; Wed, 26 Jan 2005 20:15:17 +0100 Received: from frontend2.messagingengine.com (frontend2.internal [10.202.2.151]) by frontend1.messagingengine.com (Postfix) with ESMTP id A8A38C5239E; Wed, 26 Jan 2005 14:15:14 -0500 (EST) X-Sasl-enc: UKfB0o+6nq0O7k8eqmpCbA 1106766913 Received: from [172.16.112.115] (burnham.ljcrf.edu [192.231.106.2]) by frontend2.messagingengine.com (Postfix) with ESMTP id 6DB2656F785; Wed, 26 Jan 2005 14:15:10 -0500 (EST) Date: Wed, 26 Jan 2005 11:14:45 -0800 (PST) From: Martin Jambon X-X-Sender: martin@localhost To: Yan Jun Daisy Chen Cc: caml-list@inria.fr Subject: Re: [Caml-list] extract info from file In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Miltered: at concorde with ID 41F7EC45.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 41F7EC44.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 initialise:01 fstat:01 foo:01 lazy:01 stdin:01 iter:01 endline:01 daisy:98 unix:01 simpler:01 int:01 int:01 century:98 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=disabled version=3.0.0 X-Spam-Level: On Wed, 26 Jan 2005, Yan Jun Daisy Chen wrote: > I am trying to extract comments in a text file i.e. text between (* and > *). I declared a globle sting variable, buff to store them, and want to > store all the comments extracted in a list of string. Is the sting > variable size fixed once I initialise it? > > The code I came up with is: > > open Unix;; > > let fileReader = openfile "student.cd" [O_RDONLY] 0o640;; > let buff = ref "file: ";; > let fileSize = (fstat fileReader).st_size;; > (*let fileSize = 50;;*) > let noOfChar = ref 0;; > > let extract_comment () = > let openIndex = 0 in > noOfChar := read fileReader !buff openIndex fileSize; > (*print_string !buff; print_newline(); > print_int !noOfChar; print_newline();;*) > > > let main () = > (*let fileContent = read fileReader !buff 0 5 in > print_int fileContent;*) > extract_comment();; > > main ();; > > Is there a simpler way to do this? Sure: 1) Install my favorite library, Micmatch :-) 2) Use the following code (which does what you are asking for): (* micmatch foo.ml < some_file *) open Micmatch let _ = let parse_contents = MAP "(*" (_* Lazy as s) "*)" -> `Comment s in let comments = Text.map (function `Text _ -> raise Text.Skip | `Comment s -> s) (parse_contents (Text.channel_contents stdin)) in List.iter print_endline comments (***) Martin -- Martin Jambon, PhD Researcher in Structural Bioinformatics since the 20th Century The Burnham Institute http://www.burnham.org San Diego, California