From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 84CB4BB91 for ; Fri, 28 Jan 2005 09:53:52 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0S8rpxM003574 for ; Fri, 28 Jan 2005 09:53:51 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA27523 for ; Fri, 28 Jan 2005 09:53:51 +0100 (MET) Received: from ext.lri.fr (ext.lri.fr [129.175.15.4]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0S8roKI003568 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Fri, 28 Jan 2005 09:53:51 +0100 Received: from localhost (localhost [127.0.0.1]) by ext.lri.fr (Postfix) with ESMTP id C8EF319E6FE; Fri, 28 Jan 2005 09:53:50 +0100 (CET) Received: from ext.lri.fr ([127.0.0.1]) by localhost (ext.lri.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 25142-02; Fri, 28 Jan 2005 09:53:50 +0100 (CET) Received: from smtp.lri.fr (serveur3-5 [129.175.3.5]) by ext.lri.fr (Postfix) with ESMTP id 8D10B19E620; Fri, 28 Jan 2005 09:53:50 +0100 (CET) Received: from pc8-142 (pc8-142 [129.175.8.142]) by smtp.lri.fr (Postfix) with ESMTP id BCBBFCED98; Fri, 28 Jan 2005 09:51:57 +0100 (CET) Received: from filliatr by pc8-142 with local (Exim 3.36 #1 (Debian)) id 1CuRrC-0002HX-00; Fri, 28 Jan 2005 09:51:58 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <16889.64814.187592.903001@gargle.gargle.HOWL> Date: Fri, 28 Jan 2005 09:51:58 +0100 To: skaller@users.sourceforge.net Cc: Radu Grigore , "yjc01@doc.ic.ac.uk" , caml-list Subject: Re: [Caml-list] index of substring In-Reply-To: <1106844935.12114.62.camel@pelican.wigram> References: <1106786478.41f838aecb7dd@www.doc.ic.ac.uk> <7f8e92aa05012622441fce509b@mail.gmail.com> <1106818582.6734.154.camel@pelican.wigram> <16888.49247.996632.225091@gargle.gargle.HOWL> <1106844935.12114.62.camel@pelican.wigram> X-Mailer: VM 7.18 under Emacs 21.3.1 From: Jean-Christophe Filliatre X-Virus-Scanned: by amavisd-new at lri.fr X-Miltered: at concorde with ID 41F9FD9F.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 41F9FD9E.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 filliatre:01 filliatre:01 lri:01 wrote:01 ocamllex:01 lexbuf:01 lexbuf:01 trivial:01 ocaml:01 lexer:01 lexing:01 literals:01 lexer:01 lexing:01 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.0 X-Spam-Level: skaller writes: > On Thu, 2005-01-27 at 21:20, Jean-Christophe Filliatre wrote: > > > > What's wrong with this way of handling nested comments with ocamllex: > > > > ====================================================================== > > > > | "(*" { comment lexbuf; ... } > > ... > > > > and comment = parse > > | "(*" { comment lexbuf; comment lexbuf } > > | "*)" { () } > > | _ { comment lexbuf } > > > > Well it doesn't handle (* or *) in strings .. I was only focusing on nested comments, but handling strings is rather trivial. See the ocaml lexer for instance (where you can see that introducing a new lexing function "string" is definitely better than trying to write a regular expression for strings literals). > However, whilst this code (given a fix > to handle strings) would probably work, > it isn't a plain lexer, but a mix of lexing > and client code. Sure, it is clearly more powerful than an automaton (since nested comments are not regular), if this is what you mean by "isn't a plain lexer". But that's precisely why ocamllex is so a powerful tool. You only need to know that ocamllex is building a set of mutually recusvive functions with the lexbuf as argument and then you are not limited in what you can do in the actions. You can even pass additional arguments to the lexing functions. I like to think about ocamllex as a general-purpose tool to insert a bit of regular expressions in ocaml programs (lexers, but also filters, file formats readers, line counters, code indenters, etc.) and not only as a tool to write lexers. With the header and trailer, it is even easy to build a whole ocaml program within a single ocamllex file. For instance, two programs of mine I use intensively are a program to count lines of code and comment in my ocaml programs (a 173 lines long ocamllex file) and a preprocessor for my web pages (a 129 lines long ocamllex file). -- Jean-Christophe Filliātre (http://www.lri.fr/~filliatr)