From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA26785; Sun, 23 Sep 2001 19:44:32 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA26790 for ; Sun, 23 Sep 2001 19:44:32 +0200 (MET DST) Received: from lakeland.eecs.harvard.edu (lakeland.eecs.harvard.edu [140.247.62.173]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f8NHiQD04579 for ; Sun, 23 Sep 2001 19:44:26 +0200 (MET DST) Received: by lakeland.eecs.harvard.edu (Postfix, from userid 32148) id CCFD318D47; Sun, 23 Sep 2001 13:44:25 -0400 (EDT) Date: Sun, 23 Sep 2001 13:44:25 -0400 From: Christian Lindig To: Vesa Karvonen Cc: caml-list@inria.fr Subject: Re: [Caml-list] On ocamlyacc and ocamllex Message-ID: <20010923134425.A28129@lakeland.eecs.harvard.edu> Mail-Followup-To: Christian Lindig , Vesa Karvonen , caml-list@inria.fr References: <000b01c143aa$d5218690$422aa8c0@housemarque.fi> <20010922211013.A580@eecs.harvard.edu> <001501c1444c$a7b0a720$422aa8c0@housemarque.fi> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <001501c1444c$a7b0a720$422aa8c0@housemarque.fi>; from vesa.karvonen@housemarque.fi on Sun, Sep 23, 2001 at 07:27:36PM +0300 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sun, Sep 23, 2001 at 07:27:36PM +0300, Vesa Karvonen wrote: > From: "Christian Lindig" > > The particular problem can be solved outside of Lex and Yacc: in the > > Quick C-- compiler we have a mutable Sourcemap.map data type that > > records the connection between character positions and > > (file,line,column) triples. > > This is basically the same technique that I have been using. The problem is > that the map has to be global, because the only context passed to the lexer > actions is the lexbuf. You can pass the map to the lexer such that it does not has to be global: rule token = parse eof { fun map -> P.EOF } | ws+ { fun map -> token lexbuf map } | tab { fun map -> tab lexbuf map; token lexbuf map } | nl { fun map -> nl lexbuf map ; token lexbuf map } | nl '#' { fun map -> line lexbuf map 0; token lexbuf map } .... The lexer built from the above specification takes a lexbuf and map as arguments. > Furthermore, the records need to be manually removed (in order to save > memory) after a file has been processed completely and the recorded > connections for the file are no longer needed. I assume that in a functional programming style without a global mutable value the garbage collector will remove the map once I cannot access it any longer. > The basic idea was to put the token type definition into a separate > module. Instead of two source files, you would have three source > files: > > lexer.mll token.ml parser.mly > In parser.mly there would be code that would tell ocamlyacc to look at > token.ml for the token type. Now you would have to keep the token type and the grammar up to dateup to date manually. The parser generator also needs more informations than just the token types: precedences, associativity, and return types are tied to a token - where do you keep them?. I still think that generating the token type from the grammar is the easiest way. -- Christian -- Christian Lindig Harvard University - DEAS lindig@eecs.harvard.edu 33 Oxford St, MD 242, Cambridge MA 02138 phone: +1 (617) 496-7157 http://www.eecs.harvard.edu/~lindig/ ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr