From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 3520DBB81; Wed, 14 Dec 2005 22:05:05 +0100 (CET) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jBEL520P007061; Wed, 14 Dec 2005 22:05:04 +0100 Received: from rosella (ppp33-4.lns1.syd2.internode.on.net [59.167.33.4] (may be forged)) by smtp1.adl2.internode.on.net (8.12.9/8.12.6) with ESMTP id jBEL4sXm097549; Thu, 15 Dec 2005 07:34:55 +1030 (CST) (envelope-from skaller@users.sourceforge.net) Subject: Re: [Caml-list] [ANNOUNCE] Alpha release of Menhir, an LR(1) parser generator for ocaml From: skaller To: Alessandro Baretta Cc: Francois.Pottier@inria.fr, Caml Mailing List In-Reply-To: <439FF395.3090503@barettadeit.com> References: <20051212175838.GA8502@yquem.inria.fr> <1134540495.8980.63.camel@rosella> <20051214090427.GB1330@yquem.inria.fr> <439FF395.3090503@barettadeit.com> Content-Type: text/plain Date: Thu, 15 Dec 2005 08:04:54 +1100 Message-Id: <1134594294.8942.82.camel@rosella> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 43A088FE.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 parser:01 ocaml:01 baretta:01 recursive:01 parser:01 late-binding:01 recursion:01 recursive:01 ocamlyacc:01 ocamlyacc:01 recursion:01 semicolon:01 token:01 tokens:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Wed, 2005-12-14 at 11:27 +0100, Alessandro Baretta wrote: > Francois Pottier wrote: > > > However, Menhir does not produce tables; it > > compiles the automaton down to a bunch of mutually recursive > > functions. We have not yet scientifically assessed the > > difference in size between tables and code, but a few > > quick experiments indicate that it is reasonable (the > > code is larger than the tables by a factor of two or > > three). > > > > In general, I like the approach, as it can easily scale to an extensible parser > generator by late-binding the recursion through a record/array/table of > functions. Have you thought about this at all? Actually the Felix parser does this right now -- and that is why I'm asking for the parser functions to allow a parameter: to pass in the data structure. I actually have simplistic recursive descent interpreter which is called by ocamlyacc, and which calls back into ocamlyacc: at present this allows user defined statements to be added to the Felix language. It is very crude, since there is no way to properly type user non-terminals other than statements, and it only allows extension of statements. Interestingly the 'end of stream' issue discussed at length in the Menhir manual then applies also to the points of recursion. For a statement in Felix, there is no lookahead at the end (they end with a semicolon). For expressions, I have to read one token too many, keep a list of all possible tokens that can terminate an expression, introduce an 'augmented expression' which is an expression followed by any of these tokens, and then in the action code 'put back' the token into the token stream. It would be really nice if I could write the production as: exprx: expr ~expr where ~expr means 'any token which forces a reduction of the whole expression' if you know what I mean. I have to assemble this list by hand at the moment. I did it by starting with a list of all the tokens, and removing the ones which seemed to cause a conflict. I will certainly forget to upgrade this list when I add new tokens -- hence a desire for something like ~expr which tells the parser to do it :) -- John Skaller Felix, successor to C++: http://felix.sf.net