From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 9E9DBBC0B for ; Wed, 24 Jan 2007 00:56:20 +0100 (CET) Received: from mail.rsise.anu.edu.au (mail.rsise.anu.edu.au [150.203.208.4]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l0NNuGLX017110 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 24 Jan 2007 00:56:19 +0100 Received: from localhost (localhost [127.0.0.1]) by mail.rsise.anu.edu.au (Postfix) with ESMTP id 3328254129 for ; Wed, 24 Jan 2007 10:55:58 +1100 (EST) Received: from mail.rsise.anu.edu.au ([150.203.208.4]) by localhost (mail [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 28361-06 for ; Wed, 24 Jan 2007 10:55:58 +1100 (EST) Received: from pulp.rsise.anu.edu.au (pulp.rsise.anu.edu.au [150.203.208.49]) by mail.rsise.anu.edu.au (Postfix) with ESMTP id 14DE0540F7 for ; Wed, 24 Jan 2007 10:55:53 +1100 (EST) Received: by pulp.rsise.anu.edu.au (Postfix, from userid 1560) id 105BEB03C74; Wed, 24 Jan 2007 10:59:41 +1100 (EST) Date: Wed, 24 Jan 2007 10:59:40 +1100 From: Pietro Abate To: ocaml ml Subject: parsing problem. Message-ID: <20070123235940.GA32163@pulp.rsise.anu.edu.au> Mail-Followup-To: Pietro Abate , ocaml ml MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Operating-System: GNU/Linux X-Organization: Research School of Information Science and Engineering (Australian National University) User-Agent: Mutt/1.5.13 (2006-08-11) X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mail.rsise.anu.edu.au X-Miltered: at discorde with ID 45B6A0A0.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; parsing:01 parser:01 ocaml:01 camlp:01 parser:01 lexer:01 combinators:01 camlp:01 ocaml:01 contrib:01 contrib:01 blog:98 blog:98 parsed:01 compile:01 hi all, I've a file that is composed of two parts: a language specification and a program that is written using the language specified in the first part. My goal is to parse the language spec, create a parser that uses this language, parse the rest of the file and output ocaml code (this should be a preprocessor). At the moment I'm using camlp4 to read the language specification, extend a simple fixed grammar on the fly and to parse the rest of the file using the grammar that I've extended with the language specification. For example if my fix grammar is a simple generic calculator language and I want to add more operators to the language I'll write something along these lines: _+_ , 2 _*_ , 1 -_ , 0 and then then the second part of the file will look like 1 + 2 * -3 and I expect that this line will be parsed accordingly to the operators declared at the beginning. Obviously this approach is not going to take me very far way. In the moment I need to specify a different grammar (say to write equations) I've to change the parser in the preprocessor. What I really want is to write the grammar and lexer specification in the first part of my file and then use a parser generator to create a parser to interpret the second part of the file. The catch is that I'd like to do it in one pass. Using Lex/Yacc technology this is not possible (I think) as you need to write two input files in order to generate a parser library. I'm wondering if this is possible using a generic parser combinators library. The idea would be to link a generic parser library and bootstrap my custom parser by feeding it with the grammar specification and then use this parser within or together with camlp4 to parse the rest of the file. The result of this pre-processing step should be ocaml code that I'll just need to compile. input -> campl4 extension + generic parser library -> ocaml code Is there a (small !) library to achieve this ? I had a quick look at Ocfgc (http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=492) can it be used in this way ? Ostap (http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=513) seems also promising, but I'm not sure if it is possible to plug it in this way. ulex ? menhir ? dypgen ? thanks :) pietro -- ++ Blog: http://blog.rsise.anu.edu.au/?q=pietro ++ ++ "All great truths begin as blasphemies." -George Bernard Shaw ++ Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html