From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 566B8BC0B for ; Wed, 24 Jan 2007 22:49:59 +0100 (CET) Received: from mail.rsise.anu.edu.au (mail.rsise.anu.edu.au [150.203.208.4]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l0OLntQs003030 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 24 Jan 2007 22:49:58 +0100 Received: from localhost (localhost [127.0.0.1]) by mail.rsise.anu.edu.au (Postfix) with ESMTP id D19EE5410C; Thu, 25 Jan 2007 08:49:45 +1100 (EST) Received: from mail.rsise.anu.edu.au ([150.203.208.4]) by localhost (mail [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 20691-03; Thu, 25 Jan 2007 08:49:45 +1100 (EST) Received: from pulp.rsise.anu.edu.au (pulp.rsise.anu.edu.au [150.203.208.49]) by mail.rsise.anu.edu.au (Postfix) with ESMTP id A4D2553F7E; Thu, 25 Jan 2007 08:49:41 +1100 (EST) Received: by pulp.rsise.anu.edu.au (Postfix, from userid 1560) id 1FF5EA062C5; Thu, 25 Jan 2007 08:53:30 +1100 (EST) Date: Thu, 25 Jan 2007 08:53:30 +1100 From: Pietro Abate To: caml-list@yquem.inria.fr, ocaml ml Subject: using camlp4 api [WAS: Re: [Caml-list] parsing problem.] Message-ID: <20070124215329.GA17676@pulp.rsise.anu.edu.au> Mail-Followup-To: Pietro Abate , caml-list@yquem.inria.fr, ocaml ml References: <20070123235940.GA32163@pulp.rsise.anu.edu.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070123235940.GA32163@pulp.rsise.anu.edu.au> X-Operating-System: GNU/Linux X-Organization: Research School of Information Science and Engineering (Australian National University) User-Agent: Mutt/1.5.13 (2006-08-11) X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mail.rsise.anu.edu.au X-Miltered: at discorde with ID 45B7D483.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; camlp:01 parsing:01 ens-lyon:01 parser:01 expr:01 expr:01 camlp:01 parser:01 grammars:01 verbose:01 ocaml:01 lexer:01 combinators:01 ocaml:01 contrib:01 ok, it seems my question was a bit too general or impenetrable... dypgen (http://perso.ens-lyon.fr/emmanuel.onzon/) has an example of something very similar to what I want to achieve. By defining a small parser for this language, then it is possible to interpret the code that follows: define list_contents := expr(x) = List(x,Nil) and list_contents := expr(x);list_contents(y) = List(x,y) and expr := [] = Nil and expr := [list_contents(x)] = x and expr := expr(x)::expr(y) = List(x,y) in match [1;2;3] with | a::[b;c] -> b::[] | _ -> [4] where in the first part there is a grammar definition and in the second part the grammar is used to parse the argument of the match statement. my question: Is it possible to do something similar with camlp4 ? In the library documentation I can see that this might be possible, but I don't quite understand where to start. The first step should be to write a small parser to interpret the language definition and then to generate, by using the Grammar (extensible grammars) module, other production that extend the current grammar to parse the rest of the file. The trick part is that I cannot write (I think) and EXTEND statement on the fly (as it should be parsed itself), but I have to use the quotation library directly, and this is very verbose... Did anybody do anything similar ? One small example ? thanks for your help... :) p On Wed, Jan 24, 2007 at 10:59:40AM +1100, Pietro Abate wrote: > hi all, > I've a file that is composed of two parts: a language specification and > a program that is written using the language specified in the first > part. My goal is to parse the language spec, create a parser that uses > this language, parse the rest of the file and output ocaml code (this > should be a preprocessor). > > At the moment I'm using camlp4 to read the language specification, > extend a simple fixed grammar on the fly and to parse the rest of the > file using the grammar that I've extended with the language > specification. For example if my fix grammar is a simple generic > calculator language and I want to add more operators to the language > I'll write something along these lines: > > _+_ , 2 > _*_ , 1 > -_ , 0 > > and then then the second part of the file will look like > > 1 + 2 * -3 > > and I expect that this line will be parsed accordingly to the operators > declared at the beginning. > > Obviously this approach is not going to take me very far way. In the > moment I need to specify a different grammar (say to write equations) > I've to change the parser in the preprocessor. > > What I really want is to write the grammar and lexer specification in > the first part of my file and then use a parser generator to create a > parser to interpret the second part of the file. The catch is that I'd > like to do it in one pass. Using Lex/Yacc technology this is not > possible (I think) as you need to write two input files in order to > generate a parser library. I'm wondering if this is possible using a > generic parser combinators library. > > The idea would be to link a generic parser library and bootstrap my > custom parser by feeding it with the grammar specification and then use > this parser within or together with camlp4 to parse the rest of the > file. > > The result of this pre-processing step should be ocaml code that I'll > just need to compile. > input -> campl4 extension + generic parser library -> ocaml code > > Is there a (small !) library to achieve this ? > > I had a quick look at > > Ocfgc (http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=492) can it be > used in this way ? > > Ostap > (http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=513) seems also promising, > but I'm not sure if it is possible to plug it in this way. > > ulex ? menhir ? dypgen ? > > thanks :) > pietro > > -- > ++ Blog: http://blog.rsise.anu.edu.au/?q=pietro > ++ > ++ "All great truths begin as blasphemies." -George Bernard Shaw > ++ Please avoid sending me Word or PowerPoint attachments. > See http://www.fsf.org/philosophy/no-word-attachments.html > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs -- ++ Blog: http://blog.rsise.anu.edu.au/?q=pietro ++ ++ "All great truths begin as blasphemies." -George Bernard Shaw ++ Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html