From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 38635BC0C for ; Fri, 2 Feb 2007 07:22:14 +0100 (CET) Received: from mail.rsise.anu.edu.au (mail.rsise.anu.edu.au [150.203.208.4]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l126MAgA029796 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 2 Feb 2007 07:22:13 +0100 Received: from localhost (localhost [127.0.0.1]) by mail.rsise.anu.edu.au (Postfix) with ESMTP id 20E8954141; Fri, 2 Feb 2007 17:22:08 +1100 (EST) Received: from mail.rsise.anu.edu.au ([150.203.208.4]) by localhost (mail [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 28768-07; Fri, 2 Feb 2007 17:22:08 +1100 (EST) Received: from pulp.rsise.anu.edu.au (pulp.rsise.anu.edu.au [150.203.208.49]) by mail.rsise.anu.edu.au (Postfix) with ESMTP id F1DE254114; Fri, 2 Feb 2007 17:22:03 +1100 (EST) Received: by pulp.rsise.anu.edu.au (Postfix, from userid 1560) id BA054A0AFC7; Fri, 2 Feb 2007 17:26:06 +1100 (EST) Date: Fri, 2 Feb 2007 17:26:06 +1100 From: Pietro Abate To: caml-list@yquem.inria.fr, ocaml ml Subject: Re: [Caml-list] mixing lexers with camlp4 Message-ID: <20070202062606.GA28900@pulp.rsise.anu.edu.au> Mail-Followup-To: Pietro Abate , caml-list@yquem.inria.fr, ocaml ml References: <20070202014011.GA26699@pulp.rsise.anu.edu.au> MIME-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20070202014011.GA26699@pulp.rsise.anu.edu.au> X-Operating-System: GNU/Linux X-Organization: Research School of Information Science and Engineering (Australian National University) User-Agent: Mutt/1.5.13 (2006-08-11) X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mail.rsise.anu.edu.au X-Miltered: at discorde with ID 45C2D892.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; lexers:01 camlp:01 camlp:01 lexers:01 grammars:01 genlex:01 plexer:01 genlex:01 plexer:01 lexer:01 lexer:01 kwd:01 failwith:01 parser:01 lident:01 In the best traditions, I partially answer to myself (below) but I've a new question: > - Does camlp4 allows me to mix lexers for different productions in the same > extension ? well, it seems it doesn't. Now I get this error: Error: entries "psymbol" and "symbol" do not belong to the same grammar. Fatal error: exception Failure("Grammar.extend error") - Is there a deep reason why I cannot mix different grammars ? - Is there a way of forcing this behaviour ? On Fri, Feb 02, 2007 at 12:40:11PM +1100, Pietro Abate wrote: > Hi all, > I want to parsa a language like this one: > l := l & l | l % l | Id [...] > of course the Genlex module is not immediately compatible with the Plexer > interface so I'm a bit lost... > > - Is this the best way of doing it ? don't know, maybe not. > - How can I make the Genlex module compatible with the Plexer > interface (example ?) ? This should do the job (I think) even if ignore the location... open Genlex let lexer = Genlex.make_lexer [ "+";"-";"*";"/";"="; "[";"]";"<";">"; "%";"&";"*";"?";"~" ];; let getkwd = function Kwd s -> s | _ -> failwith "aa" ;; let rec glexer = parser [< 'Kwd ("+" | "-" | "*" | "/" |"=" | "[" | "]" | "<" |">" | "%" | "&" | "?" | "~" ) as s >] -> ("", getkwd s) | [< 'Ident s >] -> ("LIDENT",s) | [< >] -> ("EOI","") ;; let lexer_gmake () = { Token.tok_func = Token.lexer_func_of_parser (fun s -> (glexer (lexer s), Token.dummy_loc)); Token.tok_using = (fun _ -> ()); Token.tok_removing = (fun _ -> ()); Token.tok_match = Token.default_match; Token.tok_text = Token.lexer_text; Token.tok_comm = None } ;; The full code of my example: to compile: #> camlp4o pa_extend.cmo pr_o.cmo pa_test.ml >> test.ml #> ocamlfind ocamlc -package camlp4 camlp4.cma str.cma test.ml ------------ pa_test.ml ------------ open Genlex type stype = Lid | Symbol of string ;; let lexer = Genlex.make_lexer [ "+";"-";"*";"/";"="; "[";"]";"<";">"; "%";"&";"*";"?";"~" ];; let getkwd = function Kwd s -> s | _ -> failwith "fail getkwd" ;; let rec glexer = parser [< 'Kwd ("+" | "-" | "*" | "/" |"=" | "[" | "]" | "<" |">" | "%" | "&" | "?" | "~" ) as s >] -> ("", getkwd s) | [< 'Ident s >] -> ("LIDENT",s) | [< >] -> ("EOI","") ;; let lexer_gmake () = { Token.tok_func = Token.lexer_func_of_parser (fun s -> (glexer (lexer s), Token.dummy_loc)); Token.tok_using = (fun _ -> ()); Token.tok_removing = (fun _ -> ()); Token.tok_match = Token.default_match; Token.tok_text = Token.lexer_text; Token.tok_comm = None } ;; let symbgrammar = Grammar.gcreate (lexer_gmake ());; let symbol strm = match Stream.peek strm with |Some("",s) -> Stream.junk strm; s |Some("LINDENT",s) -> Stream.junk strm; s | _ -> raise Stream.Failure ;; let symbol = Grammar.Entry.of_parser symbgrammar "symbol" symbol ;; let grammar = Grammar.gcreate (Plexer.gmake ());; let gram_list = Grammar.Entry.create grammar "gram_list";; EXTEND GLOBAL: gram_list; gram_list: [[ grams = LIST1 gram; EOI -> grams ]]; gram: [[ p = LIDENT; ":="; rules = LIST1 rule SEP "|" -> (p,rules) ]]; rule: [[ psl = LIST1 psymbol -> psl ]]; psymbol: [[ "VAR" -> Lid | e = symbol -> Symbol(e) ]]; END ;; let apply s = Grammar.Entry.parse gram_list (Stream.of_string s);; (apply "l := VAR");; (apply "l := VAR & VAR");; (apply "l := VAR U VAR");; Je vous remercie énormément pour votre aide. :) p -- ++ Blog: http://blog.rsise.anu.edu.au/?q=pietro ++ ++ "All great truths begin as blasphemies." -George Bernard Shaw ++ Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html