From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 0F382BC6B for ; Fri, 22 Jun 2007 12:15:07 +0200 (CEST) Received: from dorado.vub.ac.be (smtp.vub.ac.be [134.184.129.10]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l5MAF6sw016732 for ; Fri, 22 Jun 2007 12:15:06 +0200 Received: from exchange.etro.vub.ac.be (etro10.vub.ac.be [134.184.2.10]) by dorado.vub.ac.be (Postfix) with ESMTP id 403578A1 for ; Fri, 22 Jun 2007 12:15:06 +0200 (CEST) Received: from ssel.vub.ac.be ([10.0.4.37]) by exchange.etro.vub.ac.be with Microsoft SMTPSVC(6.0.3790.3959); Fri, 22 Jun 2007 12:15:06 +0200 Received: from localhost (localhost.localdomain [127.0.0.1]) by ssel.vub.ac.be (Postfix) with ESMTP id 08A932468674; Fri, 22 Jun 2007 12:15:06 +0200 (CEST) X-Virus-Scanned: amavisd-new at ssel.vub.ac.be Received: from ssel.vub.ac.be ([127.0.0.1]) by localhost (ssel.vub.ac.be [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YYCffWNZKvex; Fri, 22 Jun 2007 12:14:57 +0200 (CEST) Received: from [10.0.4.145] (unknown [10.0.4.145]) by ssel.vub.ac.be (Postfix) with ESMTP id B7A7C2468671; Fri, 22 Jun 2007 12:14:57 +0200 (CEST) In-Reply-To: <200706220014.39510.jon@ffconsultancy.com> References: <200706220014.39510.jon@ffconsultancy.com> Mime-Version: 1.0 (Apple Message framework v752.3) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <678F32CA-B60D-49B3-AE42-450A53A5C6A8@vub.ac.be> Cc: caml-list@yquem.inria.fr Content-Transfer-Encoding: 7bit From: Bruno De Fraine Subject: Re: [Caml-list] Parsing Date: Fri, 22 Jun 2007 12:14:56 +0200 To: Jon Harrop X-Mailer: Apple Mail (2.752.3) X-OriginalArrivalTime: 22 Jun 2007 10:15:06.0125 (UTC) FILETIME=[3480C3D0:01C7B4B6] X-Miltered: at discorde with ID 467BA12A.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; parsing:01 expr:01 parser:01 expr:01 parser:01 associative:01 parsing:01 associative:01 camlp:01 camlp:01 ocaml:01 mult:01 mult:01 val:01 val:01 On 22 Jun 2007, at 01:14, Jon Harrop wrote: > However, you must factor heads in the pattern matches. So instead of: > > and parse_expr = parser > | [< e1=parse_factor; 'Kwd "+"; e2=parse_expr >] -> e1 +: e2 > | [< e1=parse_factor; 'Kwd "-"; e2=parse_expr >] -> e1 -: e2 > | [< e1=parse_factor >] -> e1 > > you must write: > > and parse_expr = parser > | [< e1=parse_factor; stream >] -> > (parser > | [< 'Kwd "+"; e2=parse_expr >] -> e1 +: e2 > | [< 'Kwd "-"; e2=parse_expr >] -> e1 -: e2 > | [< >] -> e1) stream Actually you must write something like: and parse_expr = parser | [< e1=parse_factor; stream >] -> let rec aux e1 = parser | [< 'Kwd "+"; e2=parse_factor; stream >] -> aux (e1 +: e2) stream | [< 'Kwd "-"; e2=parse_factor; stream >] -> aux (e1 -: e2) stream | [< >] -> e1 in aux e1 stream Although you get the precedences right by organizing different levels for expr/factor/atom, you've made the operations right associative by greedily parsing "e2" as an expression. "+" and "-" should be left associative, unless you want 1-2+3 to evaluate to -4. > Why is this? Is it because the streams are destructive? Could the > camlp4 macro > not factor the pattern match for me? camlp4's extensible grammar system can handle levels and associativities for you. Read http://caml.inria.fr/pub/docs/tutorial-camlp4/tutorial003.html for more information. This is your example in the new camlp4: $ ocaml camlp4of.cma Objective Caml version 3.10.0 Camlp4 Parsing version 3.10.0 # type expr = Num of int | Add of expr * expr | Sub of expr * expr | Mult of expr * expr ;; type expr = Num of int | Add of expr * expr | Sub of expr * expr | Mult of expr * expr # let (+:) e1 e2 = Add(e1,e2) ;; val ( +: ) : expr -> expr -> expr = # let (-:) e1 e2 = Sub(e1,e2) ;; val ( -: ) : expr -> expr -> expr = # let ( *: ) e1 e2 = Mult(e1,e2) ;; val ( *: ) : expr -> expr -> expr = # open Camlp4.PreCast ;; # let expr = Gram.Entry.mk "expr" ;; val expr : '_a Camlp4.PreCast.Gram.Entry.t = # EXTEND Gram expr: [ "add" LEFTA [ e1 = expr; "+"; e2 = expr -> e1 +: e2 | e1 = expr; "-"; e2 = expr -> e1 -: e2 ] | "mult" LEFTA [ e1 = expr; "*"; e2 = expr -> e1 *: e2 ] | "simple" NONA [ n = INT -> Num(int_of_string n) | "("; e = expr; ")" -> e ] ] ; END ;; - : unit = () # Gram.parse expr Loc.ghost (Stream.of_string "1-2+3*4") ;; - : expr = Add (Sub (Num 1, Num 2), Mult (Num 3, Num 4)) Another reason to use camlp4 Grammars is that you get automatic syntax error descriptions. However, this seems to contain bugs in the current camlp4: # Gram.parse expr Loc.ghost (Stream.of_string "1+2*(3+)-4") ;; - : expr = Sub (Add (Num 1, Mult (Num 2, Num 3)), Num 4) (This should not be accepted. It used to give: Stream.Error "[expr] expected after '+' (in [expr])") # Gram.parse expr Loc.ghost (Stream.of_string "1+2*(3+!)-4") ;; Exception: Loc.Exc_located (, Stream.Error "[expr] expected after \"(\" (in [expr])"). (This is wrong, because there is a legal expression after "(", namely "3".) Regards, Bruno -- Bruno De Fraine Vrije Universiteit Brussel Faculty of Applied Sciences, DINF - SSEL Room 4K208, Pleinlaan 2, B-1050 Brussels tel: +32 (0)2 629 29 75 fax: +32 (0)2 629 28 70 e-mail: Bruno.De.Fraine@vub.ac.be