From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by sympa.inria.fr (Postfix) with ESMTPS id A00D47ED45 for ; Sat, 23 Jun 2012 19:39:48 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: An4BACf/5U/RVaG2kGdsb2JhbAAqGg61VAgiAQEBAQkJDQcUBCOCGAEBAQQSAhMZARsdAQMMBgULDS4hAQERAQUBHAYTIodaAQMLCymaRgkDjCOCcYRAChknDVeIcQEFDIpEY4YCA5UugRKJcIMiPoNFOw X-IronPort-AV: E=Sophos;i="4.77,464,1336341600"; d="scan'208";a="164156984" Received: from mail-gg0-f182.google.com ([209.85.161.182]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 23 Jun 2012 19:39:47 +0200 Received: by ggnm2 with SMTP id m2so3294856ggn.27 for ; Sat, 23 Jun 2012 10:39:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=/mvCxR3IHXWyKO/EJOGI4h7xZZaQPSZBQOZIB0/5kes=; b=pJtbA39N/U/1TKInYL/maDJCBZVC/P5l/uc/yqx55ju6mrAFkKo9em2bBj4lXhJSKE OF+lSJbOUrA0Kgw+7a52FU/EPt6kKU7SBj8/iH6+SZgE2UpEq7A78iu+04aqwjJdRbAc 12esE98me+EpoEe/W8GmWQxbnznRiA+NmonpEmHfVaRCv9MlptNiYReHBIS3FNx4fgiT bMxX7cX0SBLKdNzwBb3UCZtDgVEnELumiYrna9LRfOIBlB+OQRJ+KFnIdebUY47YhV0e Gq18pw8215RkqchbHv7Hgw6OnkCO2V75eQHOD0oDnxZYayV/LjuBOCBAhinb0njjSk/R EGVA== Received: by 10.50.40.169 with SMTP id y9mr4545771igk.56.1340473186329; Sat, 23 Jun 2012 10:39:46 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.94.39 with HTTP; Sat, 23 Jun 2012 10:39:06 -0700 (PDT) In-Reply-To: References: From: Gabriel Scherer Date: Sat, 23 Jun 2012 19:39:06 +0200 Message-ID: To: Aaron Bohannon Cc: caml-list@inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Caml-list] Compiling with camlp4 extensions If you want to implement your own lexer, you have to provide the MakeGram functor your own module satisfying the Lexer signature. http://bluestorm.info/camlp4/camlp4-doc/Sig.Lexer.html If you can reuse Camlp4's predefined lexer, however, you should not hesitate to do that. There is little use in being original on the lexing part, and users that already know OCaml will appreciate the consistency in the lexical conventions. Camlp4's token type for OCaml is rich enough to integrate comments and whitespace information, so you can even define an indentation-dependent language on top of the pre-existing lexer, using a filtering function on the token stream : http://bluestorm.info/camlp4/camlp4-doc/Sig.Token.Filter.html On Sat, Jun 23, 2012 at 3:41 PM, Aaron Bohannon w= rote: > Ah, yes. =A0That is helpful. =A0I had thought of trying to "extend" OCaml > by replacing the grammar with a different one, although I didn't know > exactly how to do it. > > Of course, it seemed obvious to me that I wouldn't be able to use my > own lexer if I did that. =A0I'm not sure if I will want to do that or > not yet, but I was thinking I would just learn to do it that way so > I'd have that flexibility if I need it. =A0Unfortunately, the page stops > short of explaining how to pursue that approach. :( > > =A0- Aaron > > On Sat, Jun 23, 2012 at 3:42 AM, Gabriel Scherer > wrote: >> See the "full parser tutorial" in the Camlp4 wiki, it has information >> for what, if I have correctly understood, is your use case, including >> location handling. >> =A0http://brion.inria.fr/gallium/index.php/Full_parser_tutorial >> >> On Sat, Jun 23, 2012 at 2:43 AM, Aaron Bohannon wrote: >>> Thanks for the reply. =A0The example is helpful. =A0However, I should h= ave >>> been more clear: I don't exactly want to write a syntax extension, per >>> se. =A0Rather, I am trying to use camlp4 to parse a non-OCaml grammar >>> and to generate an OCaml AST. =A0So the "Register.OCamlSyntaxExtension" >>> functor doesn't seem like it will work for me. =A0Instead, I tried using >>> "Printers.Ocaml.print_implem" in my "extension" code and everything >>> works fine, except for error locations. =A0Of course, I realize this is >>> because the AST is being printed and then re-parsed, but I don't know >>> how to prevent it from being reparsed. =A0I looked through all the >>> Camlp4 interfaces and thought that perhaps I need to use the function >>> "Register.register_str_item_parser". =A0But I couldn't make that work. >>> Either that's not the function I need or else I don't know how to use >>> it -- I can't tell which. >>> >>> =A0- Aaron >>> >>> On Fri, Jun 22, 2012 at 10:36 AM, Gabriel Scherer >>> wrote: >>>> All nodes in a Camlp4 AST are annotated with location information; the >>>> locations you get from the parser are correct, and it is your >>>> responsibility, as an extension writer, to ensure that any new nodes >>>> you generate also have (approximately) correct location information. >>>> >>>> If you build AST nodes "by hand", you have to provide this location >>>> explicitly. If you use the concrete syntax quotations, the location >>>> used is the value _loc present in the environment, whatever it may be. >>>> So to have correct locations, you have to make sure that, at every AST >>>> you produce through a quotation, there is a "_loc" variable in scope >>>> with the correct value. If you match AST pieces with quotation >>>> patterns (match e with <:expr< $a$ + $b$ >> -> ...), you may bind the >>>> location variable through the syntax "<:expr@foo<", for example: >>>> (match e with <:expr@_loc< $a$ + $b$ >> -> ...). Finally, if you're >>>> inside an EXTEND block defining a parsing rule, the idenfitier _loc is >>>> implicitely bound to a location corresponding to what was parsed by >>>> this rule. >>>> >>>> See for example the toy extension pa_refutable, that has example of >>>> those various things: >>>> =A0http://bluestorm.info/camlp4/pa_refutable.ml.html >>>> >>>> In some very rare cases (or if you are perfectionist), you may want to >>>> give to a new node a location that is not quite the location of any of >>>> the parsed node you're working on. You may use various functions of >>>> the Loc submodule of your syntax definition to forge new locations; in >>>> particular, Loc.merge merges two (supposed contiguous) locations. >>>> =A0http://bluestorm.info/camlp4/camlp4-doc/Sig.Loc.html >>>> >>>> >>>> >>>> >>>> >>>> On Fri, Jun 22, 2012 at 5:53 PM, Aaron Bohannon wrote: >>>>> Hi, >>>>> >>>>> I have been trying to use the new camlp4 to write an OCaml syntax >>>>> extension.=A0 All the examples I have seen so far suggest that I use = the >>>>> extension by passing ocamlc the "-pp" option.=A0 But it seems that al= l the >>>>> location info for error messages gets lost when I do this unless I ca= tch and >>>>> report the parse error myself within the extension.=A0 Is there some = way to >>>>> get ocamlc to report the parse error at the correct location automati= cally? >>>>> >>>>> - Aaron