Re: [Caml-list] Compiling with camlp4 extensions

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Aaron Bohannon <bohannon@seas.upenn.edu>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Compiling with camlp4 extensions
Date: Sat, 23 Jun 2012 19:39:06 +0200	[thread overview]
Message-ID: <CAPFanBHbOK371YaqHbSowTmgLxo+76YRPz69BBAxE7686hn_vQ@mail.gmail.com> (raw)
In-Reply-To: <CANghceYE7kuV4SjT+wsVVzE_i6K9COt+WdznwnO4FH2LFf5u_Q@mail.gmail.com>

If you want to implement your own lexer, you have to provide the
MakeGram functor your own module satisfying the Lexer signature.
  http://bluestorm.info/camlp4/camlp4-doc/Sig.Lexer.html

If you can reuse Camlp4's predefined lexer, however, you should not
hesitate to do that. There is little use in being original on the
lexing part, and users that already know OCaml will appreciate the
consistency in the lexical conventions. Camlp4's token type for OCaml
is rich enough to integrate comments and whitespace information, so
you can even define an indentation-dependent language on top of the
pre-existing lexer, using a filtering function on the token stream :
  http://bluestorm.info/camlp4/camlp4-doc/Sig.Token.Filter.html

On Sat, Jun 23, 2012 at 3:41 PM, Aaron Bohannon <bohannon@seas.upenn.edu> wrote:
> Ah, yes.  That is helpful.  I had thought of trying to "extend" OCaml
> by replacing the grammar with a different one, although I didn't know
> exactly how to do it.
>
> Of course, it seemed obvious to me that I wouldn't be able to use my
> own lexer if I did that.  I'm not sure if I will want to do that or
> not yet, but I was thinking I would just learn to do it that way so
> I'd have that flexibility if I need it.  Unfortunately, the page stops
> short of explaining how to pursue that approach. :(
>
>  - Aaron
>
> On Sat, Jun 23, 2012 at 3:42 AM, Gabriel Scherer
> <gabriel.scherer@gmail.com> wrote:
>> See the "full parser tutorial" in the Camlp4 wiki, it has information
>> for what, if I have correctly understood, is your use case, including
>> location handling.
>>  http://brion.inria.fr/gallium/index.php/Full_parser_tutorial
>>
>> On Sat, Jun 23, 2012 at 2:43 AM, Aaron Bohannon <bohannon@seas.upenn.edu> wrote:
>>> Thanks for the reply.  The example is helpful.  However, I should have
>>> been more clear: I don't exactly want to write a syntax extension, per
>>> se.  Rather, I am trying to use camlp4 to parse a non-OCaml grammar
>>> and to generate an OCaml AST.  So the "Register.OCamlSyntaxExtension"
>>> functor doesn't seem like it will work for me.  Instead, I tried using
>>> "Printers.Ocaml.print_implem" in my "extension" code and everything
>>> works fine, except for error locations.  Of course, I realize this is
>>> because the AST is being printed and then re-parsed, but I don't know
>>> how to prevent it from being reparsed.  I looked through all the
>>> Camlp4 interfaces and thought that perhaps I need to use the function
>>> "Register.register_str_item_parser".  But I couldn't make that work.
>>> Either that's not the function I need or else I don't know how to use
>>> it -- I can't tell which.
>>>
>>>  - Aaron
>>>
>>> On Fri, Jun 22, 2012 at 10:36 AM, Gabriel Scherer
>>> <gabriel.scherer@gmail.com> wrote:
>>>> All nodes in a Camlp4 AST are annotated with location information; the
>>>> locations you get from the parser are correct, and it is your
>>>> responsibility, as an extension writer, to ensure that any new nodes
>>>> you generate also have (approximately) correct location information.
>>>>
>>>> If you build AST nodes "by hand", you have to provide this location
>>>> explicitly. If you use the concrete syntax quotations, the location
>>>> used is the value _loc present in the environment, whatever it may be.
>>>> So to have correct locations, you have to make sure that, at every AST
>>>> you produce through a quotation, there is a "_loc" variable in scope
>>>> with the correct value. If you match AST pieces with quotation
>>>> patterns (match e with <:expr< $a$ + $b$ >> -> ...), you may bind the
>>>> location variable through the syntax "<:expr@foo<", for example:
>>>> (match e with <:expr@_loc< $a$ + $b$ >> -> ...). Finally, if you're
>>>> inside an EXTEND block defining a parsing rule, the idenfitier _loc is
>>>> implicitely bound to a location corresponding to what was parsed by
>>>> this rule.
>>>>
>>>> See for example the toy extension pa_refutable, that has example of
>>>> those various things:
>>>>  http://bluestorm.info/camlp4/pa_refutable.ml.html
>>>>
>>>> In some very rare cases (or if you are perfectionist), you may want to
>>>> give to a new node a location that is not quite the location of any of
>>>> the parsed node you're working on. You may use various functions of
>>>> the Loc submodule of your syntax definition to forge new locations; in
>>>> particular, Loc.merge merges two (supposed contiguous) locations.
>>>>  http://bluestorm.info/camlp4/camlp4-doc/Sig.Loc.html
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jun 22, 2012 at 5:53 PM, Aaron Bohannon <bohannon@seas.upenn.edu> wrote:
>>>>> Hi,
>>>>>
>>>>> I have been trying to use the new camlp4 to write an OCaml syntax
>>>>> extension.  All the examples I have seen so far suggest that I use the
>>>>> extension by passing ocamlc the "-pp" option.  But it seems that all the
>>>>> location info for error messages gets lost when I do this unless I catch and
>>>>> report the parse error myself within the extension.  Is there some way to
>>>>> get ocamlc to report the parse error at the correct location automatically?
>>>>>
>>>>> - Aaron

next prev parent reply	other threads:[~2012-06-23 17:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-22 15:53 Aaron Bohannon
2012-06-22 16:36 ` Gabriel Scherer
2012-06-23  0:43   ` Aaron Bohannon
2012-06-23  9:42     ` Gabriel Scherer
2012-06-23 13:41       ` Aaron Bohannon
2012-06-23 17:39         ` Gabriel Scherer [this message]
2012-06-22 18:01 ` [Caml-list] " Hongbo Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPFanBHbOK371YaqHbSowTmgLxo+76YRPz69BBAxE7686hn_vQ@mail.gmail.com \
    --to=gabriel.scherer@gmail.com \
    --cc=bohannon@seas.upenn.edu \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).