caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Frédéric Bour" <frederic.bour@lakaban.net>
To: caml-list@inria.fr
Subject: Re: [Caml-list] ocamllex and polymorphic recursion
Date: Sat, 14 May 2016 23:29:07 +0200	[thread overview]
Message-ID: <8f56349e-0e8d-4e3f-ca8b-3ff0733f88d7@lakaban.net> (raw)
In-Reply-To: <CADK7aFMqvvLmO6jza3Sh_c_bn64tTtjbDQzvLjmFLs2EKA0zAw@mail.gmail.com>

On 05/14/2016 09:50 PM, Nicolas Ojeda Bar wrote:

> Hi list,
>
> I am trying to write a parser using ocamllex for a language for which 
> the usual character -> token -> ast does not make sense and instead 
> one needs to produce the ast directly from the character level.
>
> Even though this is not its intended use case, it actually works quite 
> well! Except that, while I can define higher-order rules, such as
>
>   rule sp item = parse
>   | ' ' { item lexbuf }
>
> I cannot actually use such a rule for 'item's which return different 
> types.  As far as I can see this is because the set of all rules is 
> translated into a set of recursively defined functions and using 'sp' 
> with 'items' of different types would require the function 
> corresponding to 'sp' to have some explicit polymorphism annotation.
>
> Is there a way to twist ocamllex a little more to make this work ?  Or 
> if not, would it make sense to expose a way for the user to provide 
> these annotations herself ?

Here is how I would do it:

1) Polymorphic recursion:

- define a record type with one field for each rule, making types explicit:

   type rules = { sp : 'a. (Lexing.lexbuf -> 'a) -> Lexing.lexbuf -> 'a; 
...  }

- make each rule take an argument of this type, and refer to rules by 
projecting from the record:

   rule foo rules = parse
     | 'int' { rules.sp rules.int_item lexbuf }
     | 'string' { rules.sp rules.string_item lexbuf }

- tie the knot outside of the lexer:

let rec rules = { sp = (fun lexbuf -> sp rules lexbuf); ... }

When invoking the lexer, pass the rules argument too.

2) Variation with open recursion:

If you assume each rule take the "rules" argument, you can use type 
recursion and avoid some annoyances when tying the knot.
You will have to pass the "rules" argument manually each time you call a 
rule.

   type rules = { sp : 'a. (rules -> Lexing.lexbuf -> 'a) -> rules -> 
Lexing.lexbuf -> 'a; ...  }

   rule foo rules = parse
     | 'int' { rules.sp rules.int_item rules lexbuf }
     | 'string' { rules.sp rules.string_item rules lexbuf }

3) Perhaps there is no recursion in "sp" definition.
You can simply define 'sp' rule in a different lexer (and then call 
"FirstLexer.sp").

I am not 100% sure this 3) will work, but I think there is no shared 
state in the lexbuf after a rule succeed. Calling a different lexer 
should be fine.

> Thanks!
>
> - Nicolas


  reply	other threads:[~2016-05-14 21:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-14 19:50 Nicolas Ojeda Bar
2016-05-14 21:29 ` Frédéric Bour [this message]
2016-05-15 10:32   ` Nicolas Ojeda Bar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f56349e-0e8d-4e3f-ca8b-3ff0733f88d7@lakaban.net \
    --to=frederic.bour@lakaban.net \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).