Thanks Frédéric! Works nicely.

- Nicolas


On Sat, May 14, 2016 at 11:29 PM, Frédéric Bour <frederic.bour@lakaban.net> wrote:
On 05/14/2016 09:50 PM, Nicolas Ojeda Bar wrote:

Hi list,

I am trying to write a parser using ocamllex for a language for which the usual character -> token -> ast does not make sense and instead one needs to produce the ast directly from the character level.

Even though this is not its intended use case, it actually works quite well! Except that, while I can define higher-order rules, such as

  rule sp item = parse
  | ' ' { item lexbuf }

I cannot actually use such a rule for 'item's which return different types.  As far as I can see this is because the set of all rules is translated into a set of recursively defined functions and using 'sp' with 'items' of different types would require the function corresponding to 'sp' to have some explicit polymorphism annotation.

Is there a way to twist ocamllex a little more to make this work ?  Or if not, would it make sense to expose a way for the user to provide these annotations herself ?

Here is how I would do it:

1) Polymorphic recursion:

- define a record type with one field for each rule, making types explicit:

  type rules = { sp : 'a. (Lexing.lexbuf -> 'a) -> Lexing.lexbuf -> 'a; ...  }

- make each rule take an argument of this type, and refer to rules by projecting from the record:

  rule foo rules = parse
    | 'int' { rules.sp rules.int_item lexbuf }
    | 'string' { rules.sp rules.string_item lexbuf }

- tie the knot outside of the lexer:

let rec rules = { sp = (fun lexbuf -> sp rules lexbuf); ... }

When invoking the lexer, pass the rules argument too.

2) Variation with open recursion:

If you assume each rule take the "rules" argument, you can use type recursion and avoid some annoyances when tying the knot.
You will have to pass the "rules" argument manually each time you call a rule.

  type rules = { sp : 'a. (rules -> Lexing.lexbuf -> 'a) -> rules -> Lexing.lexbuf -> 'a; ...  }

  rule foo rules = parse
    | 'int' { rules.sp rules.int_item rules lexbuf }
    | 'string' { rules.sp rules.string_item rules lexbuf }

3) Perhaps there is no recursion in "sp" definition.
You can simply define 'sp' rule in a different lexer (and then call "FirstLexer.sp").

I am not 100% sure this 3) will work, but I think there is no shared state in the lexbuf after a rule succeed. Calling a different lexer should be fine.

Thanks!

- Nicolas


--
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs