caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* camlp4 and lexers
@ 2008-05-15 15:00 Pietro Abate
  2008-05-16 15:24 ` [Caml-list] " Pietro Abate
  0 siblings, 1 reply; 2+ messages in thread
From: Pietro Abate @ 2008-05-15 15:00 UTC (permalink / raw)
  To: caml-list

Hi all,
This question was asked a few weeks ago, and again last week.  However I
still don't really get how to proceed. I hope we can cook down a small
example to understand a bit more the camlp4 internals.

Say I want to write a small parser for regexp (or an aritmetic
calculator), but I don't want to extend the ocaml grammar to do that. I
just want to create a minimal lexer and a minimal grammar to parse
expressions like (aaa*|b?);c

The parser part is easy (below). The part I don't understand is how to
create a lexer. I had a look at the ocsigen xmlcaml lexer and the camlp4
lexer, but I still haven't found a minimal example I can use without
getting confused. 

In particular, the problem below is that I want my lexer to give me back
CHAR tokens (different from the CHAR of char * string of camlp4) and not
strings. I could do the same with the camlp4 lexer, but all my regexp
should be then written as ('a''a''a' *) etc ... that it's not good
looking.

A while ago I did something similar with the old camlp4 [1] using
plexer, but this is not possible anymore...

Nicolas a while ago suggested to copy the Camlp4.PreCast module and the 
lexer module and customize them. I think it should be possible just
to use Struct.Grammar.Static.Make with a new lexer instead... but, as I
said, I'm not able to write a very minimal lexer for this example...
Maybe I'm confused about this.

I think a minimal example will help more then one person here.

thanks :)
p


-------------------------- This is my parser...

module RegExGram =  Struct.Grammar.Static.Make(RegExpLexer)

let regex = RegExGram.Entry.mk "regex"

EXTEND RegExGram
  GLOBAL: regex;

  regex: [[ e1 = SELF ; "|" ; e2 = concat -> Alt(e1,e2)
          | e1 = seq -> e1 ]
  ];

  concat:[[ e1 = SELF ; ";"; e2 = seq -> Seq(e1,e2)
          | e1 = SELF ; e2 = seq -> Seq(e1,e2)
          | e1 = seq -> e1 ]
  ];

  seq:   [[ e1 = simple ; "?" -> Opt e1
          | e1 = simple ; "*" -> Star e1
          | e1 = simple ; "+" -> Plus e1
          | e1 = simple -> e1 ]
  ];

  simple:[[ "." -> Dot
          | "("; e1 = regex; ")" -> e1
          | `CHAR(s) -> Sym s ]
  ];

END

----------------------

[1] http://groups.google.com/group/fa.caml/browse_thread/thread/e26569427cc8879d


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-05-16 15:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-15 15:00 camlp4 and lexers Pietro Abate
2008-05-16 15:24 ` [Caml-list] " Pietro Abate

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).