caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Pietro Abate <Pietro.Abate@pps.jussieu.fr>
To: caml-list@yquem.inria.fr
Subject: camlp4 and lexers
Date: Thu, 15 May 2008 17:00:33 +0200	[thread overview]
Message-ID: <20080515150033.GA31934@uranium.pps.jussieu.fr> (raw)

Hi all,
This question was asked a few weeks ago, and again last week.  However I
still don't really get how to proceed. I hope we can cook down a small
example to understand a bit more the camlp4 internals.

Say I want to write a small parser for regexp (or an aritmetic
calculator), but I don't want to extend the ocaml grammar to do that. I
just want to create a minimal lexer and a minimal grammar to parse
expressions like (aaa*|b?);c

The parser part is easy (below). The part I don't understand is how to
create a lexer. I had a look at the ocsigen xmlcaml lexer and the camlp4
lexer, but I still haven't found a minimal example I can use without
getting confused. 

In particular, the problem below is that I want my lexer to give me back
CHAR tokens (different from the CHAR of char * string of camlp4) and not
strings. I could do the same with the camlp4 lexer, but all my regexp
should be then written as ('a''a''a' *) etc ... that it's not good
looking.

A while ago I did something similar with the old camlp4 [1] using
plexer, but this is not possible anymore...

Nicolas a while ago suggested to copy the Camlp4.PreCast module and the 
lexer module and customize them. I think it should be possible just
to use Struct.Grammar.Static.Make with a new lexer instead... but, as I
said, I'm not able to write a very minimal lexer for this example...
Maybe I'm confused about this.

I think a minimal example will help more then one person here.

thanks :)
p


-------------------------- This is my parser...

module RegExGram =  Struct.Grammar.Static.Make(RegExpLexer)

let regex = RegExGram.Entry.mk "regex"

EXTEND RegExGram
  GLOBAL: regex;

  regex: [[ e1 = SELF ; "|" ; e2 = concat -> Alt(e1,e2)
          | e1 = seq -> e1 ]
  ];

  concat:[[ e1 = SELF ; ";"; e2 = seq -> Seq(e1,e2)
          | e1 = SELF ; e2 = seq -> Seq(e1,e2)
          | e1 = seq -> e1 ]
  ];

  seq:   [[ e1 = simple ; "?" -> Opt e1
          | e1 = simple ; "*" -> Star e1
          | e1 = simple ; "+" -> Plus e1
          | e1 = simple -> e1 ]
  ];

  simple:[[ "." -> Dot
          | "("; e1 = regex; ")" -> e1
          | `CHAR(s) -> Sym s ]
  ];

END

----------------------

[1] http://groups.google.com/group/fa.caml/browse_thread/thread/e26569427cc8879d


             reply	other threads:[~2008-05-15 15:01 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-15 15:00 Pietro Abate [this message]
2008-05-16 15:24 ` [Caml-list] " Pietro Abate

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080515150033.GA31934@uranium.pps.jussieu.fr \
    --to=pietro.abate@pps.jussieu.fr \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).