caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* another question on lexer function
@ 1996-11-07 10:07 Olivier Pons
  1996-11-08 13:21 ` Pierre Weis
  0 siblings, 1 reply; 2+ messages in thread
From: Olivier Pons @ 1996-11-07 10:07 UTC (permalink / raw)
  To: caml-list

	hello !

is it forbidden  to use the ";;" token  in a make_lexer function ?

as  the silly exemple below suggests:

#let lexer = make_lexer [ "titi"; "::";";;"; "toto"];;
lexer : char stream -> token stream = <fun>
#let token_stream = lexer(stream_of_string "toto ;; titi");;
token_stream : token stream = <abstr>
#stream_next token_stream;; 
- : token = Kwd "toto"
#stream_next token_stream;; 
Uncaught exception: Parse_error


thanks in advance,

Olivier





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: another question on lexer function
  1996-11-07 10:07 another question on lexer function Olivier Pons
@ 1996-11-08 13:21 ` Pierre Weis
  0 siblings, 0 replies; 2+ messages in thread
From: Pierre Weis @ 1996-11-08 13:21 UTC (permalink / raw)
  To: Olivier Pons; +Cc: caml-list

> is it forbidden  to use the ";;" token  in a make_lexer function ?

You cannot use an arbitrary sequence of characters as a declared
keyword: the sequences you write must be recognized as an identifier
by the ``next_token'' function inside make_lexer, (or be a single
non-alphanumeric character). The next_token function recognizes two
kinds of identifiers: regular ones (roughly speaking sequences of
alphanumeric characters), and special ones or symbols (sequences of
non alphanumeric characters, such as ++). Once an identifier is
recognized, it is compared with the list of declared keywords: if
found in the list, a Kwd token is emitted, otherwise an Ident token is
returned.

Since the sequence ;; is recognized as two single characters by
next_token (same treatment as for parens, brackets or commas).  But
the rule for single non-alphanumeric characters is that they must have
been declared as keywords, or it is a lexical error. Since ; has not
been declared as a keyword of the lexer, an error occurs.

Now, if you want to deal with a token ``;;'', you may:
 1) Declare ";" as a keyword, then interpret to successive `;' tokens as a
    ";;" in your grammar rule.
 2) Adapt the genlex module to your specific needs. I strongly
    recommend this solution, in particular if you want to understand
    stream parsing, or if you will use the lexer in an intensive way.

Best regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis







^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1996-11-08 13:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-11-07 10:07 another question on lexer function Olivier Pons
1996-11-08 13:21 ` Pierre Weis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).