caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* I don t get the lexer
@ 2006-02-12 17:33 jean-david hsu
  2006-02-14  7:55 ` [Caml-list] " Nicolas Pouillard
  0 siblings, 1 reply; 2+ messages in thread
From: jean-david hsu @ 2006-02-12 17:33 UTC (permalink / raw)
  To: caml-list

Hello everyone
how come my lexer does not break "?!" both defined as keywords but puts 
"." aside?


# let lexer = make_lexer [".";"!";"?"];;
val lexer : char Stream.t -> Genlex.token Stream.t = <fun>
# let token_stream = lexer(Stream.of_string "hello! but ?! but!?. . jhg.");;
val token_stream : Genlex.token Stream.t = <abstr>
# Stream.next token_stream;;
- : Genlex.token = Ident "hello"
# Stream.next token_stream;;
- : Genlex.token = Kwd "!"
# Stream.next token_stream;;
- : Genlex.token = Ident "but"
# Stream.next token_stream;;
- : Genlex.token = Ident "?!"
# Stream.next token_stream;;
- : Genlex.token = Ident "but"
# Stream.next token_stream;;
- : Genlex.token = Ident "!?"
# Stream.next token_stream;;
- : Genlex.token = Kwd "."
# Stream.next token_stream;;
- : Genlex.token = Kwd "."
# Stream.next token_stream;;
- : Genlex.token = Ident "jhg"
# Stream.next token_stream;;
- : Genlex.token = Kwd "."

JD

	

	
		
___________________________________________________________________________ 
Nouveau : téléphonez moins cher avec Yahoo! Messenger ! Découvez les tarifs exceptionnels pour appeler la France et l'international.
Téléchargez sur http://fr.messenger.yahoo.com


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] I don t get the lexer
  2006-02-12 17:33 I don t get the lexer jean-david hsu
@ 2006-02-14  7:55 ` Nicolas Pouillard
  0 siblings, 0 replies; 2+ messages in thread
From: Nicolas Pouillard @ 2006-02-14  7:55 UTC (permalink / raw)
  To: jean-david hsu; +Cc: caml-list

On 2/12/06, jean-david hsu <jhsu1@email.sjsu.edu> wrote:
> Hello everyone
Hi,

> how come my lexer does not break "?!" both defined as keywords but puts
> "." aside?

Because the lexer of Genlex is not really generated but parametrized
by a keyword table.

Concerned rules are:
 | Some
        ('!' | '%' | '&' | '$' | '#' | '+' | '/' | ':' | '<' | '=' | '>' |
         '?' | '@' | '\\' | '~' | '^' | '|' | '*' as c) ->
        Stream.junk strm__;
        let s = strm__ in reset_buffer (); store c; ident2 s
[...]
| Some c -> Stream.junk strm__; Some (keyword_or_error c)
[...]
  and ident2 (strm__ : _ Stream.t) =
    match Stream.peek strm__ with
      Some
        ('!' | '%' | '&' | '$' | '#' | '+' | '-' | '/' | ':' | '<' | '=' |
         '>' | '?' | '@' | '\\' | '~' | '^' | '|' | '*' as c) ->
        Stream.junk strm__; let s = strm__ in store c; ident2 s

In short '?' and '!' are in the same character class but '.' is
treated by the default case. Since the lexer search the longest token
that matches, "?!" are packed together but not with a '.' .

--
Nicolas Pouillard

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-02-14  7:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-12 17:33 I don t get the lexer jean-david hsu
2006-02-14  7:55 ` [Caml-list] " Nicolas Pouillard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).