caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Ocamllex question
@ 2005-10-23 18:02 Matt Gushee
  2005-10-23 20:58 ` [Caml-list] " Michael Wohlwend
  2005-10-23 21:12 ` Another problem (was Re: [Caml-list] Ocamllex question) Matt Gushee
  0 siblings, 2 replies; 8+ messages in thread
From: Matt Gushee @ 2005-10-23 18:02 UTC (permalink / raw)
  To: caml-list

Hello, people--

In a lexer definition with two or more entry points, is there a way to
emit a lexeme and pass control to another entrypoint in one action?

The specific problem I am trying to deal with is a configuration file
format that includes comments denoted with an initial '#' character. I
would like to support the typical usage of '#', where a comment may
begin either at the beginning of the line, or after a declaration that I
want to capture, and in either case it extends to the end of the line.

So in general, anything after '#' up to the end of a line should be
ignored, which I think requires a separate 'comment' entrypoint. At the
end of the line, control returns to the main entry point. So my first
cut looks like this:

  rule dict = parse
      [' ']                             { dict lexbuf }
    | '#'                               { comment lexbuf }
    | word                              { WORD (Lexing.lexeme lexbuf) }
    | ':'                               { COLON }
    | '{'                               { DS }
    | '}'                               { DE }
    | ',' | '\n'                        { SEP }
    | eof                               { EOF }
  and comment = parse
      [ ^ '\n' ]                        { comment lexbuf }
    | '\n'                              { dict lexbuf }

So far so good. BUT, for the sake of simplicity (for users, not for me
;-)), my syntax has line endings as separators, and in order to support
comments following non-comments on the same line, a line ending after a
comment should be interpreted as a separator. So what I want to do is
something like:

  and comment = parse
      [ ^ '\n' ]                        { comment lexbuf }
    | '\n'                              { SEP; dict lexbuf }

But that doesn't work, of course. Maybe the solution is to push SEP back
onto the head of the buffer, but I don't see a way to do that.

Or would it be better to simply tag the comment text with, say, a
COMMENT symbol and pass it through to the parser?

--
Matt Gushee
Englewood, CO, USA


^ permalink raw reply	[flat|nested] 8+ messages in thread
* ocamllex question
@ 2009-03-10 22:44 Robert Muller
  0 siblings, 0 replies; 8+ messages in thread
From: Robert Muller @ 2009-03-10 22:44 UTC (permalink / raw)
  To: O'Caml Mailing List

I am attempting to use ocamllex together with ocamlyacc to parse a  
subset of python. Python uses indentation to denote
statement blocks so a lexer is sometimes required to return a sequence  
of tokens without advancing the input pointer. In
particular, a lexer for python should return a sequence of so-called  
DEDENT tokens when indented fragments
end. E.g.,

def f(x):
	statement1;
	statement2;
		statement3;
		statement4;
A

the lexer should return two consecutive DEDENT tokens between the '\n'  
at the end of statement4 and the token for A.

Looking at the documentation and examples, it isn't clear how to  
convince the generated lexer to not advance the input pointer
so that two consecutive DEDENT tokens can be returned before the token  
for A is returned.

Any ocamllex perts out there?

Thanks,
Bob Muller


^ permalink raw reply	[flat|nested] 8+ messages in thread
* ocamllex question
@ 2005-09-21 18:34 skaller
  0 siblings, 0 replies; 8+ messages in thread
From: skaller @ 2005-09-21 18:34 UTC (permalink / raw)
  To: ocaml

Can eof be read from a lexbuf more than once by an ocamllex lexer?
In particular is a recursive lexer matches an eof and
returns to its caller, can the parent caller still read
another eof?

In other words, is the character stream postpended by one eof
or an infinite stream of them?

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-03-10 22:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-23 18:02 Ocamllex question Matt Gushee
2005-10-23 20:58 ` [Caml-list] " Michael Wohlwend
2005-10-23 21:12 ` Another problem (was Re: [Caml-list] Ocamllex question) Matt Gushee
2005-10-23 21:37   ` Michael Wohlwend
2005-10-24 19:50     ` Matt Gushee
2005-10-24 20:18       ` Michael Wohlwend
  -- strict thread matches above, loose matches on Subject: below --
2009-03-10 22:44 ocamllex question Robert Muller
2005-09-21 18:34 skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).