caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Parameterised lexer
@ 2008-09-14 19:53 Dario Teixeira
  2008-09-16  9:38 ` [Caml-list] " David Allsopp
  0 siblings, 1 reply; 2+ messages in thread
From: Dario Teixeira @ 2008-09-14 19:53 UTC (permalink / raw)
  To: caml-list

Hi,

Is it possible to write a ocamllex/ulex scanner where a regexp is a parameter
to the lexer function?  I'm looking for something like the (invalid) ulex
code below demonstrates ("param" is the parameter):

let regexp alpha = ['a'-'z' 'A'-'Z']
let regexp whitespace = [' ' '\t' '\n']
let regexp param1 = 'x'
let regexp param2 = 'y'
let regexp param3 = 'z'

let rec token param = lexer
        | param         ->      Printf.print "*";
                                token param lexbuf
        | alpha+        ->      Printf.printf "%s" (Ulexing.utf8_lexeme lexbuf);
                                token param lexbuf
        | whitespace+   ->      Printf.printf " ";
                                token param lexbuf
        | eof           ->      Printf.printf "EOF\n"

let main () =
        let lexbuf = Ulexing.from_utf8_channel stdin
        in token param1 lexbuf

let _ = Printexc.print main ()


Thanks in advance for your help!
Kind regards,
Dario Teixeira






^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: [Caml-list] Parameterised lexer
  2008-09-14 19:53 Parameterised lexer Dario Teixeira
@ 2008-09-16  9:38 ` David Allsopp
  0 siblings, 0 replies; 2+ messages in thread
From: David Allsopp @ 2008-09-16  9:38 UTC (permalink / raw)
  To: 'Dario Teixeira', caml-list

Definitely not possible (directly) with ocamllex - what you're suggesting
would involve recompiling the automaton on each call which isn't how
ocamllex works. Don't know about ulex.

But: do you know enough about the kind of expressions that param could be to
use one regexp that would cover them all (e.g. ['x'|'y'|'z'] for the example
below)? You could then have a lexer action of the form:

rule token param = parse
  reg-exp-for-params {if Str.string_match param (Lexing.lexeme lexbuf) 0
                      then () (* Code *)
                      else failwith "lexing: empty token"}
| rest-of-the-lexer


David

-----Original Message-----
From: caml-list-bounces@yquem.inria.fr
[mailto:caml-list-bounces@yquem.inria.fr] On Behalf Of Dario Teixeira
Sent: 14 September 2008 21:53
To: caml-list@yquem.inria.fr
Subject: [Caml-list] Parameterised lexer

Hi,

Is it possible to write a ocamllex/ulex scanner where a regexp is a
parameter
to the lexer function?  I'm looking for something like the (invalid) ulex
code below demonstrates ("param" is the parameter):

let regexp alpha = ['a'-'z' 'A'-'Z']
let regexp whitespace = [' ' '\t' '\n']
let regexp param1 = 'x'
let regexp param2 = 'y'
let regexp param3 = 'z'

let rec token param = lexer
        | param         ->      Printf.print "*";
                                token param lexbuf
        | alpha+        ->      Printf.printf "%s" (Ulexing.utf8_lexeme
lexbuf);
                                token param lexbuf
        | whitespace+   ->      Printf.printf " ";
                                token param lexbuf
        | eof           ->      Printf.printf "EOF\n"

let main () =
        let lexbuf = Ulexing.from_utf8_channel stdin
        in token param1 lexbuf

let _ = Printexc.print main ()


Thanks in advance for your help!
Kind regards,
Dario Teixeira



      

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-09-16  9:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-14 19:53 Parameterised lexer Dario Teixeira
2008-09-16  9:38 ` [Caml-list] " David Allsopp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).