caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Handling include files using ocamllex
@ 2007-08-02 10:09 Erik de Castro Lopo
  2007-08-02 10:29 ` [SPAM?][Caml-list] " Christoph Bauer
  0 siblings, 1 reply; 10+ messages in thread
From: Erik de Castro Lopo @ 2007-08-02 10:09 UTC (permalink / raw)
  To: caml-list

Hi all,

I doing some simple parsing with ocamllex and ocamlyacc and I need 
to be able to handle C style include files.

I know how to do this in C with flex and bison, but I can't figure
out how to do it with ocamllex and ocamlyacc.

Anyone know how to do this?

Cheers,
Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
J. Headley: "God, root, what is difference ?"
G. Haverland: "God can change the byte order on the CPU, root can't."


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [SPAM?][Caml-list] Handling include files using ocamllex
  2007-08-02 10:09 Handling include files using ocamllex Erik de Castro Lopo
@ 2007-08-02 10:29 ` Christoph Bauer
  2007-08-02 10:42   ` [Caml-list] " Erik de Castro Lopo
  2007-08-02 13:19   ` [SPAM?][Caml-list] " skaller
  0 siblings, 2 replies; 10+ messages in thread
From: Christoph Bauer @ 2007-08-02 10:29 UTC (permalink / raw)
  To: Erik de Castro Lopo, caml-list

> Hi all,
> 
> I doing some simple parsing with ocamllex and ocamlyacc and I 
> need to be able to handle C style include files.
> 
> I know how to do this in C with flex and bison, but I can't 
> figure out how to do it with ocamllex and ocamlyacc.
> 
> Anyone know how to do this?

A solution could be to create an lexbuf from a function with
Lexing.from_function.
This function has to manage a stack of open channels and positions. It
has to 
scan for "#include"-statements and copies instead of these statements
the contents
of the corresponding files into the buffer.

Just an idea, I haven't done it yet.

Christoph Bauer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-02 10:29 ` [SPAM?][Caml-list] " Christoph Bauer
@ 2007-08-02 10:42   ` Erik de Castro Lopo
  2007-08-02 13:19   ` [SPAM?][Caml-list] " skaller
  1 sibling, 0 replies; 10+ messages in thread
From: Erik de Castro Lopo @ 2007-08-02 10:42 UTC (permalink / raw)
  To: Christoph Bauer; +Cc: caml-list

Christoph Bauer wrote:

> A solution could be to create an lexbuf from a function with
> Lexing.from_function.
>
> This function has to manage a stack of open channels and positions. It
> has to scan for "#include"-statements and copies instead of these 
> statements the contents of the corresponding files into the buffer.

Nice idea Christoph. I'll give it a try.

Cheers,
Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
The government everybody loves to abuse sues the company everybody loves
to hate. Throw in a bunch of faceless lawyers cross-examining techies
[with] all the charisma of a video driver and you've got a spectacle of
thoroughly miniscule proportions.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [SPAM?][Caml-list] Handling include files using ocamllex
  2007-08-02 10:29 ` [SPAM?][Caml-list] " Christoph Bauer
  2007-08-02 10:42   ` [Caml-list] " Erik de Castro Lopo
@ 2007-08-02 13:19   ` skaller
  2007-08-05  4:52     ` [Caml-list] " Erik de Castro Lopo
  1 sibling, 1 reply; 10+ messages in thread
From: skaller @ 2007-08-02 13:19 UTC (permalink / raw)
  To: Christoph Bauer; +Cc: Erik de Castro Lopo, caml-list

On Thu, 2007-08-02 at 12:29 +0200, Christoph Bauer wrote:
> > Hi all,
> > 
> > I doing some simple parsing with ocamllex and ocamlyacc and I 
> > need to be able to handle C style include files.
> > 
> > I know how to do this in C with flex and bison, but I can't 
> > figure out how to do it with ocamllex and ocamlyacc.
> > 
> > Anyone know how to do this?
> 
> A solution could be to create an lexbuf from a function with
> Lexing.from_function.
> This function has to manage a stack of open channels and positions. It
> has to 
> scan for "#include"-statements and copies instead of these statements
> the contents
> of the corresponding files into the buffer.
> 
> Just an idea, I haven't done it yet.


I recommend abandoning the idea of passing a 
lexbuf to a parser: make a dummy lexbuf and pass that to
keep Ocamlyacc happy, but make sure you never use it.

Instead, create an Ocaml class with a get_token method,
and use the closure of that method over the class PLUS
a dummy lexbuf.

The class then manages the lexer state. A stack
of Ocamllex lexers and lexbufs can be used. If you want
to do conditional compilation, you also need a stack
of booleans -- one stack per include file (to ensure
conditions don't span file boundaries).


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-02 13:19   ` [SPAM?][Caml-list] " skaller
@ 2007-08-05  4:52     ` Erik de Castro Lopo
  2007-08-05  5:35       ` Erik de Castro Lopo
  2007-08-05 10:16       ` skaller
  0 siblings, 2 replies; 10+ messages in thread
From: Erik de Castro Lopo @ 2007-08-05  4:52 UTC (permalink / raw)
  To: caml-list

skaller wrote:

> I recommend abandoning the idea of passing a 
> lexbuf to a parser: make a dummy lexbuf and pass that to
> keep Ocamlyacc happy, but make sure you never use it.
> 
> Instead, create an Ocaml class with a get_token method,
> and use the closure of that method over the class PLUS
> a dummy lexbuf.

I tried that with a class and ran into all sorts of problems
related to trying to use instance data in the constructor.
In the end, I ditched the class/object but kept your idea
and approached it from a more functional direction which
resulted in this (filename lexstack.ml):

------------------8<------------------8<------------------
(* The Lexstack type. *)
type 'a t =
{   mutable stack : (string * in_channel * Lexing.lexbuf) list ;
    mutable filename : string ;
    mutable chan : in_channel ;
    mutable lexbuf : Lexing.lexbuf ;
    lexfunc : Lexing.lexbuf -> 'a ;
    }

(*
** Create a lexstack with an initial top level filename and the
** lexer function.
*)
let create top_filename lexer_function =
    let chan = open_in top_filename in
    {   stack = [] ; filename = top_filename ; chan = chan ;
        lexbuf = Lexing.from_channel chan ;
        lexfunc = lexer_function
        }

(*
** The the next token. Need to accept an unused dummy lexbuf so that
** a closure consisting of the function and a lexstack can be passed
** to the ocamlyacc generated parser.
*)
let rec get_token ls dummy_lexbuf =
    match ls.lexfunc ls.lexbuf with
    |    Parser.TOK_INCLUDE fname ->
            ls.stack <- (ls.filename, ls.chan, ls.lexbuf) :: ls.stack ;
            ls.filename <- fname ;
            ls.chan <- open_in fname ;
            ls.lexbuf <- Lexing.from_channel ls.chan ;
            get_token ls dummy_lexbuf

    |    Parser.TOK_EOF ->
            (   match ls.stack with
                |    [] -> Parser.TOK_EOF
                |    (fn, ch, lb) :: tail ->
                        ls.filename <- fn ;
                        ls.chan <- ch ;
                        ls.stack <- tail ;
                        get_token ls dummy_lexbuf
                )

    |    anything -> anything


(* Get the current lexeme. *)
let lexeme ls =
    Lexing.lexeme ls.lexbuf

(* Get filename, line number and column number of current lexeme. *)
let current_pos ls =
    let pos = Lexing.lexeme_end_p ls.lexbuf in
    let linepos = pos.Lexing.pos_cnum - pos.Lexing.pos_bol -
        String.length (Lexing.lexeme ls.lexbuf)
        in
    ls.filename, pos.Lexing.pos_lnum, linepos

------------------8<------------------8<------------------

This can then be used like this:

    let lexstack = Lexstack.create filename Scanner.tokenizer in
    let dummy_lexbuf = Lexing.from_string "" in
    try
        Parser.parse (Lexstack.get_token lexstack) dummy_lexbuf
    with
        |    Scanner.Lexical_error s -> raise (E s)
        |    Parsing.Parse_error ->
                let fname, lnum, lpos = Lexstack.current_pos lexstack in
                let errstr = Printf.sprintf
                    "\n\nFile '%s' line %d,  column %d : current token is '%s'.\n"
                    fname lnum lpos (Lexstack.lexeme lexstack) in
                raise (E errstr)

I haven't tested it as thoroughly as I should have, but the
general idea seems to work.

Hopefully this will get hooked up into the Ocaml Weekly News and
then indexed by Google so other people who run into this problem
can find this solution.

Cheers,
Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"Copyrighting allows people to benefit from their labours,
but software patents allow the companies with the largest
legal departments to benefit from everyone else's work."
-- Andrew Brown
(http://www.guardian.co.uk/online/comment/story/0,12449,1387575,00.html)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-05  4:52     ` [Caml-list] " Erik de Castro Lopo
@ 2007-08-05  5:35       ` Erik de Castro Lopo
  2007-08-05 10:16       ` skaller
  1 sibling, 0 replies; 10+ messages in thread
From: Erik de Castro Lopo @ 2007-08-05  5:35 UTC (permalink / raw)
  To: caml-list

Erik de Castro Lopo wrote:

>     |    Parser.TOK_EOF ->
>             (   match ls.stack with
>                 |    [] -> Parser.TOK_EOF
>                 |    (fn, ch, lb) :: tail ->
>                         ls.filename <- fn ;
>                         ls.chan <- ch ;
>                         ls.stack <- tail ;
>                         get_token ls dummy_lexbuf
>                 )

Ooops, in the above section, there is a missing

	ls.lexbuf <- lb ;

just before the recursive call to get_token.

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"I'm too fucking busy, or vice versa" -- Dorothy Parker


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-05  4:52     ` [Caml-list] " Erik de Castro Lopo
  2007-08-05  5:35       ` Erik de Castro Lopo
@ 2007-08-05 10:16       ` skaller
  2007-08-05 10:33         ` Erik de Castro Lopo
  1 sibling, 1 reply; 10+ messages in thread
From: skaller @ 2007-08-05 10:16 UTC (permalink / raw)
  To: Erik de Castro Lopo; +Cc: caml-list

On Sun, 2007-08-05 at 14:52 +1000, Erik de Castro Lopo wrote:

> I tried that with a class and ran into all sorts of problems
> related to trying to use instance data in the constructor.

You should share that experience here. It's hard to know
when to go for a class and when a simpler algebraic data
structure is better, so user experience reports (use cases)
are valuable data.

> In the end, I ditched the class/object but kept your idea
> and approached it from a more functional direction which
> resulted in this (filename lexstack.ml):

> I haven't tested it as thoroughly as I should have, but the
> general idea seems to work.

Yep, it should. But you probably should generalise to 
support conditional compilation as well. Felix provides
that facility. Preprocessor 'macro' symbols are never
expanded in the source code, only in preprocessor
directives.

Even very basic facility is fairly general and can be
used to solve porting problems as a fallback if other
more well disciplined techniques fail.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-05 10:16       ` skaller
@ 2007-08-05 10:33         ` Erik de Castro Lopo
  2007-08-05 11:55           ` Jacques GARRIGUE
  0 siblings, 1 reply; 10+ messages in thread
From: Erik de Castro Lopo @ 2007-08-05 10:33 UTC (permalink / raw)
  To: caml-list

skaller wrote:

> You should share that experience here. It's hard to know
> when to go for a class and when a simpler algebraic data
> structure is better, so user experience reports (use cases)
> are valuable data.

Well the problem was that I wanted to do this:

    class lexstack top_filename 
        object
            val mutable filename = top_filename
            val mutable chan = open_in top_filename
            val lexbuf = Lexing.from_channel chan

Oops, error message right there ^^^^^^ trying to use instance
variable chan.

> Yep, it should. But you probably should generalise to 
> support conditional compilation as well.

Not part of my requirements so I'll skip that :-).

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"Only wimps use tape backup: *real* men just upload their
important stuff on FTP, and let the rest of the world
mirror it ;)" -- Linus Torvalds


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-05 10:33         ` Erik de Castro Lopo
@ 2007-08-05 11:55           ` Jacques GARRIGUE
  2007-08-05 12:17             ` Erik de Castro Lopo
  0 siblings, 1 reply; 10+ messages in thread
From: Jacques GARRIGUE @ 2007-08-05 11:55 UTC (permalink / raw)
  To: mle+ocaml; +Cc: caml-list

From: Erik de Castro Lopo <mle+ocaml@mega-nerd.com>
> Well the problem was that I wanted to do this:
> 
>     class lexstack top_filename 
>         object
>             val mutable filename = top_filename
>             val mutable chan = open_in top_filename
>             val lexbuf = Lexing.from_channel chan
> 
> Oops, error message right there ^^^^^^ trying to use instance
> variable chan.

Interesting, because the example you describe here is precisely the
reason it is not allowed (at least as Jerome Vouillon explained to
me.)
That is, you intend the instance variable lexbuf to be the one
associated to the current (mutable) chan, but if you change chan this
will no longer be true.
So, in order to avoid this kind of ambiguity, you have to use let
defined variables. For instance:

class lexstack top_filename =
  let init_chan = open_in top_filename in
  object
    val mutable filename = top_filename
    val mutable chan = init_chan
    val mutable lexbuf_chan = init_chan
    val mutable lexbuf = Lexing.from_channel init_chan
    method lexbuf =
      if chan == lexbuf_chan then lexbuf else
      (lexbuf <- Lexing.from_channel chan; lexbuf_chan <- chan)
    ...

Note that this restriction applies also to immutable instance
variables, because you can modify them through functional update
(the {< chan = ... >} notation.)

Jacques Garrigue


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Handling include files using ocamllex
  2007-08-05 11:55           ` Jacques GARRIGUE
@ 2007-08-05 12:17             ` Erik de Castro Lopo
  0 siblings, 0 replies; 10+ messages in thread
From: Erik de Castro Lopo @ 2007-08-05 12:17 UTC (permalink / raw)
  To: caml-list

Jacques GARRIGUE wrote:

> Interesting, because the example you describe here is precisely the
> reason it is not allowed (at least as Jerome Vouillon explained to
> me.)
> That is, you intend the instance variable lexbuf to be the one
> associated to the current (mutable) chan, but if you change chan this
> will no longer be true.
> So, in order to avoid this kind of ambiguity, you have to use let
> defined variables. For instance:
> 
> class lexstack top_filename =
>   let init_chan = open_in top_filename in
>   object
>     val mutable filename = top_filename
>     val mutable chan = init_chan
>     val mutable lexbuf_chan = init_chan
>     val mutable lexbuf = Lexing.from_channel init_chan
>     method lexbuf =
>       if chan == lexbuf_chan then lexbuf else
>       (lexbuf <- Lexing.from_channel chan; lexbuf_chan <- chan)
>     ...
> 
> Note that this restriction applies also to immutable instance
> variables, because you can modify them through functional update
> (the {< chan = ... >} notation.)

Thanks for the excellent explanation Jacques.

I must admit, this is one of my first real forays into OO side of
ocaml, but I quickly found that my aims could be acheived just as
easily using the technique I posted.

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"A programming language is low level when its programs require
attention to the irrelevant." -- Alan Perlis


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-08-05 12:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-02 10:09 Handling include files using ocamllex Erik de Castro Lopo
2007-08-02 10:29 ` [SPAM?][Caml-list] " Christoph Bauer
2007-08-02 10:42   ` [Caml-list] " Erik de Castro Lopo
2007-08-02 13:19   ` [SPAM?][Caml-list] " skaller
2007-08-05  4:52     ` [Caml-list] " Erik de Castro Lopo
2007-08-05  5:35       ` Erik de Castro Lopo
2007-08-05 10:16       ` skaller
2007-08-05 10:33         ` Erik de Castro Lopo
2007-08-05 11:55           ` Jacques GARRIGUE
2007-08-05 12:17             ` Erik de Castro Lopo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).