Re: lexer, parser - Markus Mottl

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Markus Mottl <mottl@miss.wu-wien.ac.at>
To: skaller@maxtal.com.au (John Skaller)
Cc: caml-list@inria.fr (OCAML)
Subject: Re: lexer, parser
Date: Sat, 12 Jun 1999 13:02:58 +0100 (MET DST)	[thread overview]
Message-ID: <199906121102.NAA30678@miss.wu-wien.ac.at> (raw)
In-Reply-To: <3.0.6.32.19990603215554.0097b100@triode.net.au> from "John Skaller" at Jun 3, 99 09:55:54 pm

> Is there an 'object' version of the lexer and parser,
> or can I use or adapt the existing code?
> 
> I need to maintain state, but the component also need
> to be re-entrant.

I guess you want to have just the syntactic parts in the scanner
and parser files, but semantics shall be kept within objects (very
convenient).

My solution to this is to have the parser return streams of functions.
These functions accept as final argument an object which implements the
appropriate semantics.

E.g.:

module (file) "Foo":

  class semantics = object
    method do_something param = ...
    ...
  end

  type 'a trans_fun  = semantics -> 'a -> unit
  and  'a trans_strm = 'a trans_fun Stream.t

  let do_something param obj = obj#do_something param
  ...

The parser (e.g.):

  %start main
  %type <'a Foo.trans_strm> main

  main
    : left_part A_TOKEN { [< $1; 'Foo.do_something $2 >] }
    | A_TOKEN           { [< $1 >] }
    | ...

  left_part
    : ANOTHER_TOKEN     { [< $1 >] }
    | ...

The main program (e.g.):

  let main () =
    let lexbuf     = Lexing.from_channel stdin
    and sm         = new semantics in
    let msg_stream = Parser.main Scanner.start lexbuf
    Stream.iter (fun f -> f sm) msg_stream (* this executes semantics *)

It is possible to "rearrange" streams very fast, because substreams
can be combined arbitrarily (e.g. concatenated) without loss of
efficiency. This is very important in parsers. Thus, I prefer them over
lists (of functions).

Execution of "semantics" is also much faster than versions that use
algebraic datatypes and pattern matching, because here we only have to
call methods (wrapped in the functions of the stream) on objects instead
of match abstract syntax trees or else.

Another advantage: the streams can be directed to any kind of object that
matches the interface of "semantics". Thus, we could have "different"
semantics and have the parsed program evaluated over them.  This adds
greatly to the modularity of the semantics implementation.

I have found this approach very expressive, efficient and easy to maintain
- I hope it will also help you.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

next prev parent reply	other threads:[~1999-06-14 15:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <0580637621241002*/c=FR/admd=ATLAS/prmd=SG/o=INFI/s=EBER/g=JEAN-MARC/@MHS>
     [not found] ` <0579137620FCB001*/c=FR/admd=ATLAS/prmd=SG/o=INFI/s=EBER/g=JEAN-MARC/@MHS>
1999-06-03 11:55   ` John Skaller
1999-06-12 12:02     ` Markus Mottl [this message]
1999-06-15  2:01       ` Jacques GARRIGUE
1999-06-15 10:20         ` Markus Mottl
1999-06-14  7:15     ` Christian Lindig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=199906121102.NAA30678@miss.wu-wien.ac.at \
    --to=mottl@miss.wu-wien.ac.at \
    --cc=caml-list@inria.fr \
    --cc=skaller@maxtal.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).