caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: skaller <skaller@maxtal.com.au>
To: Hendrik Tews <tews@tcs.inf.tu-dresden.de>
Cc: "caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: "nested" parsers
Date: Sat, 04 Dec 1999 05:18:56 +1100	[thread overview]
Message-ID: <38480990.4223BF76@maxtal.com.au> (raw)
In-Reply-To: <199912031331.OAA13008@ithif20.inf.tu-dresden.de>

Hendrik Tews wrote:
> 
> Georges Mariano writes:
>    Date: Thu, 02 Dec 1999 13:25:04 +0000
>    Subject: "nested" parsers
> 
>    It appears that we can divide the language L in a few "sub"-languages.
>    Let's say that  L3 <: L2 <: L1 = L   ('<:' included in )
> 
>    And we would like to have one entry point for each language
>    in our parser.
> 
> ocamlyacc generates a parsing function for echa start symbol that
> you declare in the grammar file. Therefore the easiest way (if
> possible) is that you rewrite your grammar, such that for each of
> the languages Ln you have one metasymbol, which generates this
> language. Then you include several start directives in the header
> of the grammar file.

	But it isn't clear how to invoke these sublanguages
recursively, from with an action associated with a reduce
operation for the very sublanguage which we wanted a separate
parser for.

	For example, the Python language can conveniently
be divided into two distinct languages: a statement language
and an expression language. Certainly, I can have a non-terminal
for statements, and one for expressions, and make both entry
points --- and in my parser for Python I do just that.

	But that isn't what I want to do. 
What I actually want is a recursive transition machine corresponding
to a meta-grammar (RHS of productions can be regular expressions;
for example, BNF]

	This can be organised by a finite state automaton in which
a non-terminal transition pushes down the whole automaton,
and starts a new one corresponding to the RHS of the production
defining the non-terminal labelling the arc being followed.

	Chosing the right arc is the hard bit. If the first sets
of all the arcs out of a node are known, the number of trial
parses can be reduced -- to one, if the first sets are disjoint.


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850




  reply	other threads:[~1999-12-03 18:29 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1999-12-02 13:25 Georges Mariano
1999-12-03 13:31 ` Hendrik Tews
1999-12-03 18:18   ` skaller [this message]
1999-12-03 14:32 ` Remi VANICAT

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38480990.4223BF76@maxtal.com.au \
    --to=skaller@maxtal.com.au \
    --cc=caml-list@inria.fr \
    --cc=tews@tcs.inf.tu-dresden.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).