caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Diego Olivier Fernandez Pons <Diego.FERNANDEZ_PONS@etu.upmc.fr>
To: skaller <skaller@users.sourceforge.net>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Need NFA/DFA conversion help
Date: Thu, 16 Sep 2004 13:50:48 +0200 (DST)	[thread overview]
Message-ID: <Pine.A41.4.44.0409161246300.602286-100000@ibm1> (raw)
In-Reply-To: <1095291341.27775.1164.camel@pelican.wigram>

    Bonjour,


It seems there are two different problems addressed in your request :
the first one is purely technical and the second one is more
theoretical.

> like Ocamllex, except you don't need any external tool or library,
> you can define everything 'in-caml' and dynamically, as shown.
> (The core of the matcher engine is also purely functional)

Tools like Lex or Yacc build an 'interpreter' (NFA, FDA, push-down
automata or transducers) which interprets your input language and
'does' something.

What you are asking is this set of tools to be done in a dynamic and
pure-caml way instead of a 2-stages mixed C/Caml. This is a technical
issue : just take the same principles that were used to build this
tools and replace the code generation by data structures construction
(sounds simple, doesn'it ?)

> Then you can either process some data, or you can save the compiled
> automaton (possibly as Ocaml or C sourcecode which can be compiled
> directly to executable code like (Ocaml)lex output).

Yes, one can imagine a generic framework that would allow you to
choose how your intepreter will be generated (source-code/data
structure, C/Caml, etc.)

> What doesn't work is sub-group matching, look ahead, etc -- things
> requiring actions in the middle of a regexp (rather than at the
> end).

Here is the second problem. You did exactly what was expected (taking
a textbook on formal languages and implement what is said). The
problem is that tools like PCRE, Lex, Yacc, etc. do not work as
textbook say. They do not accept rational or context-free languages,
they do not use DFA, NFA or LR(k) automata as claimed.

Yacc input language is not a context-free language but some kind of
restricted attribute grammar that lies somewhere between algebraic
transducers and contextual grammars. Hence tools like Bison will
contain a lot of strange hacks to handle all these extensions while
keeping the push-down automata machinery.

In the same way a grammar that admits 'actions in the middle of a
regexp [] rather than at the end' could hardly qualify as a 'rational
language' which is what regexp theoretically describe.

As far as I know, there is no simple way to handle all these useful
'extensions' and theoretically satisfactory ways tend to be very hard.

- general textbook

Parsing techniques (Dick Grune, Ceriel J.H. Jacobs) (electronic
version on free download)
http://www.cs.vu.nl/~dick/PTAPG.html

Dick Grune 'compilers' book has a chapter on attribute grammars

- attribute grammars

FNC2 (strongly non circular attribute grammars)
http://www-rocq.inria.fr/oscar/www/fnc2/

An attribute grammar system in Haskell
http://www.cs.uu.nl/groups/ST/twiki/bin/view/Center/AttributeGrammarSystem

- Yacc like tools

Jacke, Yacc-like parser generator with SML-consistent sytax
http://www.ps.uni-sb.de/~jan/jacke/jacke.html


        Diego Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-09-16 11:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-15 15:05 skaller
2004-09-15 15:41 ` Luc Maranget
2004-09-15 19:15   ` Alain Frisch
2004-09-15 17:50 ` Jason Hickey
2004-09-15 20:50 ` Jason Hickey
2004-09-15 23:35 ` skaller
2004-09-16 11:50   ` Diego Olivier Fernandez Pons [this message]
2004-09-16 12:07     ` Luc Maranget
2004-09-16 12:37       ` james woodyatt
2004-09-16 13:29       ` skaller
2004-09-16 12:57     ` skaller
2004-09-16 13:04       ` Diego Olivier Fernandez Pons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.A41.4.44.0409161246300.602286-100000@ibm1 \
    --to=diego.fernandez_pons@etu.upmc.fr \
    --cc=caml-list@inria.fr \
    --cc=skaller@users.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).