From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA03298; Wed, 15 Sep 2004 17:41:48 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA03230 for ; Wed, 15 Sep 2004 17:41:47 +0200 (MET DST) Received: from yquem.inria.fr (yquem.inria.fr [128.93.8.37]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id i8FFfbVq013933; Wed, 15 Sep 2004 17:41:37 +0200 Received: by yquem.inria.fr (Postfix, from userid 18041) id 5C4FFBC84; Wed, 15 Sep 2004 17:41:37 +0200 (CEST) Date: Wed, 15 Sep 2004 17:41:37 +0200 To: skaller Cc: caml-list Subject: Re: [Caml-list] Need NFA/DFA conversion help Message-ID: <20040915154137.GB24717@yquem.inria.fr> References: <1095260758.27775.1047.camel@pelican.wigram> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1095260758.27775.1047.camel@pelican.wigram> User-Agent: Mutt/1.3.28i From: luc.maranget@inria.fr (Luc Maranget) X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 pointers:01 pcre:01 rtn:99 regexp-:01 regexps:01 9660:01 glebe:01 ville:99 ville:99 transitions:01 retrieval:99 --luc:01 regexp:01 tagged:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > I need some help understanding how to convert > an NFA with *labelled* arcs into an equivalent DFA. > I'm not even sure it is possible. Any advice, > pointers to papers, or code appreciated. The main > use will be in a set of regexp related tools intended > to replace Ocamllex, Str, and PCRE with pure caml code, > and then extend that with an RTN based parsing system. > > Ocamllex functionality is already available, > however group extraction and friends seem > to require labelled arcs (and RTNs definitely do). > I'm using the naive algorithm (regexp->DFA) from > the Dragon book with the pattern-recognition > modification (which is enough for a tokeniser). > I can't see how to modify it to keep track of > the arcs (special marks in regexps like group > brackets, lookahead operator, etc). > > -- > John Skaller, mailto:skaller@users.sf.net > voice: 061-2-9660-0850, > snail: PO BOX 401 Glebe NSW 2037 Australia > Checkout the Felix programming language http://felix.sf.net As regards ocamllex, I worked from various recent publications by Ville Larikari. Here is the relevent comment from ocamllex sources (lex/lexgen.ml) (* To generate directly a NFA from a regular expression. Confer Aho-Sethi-Ullman, dragon book, chap. 3 Extension to tagged automata. Confer Ville Larikari ``NFAs with Tagged Transitions, their Conversion to Deterministic Automata and Application to Regular Expressions''. Symposium on String Processing and Information Retrieval (SPIRE 2000), (See also) *) To me, all that is far from being obvious... Bon courage ! --Luc ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners