From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA02332; Wed, 15 Sep 2004 17:06:03 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA01065 for ; Wed, 15 Sep 2004 17:06:02 +0200 (MET DST) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id i8FF603i031975 for ; Wed, 15 Sep 2004 17:06:01 +0200 Received: from [192.168.1.200] (ppp202-133.lns1.syd3.internode.on.net [203.122.202.133]) by smtp3.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id i8FF5wOU049733 for ; Thu, 16 Sep 2004 00:35:58 +0930 (CST) Subject: [Caml-list] Need NFA/DFA conversion help From: skaller Reply-To: skaller@users.sourceforge.net To: caml-list Content-Type: text/plain Message-Id: <1095260758.27775.1047.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 16 Sep 2004 01:05:58 +1000 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 41485A58.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; sourceforge:01 pointers:01 pcre:01 rtn:99 regexp-:01 regexps:01 9660:01 glebe:01 regexp:01 caml:01 labelled:01 labelled:01 nsw:01 dragon:98 snail:02 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk I need some help understanding how to convert an NFA with *labelled* arcs into an equivalent DFA. I'm not even sure it is possible. Any advice, pointers to papers, or code appreciated. The main use will be in a set of regexp related tools intended to replace Ocamllex, Str, and PCRE with pure caml code, and then extend that with an RTN based parsing system. Ocamllex functionality is already available, however group extraction and friends seem to require labelled arcs (and RTNs definitely do). I'm using the naive algorithm (regexp->DFA) from the Dragon book with the pattern-recognition modification (which is enough for a tokeniser). I can't see how to modify it to keep track of the arcs (special marks in regexps like group brackets, lookahead operator, etc). -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners