From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id CAA26861; Mon, 20 Sep 2004 02:38:41 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id CAA25895 for ; Mon, 20 Sep 2004 02:38:39 +0200 (MET DST) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id i8K0ca4M021996 for ; Mon, 20 Sep 2004 02:38:38 +0200 Received: from [192.168.1.200] (ppp202-133.lns1.syd3.internode.on.net [203.122.202.133]) by smtp1.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id i8K0cX4Y003663; Mon, 20 Sep 2004 10:08:34 +0930 (CST) Subject: Re: [Caml-list] [ann] Regexp library supporting binding for * and +'s From: skaller Reply-To: skaller@users.sourceforge.net To: Yutaka OIWA Cc: caml-list In-Reply-To: References: Content-Type: text/plain Message-Id: <1095640712.2580.292.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 20 Sep 2004 10:38:33 +1000 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 414E268C.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 sourceforge:01 2004:99 yutaka:01 oiwa:01 2004:99 pcre:01 soundness:01 unacceptable:01 xduce:01 9660:01 glebe:01 regexp:01 regexp:01 encode:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Mon, 2004-09-20 at 06:41, Yutaka OIWA wrote: > >From the computer room at ICFP2004 in Snowbird Resort, > I announce a beta version of my combinator-based > regular-expression match library which supports > list (Kleene-*) binding. > I plan to construct a neat syntax sugar over this library > and build a next-generation version of Regexp/OCaml library. > Any comments are welcome. Can you explain why/how Pcre is being used? I'm currently looking at providing the same kind of facility, however I need: (a) all pure Ocaml -- reason: maintenance, soundness (b) able to generate fairly simple automata Reason-- the execution target may be C, so it must be possible to both encode the data fairly simply, and also to provide C routines to execute various automata based on that data, without building complex data structures. (c) must process at least a stream of integer inputs Reason: 8 bit inputs are unacceptable for i18n reasons. In addition, there are uses of state machines other than processing 'strings'. I'd like to combine at least (i) tokenisation and (ii) substring extraction however a more general facility such as parsing as in C/XDuce is also appealing. Alternatively, or as well, processing tagged NFA's readily yields RTNs and hence CFG parsing support. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners