From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 1A244BB91 for ; Wed, 12 Jan 2005 21:13:28 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j0CKDR3w007646 for ; Wed, 12 Jan 2005 21:13:27 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id VAA29890 for ; Wed, 12 Jan 2005 21:13:27 +0100 (MET) Received: from out2.smtp.messagingengine.com (out2.smtp.messagingengine.com [66.111.4.26]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0CKDQKD023044 for ; Wed, 12 Jan 2005 21:13:26 +0100 Received: from frontend2.messagingengine.com (frontend2.internal [10.202.2.151]) by frontend1.messagingengine.com (Postfix) with ESMTP id 7DBDAC4C94B; Wed, 12 Jan 2005 15:12:51 -0500 (EST) X-Sasl-enc: SMjEWNxHkfbW5IwvuiHynQ 1105560770 Received: from [172.16.112.115] (burnham.ljcrf.edu [192.231.106.2]) by frontend2.messagingengine.com (Postfix) with ESMTP id 9B86875E; Wed, 12 Jan 2005 15:12:49 -0500 (EST) Date: Wed, 12 Jan 2005 12:12:43 -0800 (PST) From: Martin Jambon X-X-Sender: martin@localhost To: skaller Cc: "O'Caml Mailing List" Subject: Re: [Caml-list] Thread safe Str In-Reply-To: <1105516763.2574.231.camel@pelican.wigram> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Miltered: at nez-perce with ID 41E584E7.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 41E584E6.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 wrote:01 model:01 pcre:01 endmatch:01 backtracking:01 dfa:01 dfa:01 token:01 rhs:01 nfa:01 syntax:01 ...:98 ...:98 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=disabled version=3.0.0 X-Spam-Level: On 12 Jan 2005, skaller wrote: > On Wed, 2005-01-12 at 07:53, Martin Jambon wrote: > > On 11 Jan 2005, skaller wrote: > > > > See the ulex package for a model. The problem with micmatch > > > is precisely that it does use Str. > > > > It uses either Str or PCRE. > > > You can also integrate ulex by yourself if it is possible. > > I'm not sure .. ulex doesn't use NFA's does it? > AFIAK it doesn't support captures. > > The problem with micmatch is that it probably doesn't do what > can be done with a system supported in the core language, > linear matching over all cases. Felix does do it: in this code > > regmatch .. with > | re1 => .. > | re2 => .. > | re3 =>.. > endmatch > > the input characters (from ...) are read exactly once, > not only is there no backtracking, since a DFA is used, > but it isn't a sequence of comparisons (it's a DFA based > tokeniser, the token selecting the RHS expression). > > I would guess micmatch does not support that, although > it probably could. Sorry, I don't know the details of what can be done with DFA or NFA. The idea of micmatch is to be compatible with the behavior of the regular match...with, which allows us to mix both forms of pattern matching (returning the first match, not the longest match). The following returns `Case1: match "abcde", 1 with RE "abc", 1 -> `Case1 | RE "abcd", _ -> `Case2 And the following simplified code still returns `Case1: match "abcde" with RE "abc" -> `Case1 | RE "abcd" -> `Case2 And similarly this returns "abc": match "abcde" with RE ("abc" | "abcd" as x) -> x I could implement the following syntax: match subj with RE (re1 as x1 => `Case1 x1 as y) | (re2 as x2 => `Case2 x2 as y) -> y And the simplified forms: match subj with RE (re1 => `Case1 as y) | (re2 => `Case2 as y) -> y match subj with RE (re1 => f1 ()) | (re2 => f2 ()) -> () We could then do sophisticated things like that: match subj with RE ... ((re1 => f1 ()) | (re2 => raise Try_next_please)) ... -> ... | ... (* next *) -> ... but I don't know if these additions would be a good thing. Martin -- Martin Jambon, PhD Researcher in Structural Bioinformatics since the 20th Century The Burnham Institute http://www.burnham.org San Diego, California