From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA31993; Fri, 16 Jan 2004 19:03:00 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA31979 for ; Fri, 16 Jan 2004 19:02:59 +0100 (MET) Received: from herd.plethora.net (herd.plethora.net [205.166.146.1]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id i0GI2w505178 for ; Fri, 16 Jan 2004 19:02:58 +0100 (MET) Received: from bhurt.plethora.net (bhurt.plethora.net [205.166.146.49]) by herd.plethora.net (8.11.6/8.10.1) with ESMTP id i0GI2lZ27345; Fri, 16 Jan 2004 12:02:48 -0600 (CST) Date: Fri, 16 Jan 2004 13:05:15 -0600 (CST) From: Brian Hurt X-X-Sender: bhurt@localhost.localdomain To: Richard Jones cc: Ocaml Mailing List Subject: Re: [Caml-list] ANNOUNCE: mod_caml 1.0.6 - includes security patch In-Reply-To: <20040116093454.GA23909@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 foo:01 matchings:01 regex:01 regex:01 foo:01 transitions:01 val:01 foobar:01 foobar:01 val:01 parens:01 compiler:01 compiler:01 caml:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Fri, 16 Jan 2004, Richard Jones wrote: > Being able to write: > > var ~ /ab+/ > > and similar certainly makes string handling and simple parsing a lot > easier. > That (or something close to that) could be done via a library. What I'd like to see is to be able to pattern match on regexs, like: match str with | /ab+/ -> ... | /foo(bar)*/ -> ... etc. The compiler could then combine all the matchings into a single DFA, improving performance over code like: if (regex_match str "ab+") then ... else if (regex_match str "foo(bar)*") then ... else ... The regex matching would also let the compiler know if there were possible unmatched strings (these would should up as transitions to the error state in the DFA). Hmm. Actually, you could get close to this. You simply write a function with the signature: val multiway_regex: (string * 'a) list -> string -> 'a The assumption here is that 'a would be a variant type. This would allow you to do: type my_regex_matching = Abb | Foobar | ... ;; let regex = multiway_regex [ ("ab+", Abb); ("foo(bar)*", Foobar); ... ];; match (regex string) with | Abb -> (* matched /ab+/ *) | FooBar -> (* matched /foo(bar)*/ *) ... No- you'd want to be able to grab the substrings. So the type should be: val multiway_regex: (string * (string list -> 'a)) list -> string -> 'a Where the string list passed in to the generator function would be the list of substrings matched inside parens. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners