From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA02900; Fri, 23 May 2003 10:27:58 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA02918 for ; Fri, 23 May 2003 10:27:56 +0200 (MET DST) Received: from carme.keymachine.de (h-62.141.48.121.keyweb.de [62.141.48.121] (may be forged)) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id h4N8RuT04934 for ; Fri, 23 May 2003 10:27:56 +0200 (MET DST) Received: from kunz.ratzer (dclient217-162-174-4.hispeed.ch [217.162.174.4]) by carme.keymachine.de (8.11.6/8.11.6) with ESMTP id h4N8Rti02211 for ; Fri, 23 May 2003 10:27:55 +0200 Received: from stefan by kunz.ratzer with local (Exim 3.35 #1 (Debian)) id 19J7u7-0000OS-00 for ; Fri, 23 May 2003 10:27:55 +0200 Date: Fri, 23 May 2003 10:27:55 +0200 From: Stefan Heimann To: caml-list@inria.fr Subject: Re: Re: [Caml-list] ocamllex, regular expression syntax Message-ID: <20030523082755.GA1307@kunz.ratzer> Mail-Followup-To: caml-list@inria.fr References: <20030522205632.GA2130@kunz.ratzer> <200305230631.IAA31844@pauillac.inria.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200305230631.IAA31844@pauillac.inria.fr> User-Agent: Mutt/1.3.28i X-Spam: no; 0.00; caml-list:01 pierre:01 weis:01 regex:01 ocaml's:01 regexp:01 0200,:01 syntax:02 imply:02 harder:02 module:03 string:03 wrote:03 suppose:03 messy:03 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Hi, On Fri, May 23, 2003 at 08:31:39AM +0200, Pierre Weis wrote: > Hi, > > [...] > > But the regular expression syntax in the Str module looks "normal" to > > me. > > > > Regular expressions like this > > > > "[^"\\]*(\\.[^"\\]*)*" > > > > are not easy to read, > > I suppose you did not try this, since it is not a legal regular > expression. I guess you mean > > "[^\\"\\]*(\\.[^\\"\\]*)*" > > (Hence, the ``normal looking'' of those reg-exps does not imply simple, > clear, and natural syntax !) no, the regular expression I mentioned is correct. The double quotes do not enclose the regex but are part of it (maybe that's what confused you): stefan@kunz:~$ echo '"Hello \"World\"!"' | egrep '"[^"\\]*(\\.[^"\\]*)*"' "Hello \"World\"!" stefan@kunz:~$ echo 'x' | egrep '"[^"\\]*(\\.[^"\\]*)*"' stefan@kunz:~$ The single quotes just protect the string from shell expansion... > > but with the ocamllex syntax it is even more > > difficult: > > > > '"'[^'"''\\']*('\\'_[^'"''\\']*)*'"' > > > > (and harder to write). > > It is not so clear to me: the ' conventions are exactly those of the > language (hence there is no need to \\ the " symbols), the _ gets its > "normal" meaning of ``whatever'' or ``catch all case'' pattern... Normally you don't have to escape the " symbol with a backslash. Only when you put them inside double quotes because you use the regular expression as a normal string. But then it becomes really messy... In ocaml's Str module syntax it looks like this: let re = regexp "\"[^\"\\\\]*\\(\\\\.[^\"\\\\]*\\)*\"" > [...] Bye, Stefan -- Stefan Heimann http://www.stefanheimann.net :: personal website. http://cvsshell.sf.net :: CvsShell, a console based cvs client. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners