From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id A8DD4BB81 for ; Fri, 24 Dec 2004 18:40:40 +0100 (CET) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.206]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id iBOHed8W004955 for ; Fri, 24 Dec 2004 18:40:40 +0100 Received: by wproxy.gmail.com with SMTP id 68so450306wri for ; Fri, 24 Dec 2004 09:40:39 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=grPJ+qf6I0YEa6PZVe2bpNRB1pxfoKTeC9vj84F8b3QqrTzz9Hjtk/XRn+5igguyPrSp4NMJVjlVV+4vb9YEdLATUR8RK9TNOucuugEin04inj8yn3TayL2yrwjKgx3CyZpiNec5zJl71ZHONe5Yg+qYXyYWJt7IdKM2FOrPEQw= Received: by 10.54.7.2 with SMTP id 2mr505176wrg; Fri, 24 Dec 2004 09:40:39 -0800 (PST) Received: by 10.54.31.60 with HTTP; Fri, 24 Dec 2004 09:40:39 -0800 (PST) Message-ID: <8008871f04122409405d1b9679@mail.gmail.com> Date: Fri, 24 Dec 2004 12:40:39 -0500 From: "Christopher A. Watford" Reply-To: "Christopher A. Watford" To: skaller@users.sourceforge.net Subject: Re: [Caml-list] Str.string_match incorrect Cc: David Brown , William Lovas , caml-list@yquem.inria.fr In-Reply-To: <1103769192.3443.51.camel@pelican.wigram> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <1103687369.6979.50.camel@pelican.wigram> <20041222074455.GA81342@trout> <20041222080009.GA4501@force.stwing.upenn.edu> <1103731044.6979.109.camel@pelican.wigram> <20041222165846.GA30503@old.davidb.org> <1103769192.3443.51.camel@pelican.wigram> X-Miltered: at nez-perce with ID 41CC5497.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 behaves:01 regexp:01 denotes:01 ocaml's:01 val:01 regexp:01 bool:01 pcre:01 pcre:01 ocaml's:01 expression:01 regex:01 strings:01 strings:01 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.0 X-Spam-Level: > > > Then they're simply wrong. The fundamental operation is > > > to check if a string is in a regular set of strings. > > > Plainly 'aa' is not in the set { 'a' }. > > > > This is a strange notion of right and wrong. The function behaves exactly > > as it is specified in the documentation. > > Huh? I'm really confused. Two people think the documentation > is correct for the behaviour. > > I do not -- Saying > > string x is matched by regexp r > > means to me the same as > > x is a member of the regular set of strings r denotes > > and with that interpretation there is no possible doubt that > the documentation is wrong and should say > > matches a prefix of the argument string > > in order to describe the behaviour. > Going by PCRE's documentation the regex /a/ matches: 'a', 'aa', 'baaaaa', 'nota bene', and any other string with 'a' anywhere inside. >>From OCaml's description of string_match: val string_match: regexp -> string -> int -> bool [string_match r s start] tests whether the characters in s starting at position start match the regular expression r. The first character of a string has position 0, as usual. This sounds to me like you're doing the PCRE: /a/ if you say: Str.string_match (Str.regexp "a") "ab" 0 ;; And in PCRE /a/ matches "ab" or "aa" or "ba". /^a/ matches "aa", "ab", but not "ba", and /a$/ matches "aa", "ba", but not "ab". So as far as I can tell, OCaml's string_match is correct as the manual describes. -- Christopher A. Watford christopher.watford@gmail.com http://dorm.tunkeymicket.com