From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24219 invoked by alias); 16 Sep 2012 00:09:19 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 17270 Received: (qmail 9790 invoked from network); 16 Sep 2012 00:09:15 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,DKIM_ADSP_ALL, DKIM_SIGNED,RCVD_IN_DNSWL_MED,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.2 Received-SPF: none (ns1.primenet.com.au: domain at spodhuis.org does not designate permitted sender hosts) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=spodhuis.org; s=d201107; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=pujaxWgsTMk9Z66kpUx0RRvt3kqaK9uD1kNsWXWomP0=; b=nzoCqzYGa+PjCXKVBTKzyPKu4oV8eFxlRoLnNUh9jzPZNA7c0cjyWVbnxUlgNg5Tt7z43/U6m8zNdvt5wzzZHrRiCeREjWRkZnWzWnq665OFl8rUe9zEIDto8x4WyC0TN2Gr5DkVrOKlUzNJrTHQGPufWrELgVN2BoM38rROLBo=; Date: Sat, 15 Sep 2012 19:52:19 -0400 From: Phil Pennock To: Peter Stephenson Cc: zsh-users@zsh.org Subject: Re: regex matching regression in 5.0.0 vs. 4.3.17 Message-ID: <20120915235219.GA97656@redoubt.spodhuis.org> Mail-Followup-To: Peter Stephenson , zsh-users@zsh.org References: <20120915204246.0bb41f96@pws-pc.ntlworld.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120915204246.0bb41f96@pws-pc.ntlworld.com> On 2012-09-15 at 20:42 +0100, Peter Stephenson wrote: > So unless anyone can think of a smart solution, I think the only answer > is to remove NULL characters from the body of the regular expression and > document that this happens. The situation sucks, clearly. So: is it better to change the NUL to something else, to strip it out (shortening the pattern) or to just document that NULs are bad? For the POSIX system library regex module, a NUL will always be bad. For PCRE, pcre_exec() takes a length parameter for the haystack string, so one option might be to change the NUL in the _pattern_ to be \x00 instead? It seems that for PCRE, supplying a length-receiving parameter to unmetafy() and comparing that to strlen() should be right, and then switching the result if so. If I do this, then zsh/pcre should be able to handle NULs fine in both needle/pattern and haystack. For regex .. generally, I'm not in favour of hidden mutations of strings which might change whether they match or not. I can just document it as a limitation of non-PCRE? -Phil