zsh-workers
 help / color / mirror / code / Atom feed
From: Phil Pennock <zsh-workers+phil.pennock@spodhuis.org>
To: Zsh hackers list <zsh-workers@zsh.org>
Subject: Re: =~ doesn't work with NUL characters
Date: Wed, 14 Jun 2017 16:49:38 -0400	[thread overview]
Message-ID: <20170614204938.GA76510@tower.spodhuis.org> (raw)
In-Reply-To: <20170613100217.GA9529@chaz.gmail.com>

On 2017-06-13 at 11:02 +0100, Stephane Chazelas wrote:
> [[ $'a\0b' =~ 'a$' ]]
> 
> returns true both with and without rematchpcre

Let's break this down, non-PCRE and PCRE, and consider appropriate
behaviour for each separately.

Without rematchpcre, this is ERE per POSIX APIs, which don't portably
support size-supplied strings, relying instead upon C-string
null-termination.

Current macOS has regnexec() but this is not in the system regexp
library I see on Ubuntu Trusty or FreeBSD 10.3.  It appears to be an
extension from when they switched to the TRE implementation in macOS
10.8.  <https://laurikari.net/tre/>

Trying to support this would result in variations in behaviour across
systems in a way which I think might be undesirable.  The whole point of
adding the non-PCRE implementation was to match Bash behaviour by
default, and Bash does the same thing.

So for non-PCRE, I think this current behaviour is the only sane choice.

For PCRE, I'm inclined to agree that we should be able to portably
supply the length and there would not be any cross-platform behavioural
variances.  I think it's also reasonable that PCRE matching could
diverge from ERE matching even more.  Others might disagree?

We've "always" used strlen here; the most recent change was to handle
meta/unmeta (by me), but the strlen usage has been present since the
pcre module was introduced in commit bff61cf9e1 in 2001.

Thus: do we want to change behaviour, after 16 years, to allow embedded
NUL for the PCRE case, being different from the ERE case?

There's enough room for disagreement here that I'm not rushing to write
a patch, but instead deferring to those with commit-bit.  My personal
inclination is to handle NULL in the PCRE case.  It should just be a
case of passing an int* instead of NULL as the second parameter to
unmetafy().

-Phil


  reply	other threads:[~2017-06-14 21:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-13 10:02 Stephane Chazelas
2017-06-14 20:49 ` Phil Pennock [this message]
2017-06-14 23:08   ` Bart Schaefer
2017-06-15  7:38   ` Peter Stephenson
2017-06-15  8:18   ` Stephane Chazelas
2017-06-15  9:50   ` Stephane Chazelas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614204938.GA76510@tower.spodhuis.org \
    --to=zsh-workers+phil.pennock@spodhuis.org \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).