From: Stephane Chazelas <stephane.chazelas@gmail.com>
To: Phil Pennock <zsh-workers+phil.pennock@spodhuis.org>
Cc: Bart Schaefer <schaefer@brasslantern.com>, zsh-workers@zsh.org
Subject: Re: please consider using PCRE_DOLLAR_ENDONLY (and PCRE_DOTALL) for rematchpcre
Date: Tue, 23 Jan 2018 06:57:35 +0000 [thread overview]
Message-ID: <20180123065735.GA16678@chaz.gmail.com> (raw)
In-Reply-To: <20180122052829.GA83799@tower.spodhuis.org>
2018-01-22 00:28:29 -0500, Phil Pennock:
[...]
> Changing the default behavior of valid semantics risks hard-to-debug
> breakage of existing scripts and I am erring on the side of being
> against this change. It's not hard opposition, but I'd like to see
> stronger justification before risking breaking changes.
>
> I know that I myself have scripts which rely upon PCRE matching against
> multiline data behaving as per the defaults of pcrepattern(3).
>
> In addition, while the DOTALL change can be turned off in-regex, the
> dollar-endonly one can't, AFAIK, so that becomes a breaking change which
> can't be worked around.
[...]
dollar-endonly is not really about multiline
[[ $'a\nb' =~ 'a$' ]]
will not match with or without it and
[[ $'a\nb' =~ '(?m)a$' ]]
will match with or without it.
It's more about single-line where the line delimiter happens to
be included (and you want the $ to match on the end of that line
as opposed to the end of the string).
$ matches before a trailing newline in a string in perl because
of how its <> operator works. perl is a text processing utility,
its regexps are primarily matched against single lines where the
newline is included (contrary to traditional text processing
utilities like sed/grep/awk where the record separator is not
included).
In:
perl -pe 's/.$//'
(which calls <>).
you want to remove the last character of the line, not the
newline character.
That $ behaviour makes a lot of sense there. Even if you use:
perl -lpe 's/.$//'
where that -l causes the delimiter to be removed on input and
added back on output like in sed/awk, that behaviour doesn't
harm because the record does *not* contain any newline
delimiter.
But zsh is not a text processing utility, and its "read" builtin
(the closest equivalent to perl's <>) does not include the
delimiter. It's actually hard to have a trailing newline when
processing text in shells given that $(...) strips them..
On the other hand, having
[[ $file =~ '\.txt$' ]]
match on files that don't end in .txt is a concern (and in my
experience, file names (as opposed to text lines with
delimiters) is the kind of thing I deal most often with in zsh).
And again, note that it only happens with pcrematch, it works as
expected with EREs.
--
Stephane
next prev parent reply other threads:[~2018-01-23 6:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-22 12:25 Stephane Chazelas
2017-11-22 21:40 ` Stephane Chazelas
2018-01-20 7:48 ` Bart Schaefer
2018-01-22 5:28 ` Phil Pennock
2018-01-23 6:57 ` Stephane Chazelas [this message]
2018-01-23 13:55 ` Stephane Chazelas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180123065735.GA16678@chaz.gmail.com \
--to=stephane.chazelas@gmail.com \
--cc=schaefer@brasslantern.com \
--cc=zsh-workers+phil.pennock@spodhuis.org \
--cc=zsh-workers@zsh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).