zsh-workers
 help / color / mirror / code / Atom feed
From: Stephane Chazelas <stephane.chazelas@gmail.com>
To: Phil Pennock <zsh-workers+phil.pennock@spodhuis.org>
Cc: Bart Schaefer <schaefer@brasslantern.com>, zsh-workers@zsh.org
Subject: Re: please consider using PCRE_DOLLAR_ENDONLY (and PCRE_DOTALL) for rematchpcre
Date: Tue, 23 Jan 2018 06:57:35 +0000	[thread overview]
Message-ID: <20180123065735.GA16678@chaz.gmail.com> (raw)
In-Reply-To: <20180122052829.GA83799@tower.spodhuis.org>

2018-01-22 00:28:29 -0500, Phil Pennock:
[...]
> Changing the default behavior of valid semantics risks hard-to-debug
> breakage of existing scripts and I am erring on the side of being
> against this change.  It's not hard opposition, but I'd like to see
> stronger justification before risking breaking changes.
> 
> I know that I myself have scripts which rely upon PCRE matching against
> multiline data behaving as per the defaults of pcrepattern(3).
> 
> In addition, while the DOTALL change can be turned off in-regex, the
> dollar-endonly one can't, AFAIK, so that becomes a breaking change which
> can't be worked around.
[...]

dollar-endonly is not really about multiline

[[ $'a\nb' =~ 'a$' ]]

will not match with or without it and

[[ $'a\nb' =~ '(?m)a$' ]]

will match with or without it.

It's more about single-line where the line delimiter happens to
be included (and you want the $ to match on the end of that line
as opposed to the end of the string).

$ matches before a trailing newline in a string in perl because
of how its <> operator works. perl is a text processing utility,
its regexps are primarily matched against single lines where the
newline is included (contrary to traditional text processing
utilities like sed/grep/awk where the record separator is not
included).

In:

    perl -pe 's/.$//'

(which calls <>).

you want to remove the last character of the line, not the
newline character.

That $ behaviour makes a lot of sense there. Even if you use:

   perl -lpe 's/.$//'

where that -l causes the delimiter to be removed on input and
added back on output like in sed/awk, that behaviour doesn't
harm because the record does *not* contain any newline
delimiter.

But zsh is not a text processing utility, and its "read" builtin
(the closest equivalent to perl's <>) does not include the
delimiter. It's actually hard to have a trailing newline when
processing text in shells given that $(...) strips them..

On the other hand, having

[[ $file =~ '\.txt$' ]]

match on files that don't end in .txt is a concern (and in my
experience, file names (as opposed to text lines with
delimiters) is the kind of thing I deal most often with in zsh).

And again, note that it only happens with pcrematch, it works as
expected with EREs.


-- 
Stephane


  reply	other threads:[~2018-01-23  6:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22 12:25 Stephane Chazelas
2017-11-22 21:40 ` Stephane Chazelas
2018-01-20  7:48   ` Bart Schaefer
2018-01-22  5:28     ` Phil Pennock
2018-01-23  6:57       ` Stephane Chazelas [this message]
2018-01-23 13:55         ` Stephane Chazelas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180123065735.GA16678@chaz.gmail.com \
    --to=stephane.chazelas@gmail.com \
    --cc=schaefer@brasslantern.com \
    --cc=zsh-workers+phil.pennock@spodhuis.org \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).