From: Peter Stephenson <p.stephenson@samsung.com>
To: Zsh hackers list <zsh-workers@zsh.org>
Subject: Re: Issue with ${var#(*_)(#cN,M)}
Date: Tue, 27 Oct 2015 10:46:33 +0000 [thread overview]
Message-ID: <20151027104633.2479414f@pwslap01u.europe.root.pri> (raw)
In-Reply-To: <20151027100034.45f487f0@pwslap01u.europe.root.pri>
On Tue, 27 Oct 2015 10:00:34 +0000
Peter Stephenson <p.stephenson@samsung.com> wrote:
> Original problem
> > } ~$ a='1_2_3_4_5_6'
> > } ~$ echo ${a#(*_)(#c2)}
> > } 2_3_4_5_6
>
> On Tue, 20 Oct 2015 16:04:22 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > What's messing it up is the "*" operator and the backtracking that is
> > implied because * can match anything.
>
> Exactly. What's backtracking over what in what order here is a bit of
> nightmare, and I'm not sure I'm likely to get my mind round it.
>
> Unless someone does, you'll be better of sticking to
>
> % a='1_2_3_4_5_6'
> % echo ${a#([^_]#_)(#c2)}
> 3_4_5_6
>
> and then we don't have the "*" within the group to worry about.
Indeed, I've just noticed that with
% egrep --version
egrep (GNU grep) 2.8
the following:
% egrep '^(*_){2}$' <<<'1_2_'
fails to match completely, i.e the backtracking is too complicated
to handle, whereas
% egrep '^([^_]+_){2}$' <<<'1_2_'
succeeds. At this point, I'm going to document the difficulty and
slowly retreat backwards from the dark corner.
pws
diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 5ea8610..49a0f0d 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2192,6 +2192,16 @@ inclusive. The form tt(LPAR()#c)var(N)tt(RPAR()) requires exactly tt(N)
matches; tt(LPAR()#c,)var(M)tt(RPAR()) is equivalent to specifying var(N)
as 0; tt(LPAR()#c)var(N)tt(,RPAR()) specifies that there is no maximum
limit on the number of matches.
+
+Note that if the previous group of characters contains wildcards,
+results can be unpredictable to the point of being logically incorrect.
+It is recommended that the pattern be trimmed to match the minimum
+possible. For example, to match a string of the form `tt(1_2_3_)', use
+a pattern of the form `tt(LPAR()[[:digit:]]##_+RPAR()LPAR()#c3+RPAR())', not
+`tt(LPAR()*_+RPAR()LPAR()#c3+RPAR())'. This arises from the
+complicated interaction between attempts to match a number of
+repetitions of the whole pattern and attempts to match the wildcard
+`tt(*)'.
)
vindex(MATCH)
vindex(MBEGIN)
next prev parent reply other threads:[~2015-10-27 10:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-19 9:33 Stephane Chazelas
2015-10-19 19:17 ` Bart Schaefer
2015-10-20 19:09 ` Stephane Chazelas
2015-10-20 23:04 ` Bart Schaefer
2015-10-27 10:00 ` Peter Stephenson
2015-10-27 10:46 ` Peter Stephenson [this message]
2015-10-27 11:03 ` Stephane Chazelas
2015-10-27 11:11 ` Peter Stephenson
2015-10-27 11:11 ` Stephane Chazelas
2015-10-27 11:37 ` Peter Stephenson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151027104633.2479414f@pwslap01u.europe.root.pri \
--to=p.stephenson@samsung.com \
--cc=zsh-workers@zsh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).