zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <p.stephenson@samsung.com>
To: Zsh hackers list <zsh-workers@zsh.org>
Subject: Re: Issue with ${var#(*_)(#cN,M)}
Date: Tue, 27 Oct 2015 10:46:33 +0000	[thread overview]
Message-ID: <20151027104633.2479414f@pwslap01u.europe.root.pri> (raw)
In-Reply-To: <20151027100034.45f487f0@pwslap01u.europe.root.pri>

On Tue, 27 Oct 2015 10:00:34 +0000
Peter Stephenson <p.stephenson@samsung.com> wrote:
> Original problem
> > } ~$ a='1_2_3_4_5_6'
> > } ~$ echo ${a#(*_)(#c2)}
> > } 2_3_4_5_6
> 
> On Tue, 20 Oct 2015 16:04:22 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > What's messing it up is the "*" operator and the backtracking that is
> > implied because * can match anything.
> 
> Exactly.  What's backtracking over what in what order here is a bit of
> nightmare, and I'm not sure I'm likely to get my mind round it.
> 
> Unless someone does, you'll be better of sticking to
> 
> % a='1_2_3_4_5_6'
> % echo ${a#([^_]#_)(#c2)}
> 3_4_5_6
> 
> and then we don't have the "*" within the group to worry about.

Indeed, I've just noticed that with
% egrep --version
egrep (GNU grep) 2.8

the following:

% egrep '^(*_){2}$' <<<'1_2_'

fails to match completely, i.e the backtracking is too complicated
to handle, whereas

% egrep '^([^_]+_){2}$' <<<'1_2_'

succeeds.  At this point, I'm going to document the difficulty and
slowly retreat backwards from the dark corner.

pws

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 5ea8610..49a0f0d 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2192,6 +2192,16 @@ inclusive.  The form tt(LPAR()#c)var(N)tt(RPAR()) requires exactly tt(N)
 matches; tt(LPAR()#c,)var(M)tt(RPAR()) is equivalent to specifying var(N)
 as 0; tt(LPAR()#c)var(N)tt(,RPAR()) specifies that there is no maximum
 limit on the number of matches.
+
+Note that if the previous group of characters contains wildcards,
+results can be unpredictable to the point of being logically incorrect.
+It is recommended that the pattern be trimmed to match the minimum
+possible.  For example, to match a string of the form `tt(1_2_3_)', use
+a pattern of the form `tt(LPAR()[[:digit:]]##_+RPAR()LPAR()#c3+RPAR())', not
+`tt(LPAR()*_+RPAR()LPAR()#c3+RPAR())'.  This arises from the
+complicated interaction between attempts to match a number of
+repetitions of the whole pattern and attempts to match the wildcard
+`tt(*)'.
 )
 vindex(MATCH)
 vindex(MBEGIN)


  reply	other threads:[~2015-10-27 10:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-19  9:33 Stephane Chazelas
2015-10-19 19:17 ` Bart Schaefer
2015-10-20 19:09   ` Stephane Chazelas
2015-10-20 23:04     ` Bart Schaefer
2015-10-27 10:00       ` Peter Stephenson
2015-10-27 10:46         ` Peter Stephenson [this message]
2015-10-27 11:03           ` Stephane Chazelas
2015-10-27 11:11             ` Peter Stephenson
2015-10-27 11:11             ` Stephane Chazelas
2015-10-27 11:37               ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151027104633.2479414f@pwslap01u.europe.root.pri \
    --to=p.stephenson@samsung.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).