zsh-workers
 help / color / mirror / code / Atom feed
* [Bug] S-flag imposes non-greedy match where it shouldn't
@ 2019-12-18 20:41 Sebastian Gniazdowski
  2019-12-18 20:44 ` Sebastian Gniazdowski
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-18 20:41 UTC (permalink / raw)
  To: Zsh hackers list

Hi,
str="aXXXXXbXXXXc"; print ${(S)str##X##}
Output: abXXXXc

As it can be seen, the flag worked correctly. However, when %% will be
used instead of ##:

str="aXXXXXbXXXXc"; print ${(S)str%%X##}
Output: aXXXXXbXXXc

Then the (S) flag seems to impose also non-greedy matching, not only
substring searching.
-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-18 20:41 [Bug] S-flag imposes non-greedy match where it shouldn't Sebastian Gniazdowski
@ 2019-12-18 20:44 ` Sebastian Gniazdowski
  2019-12-19 15:29   ` Daniel Shahaf
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-18 20:44 UTC (permalink / raw)
  To: Zsh hackers list

Or rather not a bug… It seems that it's the result of how % searches
the substrings from the end – it stops at the first match, i.e.: after
finding a first X from the right.

On Wed, 18 Dec 2019 at 21:41, Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> Hi,
> str="aXXXXXbXXXXc"; print ${(S)str##X##}
> Output: abXXXXc
>
> As it can be seen, the flag worked correctly. However, when %% will be
> used instead of ##:
>
> str="aXXXXXbXXXXc"; print ${(S)str%%X##}
> Output: aXXXXXbXXXc
>
> Then the (S) flag seems to impose also non-greedy matching, not only
> substring searching.
> --
> Sebastian Gniazdowski
> News: https://twitter.com/ZdharmaI
> IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
> Blog: http://zdharma.org



-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-18 20:44 ` Sebastian Gniazdowski
@ 2019-12-19 15:29   ` Daniel Shahaf
  2019-12-26 18:35     ` Sebastian Gniazdowski
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Shahaf @ 2019-12-19 15:29 UTC (permalink / raw)
  To: zsh-workers

Sebastian Gniazdowski wrote on Wed, 18 Dec 2019 20:44 +00:00:
> Or rather not a bug… It seems that it's the result of how % searches
> the substrings from the end – it stops at the first match, i.e.: after
> finding a first X from the right.

Could we improve the documentation of (S), then?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-19 15:29   ` Daniel Shahaf
@ 2019-12-26 18:35     ` Sebastian Gniazdowski
  2019-12-27  4:54       ` Sebastian Gniazdowski
  2019-12-27  5:29       ` Daniel Shahaf
  0 siblings, 2 replies; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-26 18:35 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]

I've attached the extended description. It includes a trick to
work-around the unintuitive behavior of S. It looks as follows:

http://psprint.blinkenshell.org/S_flag.png

I think that the way the S flag works is a bit of an inconsistency,
Because ${str%%X##**} would not stop at the first from the right
match, it would try other matches starting from the right and go on up
to the final first from the left X. I think that (S) shouldn't change
this, but on the other hand should ${(S)str%%X##} match the first
three X? Rather not, as it would resemble ## then... Intuitively,
however, it should match all the three right X.

On Thu, 19 Dec 2019 at 16:30, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Sebastian Gniazdowski wrote on Wed, 18 Dec 2019 20:44 +00:00:
> > Or rather not a bug… It seems that it's the result of how % searches
> > the substrings from the end – it stops at the first match, i.e.: after
> > finding a first X from the right.
>
> Could we improve the documentation of (S), then?



-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

[-- Attachment #2: 0001-Extend-description-of-S-flag.patch.txt --]
[-- Type: text/plain, Size: 1363 bytes --]

From 6a2d5a6f0b69cbfb8c76176e55e233a3c710a42b Mon Sep 17 00:00:00 2001
From: Sebastian Gniazdowski <sgniazdowski@gmail.com>
Date: Thu, 26 Dec 2019 19:22:41 +0100
Subject: [PATCH] Extend description of S flag

---
 Doc/Zsh/expn.yo | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index d7147dbd7..90d03a91b 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -1399,6 +1399,20 @@ from the beginning and with tt(%) start from the end of the string.
 With substitution via tt(${)...tt(/)...tt(}) or
 tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the
 shortest instead of the longest match should be replaced.
+The substring search means that the pattern is matched skipping the
+parts of the input string starting from the direction set by the use
+of tt(#) or tt(%). For example, to match a pattern starting from the
+end, one could use:
+
+example(str="abcXXXdefXXXghi"
+out=${(S)str%%(#b)([^X])X##}
+out=$out${match[1]}
+)
+
+The result is tt(abcXXXdefghi). It would have been tt(abcXXXdefXXghif)
+if not the tt([^X]) part, as despite the tt(%%) specifies a greedy
+match, the substring matching works by trying matches from right to
+left and stops at a first valid match.
 )
 item(tt(I:)var(expr)tt(:))(
 Search the var(expr)th match (where var(expr) evaluates to a number).
-- 
2.21.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-26 18:35     ` Sebastian Gniazdowski
@ 2019-12-27  4:54       ` Sebastian Gniazdowski
  2019-12-27  5:09         ` Sebastian Gniazdowski
  2019-12-27  5:29       ` Daniel Shahaf
  1 sibling, 1 reply; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-27  4:54 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

The trick is not right. I'll try to update it or otherwise change the
description.

On Thu, 26 Dec 2019 at 19:35, Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> I've attached the extended description. It includes a trick to
> work-around the unintuitive behavior of S. It looks as follows:
>
> http://psprint.blinkenshell.org/S_flag.png
>
> I think that the way the S flag works is a bit of an inconsistency,
> Because ${str%%X##**} would not stop at the first from the right
> match, it would try other matches starting from the right and go on up
> to the final first from the left X. I think that (S) shouldn't change
> this, but on the other hand should ${(S)str%%X##} match the first
> three X? Rather not, as it would resemble ## then... Intuitively,
> however, it should match all the three right X.
>
> On Thu, 19 Dec 2019 at 16:30, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Sebastian Gniazdowski wrote on Wed, 18 Dec 2019 20:44 +00:00:
> > > Or rather not a bug… It seems that it's the result of how % searches
> > > the substrings from the end – it stops at the first match, i.e.: after
> > > finding a first X from the right.
> >
> > Could we improve the documentation of (S), then?
>
>
>
> --
> Sebastian Gniazdowski
> News: https://twitter.com/ZdharmaI
> IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
> Blog: http://zdharma.org



-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-27  4:54       ` Sebastian Gniazdowski
@ 2019-12-27  5:09         ` Sebastian Gniazdowski
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-27  5:09 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 2240 bytes --]

I've updated the trick. It's now:

str="abcXXXdefXXXghi"
out=${(S)str%%(#b)([^X])X##(*)}
out=$out$match[1]$match[2]

The man page looks like http://psprint.blinkenshell.org/S_flag-2.png

I think that the trick is minimal to get the job done, i.e.: to
actually match a substring greedily starting from the right.

On Fri, 27 Dec 2019 at 05:54, Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> The trick is not right. I'll try to update it or otherwise change the
> description.
>
> On Thu, 26 Dec 2019 at 19:35, Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> >
> > I've attached the extended description. It includes a trick to
> > work-around the unintuitive behavior of S. It looks as follows:
> >
> > http://psprint.blinkenshell.org/S_flag.png
> >
> > I think that the way the S flag works is a bit of an inconsistency,
> > Because ${str%%X##**} would not stop at the first from the right
> > match, it would try other matches starting from the right and go on up
> > to the final first from the left X. I think that (S) shouldn't change
> > this, but on the other hand should ${(S)str%%X##} match the first
> > three X? Rather not, as it would resemble ## then... Intuitively,
> > however, it should match all the three right X.
> >
> > On Thu, 19 Dec 2019 at 16:30, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > >
> > > Sebastian Gniazdowski wrote on Wed, 18 Dec 2019 20:44 +00:00:
> > > > Or rather not a bug… It seems that it's the result of how % searches
> > > > the substrings from the end – it stops at the first match, i.e.: after
> > > > finding a first X from the right.
> > >
> > > Could we improve the documentation of (S), then?
> >
> >
> >
> > --
> > Sebastian Gniazdowski
> > News: https://twitter.com/ZdharmaI
> > IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
> > Blog: http://zdharma.org
>
>
>
> --
> Sebastian Gniazdowski
> News: https://twitter.com/ZdharmaI
> IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
> Blog: http://zdharma.org



-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

[-- Attachment #2: 0001-Extend-description-of-S-flag.patch.2.txt --]
[-- Type: text/plain, Size: 1399 bytes --]

From 8176ef315181dc38e41e80c6509d591bddf86db1 Mon Sep 17 00:00:00 2001
From: Sebastian Gniazdowski <sgniazdowski@gmail.com>
Date: Thu, 26 Dec 2019 19:22:41 +0100
Subject: [PATCH] Extend description of S flag

---
 Doc/Zsh/expn.yo | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index d7147dbd7..36813dc20 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -1399,6 +1399,19 @@ from the beginning and with tt(%) start from the end of the string.
 With substitution via tt(${)...tt(/)...tt(}) or
 tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the
 shortest instead of the longest match should be replaced.
+The substring search means that the pattern is matched skipping the
+parts of the input string starting from the direction set by the use
+of tt(#) or tt(%). For example, to match a pattern starting from the
+end, one could use:
+
+example(str="abcXXXdefXXXghi"
+out=${(S)str%%(#b)([^X])X##(*)}
+out=$out$match[1]$match[2])
+
+The result is tt(abcXXXdefghi). It would have been tt(abcXXXdefXXghi)
+if the substitution would have been tt(${(S)str%%X##}), as despite the
+tt(%%) specifies a greedy match, the substring matching works by
+trying matches from right to left and stops at a first valid match.
 )
 item(tt(I:)var(expr)tt(:))(
 Search the var(expr)th match (where var(expr) evaluates to a number).
-- 
2.21.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-26 18:35     ` Sebastian Gniazdowski
  2019-12-27  4:54       ` Sebastian Gniazdowski
@ 2019-12-27  5:29       ` Daniel Shahaf
  2019-12-28 19:04         ` Sebastian Gniazdowski
  1 sibling, 1 reply; 22+ messages in thread
From: Daniel Shahaf @ 2019-12-27  5:29 UTC (permalink / raw)
  To: Zsh hackers list

Sebastian Gniazdowski wrote on Thu, Dec 26, 2019 at 19:35:05 +0100:
> I've attached the extended description.

Thanks; review below.

> It includes a trick to
> work-around the unintuitive behavior of S. It looks as follows:
> 
> http://psprint.blinkenshell.org/S_flag.png

Please just copy-paste terminal transcripts into the email.  (Static
transcripts, as in the documentation.)

> I think that the way the S flag works is a bit of an inconsistency,
> Because ${str%%X##**} would not stop at the first from the right
> match, it would try other matches starting from the right and go on up
> to the final first from the left X. I think that (S) shouldn't change
> this, but on the other hand should ${(S)str%%X##} match the first
> three X? Rather not, as it would resemble ## then... Intuitively,
> however, it should match all the three right X.

Yes, I don't find the following very intuitive:

% set -- aXbXc
% p ${1%%X*}
a
% p ${(S)1%%X*}
aXb
% p ${(S)1%X*}
aXbc
% 

I expected ${(S)%%} to mean: 'Look for the longest match that ends on the last
character; if you don't find any, then look for the longest match that ends on
the penultimate character; etc, until you finally consider whether $str[1] is a
match and whether ${str[1,0]} is a match'.  However, that's clearly not what it
does here, or ${(S)1%%X*} would have printed «a».  Rather, it seems that
${(S)%} and ${(S)%%} mean 'Find the match whose _start_ is closest to the end
of the string; of all matches that start at a particular index, ${(S)%} picks
the shortest and ${(S)%%} the longest.'.

> +++ b/Doc/Zsh/expn.yo
> @@ -1399,6 +1399,20 @@ from the beginning and with tt(%) start from the end of the string.
>  With substitution via tt(${)...tt(/)...tt(}) or
>  tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the
>  shortest instead of the longest match should be replaced.
> +The substring search means that the pattern is matched skipping the
> +parts of the input string starting from the direction set by the use
> +of tt(#) or tt(%).

I don't understand this sentence.  What does "skipping" mean?

Documentation should be clear and specific enough to allow acceptance
tests to be based on it.

> +For example, to match a pattern starting from the
> +end, one could use:
> +
> +example(str="abcXXXdefXXXghi"
> +out=${(S)str%%(#b)([^X])X##}
> +out=$out${match[1]}
> +)
> +
> +The result is tt(abcXXXdefghi).

That's not correct.  The output is abcXXXdefXXXghi (in 'zsh -f') or
abcXXXdeghif (with extendedglob set), but not abcXXXdefghi.

I doubt this example would clarify the meaning of ${(S)} to people who
encounter it for the first time.  Please use a more minimal example.
Specific issues:

- Assigning to $out a concatenation of two different values muddies the water.
  It forces readers to reverse engineer which parts of the resultant value come
  from ${match[1]} and which from the ${(S)%%}.  This is documentation, not
  a homework problem; the answer should be obvious.  Something like
  «out="${out}+${match[1]}"» would address this — but…

- … the use of advanced pattern matching features needlessly raises the
  learning curve.  For example, the use of «##» doesn't affect the behaviour
  of the example in any meaningful way, but it has two downsides: it means the
  example won't work out of the box when people paste it into their shell, and
  it means people who RTFM about (S) won't be able to understand it until they
  also look up what «##» does [which in turn means they'll have to open
  zshoptions(1) to RTFM about EXTENDED_GLOB].  This mostly applies to the use
  of (#b) and capture groups too: it would be better not to assume knowledge
  of that.

> It would have been tt(abcXXXdefXXghif)
> +if not the tt([^X]) part, as despite the tt(%%) specifies a greedy
> +match, the substring matching works by trying matches from right to
> +left and stops at a first valid match.

There are some grammatical errors here (e.g., s/(?<=specif)ies/ying/), but
let's not worry about them until the rest of the patch isn't a moving target.

Thanks for the patch.  I look forward to a v2.

Daniel

P.S. Obviously, I meant to write «s/specifies/specifying/» — but I wanted to
illustrate the point that no more knowledge of pattern-matching syntax should
be assumed than necessary.  [It was a positive lookbehind assertion Perl
syntax.]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-27  5:29       ` Daniel Shahaf
@ 2019-12-28 19:04         ` Sebastian Gniazdowski
  2019-12-28 20:34           ` Bart Schaefer
  2019-12-28 21:00           ` Daniel Shahaf
  0 siblings, 2 replies; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-28 19:04 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

On Fri, 27 Dec 2019 at 06:30, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Sebastian Gniazdowski wrote on Thu, Dec 26, 2019 at 19:35:05 +0100:
> > +++ b/Doc/Zsh/expn.yo
> > @@ -1399,6 +1399,20 @@ from the beginning and with tt(%) start from the end of the string.
> >  With substitution via tt(${)...tt(/)...tt(}) or
> >  tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the
> >  shortest instead of the longest match should be replaced.
> > +The substring search means that the pattern is matched skipping the
> > +parts of the input string starting from the direction set by the use
> > +of tt(#) or tt(%).
>
> I don't understand this sentence.  What does "skipping" mean?

It means that parts of the string are being skipped when they don't
match when moving to the other end. Does the sentence need an update?

> > +For example, to match a pattern starting from the
> > +end, one could use:
> > +
> > +example(str="abcXXXdefXXXghi"
> > +out=${(S)str%%(#b)([^X])X##}
> > +out=$out${match[1]}
> > +)
> > +
> > +The result is tt(abcXXXdefghi).
>
> That's not correct.  The output is abcXXXdefXXXghi (in 'zsh -f') or
> abcXXXdeghif (with extendedglob set), but not abcXXXdefghi.

I've sent an updated patch half hour before your email. It contains
the correct example.

> I doubt this example would clarify the meaning of ${(S)} to people who
> encounter it for the first time.  Please use a more minimal example.
> Specific issues:
>   - (...) This is documentation, not
>   a homework problem; the answer should be obvious.  Something like
>   «out="${out}+${match[1]}"» would address this — but…

I think that many examples in the man pages are like that – they don't
go the obvious path of just demonstrating the usage but instead, they
cover some edge case that, after (sometimes quite long) thinking
reveal something very peculiar about the feature. There are better
examples of this, however, the best that I've found currently is the
one used for the #b glob flag:

             foo="a string with a message"
             if [[ $foo = (a|an)' '(#b)(*)' '* ]]; then
               print ${foo[$mbegin[1],$mend[1]]}
            fi

The example prints `string with a', and the user has a "homework" of
untangling a few points:
- why it isn't "string with a message" (it's because the final ' '*
part that requires a space after the final word of the (*) part),
- why the answer isn't "message" (the same as above plus the fact that
there's no * before (a|an) and the greediness).

If not the homework-attitude of the examples in the man page, the
example would have been

             if [[ "a string with a message" = (#b)a' '(*) ]]; then

and would give the answer "string with a message". This would have
been the obvious-demonstration attitude that I've referred to.

> - … the use of advanced pattern matching features needlessly raises the
>   learning curve.

I can add the mention that the example needs EXTENDED_GLOB. Overall I
think that the example:
- is nice because it shows how to make the (S)...%% substitution
behave as the intuition would suggest,
- it's the only place in the documentation that uses the (#b) flag
with #/% substitution, showing that it's possible to use it in that
place,
- it isn't that complex for someone that knows #b flag and the $match parameter.

> > It would have been tt(abcXXXdefXXghif)
> > +if not the tt([^X]) part, as despite the tt(%%) specifies a greedy
> > +match, the substring matching works by trying matches from right to
> > +left and stops at a first valid match.
>
> There are some grammatical errors here (e.g., s/(?<=specif)ies/ying/), but
> let's not worry about them until the rest of the patch isn't a moving target.

I think that grammar is correct here. Did you maybe misread the sentence?

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-28 19:04         ` Sebastian Gniazdowski
@ 2019-12-28 20:34           ` Bart Schaefer
  2019-12-28 21:00           ` Daniel Shahaf
  1 sibling, 0 replies; 22+ messages in thread
From: Bart Schaefer @ 2019-12-28 20:34 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Daniel Shahaf, Zsh hackers list

On Sat, Dec 28, 2019 at 11:05 AM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> I think that many examples in the man pages are like that – they don't
> go the obvious path of just demonstrating the usage but instead, they
> cover some edge case

Usually the examples in the manual page are derived from some usage in
a completion function or the like, or from a "how do you accomplish
X?" question on one of the mailing lists that led to the addition of
the feature in question.  Consequently they tend to have been drawn
from unusual situations and never updated once a more obvious/common
usage develops.

> > > +if not the tt([^X]) part, as despite the tt(%%) specifies a greedy
> > > +match, the substring matching works by trying matches from right to
> > > +left and stops at a first valid match.
> >
> > There are some grammatical errors here (e.g., s/(?<=specif)ies/ying/)
>
> I think that grammar is correct here. Did you maybe misread the sentence?

In American English it would be more common to say "despite X
specifying Y" or "despite that X specifies Y" -- it's not exactly
wrong to omit the word "that" but it's unusual.  Sometimes one would
even go so far as "despite the fact that X specifies Y".   In the
first case "despite" modifies "specifying", in the second case
"despite" modifies the entire condition "X specifies Y" and "that" is
needed to distinguish that "despite" isn't directly modifying "X"
(exactly the way I put "that" after "distinguish" in this sentence).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-28 19:04         ` Sebastian Gniazdowski
  2019-12-28 20:34           ` Bart Schaefer
@ 2019-12-28 21:00           ` Daniel Shahaf
  2019-12-29  0:56             ` Sebastian Gniazdowski
  1 sibling, 1 reply; 22+ messages in thread
From: Daniel Shahaf @ 2019-12-28 21:00 UTC (permalink / raw)
  To: Zsh hackers list

Sebastian Gniazdowski wrote on Sat, Dec 28, 2019 at 20:04:21 +0100:
> On Fri, 27 Dec 2019 at 06:30, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Sebastian Gniazdowski wrote on Thu, Dec 26, 2019 at 19:35:05 +0100:
> > > +++ b/Doc/Zsh/expn.yo
> > > @@ -1399,6 +1399,20 @@ from the beginning and with tt(%) start from the end of the string.
> > >  With substitution via tt(${)...tt(/)...tt(}) or
> > >  tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the
> > >  shortest instead of the longest match should be replaced.
> > > +The substring search means that the pattern is matched skipping the
> > > +parts of the input string starting from the direction set by the use
> > > +of tt(#) or tt(%).
> >
> > I don't understand this sentence.  What does "skipping" mean?
> 
> It means that parts of the string are being skipped when they don't
> match when moving to the other end. Does the sentence need an update?

Yes.  Feel free to also add a paragraph break, and/or to change the incumbent
text, too.

> > > +For example, to match a pattern starting from the
> > > +end, one could use:
> > > +
> > > +example(str="abcXXXdefXXXghi"
> > > +out=${(S)str%%(#b)([^X])X##}
> > > +out=$out${match[1]}
> > > +)
> > > +
> > > +The result is tt(abcXXXdefghi).
> >
> > That's not correct.  The output is abcXXXdefXXXghi (in 'zsh -f') or
> > abcXXXdeghif (with extendedglob set), but not abcXXXdefghi.
> 
> I've sent an updated patch half hour before your email. It contains
> the correct example.
> 

I saw it, but most of my feedback applied to it too.

I think the last sentence of that patch is the most important one, since it's
the only one that actually gives the general rule.  I'd put it nearer the top.

> > I doubt this example would clarify the meaning of ${(S)} to people who
> > encounter it for the first time.  Please use a more minimal example.
> > Specific issues:
> >   - (...) This is documentation, not
> >   a homework problem; the answer should be obvious.  Something like
> >   «out="${out}+${match[1]}"» would address this — but…
> 
> I think that many examples in the man pages are like that – they don't
> go the obvious path of just demonstrating the usage but instead, they
> cover some edge case that, after (sometimes quite long) thinking
> reveal something very peculiar about the feature.

So what?  We're not going to accept a patch that adds an unclear explanation
simply because other explanations are unclear.

New documentation should be clear.  If any of the existing documentation is
unclear, we should fix that, too.

> There are better examples of this, however, the best that I've found
> currently is the one used for the #b glob flag:
> 
>              foo="a string with a message"
>              if [[ $foo = (a|an)' '(#b)(*)' '* ]]; then
>                print ${foo[$mbegin[1],$mend[1]]}
>             fi
> 
> The example prints `string with a', and the user has a "homework" of
> untangling a few points:
> - why it isn't "string with a message" (it's because the final ' '*
> part that requires a space after the final word of the (*) part),
> - why the answer isn't "message" (the same as above plus the fact that
> there's no * before (a|an) and the greediness).
> 
> If not the homework-attitude of the examples in the man page, the
> example would have been
> 
>              if [[ "a string with a message" = (#b)a' '(*) ]]; then
> 
> and would give the answer "string with a message". This would have
> been the obvious-demonstration attitude that I've referred to.

You can't actually get rid of the variable $foo; it's needed for the «print»
call on the next line.  Otherwise, I agree.  I'll go ahead and make the change,
and also change the spaces to underscores.  Thanks for pointing this out.  Do
you know any other examples that have room for improvement?

> > - … the use of advanced pattern matching features needlessly raises the
> >   learning curve.
> 
> I can add the mention that the example needs EXTENDED_GLOB. Overall I
> think that the example:
> - is nice because it shows how to make the (S)...%% substitution
> behave as the intuition would suggest,

Let's not lose sight of the wood for the trees.  The purpose of the
documentation is first and foremost to describe what a feature _does_, be it
intuitive or not.  Describing how to coerce it into doing other things is
secondary.  Your (revised) patch puts the cart before the horse: it describes
your "trick" before describing what ${(S)%%} actually does.  Please change
that.

If you then want to recommend left-anchoring the pattern in order to force
a match that starts farther from the end to be used, that would be fine.
And if the left-anchoring example requires capturing groups, so be it — but you
could probably give an example that doesn't.  (passwd(5) lines come to mind.)

I wonder if there's anything else the documentation could recommend.  Your
trick boils down to using captured negated character classes as a poor man's
negative lookbehind assertion, but we have the zsh/pcre module which supports
real lookaround assertions (as well as resetting the start of the
match, \K), so perhaps that could be used?  Or perhaps there's a way to get the
"intuitive" behaviour by reversing the string, using ${(S)##}, and reversing it
again.

> - it's the only place in the documentation that uses the (#b) flag
> with #/% substitution, showing that it's possible to use it in that
> place,

We can add a separate example for that under (#b), which is the more advanced
of these topics, and subject the explanation of (S) to KISS.

> - it isn't that complex for someone that knows #b flag and the $match parameter.

The documentation is aimed at everyone, including people who don't already know (#b).

> > > It would have been tt(abcXXXdefXXghif)
> > > +if not the tt([^X]) part, as despite the tt(%%) specifies a greedy
> > > +match, the substring matching works by trying matches from right to
> > > +left and stops at a first valid match.
> >
> > There are some grammatical errors here (e.g., s/(?<=specif)ies/ying/), but
> > let's not worry about them until the rest of the patch isn't a moving target.
> 
> I think that grammar is correct here. Did you maybe misread the sentence?

No, I didn't.  I was taught that "despite" should always be followed by a noun
phrase, never by a sentence.

Cheers,

Daniel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-28 21:00           ` Daniel Shahaf
@ 2019-12-29  0:56             ` Sebastian Gniazdowski
  2019-12-29  2:05               ` Daniel Shahaf
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-29  0:56 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1487 bytes --]

sob., 28 gru 2019, 22:01 użytkownik Daniel Shahaf <d.s@daniel.shahaf.name>
napisał:

> Sebastian Gniazdowski wrote on Sat, Dec 28, 2019 at 20:04:21 +0100:
> > I think that many examples in the man pages are like that – they don't
> > go the obvious path of just demonstrating the usage but instead, they
> > cover some edge case that, after (sometimes quite long) thinking
> > reveal something very peculiar about the feature.
>
> So what?  We're not going to accept a patch that adds an unclear
> explanation
> simply because other explanations are unclear.
>

I think that the style of the docs has a value. At first one can get little
angry "why the example just doesn't confirm what I already suspect",
however, after untangling it one most probably will feel satisfaction and
gratefulness. It's a way to share advanced, expert knowledge.

You can't actually get rid of the variable $foo; it's needed for the «print»
> call on the next line.


Just noticing that this shows another non-trivial aspect of the example -
that it uses mbegin and mend instead of match.

Otherwise, I agree.  I'll go ahead and make the change,
> and also change the spaces to underscores.  Thanks for pointing this out.
> Do
> you know any other examples that have room for improvement?
>

As I said I see much value in the current style of the docs and I've
learned much from it, so I don't think it should be changed.

-- 
Best regards,
Sebastian Gniazdowski

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-29  0:56             ` Sebastian Gniazdowski
@ 2019-12-29  2:05               ` Daniel Shahaf
  2019-12-29  3:14                 ` Sebastian Gniazdowski
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Shahaf @ 2019-12-29  2:05 UTC (permalink / raw)
  To: zsh-workers

Sebastian Gniazdowski wrote on Sun, Dec 29, 2019 at 01:56:12 +0100:
> sob., 28 gru 2019, 22:01 użytkownik Daniel Shahaf <d.s@daniel.shahaf.name>
> napisał:
> 
> > Sebastian Gniazdowski wrote on Sat, Dec 28, 2019 at 20:04:21 +0100:
> > > I think that many examples in the man pages are like that – they don't
> > > go the obvious path of just demonstrating the usage but instead, they
> > > cover some edge case that, after (sometimes quite long) thinking
> > > reveal something very peculiar about the feature.
> >
> > So what?  We're not going to accept a patch that adds an unclear
> > explanation
> > simply because other explanations are unclear.
> >
> 
> I think that the style of the docs has a value. At first one can get little
> angry "why the example just doesn't confirm what I already suspect",
> however, after untangling it one most probably will feel satisfaction and
> gratefulness. It's a way to share advanced, expert knowledge.

There should be no untangling.  Documentation is about conveying knowledge, not
about presenting riddles.  The documentation of (S) should explain what (S)
does assuming as little knowledge as practicable.  Likewise, the documentation
of (#b) should, if possible, explain what (#b) does without requiring the
reader to know — or, worse, reverse engineer — details such as the greediness
of the * operator.  That detail should, of course, be documented in the section
about that operator.

> > You can't actually get rid of the variable $foo; it's needed for the «print»
> > call on the next line.
> 
> 
> Just noticing that this shows another non-trivial aspect of the example -
> that it uses mbegin and mend instead of match.

Using mbegin and mend is not non-trivial in the context of that example.

> > Otherwise, I agree.  I'll go ahead and make the change,
> > and also change the spaces to underscores.  Thanks for pointing this out.
> > Do
> > you know any other examples that have room for improvement?
> >
> 
> As I said I see much value in the current style of the docs and I've
> learned much from it, so I don't think it should be changed.

Well, I'm sorry, but I expect consensus will not side with you on this.

Do you intend to send another revision of the (S) docs patch upthread?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug] S-flag imposes non-greedy match where it shouldn't
  2019-12-29  2:05               ` Daniel Shahaf
@ 2019-12-29  3:14                 ` Sebastian Gniazdowski
  2019-12-30 18:00                   ` [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't) Daniel Shahaf
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-29  3:14 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

On Sun, 29 Dec 2019 at 03:06, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Sebastian Gniazdowski wrote on Sun, Dec 29, 2019 at 01:56:12 +0100:
> > I think that the style of the docs has a value. At first one can get little
> > angry "why the example just doesn't confirm what I already suspect",
> > however, after untangling it one most probably will feel satisfaction and
> > gratefulness. It's a way to share advanced, expert knowledge.
>
> There should be no untangling.  Documentation is about conveying knowledge, not
> about presenting riddles.

I'm just noticing that the riddle-style is a way to convey peculiar
things about the described features and it does work, although it
requires involving more resources during reading. Otherwise, there
must be basically a long paragraph of text that openly - although with
difficulty of stating the subtleties clearly – describes the things. I
can opt for the riddle style because it has its advantages – reading
such long delicate text requires resources too and is open for
misreading, while the riddle condensates the knowledge in a form
that's not open for that.

> > As I said I see much value in the current style of the docs and I've
> > learned much from it, so I don't think it should be changed.
>
> Well, I'm sorry, but I expect consensus will not side with you on this.
>
> Do you intend to send another revision of the (S) docs patch upthread?

Rather not, I've couldn't replace the example with anything simpler
and the patch that would describe the issue with (S) would be either
superficial (for a short version) or too long and convoluted.

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)
  2019-12-29  3:14                 ` Sebastian Gniazdowski
@ 2019-12-30 18:00                   ` Daniel Shahaf
  2019-12-30 18:11                     ` Roman Perepelitsa
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Shahaf @ 2019-12-30 18:00 UTC (permalink / raw)
  To: Zsh hackers list

---
Sebastian Gniazdowski wrote on Sun, Dec 29, 2019 at 04:14:11 +0100:
> Rather not, I've couldn't replace the example with anything simpler
> and the patch that would describe the issue with (S) would be either
> superficial (for a short version) or too long and convoluted.

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index d7147dbd7..92687bcfe 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -1394,11 +1394,40 @@ used with the tt(${)...tt(/)...tt(}) forms.
 
 startitem()
 item(tt(S))(
-Search substrings as well as beginnings or ends; with tt(#) start
-from the beginning and with tt(%) start from the end of the string.
+With tt(#) or tt(##), search for the match that starts closest to the start of
+the string (a `substring match'). Of all matches at a particular position,
+tt(#) selects the shortest and tt(##) the longest:
+
+example(% str="aXbXc"
+% echo ${+LPAR()S+RPAR()str#X*}
+abXc
+% echo ${+LPAR()S+RPAR()str##X*}
+a
+% )
+
+With tt(%) or tt(%%), search for the match that starts closest to the end of
+the string:
+
+example(% str="aXbXc"
+% echo ${+LPAR()S+RPAR()str%X*}
+aXbc
+% echo ${+LPAR()S+RPAR()str%%X*}
+aXb
+% )
+
+(Note that tt(%) and tt(%%) don't search for the match that ends closest to the
+end of the string, as one might expect.)
+
 With substitution via tt(${)...tt(/)...tt(}) or
 tt(${)...tt(//)...tt(}), specifies non-greedy matching, i.e. that the
-shortest instead of the longest match should be replaced.
+shortest instead of the longest match should be replaced:
+
+example(% str="abab"
+% echo ${str/*b/_}
+_
+% echo ${+LPAR()S+RPAR()str/*b/_}
+_ab
+% )
 )
 item(tt(I:)var(expr)tt(:))(
 Search the var(expr)th match (where var(expr) evaluates to a number).

Cheers,

Daniel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)
  2019-12-30 18:00                   ` [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't) Daniel Shahaf
@ 2019-12-30 18:11                     ` Roman Perepelitsa
       [not found]                       ` <CAKc7PVAXLpKqZvmbazZK=mvcz8T-AHJXKusut6aEjkkSLzgdbw@mail.gmail.com>
  0 siblings, 1 reply; 22+ messages in thread
From: Roman Perepelitsa @ 2019-12-30 18:11 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

From the peanut gallery: Thank you! I finally understand what (S) does.

Roman.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)
       [not found]                       ` <CAKc7PVAXLpKqZvmbazZK=mvcz8T-AHJXKusut6aEjkkSLzgdbw@mail.gmail.com>
@ 2019-12-30 20:01                         ` Roman Perepelitsa
  2019-12-30 20:20                           ` Sebastian Gniazdowski
  0 siblings, 1 reply; 22+ messages in thread
From: Roman Perepelitsa @ 2019-12-30 20:01 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list, Daniel Shahaf

On Mon, Dec 30, 2019 at 8:32 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> Do you also understand why:
>
> str=aXXXb
> print ${str%%X##}
>
> Outputs aXXb?

I suppose you mean ${(S)str%%X##}. Yes, I understand why this prints aXXb.

> Outputs aXXb? The docs don't cover this.

They do with the latest patch for (S) from Daniel:

> With tt(%) or tt(%%), search for the match that starts closest to the end of
> the string

This means that ${(S)str%%X##} is going to find a match that starts
closest to the end of the string and remove it. X## matches one or
move X characters. We go backwards one character at a time until X##
matches. The first match starts at str[-2] and the match is X, so X
gets removed. This seems clear from the docs.

I think it would be beneficial to specify that with ${(S)str##pattern}
the first attempted match starts at str[-1] and that no attempt is
made to check if the empty string (the ultimate shortest suffix)
matches. I think you or someone else has recently raised this point as
this seems inconsistent. It's surprising to me that both ${str#*} and
${(S)str%*} expand to $str while ${(S)str%%*} doesn't.

Roman.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)
  2019-12-30 20:01                         ` Roman Perepelitsa
@ 2019-12-30 20:20                           ` Sebastian Gniazdowski
  2019-12-30 21:24                             ` ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)) Daniel Shahaf
  2019-12-30 21:40                             ` [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't) Roman Perepelitsa
  0 siblings, 2 replies; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-30 20:20 UTC (permalink / raw)
  To: Roman Perepelitsa; +Cc: Zsh hackers list, Daniel Shahaf

On Mon, 30 Dec 2019 at 21:01, Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
> They do with the latest patch for (S) from Daniel:
>
> > With tt(%) or tt(%%), search for the match that starts closest to the end of
> > the string
>
> This means that ${(S)str%%X##} is going to find a match that starts
> closest to the end of the string and remove it. X## matches one or
> move X characters. We go backwards one character at a time until X##
> matches. The first match starts at str[-2] and the match is X, so X
> gets removed. This seems clear from the docs.

Ok, this does seem to capture the issue.

> I think it would be beneficial to specify that with ${(S)str##pattern}
> the first attempted match starts at str[-1] and that no attempt is
> made to check if the empty string (the ultimate shortest suffix)
> matches. I think you or someone else has recently raised this point as
> this seems inconsistent. It's surprising to me that both ${str#*} and
> ${(S)str%*} expand to $str while ${(S)str%%*} doesn't.

Also, ${str%*} doesn't expand to $str, which seems to be a bug? Is it
a different uncover of the one from users/22600?


--
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't))
  2019-12-30 20:20                           ` Sebastian Gniazdowski
@ 2019-12-30 21:24                             ` Daniel Shahaf
  2019-12-30 21:44                               ` Roman Perepelitsa
  2019-12-30 22:34                               ` Peter Stephenson
  2019-12-30 21:40                             ` [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't) Roman Perepelitsa
  1 sibling, 2 replies; 22+ messages in thread
From: Daniel Shahaf @ 2019-12-30 21:24 UTC (permalink / raw)
  To: zsh-workers

Sebastian Gniazdowski wrote on Mon, Dec 30, 2019 at 21:20:34 +0100:
> On Mon, 30 Dec 2019 at 21:01, Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
> > I think it would be beneficial to specify that with ${(S)str##pattern}
> > the first attempted match starts at str[-1] and that no attempt is
> > made to check if the empty string (the ultimate shortest suffix)
> > matches. I think you or someone else has recently raised this point as
> > this seems inconsistent. It's surprising to me that both ${str#*} and
> > ${(S)str%*} expand to $str while ${(S)str%%*} doesn't.
> 

Let's see:

     1	% set -- foo
     2	% p ${1#*}
     3	foo
     4	% p ${1%*}
     5	foo
     6	% p ${(S)1#*}
     7	foo
     8	% p ${(S)1%*}
     9	foo
    10	% p ${1##*}
    11	
    12	% p ${1%%*}
    13	
    14	% p ${(S)1##*}
    15	
    16	% p ${(S)1%%*}
    17	fo
    18	% 

Isn't this an implementation bug?  It certainly would be if the docs patch
I posted is accepted.  The existing documentation doesn't promise this behaviour
either.

> Also, ${str%*} doesn't expand to $str,

It does for me; see above.

> which seems to be a bug? Is it a different uncover of the one from
> users/22600?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)
  2019-12-30 20:20                           ` Sebastian Gniazdowski
  2019-12-30 21:24                             ` ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)) Daniel Shahaf
@ 2019-12-30 21:40                             ` Roman Perepelitsa
  1 sibling, 0 replies; 22+ messages in thread
From: Roman Perepelitsa @ 2019-12-30 21:40 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list, Daniel Shahaf

On Mon, Dec 30, 2019 at 9:20 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
> Also, ${str%*} doesn't expand to $str, which seems to be a bug?

I think this has been fixed. ${(S)str%%*} is still broken though. (Or
maybe I missing something.)

Roman.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't))
  2019-12-30 21:24                             ` ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)) Daniel Shahaf
@ 2019-12-30 21:44                               ` Roman Perepelitsa
  2019-12-30 22:11                                 ` Sebastian Gniazdowski
  2019-12-30 22:34                               ` Peter Stephenson
  1 sibling, 1 reply; 22+ messages in thread
From: Roman Perepelitsa @ 2019-12-30 21:44 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

On Mon, Dec 30, 2019 at 10:25 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > Also, ${str%*} doesn't expand to $str,
>
> It does for me; see above.

I believe Sebastian is correct in that ${str%*} used to chop off the
last character from str. It's been fixed at some point. Perhaps
${(S)str%%*} was supposed to get fixed at the same time but somehow
eluded it?

Roman.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't))
  2019-12-30 21:44                               ` Roman Perepelitsa
@ 2019-12-30 22:11                                 ` Sebastian Gniazdowski
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Gniazdowski @ 2019-12-30 22:11 UTC (permalink / raw)
  To: Roman Perepelitsa; +Cc: Daniel Shahaf, Zsh hackers list

On Mon, 30 Dec 2019 at 22:46, Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> On Mon, Dec 30, 2019 at 10:25 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > > Also, ${str%*} doesn't expand to $str,
> >
> > It does for me; see above.
>
> I believe Sebastian is correct in that ${str%*} used to chop off the
> last character from str. It's been fixed at some point. Perhaps
> ${(S)str%%*} was supposed to get fixed at the same time but somehow
> eluded it?

Yes, running the latest zsh reveals it has been fixed (I was running a
few commits behind HEAD).


-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't))
  2019-12-30 21:24                             ` ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)) Daniel Shahaf
  2019-12-30 21:44                               ` Roman Perepelitsa
@ 2019-12-30 22:34                               ` Peter Stephenson
  1 sibling, 0 replies; 22+ messages in thread
From: Peter Stephenson @ 2019-12-30 22:34 UTC (permalink / raw)
  To: zsh-workers

On Mon, 2019-12-30 at 21:24 +0000, Daniel Shahaf wrote:
>      1	% set -- foo
>      2	% p ${1#*}
>      3	foo
>      4	% p ${1%*}
>      5	foo
>      6	% p ${(S)1#*}
>      7	foo
>      8	% p ${(S)1%*}
>      9	foo
>     10	% p ${1##*}
>     11	
>     12	% p ${1%%*}
>     13	
>     14	% p ${(S)1##*}
>     15	
>     16	% p ${(S)1%%*}
>     17	fo
>     18	% 
> 
> Isn't this an implementation bug?

The last one certainly doesn't look right.

The top-and-tail operators are already complicated without the substring
matching, which was bolted on later wihtout a particularly good set of
ground rules about how the loops looking for the longest or shortest
match and for a given substring interacted in the case of pattern
matches where the match itself can have a variable length.  Furthermore,
as you'll see, a lot of the various cases start / end, longest /
shortest, full / substring are implemented separately (though in return
that makes it a bit easier to fix a problem case without disturbing
others).  So it's actually quite easy for something like this to lie
around for a long time.

pws


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-12-30 22:34 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18 20:41 [Bug] S-flag imposes non-greedy match where it shouldn't Sebastian Gniazdowski
2019-12-18 20:44 ` Sebastian Gniazdowski
2019-12-19 15:29   ` Daniel Shahaf
2019-12-26 18:35     ` Sebastian Gniazdowski
2019-12-27  4:54       ` Sebastian Gniazdowski
2019-12-27  5:09         ` Sebastian Gniazdowski
2019-12-27  5:29       ` Daniel Shahaf
2019-12-28 19:04         ` Sebastian Gniazdowski
2019-12-28 20:34           ` Bart Schaefer
2019-12-28 21:00           ` Daniel Shahaf
2019-12-29  0:56             ` Sebastian Gniazdowski
2019-12-29  2:05               ` Daniel Shahaf
2019-12-29  3:14                 ` Sebastian Gniazdowski
2019-12-30 18:00                   ` [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't) Daniel Shahaf
2019-12-30 18:11                     ` Roman Perepelitsa
     [not found]                       ` <CAKc7PVAXLpKqZvmbazZK=mvcz8T-AHJXKusut6aEjkkSLzgdbw@mail.gmail.com>
2019-12-30 20:01                         ` Roman Perepelitsa
2019-12-30 20:20                           ` Sebastian Gniazdowski
2019-12-30 21:24                             ` ${(S)%%*} doesn't match the empty string (was: Re: [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't)) Daniel Shahaf
2019-12-30 21:44                               ` Roman Perepelitsa
2019-12-30 22:11                                 ` Sebastian Gniazdowski
2019-12-30 22:34                               ` Peter Stephenson
2019-12-30 21:40                             ` [PATCH] zshexpn: Expand documentation of (S) (was: Re: [Bug] S-flag imposes non-greedy match where it shouldn't) Roman Perepelitsa

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).