zsh-workers
 help / color / mirror / code / Atom feed
* #% anchoring doesn't work with (S)
@ 2023-01-30 12:32 Sebastian Gniazdowski
  2023-02-02  8:31 ` Sebastian Gniazdowski
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2023-01-30 12:32 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

INPUT=ABC; INPUT=${(S)INPUT//#%((#b)(*))/°match°}; print $match
#no output

# Try to quote #
INPUT=ABC; INPUT=${(S)INPUT//\#\%((#b)(*))/°match°}; print $match
#no output

# Anchor with (#s)/(#e) instead:
INPUT=ABC; INPUT=${(S)INPUT//(#s)((#b)(*))(#e)/°match°}; print $match
#ouput correct:
ABC

# No S-flag
INPUT=ABC; INPUT=${INPUT//#%((#b)(*))/°match°}; print $match
#output correct
ABC

BTW, what were the rules of (#b) flag activity (or (#B))? It was something
about "till the end of the parens".



-- 
Best regards,
Sebastian Gniazdowski

[-- Attachment #2: Type: text/html, Size: 1981 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-01-30 12:32 #% anchoring doesn't work with (S) Sebastian Gniazdowski
@ 2023-02-02  8:31 ` Sebastian Gniazdowski
  2023-02-02 10:32   ` Mikael Magnusson
  2023-02-02 10:49   ` Peter Stephenson
  0 siblings, 2 replies; 8+ messages in thread
From: Sebastian Gniazdowski @ 2023-02-02  8:31 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

Could the bug be fixed? It already makes #% pretty much unusable for a
backward compatible software, yet in say 4 years this would be changed, if
the bug would be fixed today


On Mon, 30 Jan 2023 at 12:32, Sebastian Gniazdowski <sgniazdowski@gmail.com>
wrote:

> INPUT=ABC; INPUT=${(S)INPUT//#%((#b)(*))/°match°}; print $match
> #no output
>
> # Try to quote #
> INPUT=ABC; INPUT=${(S)INPUT//\#\%((#b)(*))/°match°}; print $match
> #no output
>
> # Anchor with (#s)/(#e) instead:
> INPUT=ABC; INPUT=${(S)INPUT//(#s)((#b)(*))(#e)/°match°}; print $match
> #ouput correct:
> ABC
>
> # No S-flag
> INPUT=ABC; INPUT=${INPUT//#%((#b)(*))/°match°}; print $match
> #output correct
> ABC
>
> BTW, what were the rules of (#b) flag activity (or (#B))? It was something
> about "till the end of the parens".
>
>
>
> --
> Best regards,
> Sebastian Gniazdowski
>
>

-- 
Best regards,
Sebastian Gniazdowski

[-- Attachment #2: Type: text/html, Size: 2848 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-02-02  8:31 ` Sebastian Gniazdowski
@ 2023-02-02 10:32   ` Mikael Magnusson
  2023-02-02 10:44     ` Mikael Magnusson
  2023-02-02 12:47     ` Sebastian Gniazdowski
  2023-02-02 10:49   ` Peter Stephenson
  1 sibling, 2 replies; 8+ messages in thread
From: Mikael Magnusson @ 2023-02-02 10:32 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On 2/2/23, Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> Could the bug be fixed? It already makes #% pretty much unusable for a
> backward compatible software, yet in say 4 years this would be changed, if
> the bug would be fixed today

Why would you use (S) (shortest possible match) with #% (match the
entire string)? It will obviously never have a useful effect other
than doing nothing.

That said, compgetmatch() does this, which is probably your problem
(it gives no real motivation for why it does this)
    /*
     * Search is anchored to the end of the string if we want to match
     * it all, or if we are matching at the end of the string and not
     * using substrings.
     */
    if ((*flp & SUB_ALL) || ((*flp & SUB_END) && !(*flp & SUB_SUBSTR)))
	patflags &= ~PAT_NOANCH;


-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-02-02 10:32   ` Mikael Magnusson
@ 2023-02-02 10:44     ` Mikael Magnusson
  2023-02-02 12:47     ` Sebastian Gniazdowski
  1 sibling, 0 replies; 8+ messages in thread
From: Mikael Magnusson @ 2023-02-02 10:44 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On 2/2/23, Mikael Magnusson <mikachu@gmail.com> wrote:
> On 2/2/23, Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
>> Could the bug be fixed? It already makes #% pretty much unusable for a
>> backward compatible software, yet in say 4 years this would be changed,
>> if
>> the bug would be fixed today
>
> Why would you use (S) (shortest possible match) with #% (match the
> entire string)? It will obviously never have a useful effect other
> than doing nothing.
>
> That said, compgetmatch() does this, which is probably your problem
> (it gives no real motivation for why it does this)
>     /*
>      * Search is anchored to the end of the string if we want to match
>      * it all, or if we are matching at the end of the string and not
>      * using substrings.
>      */
>     if ((*flp & SUB_ALL) || ((*flp & SUB_END) && !(*flp & SUB_SUBSTR)))
> 	patflags &= ~PAT_NOANCH;

Actually this is probably not it, it works if you don't use just a *
as the pattern:

% INPUT=ABCABCABC; INPUT=${(S)INPUT//#%((#b)(A*C))/°match°}; print
$INPUT $match
°match° ABCABCABC
% INPUT=ABCABCABC; INPUT=${INPUT//#%((#b)(A*C))/°match°}; print $INPUT
$match
°match° ABCABCABC
% INPUT=ABCABCABC; INPUT=${(S)INPUT//((#b)(A*C))/°match°}; print
$INPUT $match
°match°°match°°match° ABC

so it feels more like the * itself remembers the S flag but not the #%
flags. (But still, specifying the (S) flag in this case is useless in
the first place, so just don't specify it and your code will be
compatible with every version).

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-02-02  8:31 ` Sebastian Gniazdowski
  2023-02-02 10:32   ` Mikael Magnusson
@ 2023-02-02 10:49   ` Peter Stephenson
  1 sibling, 0 replies; 8+ messages in thread
From: Peter Stephenson @ 2023-02-02 10:49 UTC (permalink / raw)
  To: Zsh hackers list

> On 02/02/2023 08:31 Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> Could the bugbe fixed? It already makes #% pretty much unusable for a backward compatible software, yet in say 4 years this would be changed, if the bug would be fixed today
> 
> On Mon, 30 Jan 2023 at 12:32, Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> > INPUT=ABC; INPUT=${(S)INPUT//#%((#b)(*))/°match°}; print $match
> > #no output

It's a confusing combination of options but it looks like it's trying to do a shortest
match as if with a ${param#head} or ${param%tail} and so bailing out early when it's
found a substring.  This obviously doesn't work when we're anchoring at both the
start and the end, so tell it to do a longest match in that case.

We could probably do with some more tests with the #% combination, there aren't all
that many, so other oddities might be sneaking through.

pws

diff --git a/Src/subst.c b/Src/subst.c
index 4ad9fee1a..3dd920e87 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -2926,6 +2926,9 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 	 */
 	if (!(flags & (SUB_MATCH|SUB_REST|SUB_BIND|SUB_EIND|SUB_LEN)))
 	    flags |= SUB_REST;
+	/* If matching at start and end, don't stop early */
+	if ((flags & (SUB_START|SUB_END)) == (SUB_START|SUB_END))
+	    flags |= SUB_LONG;
 
 	/*
 	 * With ":" treat a value as unset if the variable is set but
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index a11652d1e..7990c2958 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -2307,6 +2307,13 @@ F:behavior, see http://austingroupbugs.net/view.php?id=888
 >x
 >y
 
+  a="string"
+  print ${(S)a//#%((#b)(*))/different}
+  print $match[1]
+0:Fully anchored string must be fully searched
+>different
+>string
+
   my_width=6
   my_index=1
   my_options=Option1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-02-02 10:32   ` Mikael Magnusson
  2023-02-02 10:44     ` Mikael Magnusson
@ 2023-02-02 12:47     ` Sebastian Gniazdowski
  2023-02-06 17:17       ` Bart Schaefer
  1 sibling, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2023-02-02 12:47 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

It does have sense to match shortest and anchor to #%, it simply moves the
"weight" on the right side of the pattern, i.e. "ABC" == (?)* vs ABC ==
*(?) (not very correct, but shows the thought).

On Thu, 2 Feb 2023 at 10:32, Mikael Magnusson <mikachu@gmail.com> wrote:

> On 2/2/23, Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> > Could the bug be fixed? It already makes #% pretty much unusable for a
> > backward compatible software, yet in say 4 years this would be changed,
> if
> > the bug would be fixed today
>
> Why would you use (S) (shortest possible match) with #% (match the
> entire string)? It will obviously never have a useful effect other
> than doing nothing.
>
> That said, compgetmatch() does this, which is probably your problem
> (it gives no real motivation for why it does this)
>     /*
>      * Search is anchored to the end of the string if we want to match
>      * it all, or if we are matching at the end of the string and not
>      * using substrings.
>      */
>     if ((*flp & SUB_ALL) || ((*flp & SUB_END) && !(*flp & SUB_SUBSTR)))
>         patflags &= ~PAT_NOANCH;
>
>
> --
> Mikael Magnusson
>


-- 
Best regards,
Sebastian Gniazdowski

[-- Attachment #2: Type: text/html, Size: 1869 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-02-02 12:47     ` Sebastian Gniazdowski
@ 2023-02-06 17:17       ` Bart Schaefer
  2023-02-06 17:31         ` Peter Stephenson
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Schaefer @ 2023-02-06 17:17 UTC (permalink / raw)
  To: Zsh hackers list

On Thu, Feb 2, 2023 at 4:49 AM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
> On Thu, 2 Feb 2023 at 10:32, Mikael Magnusson <mikachu@gmail.com> wrote:
>>
>> Why would you use (S) (shortest possible match) with #% (match the
>> entire string)? It will obviously never have a useful effect other
>> than doing nothing.
>
> It does have sense to match shortest and anchor to #%, it simply moves the "weight" on the right side of the pattern, i.e. "ABC" == (?)* vs ABC == *(?) (not very correct, but shows the thought).

Arguably then this is wrong:

% sample=match
% : ${(S)sample/(#b)(#s)(m*)(*)(#e)}; printf "<%s>" $match ; echo
<match>

I expected <m><atch>.  Compare without the end anchor:

% : ${(S)sample/(#b)(#s)(m*)(*)}; printf "<%s>" $match ; echo
<m>

In any case the behavior of PWS's patch appears to be the same.

Also

% : ${(S)sample/(#b)(#s)(m*)(*h)}; printf "<%s>" $match ; echo
<matc><h>

I guess it's ambiguous, but in other patterns like (this|that) we
prefer the left before the right.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: #% anchoring doesn't work with (S)
  2023-02-06 17:17       ` Bart Schaefer
@ 2023-02-06 17:31         ` Peter Stephenson
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Stephenson @ 2023-02-06 17:31 UTC (permalink / raw)
  To: Zsh hackers list

> On 06/02/2023 17:17 Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Thu, Feb 2, 2023 at 4:49 AM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> > On Thu, 2 Feb 2023 at 10:32, Mikael Magnusson <mikachu@gmail.com> wrote:
> >>
> >> Why would you use (S) (shortest possible match) with #% (match the
> >> entire string)? It will obviously never have a useful effect other
> >> than doing nothing.
> >
> > It does have sense to match shortest and anchor to #%, it simply moves the "weight" on the right side of the pattern, i.e. "ABC" == (?)* vs ABC == *(?) (not very correct, but shows the thought).
> 
> Arguably then this is wrong:
> 
> % sample=match
> % : ${(S)sample/(#b)(#s)(m*)(*)(#e)}; printf "<%s>" $match ; echo
> <match>
> 
> I expected <m><atch>.  Compare without the end anchor:
> 
> % : ${(S)sample/(#b)(#s)(m*)(*)}; printf "<%s>" $match ; echo
> <m>
> 
> In any case the behavior of PWS's patch appears to be the same.

I haven't followed this through in detail, but I believe we just don't have
enough state in the matcher to deal with longest vs. shortest substring
at the same time as everything else.

pws


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-02-06 17:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-30 12:32 #% anchoring doesn't work with (S) Sebastian Gniazdowski
2023-02-02  8:31 ` Sebastian Gniazdowski
2023-02-02 10:32   ` Mikael Magnusson
2023-02-02 10:44     ` Mikael Magnusson
2023-02-02 12:47     ` Sebastian Gniazdowski
2023-02-06 17:17       ` Bart Schaefer
2023-02-06 17:31         ` Peter Stephenson
2023-02-02 10:49   ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).