zsh-users
 help / color / mirror / code / Atom feed
* Non-greedy matching (S-flag) behaving weird
@ 2018-06-08  6:48 ` Sebastian Gniazdowski
  2018-06-08  8:15   ` Peter Stephenson
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Sebastian Gniazdowski @ 2018-06-08  6:48 UTC (permalink / raw)
  To: Zsh Users

Hello,
below subsitution is a really easy one. First parenthesis: anything
preceding, second parenthesis: print|END|BEGIN, third parenthesis:
anything that follows print|END etc.

~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'"

~ __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}";

~ echo "__wrd2: $__wrd2, match[1]: ${match[1]}, match[2]: ${match[2]},
match[3]: ${match[3]}"; echo $?

__wrd2:  'Finished' }', match[1]: echo abc | awk '{ print $1 } END { ,
match[2]: print, match[3]:  'Finished' }'

As it can be seen, match[1] obtains almost whole string. The matching
is ungreedy, why `print' isn't matched? Why matching continues to last
keyword, "END", skipping "print"
-- 
Best regards,
Sebastian Gniazdowski


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Non-greedy matching (S-flag) behaving weird
  2018-06-08  6:48 ` Non-greedy matching (S-flag) behaving weird Sebastian Gniazdowski
@ 2018-06-08  8:15   ` Peter Stephenson
  2018-06-08 12:42     ` Sebastian Gniazdowski
  2018-06-08 13:30   ` Peter Stephenson
       [not found]   ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local>
  2 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2018-06-08  8:15 UTC (permalink / raw)
  To: Zsh Users

On Fri, 8 Jun 2018 08:48:05 +0200
Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> below subsitution is a really easy one. First parenthesis: anything
> preceding, second parenthesis: print|END|BEGIN, third parenthesis:
> anything that follows print|END etc.
> 
> ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'"
> 
> ~
> __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}";
> 
> ~ echo "__wrd2: $__wrd2, match[1]: ${match[1]}, match[2]: ${match[2]},
> match[3]: ${match[3]}"; echo $?
> 
> __wrd2:  'Finished' }', match[1]: echo abc | awk '{ print $1 } END { ,
> match[2]: print, match[3]:  'Finished' }'
> 
> As it can be seen, match[1] obtains almost whole string. The matching
> is ungreedy, why `print' isn't matched? Why matching continues to last
> keyword, "END", skipping "print"

You've got a "*" at the beginning and the end  They're both doing
matching --- they're is no single "matching" to which a rule applies ,
there are just separate patterns all attempting to match.  You're going
to have to work out some way of forcing one of them to match more than
the other.

pws


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Non-greedy matching (S-flag) behaving weird
  2018-06-08  8:15   ` Peter Stephenson
@ 2018-06-08 12:42     ` Sebastian Gniazdowski
  2018-06-08 12:48       ` Sebastian Gniazdowski
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Gniazdowski @ 2018-06-08 12:42 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Users

On 8 June 2018 at 10:15, Peter Stephenson <p.stephenson@samsung.com> wrote:
> You've got a "*" at the beginning and the end  They're both doing
> matching --- they're is no single "matching" to which a rule applies ,
> there are just separate patterns all attempting to match.  You're going
> to have to work out some way of forcing one of them to match more than
> the other.

You are apparently right, but it is a big surprise to me. * matching
over what (a|b) should match, on string xxxaxxxb?? Well, this test
works like I would expect:

~ buf='xxxaxxxbxxx'; print "${(S)buf/(#b)(*)(a|b)(*)/R}"
Rxxxbxxx

With greedy search (no (S)-flag):

~ buf='xxxaxxxbxxx'; print "${buf/(#b)(*)(a|b)(*)/R}"
R

However, I also tested vim, entering text:

abcd BEGIN efgh END ijkl

And then running matching with regex: .\{-}\(BEGIN\|END\).\{-}

\{-} is non-greedy match. YET, this matched till END, not till BEGIN.
Very weird.

-- 
Best regards,
Sebastian Gniazdowski


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Non-greedy matching (S-flag) behaving weird
  2018-06-08 12:42     ` Sebastian Gniazdowski
@ 2018-06-08 12:48       ` Sebastian Gniazdowski
  0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Gniazdowski @ 2018-06-08 12:48 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Users

Turns out I was just confused. Vim does 2 adjacent matches, that's why
it looked like till-END would happen. Adding /e switch to the
search-regex showed that first match ends at BEGIN.

So those are 2 examples showing how non-greedy should behave.


On 8 June 2018 at 14:42, Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> On 8 June 2018 at 10:15, Peter Stephenson <p.stephenson@samsung.com> wrote:
>> You've got a "*" at the beginning and the end  They're both doing
>> matching --- they're is no single "matching" to which a rule applies ,
>> there are just separate patterns all attempting to match.  You're going
>> to have to work out some way of forcing one of them to match more than
>> the other.
>
> You are apparently right, but it is a big surprise to me. * matching
> over what (a|b) should match, on string xxxaxxxb?? Well, this test
> works like I would expect:
>
> ~ buf='xxxaxxxbxxx'; print "${(S)buf/(#b)(*)(a|b)(*)/R}"
> Rxxxbxxx
>
> With greedy search (no (S)-flag):
>
> ~ buf='xxxaxxxbxxx'; print "${buf/(#b)(*)(a|b)(*)/R}"
> R
>
> However, I also tested vim, entering text:
>
> abcd BEGIN efgh END ijkl
>
> And then running matching with regex: .\{-}\(BEGIN\|END\).\{-}
>
> \{-} is non-greedy match. YET, this matched till END, not till BEGIN.
> Very weird.
>
> --
> Best regards,
> Sebastian Gniazdowski


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Non-greedy matching (S-flag) behaving weird
  2018-06-08  6:48 ` Non-greedy matching (S-flag) behaving weird Sebastian Gniazdowski
  2018-06-08  8:15   ` Peter Stephenson
@ 2018-06-08 13:30   ` Peter Stephenson
       [not found]   ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local>
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Stephenson @ 2018-06-08 13:30 UTC (permalink / raw)
  To: Zsh Users

On Fri, 8 Jun 2018 08:48:05 +0200
Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> below subsitution is a really easy one. First parenthesis: anything
> preceding, second parenthesis: print|END|BEGIN, third parenthesis:
> anything that follows print|END etc.
> 
> ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'"
> 
> ~
> __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}";

I think what you're trying to do is:

print ${__wrd2#*(BEGIN|END|print)}

(or some variation thereon).

This is now well defined --- there's only one expression matching
an arbitrary number of parameters, and you've explicitly told it to
shorten from the end of the string to resolve multiple matches.

You are onto a loser with multiple *'s with the greedy match rule
relaxed; it's poorly defined. so the fact it's not doing what you
expect isn't saying anything.  (That's why the greedy match rule
is there in the first place.)  But you'll be better off consulting
a more authoritative source than me if you want more, so I'll
sign off now.

pws


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Non-greedy matching (S-flag) behaving weird
       [not found]   ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local>
@ 2018-06-08 13:54     ` Peter Stephenson
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Stephenson @ 2018-06-08 13:54 UTC (permalink / raw)
  To: Zsh Users

On Fri, 8 Jun 2018 14:30:12 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> On Fri, 8 Jun 2018 08:48:05 +0200
> Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote:
> >
> > ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'"
> > 
> > ~
> > __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}";  
>
> You are onto a loser with multiple *'s with the greedy match rule
> relaxed; it's poorly defined. so the fact it's not doing what you
> expect isn't saying anything.  (That's why the greedy match rule
> is there in the first place.)  But you'll be better off consulting
> a more authoritative source than me if you want more, so I'll
> sign off now.

I should probably point out, though, that the (S) flag in any
case only guarantees to make the match for the *whole* left part of
the /.../... expression as short as possible.  Given you're forcing it
to match the entire string in any case, it has no effect.
It is not documented to match *individual parts* of the match
expression in any particular way.  So actually my remarks on non-greedy
matching aren't relevant.  Sorry I didn't check the doc earlier
instead of spreading unnecessary confusion.

pws


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-08 13:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20180608064952epcas4p4e8890504ddaf097f8d0df1cb1e89d619@epcas4p4.samsung.com>
2018-06-08  6:48 ` Non-greedy matching (S-flag) behaving weird Sebastian Gniazdowski
2018-06-08  8:15   ` Peter Stephenson
2018-06-08 12:42     ` Sebastian Gniazdowski
2018-06-08 12:48       ` Sebastian Gniazdowski
2018-06-08 13:30   ` Peter Stephenson
     [not found]   ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local>
2018-06-08 13:54     ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).