* Non-greedy matching (S-flag) behaving weird @ 2018-06-08 6:48 ` Sebastian Gniazdowski 2018-06-08 8:15 ` Peter Stephenson ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Sebastian Gniazdowski @ 2018-06-08 6:48 UTC (permalink / raw) To: Zsh Users Hello, below subsitution is a really easy one. First parenthesis: anything preceding, second parenthesis: print|END|BEGIN, third parenthesis: anything that follows print|END etc. ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'" ~ __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}"; ~ echo "__wrd2: $__wrd2, match[1]: ${match[1]}, match[2]: ${match[2]}, match[3]: ${match[3]}"; echo $? __wrd2: 'Finished' }', match[1]: echo abc | awk '{ print $1 } END { , match[2]: print, match[3]: 'Finished' }' As it can be seen, match[1] obtains almost whole string. The matching is ungreedy, why `print' isn't matched? Why matching continues to last keyword, "END", skipping "print" -- Best regards, Sebastian Gniazdowski ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Non-greedy matching (S-flag) behaving weird 2018-06-08 6:48 ` Non-greedy matching (S-flag) behaving weird Sebastian Gniazdowski @ 2018-06-08 8:15 ` Peter Stephenson 2018-06-08 12:42 ` Sebastian Gniazdowski 2018-06-08 13:30 ` Peter Stephenson [not found] ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local> 2 siblings, 1 reply; 6+ messages in thread From: Peter Stephenson @ 2018-06-08 8:15 UTC (permalink / raw) To: Zsh Users On Fri, 8 Jun 2018 08:48:05 +0200 Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote: > below subsitution is a really easy one. First parenthesis: anything > preceding, second parenthesis: print|END|BEGIN, third parenthesis: > anything that follows print|END etc. > > ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'" > > ~ > __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}"; > > ~ echo "__wrd2: $__wrd2, match[1]: ${match[1]}, match[2]: ${match[2]}, > match[3]: ${match[3]}"; echo $? > > __wrd2: 'Finished' }', match[1]: echo abc | awk '{ print $1 } END { , > match[2]: print, match[3]: 'Finished' }' > > As it can be seen, match[1] obtains almost whole string. The matching > is ungreedy, why `print' isn't matched? Why matching continues to last > keyword, "END", skipping "print" You've got a "*" at the beginning and the end They're both doing matching --- they're is no single "matching" to which a rule applies , there are just separate patterns all attempting to match. You're going to have to work out some way of forcing one of them to match more than the other. pws ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Non-greedy matching (S-flag) behaving weird 2018-06-08 8:15 ` Peter Stephenson @ 2018-06-08 12:42 ` Sebastian Gniazdowski 2018-06-08 12:48 ` Sebastian Gniazdowski 0 siblings, 1 reply; 6+ messages in thread From: Sebastian Gniazdowski @ 2018-06-08 12:42 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh Users On 8 June 2018 at 10:15, Peter Stephenson <p.stephenson@samsung.com> wrote: > You've got a "*" at the beginning and the end They're both doing > matching --- they're is no single "matching" to which a rule applies , > there are just separate patterns all attempting to match. You're going > to have to work out some way of forcing one of them to match more than > the other. You are apparently right, but it is a big surprise to me. * matching over what (a|b) should match, on string xxxaxxxb?? Well, this test works like I would expect: ~ buf='xxxaxxxbxxx'; print "${(S)buf/(#b)(*)(a|b)(*)/R}" Rxxxbxxx With greedy search (no (S)-flag): ~ buf='xxxaxxxbxxx'; print "${buf/(#b)(*)(a|b)(*)/R}" R However, I also tested vim, entering text: abcd BEGIN efgh END ijkl And then running matching with regex: .\{-}\(BEGIN\|END\).\{-} \{-} is non-greedy match. YET, this matched till END, not till BEGIN. Very weird. -- Best regards, Sebastian Gniazdowski ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Non-greedy matching (S-flag) behaving weird 2018-06-08 12:42 ` Sebastian Gniazdowski @ 2018-06-08 12:48 ` Sebastian Gniazdowski 0 siblings, 0 replies; 6+ messages in thread From: Sebastian Gniazdowski @ 2018-06-08 12:48 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh Users Turns out I was just confused. Vim does 2 adjacent matches, that's why it looked like till-END would happen. Adding /e switch to the search-regex showed that first match ends at BEGIN. So those are 2 examples showing how non-greedy should behave. On 8 June 2018 at 14:42, Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote: > On 8 June 2018 at 10:15, Peter Stephenson <p.stephenson@samsung.com> wrote: >> You've got a "*" at the beginning and the end They're both doing >> matching --- they're is no single "matching" to which a rule applies , >> there are just separate patterns all attempting to match. You're going >> to have to work out some way of forcing one of them to match more than >> the other. > > You are apparently right, but it is a big surprise to me. * matching > over what (a|b) should match, on string xxxaxxxb?? Well, this test > works like I would expect: > > ~ buf='xxxaxxxbxxx'; print "${(S)buf/(#b)(*)(a|b)(*)/R}" > Rxxxbxxx > > With greedy search (no (S)-flag): > > ~ buf='xxxaxxxbxxx'; print "${buf/(#b)(*)(a|b)(*)/R}" > R > > However, I also tested vim, entering text: > > abcd BEGIN efgh END ijkl > > And then running matching with regex: .\{-}\(BEGIN\|END\).\{-} > > \{-} is non-greedy match. YET, this matched till END, not till BEGIN. > Very weird. > > -- > Best regards, > Sebastian Gniazdowski ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Non-greedy matching (S-flag) behaving weird 2018-06-08 6:48 ` Non-greedy matching (S-flag) behaving weird Sebastian Gniazdowski 2018-06-08 8:15 ` Peter Stephenson @ 2018-06-08 13:30 ` Peter Stephenson [not found] ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local> 2 siblings, 0 replies; 6+ messages in thread From: Peter Stephenson @ 2018-06-08 13:30 UTC (permalink / raw) To: Zsh Users On Fri, 8 Jun 2018 08:48:05 +0200 Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote: > below subsitution is a really easy one. First parenthesis: anything > preceding, second parenthesis: print|END|BEGIN, third parenthesis: > anything that follows print|END etc. > > ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'" > > ~ > __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}"; I think what you're trying to do is: print ${__wrd2#*(BEGIN|END|print)} (or some variation thereon). This is now well defined --- there's only one expression matching an arbitrary number of parameters, and you've explicitly told it to shorten from the end of the string to resolve multiple matches. You are onto a loser with multiple *'s with the greedy match rule relaxed; it's poorly defined. so the fact it's not doing what you expect isn't saying anything. (That's why the greedy match rule is there in the first place.) But you'll be better off consulting a more authoritative source than me if you want more, so I'll sign off now. pws ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20180608143012.394398bf@camnpupstephen.cam.scsc.local>]
* Re: Non-greedy matching (S-flag) behaving weird [not found] ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local> @ 2018-06-08 13:54 ` Peter Stephenson 0 siblings, 0 replies; 6+ messages in thread From: Peter Stephenson @ 2018-06-08 13:54 UTC (permalink / raw) To: Zsh Users On Fri, 8 Jun 2018 14:30:12 +0100 Peter Stephenson <p.stephenson@samsung.com> wrote: > On Fri, 8 Jun 2018 08:48:05 +0200 > Sebastian Gniazdowski <sgniazdowski@gmail.com> wrote: > > > > ~ __wrd2="echo abc | awk '{ print \$1 } END { print 'Finished' }'" > > > > ~ > > __wrd2="${(S)__wrd2/(#b)(#s)(*)(BEGIN|END|print)(*)(#e)/${match[3]}}"; > > You are onto a loser with multiple *'s with the greedy match rule > relaxed; it's poorly defined. so the fact it's not doing what you > expect isn't saying anything. (That's why the greedy match rule > is there in the first place.) But you'll be better off consulting > a more authoritative source than me if you want more, so I'll > sign off now. I should probably point out, though, that the (S) flag in any case only guarantees to make the match for the *whole* left part of the /.../... expression as short as possible. Given you're forcing it to match the entire string in any case, it has no effect. It is not documented to match *individual parts* of the match expression in any particular way. So actually my remarks on non-greedy matching aren't relevant. Sorry I didn't check the doc earlier instead of spreading unnecessary confusion. pws ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-06-08 13:54 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20180608064952epcas4p4e8890504ddaf097f8d0df1cb1e89d619@epcas4p4.samsung.com> 2018-06-08 6:48 ` Non-greedy matching (S-flag) behaving weird Sebastian Gniazdowski 2018-06-08 8:15 ` Peter Stephenson 2018-06-08 12:42 ` Sebastian Gniazdowski 2018-06-08 12:48 ` Sebastian Gniazdowski 2018-06-08 13:30 ` Peter Stephenson [not found] ` <20180608143012.394398bf@camnpupstephen.cam.scsc.local> 2018-06-08 13:54 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).