* [bug] shwordsplit not working on $@ when $# > 1 @ 2016-08-08 11:16 Stephane Chazelas 2016-08-08 17:28 ` Bart Schaefer 2016-08-08 18:27 ` Peter Stephenson 0 siblings, 2 replies; 12+ messages in thread From: Stephane Chazelas @ 2016-08-08 11:16 UTC (permalink / raw) To: zsh-workers Bug found by BinaryZebra (https://unix.stackexchange.com/questions/301277/zsh-fail-to-keep-unquoted-and-equal) It looks like the elements of $@ only undergo splitting when $# is 1: ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b a b ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b c a:b c Also reproduced with current git head. (note that it affects shwordsplit, but not globsubst) -- Stephane ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-08 11:16 [bug] shwordsplit not working on $@ when $# > 1 Stephane Chazelas @ 2016-08-08 17:28 ` Bart Schaefer 2016-08-08 18:02 ` Jérémie Roquet 2016-08-08 18:27 ` Peter Stephenson 1 sibling, 1 reply; 12+ messages in thread From: Bart Schaefer @ 2016-08-08 17:28 UTC (permalink / raw) To: zsh-workers On Aug 8, 12:16pm, Stephane Chazelas wrote: } } ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b } a b } ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b c } a:b c Very odd. Works correctly in zsh-3.0.8, does not in zsh 4.3.17 (I don't have anything in between with which to test). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-08 17:28 ` Bart Schaefer @ 2016-08-08 18:02 ` Jérémie Roquet 0 siblings, 0 replies; 12+ messages in thread From: Jérémie Roquet @ 2016-08-08 18:02 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh Hackers' List 2016-08-08 19:28 GMT+02:00 Bart Schaefer <schaefer@brasslantern.com>: > On Aug 8, 12:16pm, Stephane Chazelas wrote: > } > } ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b > } a b > } ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b c > } a:b c > > Very odd. Works correctly in zsh-3.0.8, does not in zsh 4.3.17 (I don't > have anything in between with which to test). The behavior changed in 2011 with this changeset: https://sourceforge.net/p/zsh/code/ci/6699851bcb535f3b2eb707a91cdc1a27f65a05a2 Best regards, -- Jérémie ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-08 11:16 [bug] shwordsplit not working on $@ when $# > 1 Stephane Chazelas 2016-08-08 17:28 ` Bart Schaefer @ 2016-08-08 18:27 ` Peter Stephenson 2016-08-09 1:21 ` Bart Schaefer 1 sibling, 1 reply; 12+ messages in thread From: Peter Stephenson @ 2016-08-08 18:27 UTC (permalink / raw) To: zsh-workers On Mon, 8 Aug 2016 12:16:26 +0100 Stephaneo Chazelas <stephane.chazelas@gmail.com> wrote: > It looks like the elements of $@ only undergo splitting when $# > is 1: > > ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b > a b > ~$ zsh -o shwordsplit -c 'IFS=:; echo $@' zsh a:b c > a:b c The variable isarr from the value entry v->isarr is negative, so we don't go into the loop that does joining and splitting as that explicitly asks for (isarr >= 0). (It was 0 for a scalar, hence the first case above.) Setting this negative is apparently deliberate: /* "$foo[@]"-style substitution * Only sign bit is significant */ #define SCANPM_ISVAR_AT ((int)(((unsigned int)-1)<<15)) This seems to conflict with the use of isarr in paramsubst(); that's documented but I think only because I tried to decode the code. I don't know what this SCANPM_ISVAR_AT is doing. Perhaps it needs converting to something else at some point? pws ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-08 18:27 ` Peter Stephenson @ 2016-08-09 1:21 ` Bart Schaefer 2016-08-09 8:40 ` Peter Stephenson 2016-08-10 12:56 ` Peter Stephenson 0 siblings, 2 replies; 12+ messages in thread From: Bart Schaefer @ 2016-08-09 1:21 UTC (permalink / raw) To: zsh-workers On Aug 8, 7:27pm, Peter Stephenson wrote: } Subject: Re: [bug] shwordsplit not working on $@ when $# > 1 } } The variable isarr from the value entry v->isarr is negative, so we } don't go into the loop that does joining and splitting as that } explicitly asks for (isarr >= 0). (It was 0 for a scalar, hence the } first case above.) It's not really a loop; it's just the "if" block at subst.c:3457. This block first joins all the array elements on the first character of $IFS, and then splits the resulting string on spaces. The problem is that in this case we want split but in the case of users/15439 we do NOT want join. The patch called out by Jeremie prevents the joining, but does so by bypassing the splitting; it first sets nojoin nonzero by examining IFS, and then sets isarr = -1 when nojoin is true (i.e., the negative isarr you saw is not coming from v->isarr, so the value of SCANPM_ISVAR_AT is a red herring). The underlying assumption that it works to first join on $IFS and then split again to get the array leads to the error, because when $IFS is empty there won't be anything to split on, but when there is no join the new array must be built by splitting every element of the value array (and there is currently no code that does that). I *think* we can untangle this as follows, but then again I thought I had untangled in in workers/29313, too. This relies on the idea that if we already have an array when nojoin, then we're not going to split it again, which seems dubious somehow if there is explicit use of the (s:-:) flag. It may be that we really do need to write a loop over the existing elements when isarr is true there. Existing tests still pass, but then they always did, this needs a new one. Holding off until we think of other edge cases. diff --git a/Src/subst.c b/Src/subst.c index e3af156..6dd4608 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -3454,13 +3454,13 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, * exception is that ${name:-word} and ${name:+word} will have already * done any requested splitting of the word value with quoting preserved. */ - if (ssub || (spbreak && isarr >= 0) || spsep || sep) { - if (isarr) { + if (ssub || spbreak || spsep || sep) { + if (isarr && !nojoin) { val = sepjoin(aval, sep, 1); isarr = 0; ms_flags = 0; } - if (!ssub && (spbreak || spsep)) { + if (!ssub && (spbreak || spsep) && !isarr) { aval = sepsplit(val, spsep, 0, 1); if (!aval || !aval[0]) val = dupstring(""); @@ -3527,7 +3527,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, } /* * TODO: It would be really quite nice to abstract the - * isarr and !issarr code into a function which gets + * isarr and !isarr code into a function which gets * passed a pointer to a function with the effect of * the promptexpand bit. Then we could use this for * a lot of stuff and bury val/aval/isarr inside a structure ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-09 1:21 ` Bart Schaefer @ 2016-08-09 8:40 ` Peter Stephenson 2016-08-10 17:28 ` Bart Schaefer 2016-08-10 12:56 ` Peter Stephenson 1 sibling, 1 reply; 12+ messages in thread From: Peter Stephenson @ 2016-08-09 8:40 UTC (permalink / raw) To: zsh-workers On Mon, 08 Aug 2016 18:21:24 -0700 Bart Schaefer <schaefer@brasslantern.com> wrote: > I *think* we can untangle this as follows, but then again I thought > I had untangled in in workers/29313, too. This relies on the idea > that if we already have an array when nojoin, then we're not going > to split it again, which seems dubious somehow if there is explicit > use of the (s:-:) flag. That was my first worry, looking at the change, but simple examples of this still work, so I can't see an obvious case where this isn't better than before. > Existing tests still pass, but then they always did, this needs a > new one. Holding off until we think of other edge cases. I suppose adding tests is key to this. There are already a number involving (s...), however. Perhaps we need some more with both joining and splitting on various types of object (and then work out what the result actually should be...) pws ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-09 8:40 ` Peter Stephenson @ 2016-08-10 17:28 ` Bart Schaefer 2016-08-11 10:05 ` Peter Stephenson 0 siblings, 1 reply; 12+ messages in thread From: Bart Schaefer @ 2016-08-10 17:28 UTC (permalink / raw) To: zsh-workers On Aug 9, 9:40am, Peter Stephenson wrote: } } On Mon, 08 Aug 2016 18:21:24 -0700 } Bart Schaefer <schaefer@brasslantern.com> wrote: } > I *think* we can untangle this as follows, but then again I thought } > I had untangled in in workers/29313, too. This relies on the idea } > that if we already have an array when nojoin, then we're not going } > to split it again, which seems dubious somehow if there is explicit } > use of the (s:-:) flag. } } That was my first worry, looking at the change, but simple examples of } this still work, so I can't see an obvious case where this isn't better } than before. It's better than before in that the two specific shwordsplit issues are addressed. However, I'm not sure that either of these cases is correct: torch% x=(a:b "c d" ef) torch% print -l ${(s.:.)x} a b c d ef torch% print -l ${(@s.:.)x} a:b c d ef The first one agrees with zsh-3.0.8, so I guess it's no more wrong than it ever was. The second one used to work exactly like the first one, and is the one that worries me the most. I can fix that by testing (nojoin % 2) instead of (!nojoin) but that seems like piling a hack on top of a hack. Then there's this weird edge case, where an empty $IFS acts like you have specified the (@) flag when shwordsplit is set: torch% IFS= torch% setopt shwordsplit torch% print -l ${(s.:.)x} a:b c d ef ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-10 17:28 ` Bart Schaefer @ 2016-08-11 10:05 ` Peter Stephenson 2016-08-11 17:28 ` Bart Schaefer 0 siblings, 1 reply; 12+ messages in thread From: Peter Stephenson @ 2016-08-11 10:05 UTC (permalink / raw) To: zsh-workers On Wed, 10 Aug 2016 10:28:36 -0700 Bart Schaefer <schaefer@brasslantern.com> wrote: > torch% x=(a:b "c d" ef) > torch% print -l ${(s.:.)x} > a > b c d ef That's correct, that's forced joining, which is documented. As I said in my previous email, it's not clear it's actually a lot of use. > torch% print -l ${(@s.:.)x} > a:b > c d > ef > > The second one used to work exactly like the first one, > and is the one that worries me the most. I think it should only cause a visible effect in double quotes as that's its real point --- though I wouldn't be surprised if there were already exceptions. It's hard to see how it could be interpreted to mean ignore the (s.:.), even if there are double quotes. You might also hope that logically this would have the same effect as ${(s.:.)@} when the contents of $@ were the same as the contents of $x, regardless of context (i.e. whether or not in double quotes). Possibly that conflicts with the principle that it has no effect outside double quotes, but I can't think of a case. To be clear: it is not a conflict that SHWORDSPLIT behaviour and (s...) behaviour differ from one another, e.g. with respect to forced joinging, only if expressions involving the same modifications to ${(@)x} and $@ differ when the contents of the arrays and the contexts are the same. Note the documented oddity of the behaviour of (s...) in double quotes when (@) does *not* also appear. But as far as I know the combination of the two has always behaved rationally by zsh standards. > Then there's this weird edge case, where an empty $IFS acts like you > have specified the (@) flag when shwordsplit is set: > > torch% IFS= > torch% setopt shwordsplit > torch% print -l ${(s.:.)x} > a:b > c d > ef Hmm... I would guess that what's happened is without an IFS forced joining with a default separator fails, and because it didn't get joined we refuse to split it (I think there was a sort of vague assumption at one time that it only made sense to split a scalar into an array, rather than an array into multiple arrays, though there are obviously exceptions so this isn't much use as a rule). That's probably a bug --- I would think the most logical answer here is it should have been joined with no separator and then split, but I doubt this has ever been thought about before. I wouldn't expect SHWORDSPLIT to make a difference if forced splitting is in use, but that's another cavalier statement. pws ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-11 10:05 ` Peter Stephenson @ 2016-08-11 17:28 ` Bart Schaefer 2016-08-12 7:50 ` Bart Schaefer 0 siblings, 1 reply; 12+ messages in thread From: Bart Schaefer @ 2016-08-11 17:28 UTC (permalink / raw) To: zsh-workers On Aug 11, 11:05am, Peter Stephenson wrote: } Subject: Re: [bug] shwordsplit not working on $@ when $# > 1 } } On Wed, 10 Aug 2016 10:28:36 -0700 } Bart Schaefer <schaefer@brasslantern.com> wrote: } > torch% print -l ${(@s.:.)x} } > a:b } > c d } > ef } } I think it should only cause a visible effect in double quotes as that's } its real point --- though I wouldn't be surprised if there were already } exceptions. It's hard to see how it could be interpreted to mean ignore } the (s.:.), even if there are double quotes. I think this is fixed by workers/39019, but I did not add any tests that use double-quoting. Should there be some? } To be clear: it is not a conflict that SHWORDSPLIT behaviour and } (s...) behaviour differ from one another, e.g. with respect to forced } [joining], only if expressions involving the same modifications to } ${(@)x} and $@ differ when the contents of the arrays and the contexts } are the same. Agreed, but I'm still not sure the test suite covers that ... } > Then there's this weird edge case, where an empty $IFS acts like you } > have specified the (@) flag when shwordsplit is set: } > } > torch% IFS= } > torch% setopt shwordsplit } > torch% print -l ${(s.:.)x} } > a:b } > c d } > ef } } Hmm... I would guess that what's happened is without an IFS forced } joining with a default separator fails, and because it didn't get joined } we refuse to split it To clarify, the above edge case was introduced by the (never-committed) patch in 39009, and should have been avoided by the patch in 39019. } I would think the most logical answer here is it should have been joined } with no separator and then split, but I doubt this has ever been thought } about before. This is still broken after 39019 (sigh) because IFS= plus shwordsplit implies that joining should not happen in other circumstances, and we only split after joining, so that unintentionally bleeds into this. I thought I'd got it worked out and that the tests I added covered it, but trying this specific example again still fails. Back to prodding at it, I guess. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-11 17:28 ` Bart Schaefer @ 2016-08-12 7:50 ` Bart Schaefer 0 siblings, 0 replies; 12+ messages in thread From: Bart Schaefer @ 2016-08-12 7:50 UTC (permalink / raw) To: zsh-workers On Aug 11, 10:28am, Bart Schaefer wrote: } } This is still broken after 39019 (sigh) because IFS= plus shwordsplit } implies that joining should not happen in other circumstances, and we } only split after joining, so that unintentionally bleeds into this. I } thought I'd got it worked out and that the tests I added covered it, } but trying this specific example again still fails. Back to prodding } at it, I guess. OK, this finally seems to both test and solve all the weird combos. Note the only difference from nojoin == 0 to nojoin == 1 in the case of force_split is that ms_flags is not cleared. I really don't follow what that's doing. diff --git a/Src/subst.c b/Src/subst.c index ae3e4c4..99e1650 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -3461,11 +3461,12 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, val = sepjoin(aval, sep, 1); isarr = 0; ms_flags = 0; - } else if (force_split && nojoin == 2) { + } else if (force_split && (spsep || nojoin == 2)) { /* Hack to simulate splitting individual elements: - * first join on what we later use to split + * forced joining as previously determined, or + * join on what we later use to forcibly split */ - val = sepjoin(aval, spsep, 1); + val = sepjoin(aval, (nojoin == 1 ? sep : spsep), 1); isarr = 0; } } diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst index 35630c5..460a841 100644 --- a/Test/D04parameter.ztst +++ b/Test/D04parameter.ztst @@ -1970,8 +1970,14 @@ set -- one:two bucklemy:shoe IFS= setopt shwordsplit - print -l ${@} + print -l ${@} ${(s.:.)*} ${(s.:.j.-.)*} ) -0:Joining of $@ does not happen when IFS is empty +0:Joining of $@ does not happen when IFS is empty, but splitting $* does >one:two >bucklemy:shoe +>one +>twobucklemy +>shoe +>one +>two-bucklemy +>shoe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-09 1:21 ` Bart Schaefer 2016-08-09 8:40 ` Peter Stephenson @ 2016-08-10 12:56 ` Peter Stephenson 2016-08-11 0:39 ` Bart Schaefer 1 sibling, 1 reply; 12+ messages in thread From: Peter Stephenson @ 2016-08-10 12:56 UTC (permalink / raw) To: zsh-workers > Existing tests still pass, but then they always did, this needs a > new one. Holding off until we think of other edge cases. This tests the significant cases with $@. There's already quite a lot with splitting, joining, IFS and SH_WORD_SPLIT in that file, so I'm not sure how much more there is to do. Not quite sure what forced joining gets us apart from extra complexity but too late to worry now... pws diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst index 7cb297e..6767eb2 100644 --- a/Test/D04parameter.ztst +++ b/Test/D04parameter.ztst @@ -1920,3 +1920,38 @@ print $array 0:"-" works after "[" in same expression (Dash problem) >foo two three + + ( + setopt shwordsplit + IFS=: + set -- whim:wham:whom + print -l $@ + ) +0:Splitting of $@ on IFS: single element +>whim +>wham +>whom + + ( + setopt shwordsplit + IFS=: + set -- one:two bucklemy:shoe + print -l $@ + ) +0:Splitting of $@ on IFS: multiple elements +# No forced joining in this case +>one +>two +>bucklemy +>shoe + + ( + setopt shwordsplit + set -- one:two bucklemy:shoe + print -l ${(s.:.)@} + ) +0:Splitting of $@ on (s): multiple elements +# Forced joining in this case +>one +>two bucklemy +>shoe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [bug] shwordsplit not working on $@ when $# > 1 2016-08-10 12:56 ` Peter Stephenson @ 2016-08-11 0:39 ` Bart Schaefer 0 siblings, 0 replies; 12+ messages in thread From: Bart Schaefer @ 2016-08-11 0:39 UTC (permalink / raw) To: zsh-workers On Aug 10, 1:56pm, Peter Stephenson wrote: } Subject: Re: [bug] shwordsplit not working on $@ when $# > 1 } } > Existing tests still pass, but then they always did, this needs a } > new one. Holding off until we think of other edge cases. } } This tests the significant cases with $@. OK, this replaces both workers/39009 and workers/39013 -- to the latter it adds three more tests based on workers/39018. I also re-ordered the statements in the 39013 tests to make it obvious that IFS doesn't have an effect on the "set -- ..." statements. diff --git a/Src/subst.c b/Src/subst.c index e3af156..ae3e4c4 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -3454,13 +3454,22 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, * exception is that ${name:-word} and ${name:+word} will have already * done any requested splitting of the word value with quoting preserved. */ - if (ssub || (spbreak && isarr >= 0) || spsep || sep) { + if (ssub || spbreak || spsep || sep) { + int force_split = !ssub && (spbreak || spsep); if (isarr) { - val = sepjoin(aval, sep, 1); - isarr = 0; - ms_flags = 0; + if (nojoin == 0) { + val = sepjoin(aval, sep, 1); + isarr = 0; + ms_flags = 0; + } else if (force_split && nojoin == 2) { + /* Hack to simulate splitting individual elements: + * first join on what we later use to split + */ + val = sepjoin(aval, spsep, 1); + isarr = 0; + } } - if (!ssub && (spbreak || spsep)) { + if (force_split && !isarr) { aval = sepsplit(val, spsep, 0, 1); if (!aval || !aval[0]) val = dupstring(""); @@ -3527,7 +3536,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, } /* * TODO: It would be really quite nice to abstract the - * isarr and !issarr code into a function which gets + * isarr and !isarr code into a function which gets * passed a pointer to a function with the effect of * the promptexpand bit. Then we could use this for * a lot of stuff and bury val/aval/isarr inside a structure diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst index 7cb297e..35630c5 100644 --- a/Test/D04parameter.ztst +++ b/Test/D04parameter.ztst @@ -1920,3 +1920,58 @@ print $array 0:"-" works after "[" in same expression (Dash problem) >foo two three + + ( + setopt shwordsplit + set -- whim:wham:whom + IFS=: + print -l $@ + ) +0:Splitting of $@ on IFS: single element +>whim +>wham +>whom + + ( + setopt shwordsplit + set -- one:two bucklemy:shoe + IFS=: + print -l $@ + ) +0:Splitting of $@ on IFS: multiple elements +# No forced joining in this case +>one +>two +>bucklemy +>shoe + + ( + set -- one:two bucklemy:shoe + print -l ${(s.:.)@} + ) +0:Splitting of $@ on (s): multiple elements +# Forced joining in this case +>one +>two bucklemy +>shoe + + ( + set -- one:two bucklemy:shoe + print -l ${(@s.:.)@} + ) +0:Splitting of $@ on (@s): multiple elements +# Forced non-joining in this case +>one +>two +>bucklemy +>shoe + + ( + set -- one:two bucklemy:shoe + IFS= + setopt shwordsplit + print -l ${@} + ) +0:Joining of $@ does not happen when IFS is empty +>one:two +>bucklemy:shoe ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-08-12 7:50 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-08-08 11:16 [bug] shwordsplit not working on $@ when $# > 1 Stephane Chazelas 2016-08-08 17:28 ` Bart Schaefer 2016-08-08 18:02 ` Jérémie Roquet 2016-08-08 18:27 ` Peter Stephenson 2016-08-09 1:21 ` Bart Schaefer 2016-08-09 8:40 ` Peter Stephenson 2016-08-10 17:28 ` Bart Schaefer 2016-08-11 10:05 ` Peter Stephenson 2016-08-11 17:28 ` Bart Schaefer 2016-08-12 7:50 ` Bart Schaefer 2016-08-10 12:56 ` Peter Stephenson 2016-08-11 0:39 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).