* ${^var} and word splitting @ 2014-11-24 9:56 Stephane Chazelas 2014-11-24 11:12 ` Peter Stephenson [not found] ` <20141124111201.161d8cf2__23261.8202259347$1416827641$gmane$org@pwslap01u.europe.root.pri> 0 siblings, 2 replies; 9+ messages in thread From: Stephane Chazelas @ 2014-11-24 9:56 UTC (permalink / raw) To: Zsh hackers list $ a=' 1 2 3 ' $ print -l $=a 1 2 3 $ print -l x$^=a x x1 x2 x3 x $ print -l x${^${=a}} x1 x2 x3 Why the extra "x" lines with x$^=a ? Same for the (s:sep:) or (f) expansion flags. -- Stephane ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ${^var} and word splitting 2014-11-24 9:56 ${^var} and word splitting Stephane Chazelas @ 2014-11-24 11:12 ` Peter Stephenson [not found] ` <20141124111201.161d8cf2__23261.8202259347$1416827641$gmane$org@pwslap01u.europe.root.pri> 1 sibling, 0 replies; 9+ messages in thread From: Peter Stephenson @ 2014-11-24 11:12 UTC (permalink / raw) To: Zsh hackers list On Mon, 24 Nov 2014 09:56:37 +0000 Stephane Chazelas <stephane.chazelas@gmail.com> wrote: > $ a=' 1 2 3 ' > $ print -l $=a > 1 > 2 > 3 > $ print -l x$^=a > x > x1 > x2 > x3 > x > $ print -l x${^${=a}} > x1 > x2 > x3 > > > Why the extra "x" lines with x$^=a ? In the case of $^=a, the steps are - split a. There's whitespace start and end so you get null elements corresponding to those. - add the x's in front - remove remaining null elements, but there aren't any. With nested expansion, you get - split a: same result - remove null elements (before the x's get added). - add the x's - remove null elements for this level (but there aren't any more) C.f. $ print -l x${^"${(@)=a}"} x x1 x2 x3 x which has been told explicitly to keep the null elements despite the nesting. -- Peter Stephenson <p.stephenson@samsung.com> Principal Software Engineer Tel: +44 (0)1223 434724 Samsung Cambridge Solution Centre St John's House, St John's Innovation Park, Cowley Road, Cambridge, CB4 0DS, UK ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20141124111201.161d8cf2__23261.8202259347$1416827641$gmane$org@pwslap01u.europe.root.pri>]
* Re: ${^var} and word splitting [not found] ` <20141124111201.161d8cf2__23261.8202259347$1416827641$gmane$org@pwslap01u.europe.root.pri> @ 2014-11-24 15:26 ` Stephane Chazelas 2014-11-24 15:55 ` Peter Stephenson [not found] ` <20141124155524.0739b3ec__26419.4987401881$1416845250$gmane$org@pwslap01u.europe.root.pri> 0 siblings, 2 replies; 9+ messages in thread From: Stephane Chazelas @ 2014-11-24 15:26 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list 2014-11-24 11:12:01 +0000, Peter Stephenson: > On Mon, 24 Nov 2014 09:56:37 +0000 > Stephane Chazelas <stephane.chazelas@gmail.com> wrote: > > $ a=' 1 2 3 ' > > $ print -l $=a > > 1 > > 2 > > 3 > > $ print -l x$^=a > > x > > x1 > > x2 > > x3 > > x > > $ print -l x${^${=a}} > > x1 > > x2 > > x3 > > > > > > Why the extra "x" lines with x$^=a ? > > In the case of $^=a, the steps are > > - split a. There's whitespace start and end so you get null elements > corresponding to those. > - add the x's in front > - remove remaining null elements, but there aren't any. OK thanks. that's a difference from other shells I was not aware of and it seems to be as documented indeed. The source of my confusion can be simplified to: ~$ a=' 1 2 3 ' ~$ printf '%s\n' "${=a}" 1 2 3 ~$ In other shells, leading/trailing _IFS white space_ characters are ignored as part of word splitting, not in zsh. If I understand correctly, in zsh the removing of those are accounted to null-removal in things like: $ print -l $=a 1 2 3 But then it's not clear why they are removed there and not in: a=':a::b:' IFS=: print -l $=a ? -- Stephane ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ${^var} and word splitting 2014-11-24 15:26 ` Stephane Chazelas @ 2014-11-24 15:55 ` Peter Stephenson 2014-11-24 16:55 ` Bart Schaefer [not found] ` <20141124155524.0739b3ec__26419.4987401881$1416845250$gmane$org@pwslap01u.europe.root.pri> 1 sibling, 1 reply; 9+ messages in thread From: Peter Stephenson @ 2014-11-24 15:55 UTC (permalink / raw) To: Zsh hackers list On Mon, 24 Nov 2014 15:26:28 +0000 Stephane Chazelas <stephane.chazelas@gmail.com> wrote: > If I understand correctly, in zsh the removing of those are > accounted to null-removal in things like: > > $ print -l $=a > 1 > 2 > 3 > > But then it's not clear why they are removed there and not in: > > a=':a::b:' > IFS=: > print -l $=a I looked at the code and you're exactly right: it's not clear. There's a parameter determining how the split function behaves and there's an argument allownull that I already noted I didn't understand in the comment to sepsplit(), determining whether the argument being set will be empty or will be set to something that has no effect except preventing the argument being removed later. In the case in question this is zero. Consequently it's easy to change the behaviour in the second case... This doesn't cause any test failures. Unless anyone has any ideas why we do this, maybe we should simplify it like this? If anyone does have ideas, we should write a test for that case. There's one other case in parameter substitution to do with assignment within the substitution that presumably ought to be consistent. diff --git a/Src/subst.c b/Src/subst.c index 61aa1c1..17f35be 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -3322,7 +3322,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags) isarr = 0; } if (!ssub && (spbreak || spsep)) { - aval = sepsplit(val, spsep, 0, 1); + aval = sepsplit(val, spsep, 1, 1); if (!aval || !aval[0]) val = dupstring(""); else if (!aval[1]) pws ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ${^var} and word splitting 2014-11-24 15:55 ` Peter Stephenson @ 2014-11-24 16:55 ` Bart Schaefer 2014-11-24 17:22 ` Peter Stephenson 0 siblings, 1 reply; 9+ messages in thread From: Bart Schaefer @ 2014-11-24 16:55 UTC (permalink / raw) To: Zsh hackers list On Nov 24, 3:55pm, Peter Stephenson wrote: } Subject: Re: ${^var} and word splitting } } On Mon, 24 Nov 2014 15:26:28 +0000 } Stephane Chazelas <stephane.chazelas@gmail.com> wrote: } > If I understand correctly, in zsh the removing of those are } > accounted to null-removal in things like: } > } > $ print -l $=a } > 1 } > 2 } > 3 } > } > But then it's not clear why they are removed there and not in: } > } > a=':a::b:' } > IFS=: } > print -l $=a } } I looked at the code and you're exactly right: it's not clear. Isn't it always the case that *whitespace* in IFS is treated differently than non-whitespace? E.g. consecutive whitespace is treated as a single character, so (IFS=" " "a b") is two words but (IFS=: "a::b") is three? I'm not actually able to try examples at the moment so maybe I'm just not following something. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ${^var} and word splitting 2014-11-24 16:55 ` Bart Schaefer @ 2014-11-24 17:22 ` Peter Stephenson 0 siblings, 0 replies; 9+ messages in thread From: Peter Stephenson @ 2014-11-24 17:22 UTC (permalink / raw) To: Zsh hackers list On Mon, 24 Nov 2014 08:55:08 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Nov 24, 3:55pm, Peter Stephenson wrote: > } Subject: Re: ${^var} and word splitting > } > } On Mon, 24 Nov 2014 15:26:28 +0000 > } Stephane Chazelas <stephane.chazelas@gmail.com> wrote: > } > If I understand correctly, in zsh the removing of those are > } > accounted to null-removal in things like: > } > > } > $ print -l $=a > } > 1 > } > 2 > } > 3 > } > > } > But then it's not clear why they are removed there and not in: > } > > } > a=':a::b:' > } > IFS=: > } > print -l $=a > } > } I looked at the code and you're exactly right: it's not clear. > > Isn't it always the case that *whitespace* in IFS is treated differently > than non-whitespace? E.g. consecutive whitespace is treated as a single > character, so (IFS=" " "a b") is two words but (IFS=: "a::b") is three? Yes, that's right, but what zsh is doing is a bit funny: as Stephane notes, in other shells you don't get the null arguments in the first place if the special whitespace rule is being followed, it's not a question of whether they get removed later or not. At least I think so --- splitting is implicit in other shells so I may just not be quite doing the equivalent. Here's what I did in bash: $ fn() { local arg; for arg in "$@"; do echo $arg; done; } $ fn2() { fn $1; } $ fn2 ' a b c ' a b c $ So that $1 argument to fn2 gets split in the way we're talking about, while within fn we make sure we pick up every piece that's been split from it. This definitely looks different from zsh. However, I think what you mention is indeed the source of the difference Stephane noticed, because doubling a whitespace character in IFS does have the documented effect of making it work the other way (see the zshparam manual; this is without the patch which is obviously not a correct fix): % a=' a b c ' % print $=a a b c % print -l $=a a b c % IFS=' ' # two spaces % print -l $=a a b c % So that explains the mysterious allownull whatever the explanation for the implementation. pws ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20141124155524.0739b3ec__26419.4987401881$1416845250$gmane$org@pwslap01u.europe.root.pri>]
* Re: ${^var} and word splitting [not found] ` <20141124155524.0739b3ec__26419.4987401881$1416845250$gmane$org@pwslap01u.europe.root.pri> @ 2014-11-24 21:18 ` Stephane Chazelas 2014-11-25 7:49 ` Bart Schaefer [not found] ` <141124234931.ZM17259__8246.8130779036$1416901919$gmane$org@torch.brasslantern.com> 0 siblings, 2 replies; 9+ messages in thread From: Stephane Chazelas @ 2014-11-24 21:18 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list 2014-11-24 15:55:24 +0000, Peter Stephenson: [...] > Consequently it's easy to change the behaviour in the second case... > This doesn't cause any test failures. Unless anyone has any ideas why > we do this, maybe we should simplify it like this? If anyone does have > ideas, we should write a test for that case. [...] I'd say no. People expect: IFS=: PATH=/bin::/usr/bin setopt shwordsplit set -- $PATH to split $PATH into /bin, "" and /usr/bin. That's how all the shells (except the Bourne shell) behave (and POSIX requires) and is the whole point of having _IFS white space_ in the first place. (BTW, POSIX also requires :/bin::/usr/bin: to be split into "", "/bin", "" and "/usr/bin" (not another "") as IFS is the internal field _delimiter_ there, not _separator_. I tend to prefer the zsh way (also yash's and older versions of pdksh) though.) What I don't like much is IFS white spaces (or x's with (s:x:)) to be collapsed *but not removed from head and tail*. The whole point of having /IFS white spaces/ was to split strings the /natural/ way (like words in a text, like awk's fields or like the Bourne shell did for any character of $IFS, not just the whitespace ones). That means considering sequences of blanks as one *and* leading and trailing blanks not to create fields. A string like " : foo : bar : : baz " would be split into "", foo, bar, "" and baz. I don't see the point in doing one and not the other. IOW in: ~$ a=' a b ' zsh -c 'print -l ${(s, ,)a}' a b ~$ a=' a b ' zsh -c 'print -l "${(s, ,)a}"' a b ~$ a=' a b ' zsh -c 'print -l "${(s, ,@)a}"' a b ~$ I'd rather 2 above behave either like 1 or 2. I'm fine with 1 and 3 behave like they do now. It may be too late to change the behaviour now, though I'd find it hard to imagine people relying on "$=var" to make empty arguments at the beginning and end but not in the middle. -- Stephane ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ${^var} and word splitting 2014-11-24 21:18 ` Stephane Chazelas @ 2014-11-25 7:49 ` Bart Schaefer [not found] ` <141124234931.ZM17259__8246.8130779036$1416901919$gmane$org@torch.brasslantern.com> 1 sibling, 0 replies; 9+ messages in thread From: Bart Schaefer @ 2014-11-25 7:49 UTC (permalink / raw) To: Zsh hackers list On Nov 24, 9:18pm, Stephane Chazelas wrote: } } I don't see the point in doing one and not the other. IOW in: } } ~$ a=' a b ' zsh -c 'print -l ${(s, ,)a}' } a } b } ~$ a=' a b ' zsh -c 'print -l "${(s, ,)a}"' } } a } b } } ~$ a=' a b ' zsh -c 'print -l "${(s, ,@)a}"' } } a } } b } } ~$ } } I'd rather 2 above behave either like 1 or 2. (Working with the presumption you mean "like 1 or 3".) This may go back to a misinterpretation of documentation -- there are a lot of little things about zsh that got that way because e.g. examples in the ksh88 documentation were implemented without completely knowing what was BNF-style markup and what was actual syntax. Nevertheless I think the intention was that #2 is "collapse consecutive whitespace to a single space and then act like #3". In any case it all depends on where you apply the (@): % a=' a b ' zsh -c 'print -l "${(@)${(s, ,)a}}"' a b % } It may be too late to change the behaviour now, though I'd find } it hard to imagine people relying on "$=var" to make empty } arguments at the beginning and end but not in the middle. I have the nagging suspicion there may be cases in the completion code that expect exactly that ... or that have been programmed to work around it and would need to be fixed if it changes. -- Barton E. Schaefer ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <141124234931.ZM17259__8246.8130779036$1416901919$gmane$org@torch.brasslantern.com>]
* Re: ${^var} and word splitting [not found] ` <141124234931.ZM17259__8246.8130779036$1416901919$gmane$org@torch.brasslantern.com> @ 2014-11-25 12:12 ` Stephane Chazelas 0 siblings, 0 replies; 9+ messages in thread From: Stephane Chazelas @ 2014-11-25 12:12 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list 2014-11-24 23:49:31 -0800, Bart Schaefer: [...] > } ~$ a=' a b ' zsh -c 'print -l ${(s, ,)a}' > } a > } b > } ~$ a=' a b ' zsh -c 'print -l "${(s, ,)a}"' > } > } a > } b > } > } ~$ a=' a b ' zsh -c 'print -l "${(s, ,@)a}"' > } > } a > } > } b > } > } ~$ > } > } I'd rather 2 above behave either like 1 or [3]. [...] > } It may be too late to change the behaviour now, though I'd find > } it hard to imagine people relying on "$=var" to make empty > } arguments at the beginning and end but not in the middle. > > I have the nagging suspicion there may be cases in the completion code > that expect exactly that ... or that have been programmed to work around > it and would need to be fixed if it changes. [...] I'd be surprised if it were the case. Anyway, if one wants 1, he can write it as 1, and if one wants 3, he can write it as 3. So, that's no big deal if 2 stays the way it is. It's just that I don't find it intuitive or /consistent/. 2 is specific to zsh anyway. Other shells don't split inside double quotes (except for ${array[@]}) and $^a is zsh-specific. So it's not a question of compatibility with other shells. zsh -o shwordsplit works like other shells where the behaviour is defined in other shells. -- Stephane ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-11-25 12:12 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-11-24 9:56 ${^var} and word splitting Stephane Chazelas 2014-11-24 11:12 ` Peter Stephenson [not found] ` <20141124111201.161d8cf2__23261.8202259347$1416827641$gmane$org@pwslap01u.europe.root.pri> 2014-11-24 15:26 ` Stephane Chazelas 2014-11-24 15:55 ` Peter Stephenson 2014-11-24 16:55 ` Bart Schaefer 2014-11-24 17:22 ` Peter Stephenson [not found] ` <20141124155524.0739b3ec__26419.4987401881$1416845250$gmane$org@pwslap01u.europe.root.pri> 2014-11-24 21:18 ` Stephane Chazelas 2014-11-25 7:49 ` Bart Schaefer [not found] ` <141124234931.ZM17259__8246.8130779036$1416901919$gmane$org@torch.brasslantern.com> 2014-11-25 12:12 ` Stephane Chazelas
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).