Word splitting in zsh

zsh-users
 help / color / mirror / code / Atom feed

* Word splitting in zsh
@ 2001-02-09  1:23 Deborah Ariel Pickett
  2001-02-09  3:09 ` Bart Schaefer
  0 siblings, 1 reply; 3+ messages in thread
From: Deborah Ariel Pickett @ 2001-02-09  1:23 UTC (permalink / raw)
  To: zsh-users

Hiya,

I've come across this little problem in zsh when I run it under
setopt SHWORDSPLIT (not that this is something I normally do).
What it boils down to is that in constructs like ${variable-word}
(where "-" is any of the characters that can go in that place - e.g.,
"-", "+", "=") the shell doesn't seem to be honouring double quotes the
way the manpage says.

Here's the relevant sections in the manpage:
[...] (Parameters)
      @      In double quotes, array elements are put into sepa-
              rate  words.   E.g.,  "${(@)foo}"  is equivalent to
              "${foo[@]}" and "${(@)foo[1,2]}"  is  the  same  as
              "$foo[1]" "$foo[2]".
[...] (Parameter expansion)
      ${name:+word}
              If name is set  and  is  non-null  then  substitute
              word; otherwise substitute nothing.
[...]
       Note that double quotes may appear around nested substitu-
       tions, in which case only the part inside  is  treated  as
       quoted;  for  example, ${(f)"$(foo)"} quotes the result of
       $(foo), but the flag `(f)' (see below)  is  applied  using
       the  rules  for unquoted substitutions.  Note further that
       quotes are themselves nested in this context; for example,
       in  "${(@f)"$(foo)"}",  there  are two sets of quotes, one
       surrounding the whole expression,  the  other  (redundant)
       surrounding the $(foo) as before.

With these two bits together, the POSIX and Bourne shells can make a
generic "any number of arguments, including zero" by using the form
${1+"$@"} (i.e., if $1 is set, substitute "$@", otherwise substitute
nothing).

Trying that in zsh, I get this:

bruce ~ % echo $ZSH_VERSION
3.1.6
# (But this also applies in 3.1.9)
bruce ~ % setopt
alwaystoend
noappendhistory
autocd
noautomenu
autonamedirs
autopushd
autoresume
braceccl
noclobber
completeinword
correct
extendedglob
noflowcontrol
globdots
histexpiredupsfirst
histignoredups
histnostore
ignoreeof
interactive
nolistbeep
longlistjobs
monitor
numericglobsort
promptsubst
pushdignoredups
pushdminus
pushdsilent
pushdtohome
shinstdin
zle
bruce ~ % setopt shwordsplit
# Now splitting is done like in other shells.
bruce ~ args()
function >{
function >for x
function for >do
function for >echo "'$x'"
function for >done
function >}
bruce ~ % args "a1 a2 a3" b c
'a1 a2 a3'
'b'
'c'
# Ok, that's what we expected.
bruce ~ % args "$@"
# Acceptable, but Bourne sh would print a single blank entry here, since
# there's a pair of quotes.
bruce ~ % args ${1+"$@"}
# This is the Proper Bourne Shell way to do it.
bruce ~ % set "a1 a2 a3" b c
bruce ~ % args "$@"
'a1 a2 a3'
'b'
'c'
# Fine . . .
bruce ~ % args ${1+"$@"}
'a1'
'a2'
'a3'
'b'
'c'
# What??  The $@ was in quotes, why was "a1 a2 a3" split?
bruce ~ % bash --posix
# Let's try the same under a POSIX shell.
bash-2.03$ args()
> {
> for x
> do
> echo "'$x'"
> done
> }
bash-2.03$ set "a1 a2 a3" b c
bash-2.03$ args ${1+"$@"}
'a1 a2 a3'
'b'
'c'
# Here it does it the correct way.

So . . am I misunderstanding how double quotes are propagated through
${} constructs in zsh?  Or is this a bona fide bug?  Whatever the
answer, this is something that doesn't work the same as in the Bourne
shell.  Should it?

The only hint of an answer comes in this section of the manpage:
       1. Nested Substitution
              If multiple nested ${...} forms are  present,  sub-
              stitution  is  performed  from the inside outwards.
              At each level, the substitution  takes  account  of
              whether  the current value is a scalar or an array,
              whether the whole substitution is in double quotes,
              and what flags are supplied to the current level of
              substitution, just as if  the  nested  substitution
              were  the  outermost.  The flags are not propagated
              up to enclosing substitutions; the nested substitu-
              tion  will  return  either  a scalar or an array as
              determined by  the  flags,  possibly  adjusted  for
              quoting.   All the following steps take place where
              applicable at all  levels  of  substitution.   Note
              that,  unless  the `(P)' flag is present, the flags
              and any subscripts apply directly to the  value  of
              the nested substitution; for example, the expansion
              ${${foo}} behaves exactly the same as ${foo}.

This makes some sense, since these produce different results:
bruce ~ % args "${@}"
'a1 a2 a3'
'b'
'c'
bruce ~ % args "${${@}}"
'a1 a2 a3 b c'

though I wonder if they ought to, since this disagrees with the last
sentence about ${${foo}} and ${foo}.

-- 
Debbie Pickett http://www.csse.monash.edu.au/~debbiep debbiep@csse.monash.edu.au
"Look at me; I will never pass for a perfect bride, or a perfect daughter.  Can
  it be I'm not meant to play this part? Now I see that if I were truly to be
        myself, I would break my family's heart." - Reflection, _Mulan_


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Word splitting in zsh
  2001-02-09  1:23 Word splitting in zsh Deborah Ariel Pickett
@ 2001-02-09  3:09 ` Bart Schaefer
  2001-02-09  7:34   ` Andrej Borsenkow
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Schaefer @ 2001-02-09  3:09 UTC (permalink / raw)
  To: Deborah Ariel Pickett, zsh-users

On Feb 9, 12:23pm, Deborah Ariel Pickett wrote:
}
} I've come across this little problem in zsh when I run it under
} setopt SHWORDSPLIT (not that this is something I normally do).

There's definitely some kind of bug here.

zagzig% echo $ZSH_VERSION
3.0.8
zagzig% set "a1 a2 a3" b c
zagzig% print -l ${1+"$@"}
a1 a2 a3
b
c
zagzig% setopt shwordsplit
zagzig% print -l ${1+"$@"}
a1
a2
a3
b
c
zagzig% 

Well, that's not quite right, but 3.1.9-dev-8 is even worse:

zagzig% echo $ZSH_VERSION 
3.1.9-dev-8
zagzig% set "a1 a2 a3" b c
zagzig% print -l ${1+"$@"}
a1 a2 a3 b c						<-- Yipes!
zagzig% setopt shwordsplit
zagzig% print -l ${1+"$@"}
a1
a2
a3
b
c
zagzig% 

It wasn't always so:

zagzig% echo $VERSION
zsh 2.4.306 beta
zagzig% setopt shwordsplit
zagzig% set "a1 a2 a3" b c
zagzig% print -l ${1+"$@"}
a1 a2 a3
b
c
zagzig% 

I don't know exactly when this bug was introduced, though.

} bruce ~ % args "$@"
} # Acceptable, but Bourne sh would print a single blank entry here, since
} # there's a pair of quotes.

Actually, that's not quite true.  Some versions of Bourne sh expand "$@"
to the empty string, and some expand it to no string at all.  The reason
for the ${1+"$@"} hack in many shell scripts is so that you don't have
to care which flavor of Bourne shell you have.  Zsh has always tried to
be in the latter camp, e.g.,

zagzig% print -l X "" X
X

X
zagzig% print -l X "$@" X
X
X
zagzig% 

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: Word splitting in zsh
  2001-02-09  3:09 ` Bart Schaefer
@ 2001-02-09  7:34   ` Andrej Borsenkow
  0 siblings, 0 replies; 3+ messages in thread
From: Andrej Borsenkow @ 2001-02-09  7:34 UTC (permalink / raw)
  To: Deborah Ariel Pickett, zsh-users

Some general notes. The POSIX shell is using one-level textual substitution.
It does *not* know anything about internal structure of variables. It does
*not* splits anything in the middle of substitutions. It behaves damn simply -
replace the values *once* and then split the whole line. That is the only
context where term "word" makes sense - meaning exactly "positional parameter
passed to comand".

> }
> } I've come across this little problem in zsh when I run it under
> } setopt SHWORDSPLIT (not that this is something I normally do).
>
> There's definitely some kind of bug here.
>
> zagzig% echo $ZSH_VERSION
> 3.0.8
> zagzig% set "a1 a2 a3" b c
> zagzig% print -l ${1+"$@"}
> a1 a2 a3
> b
> c

That is correct and is how sh behaves.

> zagzig% setopt shwordsplit
> zagzig% print -l ${1+"$@"}
> a1
> a2
> a3
> b
> c
> zagzig%
>

> Well, that's not quite right,

It is simply wrong.

                                            but 3.1.9-dev-8 is even worse:
>
> zagzig% echo $ZSH_VERSION
> 3.1.9-dev-8
> zagzig% set "a1 a2 a3" b c
> zagzig% print -l ${1+"$@"}
> a1 a2 a3 b c						<-- Yipes!

Well, this is "correct" *zsh* behaviour. The part after `+' is a word - not
array. And is taken as single word and is never splitted. What happens here,
is

- zsh evaluates "$@" that gives you array with three elements
- but because of "scalar context" in this case (the best I can call it) array
is concatenated forming the above value. Even worse, it is inconsistent with
everything else - array joining is supposed to use IFS ... but it does not in
this case. We get (quoting doc): "If NAME is an array parameter, and the
KSH_ARRAYS option is not set, then the value of each element of NAME is
substituted, one element per word."; these elements are then joined together
with space, ignoring actual IFS value.

> zagzig% setopt shwordsplit
> zagzig% print -l ${1+"$@"}
> a1
> a2
> a3
> b
> c
> zagzig%
>

That is just because of above. The structure of WORD in ${name+WORD} is not
remebered. But note the same bug again:

bor@itsrm2% set 'a b c' 1 2
bor@itsrm2% IFS=: print -l ${1+"$@"}
a b c 1 2
bor@itsrm2% setopt shwordsplit
bor@itsrm2% IFS=: print -l ${1+"$@"}
a
b
c
1
2

IFS value is silently ignored.

>
> I don't know exactly when this bug was introduced, though.
>

That is almost inevitable in current implementation. I repeat - sh word
splitting is done exactly once on the line after all substitutions have been
done. In zsh wordspitting happens at every level as part of evrey ${...}
substitution. I never liked it but could not find a good example. Thank you
for finding it :)

> } bruce ~ % args "$@"
> } # Acceptable, but Bourne sh would print a single blank entry here, since
> } # there's a pair of quotes.
>
> Actually, that's not quite true.  Some versions of Bourne sh expand "$@"
> to the empty string, and some expand it to no string at all.  The reason
> for the ${1+"$@"} hack in many shell scripts is so that you don't have
> to care which flavor of Bourne shell you have.  Zsh has always tried to
> be in the latter camp, e.g.,
>

Here saith SUS V2:

Expands to the positional parameters, starting from one. When the expansion
occurs within double-quotes, and where field splitting (see Field Splitting )
is performed, each positional parameter expands as a separate field, with the
provision that the expansion of the first parameter is still joined with the
beginning part of the original word (assuming that the expanded parameter was
embedded within a word), and the expansion of the last parameter is still
joined with the last part of the original word. If there are no positional
parameters, the expansion of "@" generates zero fields, even when "@" is
double-quoted.

-andrej

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-02-09  7:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-09  1:23 Word splitting in zsh Deborah Ariel Pickett
2001-02-09  3:09 ` Bart Schaefer
2001-02-09  7:34   ` Andrej Borsenkow

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).