zsh-workers
 help / color / mirror / code / Atom feed
* Order of field splitting in zsh
@ 1997-01-16 14:07 Andrej Borsenkow
  1997-01-16 15:55 ` Zoltan Hidvegi
  0 siblings, 1 reply; 4+ messages in thread
From: Andrej Borsenkow @ 1997-01-16 14:07 UTC (permalink / raw)
  To: Zsh workers mailing list


POSIX.2 defines the following order of expansions in sh:

1. tilde expansion, parameter expansion, command substitution, arithmetic
expansion
2. field splitting (_after_ the above)
3. pathname expansion (globbing)
4. qoute removal.

It seems, that zsh (even if invoked as sh) does field splitting on result
of command substitution  _immidiately_ after getting the value. The
example is:

% sh #where sh is linked to zsh
% args a$(echo a b)b${IFS::=:}
3
aa
bb
%

(the third being null string). If I understand POSIX specs right, it
should give _two_ arguments ('aa bb' and empty).

This example is obviously artificial; I fail currently to state if it can
be a problem in real life or not. (Note, that ${var::=val} is illegal in
POSIX; I use it to just demonstrate order of substitutions).

greetings

-------------------------------------------------------------------------
Andrej Borsenkow 		Fax:   +7 (095) 252 01 05
SNI ITS Moscow			Tel:   +7 (095) 252 13 88

NERV:  borsenkow.msk		E-Mail: borsenkow.msk@sni.de
-------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Order of field splitting in zsh
  1997-01-16 14:07 Order of field splitting in zsh Andrej Borsenkow
@ 1997-01-16 15:55 ` Zoltan Hidvegi
  1997-01-22 15:58   ` Andrej Borsenkow
  0 siblings, 1 reply; 4+ messages in thread
From: Zoltan Hidvegi @ 1997-01-16 15:55 UTC (permalink / raw)
  To: borsenkow.msk; +Cc: zsh-workers

Andrej Borsenkow wrote:
> 
> POSIX.2 defines the following order of expansions in sh:
> 
> 1. tilde expansion, parameter expansion, command substitution, arithmetic
> expansion
> 2. field splitting (_after_ the above)
> 3. pathname expansion (globbing)
> 4. qoute removal.
> 
> It seems, that zsh (even if invoked as sh) does field splitting on result
> of command substitution  _immidiately_ after getting the value. The
> example is:
> 
> % sh #where sh is linked to zsh
> % args a$(echo a b)b${IFS::=:}
> 3
> aa
> bb
> %
> 
> (the third being null string). If I understand POSIX specs right, it
> should give _two_ arguments ('aa bb' and empty).
> 
> This example is obviously artificial; I fail currently to state if it can
> be a problem in real life or not. (Note, that ${var::=val} is illegal in
> POSIX; I use it to just demonstrate order of substitutions).

You are right but that can only cause problems when IFS changes in step
one, and under POSIX it can only happen when it was set to the empty string
previously.  I checked AT&T ksh and pdksh:

% ksh
$ args () { for i; do echo $i; done ; }
$ IFS=
$ args $(echo a b c)${IFS:=' '}
a b c
$ args $(echo a b c)${IFS:=' '}
a
b
c

As you see ksh behaves like zsh.  Bash behaves as POSIX requires.  But I do
not think it is a real problem, and the fix would just complicate the code
unnecessarily.  Note that both ksh I tested claims POSIX compilance.

Zoltan


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Order of field splitting in zsh
  1997-01-16 15:55 ` Zoltan Hidvegi
@ 1997-01-22 15:58   ` Andrej Borsenkow
  1997-01-22 21:59     ` Zoltan Hidvegi
  0 siblings, 1 reply; 4+ messages in thread
From: Andrej Borsenkow @ 1997-01-22 15:58 UTC (permalink / raw)
  To: Zoltan Hidvegi; +Cc: zsh-workers

On Thu, 16 Jan 1997, Zoltan Hidvegi wrote:

> Andrej Borsenkow wrote:
> > 
> > POSIX.2 defines the following order of expansions in sh:
> > 
> > 1. tilde expansion, parameter expansion, command substitution, arithmetic
> > expansion
> > 2. field splitting (_after_ the above)
> > 3. pathname expansion (globbing)
> > 4. qoute removal.
> > 
> > It seems, that zsh (even if invoked as sh) does field splitting on result
> > of command substitution  _immidiately_ after getting the value. The
> > example is:
> > 
...
> 
> You are right but that can only cause problems when IFS changes in step
> one, and under POSIX it can only happen when it was set to the empty string
> previously.  I checked AT&T ksh and pdksh:
> 
...
> 
> As you see ksh behaves like zsh.  Bash behaves as POSIX requires.  But I do
> not think it is a real problem, and the fix would just complicate the code
> unnecessarily.  Note that both ksh I tested claims POSIX compilance.
> 

There is more simple case:

% ./sh (where sh -> /bin/zsh)
% args $(echo 'a ')$(echo 'b')
                ^ note blank here (or any IFS white space)
1
ab
% /bin/ksh
% args $(echo 'a ')$(echo 'b')
2
a
b
% 

The same with /bin/sh (well, my /bin/sh doesn't understand $(...) but with
`...` it behaves like ksh).

greetings

-------------------------------------------------------------------------
Andrej Borsenkow 		Fax:   +7 (095) 252 01 05
SNI ITS Moscow			Tel:   +7 (095) 252 13 88

NERV:  borsenkow.msk		E-Mail: borsenkow.msk@sni.de
-------------------------------------------------------------------------





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Order of field splitting in zsh
  1997-01-22 15:58   ` Andrej Borsenkow
@ 1997-01-22 21:59     ` Zoltan Hidvegi
  0 siblings, 0 replies; 4+ messages in thread
From: Zoltan Hidvegi @ 1997-01-22 21:59 UTC (permalink / raw)
  To: borsenkow.msk; +Cc: zsh-workers

Andrej Borsenkow wrote:
> There is more simple case:
> 
> % ./sh (where sh -> /bin/zsh)
> % args $(echo 'a ')$(echo 'b')
>                 ^ note blank here (or any IFS white space)
> 1
> ab

That's a real bug.  Here is a fix.  That causes an other minor change:

% args () { for i; do print -r -- "/$i/"; done ; }
% args a${^=:- }b                                  
/ab/
/ab/
% args a${^=:- } 
/a/
/a/

Previously zsh expanded these to one word.

Zoltan


*** Src/utils.c	1997/01/20 01:04:43	3.1.1.22
--- Src/utils.c	1997/01/22 21:10:46
***************
*** 1429,1439 ****
  {
      char *t, **ret, **ptr;
  
!     skipwsep(&s);
!     ptr = ret = (char **) ncalloc(sizeof(*ret) * (wordcount(s, NULL, 0) + 1));
  
      if (*s && isep(*s == Meta ? s[1] ^ 32 : *s))
  	*ptr++ = dupstring(allownull ? "" : nulstring);
      while (*s) {
  	if (isep(*s == Meta ? s[1] ^ 32 : *s)) {
  	    if (*s == Meta)
--- 1429,1441 ----
  {
      char *t, **ret, **ptr;
  
!     ptr = ret = (char **) ncalloc(sizeof(*ret) * (wordcount(s, NULL, -!allownull) + 1));
  
+     skipwsep(&s);
      if (*s && isep(*s == Meta ? s[1] ^ 32 : *s))
  	*ptr++ = dupstring(allownull ? "" : nulstring);
+     else if (!allownull && t != s)
+ 	*ptr++ = dupstring("");
      while (*s) {
  	if (isep(*s == Meta ? s[1] ^ 32 : *s)) {
  	    if (*s == Meta)
***************
*** 1448,1455 ****
--- 1450,1460 ----
  	    ztrncpy(*ptr++, t, s - t);
  	} else
  	    *ptr++ = dupstring(nulstring);
+ 	t = s;
  	skipwsep(&s);
      }
+     if (!allownull && t != s)
+ 	*ptr++ = dupstring("");
      *ptr = NULL;
      return ret;
  }
***************
*** 1536,1560 ****
  	    if ((c && *(s + sl)) || mul)
  		r++;
      } else {
! 	char *t;
  
  	r = 0;
! 	if (!mul)
  	    skipwsep(&s);
! 	if (*s && isep(*s == Meta ? s[1] ^ 32 : *s))
  	    r++;
! 	for (t = s; *t; r++) {
! 	    if (isep(*t == Meta ? t[1] ^ 32 : *t)) {
! 		if (*t == Meta)
! 		    t++;
! 		t++;
! 		if (!mul)
! 		    skipwsep(&t);
  	    }
! 	    findsep(&t, NULL);
! 	    if (!mul)
! 		skipwsep(&t);
  	}
      }
      return r;
  }
--- 1541,1569 ----
  	    if ((c && *(s + sl)) || mul)
  		r++;
      } else {
! 	char *t = s;
  
  	r = 0;
! 	if (mul <= 0)
  	    skipwsep(&s);
! 	if ((*s && isep(*s == Meta ? s[1] ^ 32 : *s)) ||
! 	    (mul < 0 && t != s))
  	    r++;
! 	for (; *s; r++) {
! 	    if (isep(*s == Meta ? s[1] ^ 32 : *s)) {
! 		if (*s == Meta)
! 		    s++;
! 		s++;
! 		if (mul <= 0)
! 		    skipwsep(&s);
  	    }
! 	    findsep(&s, NULL);
! 	    t = s;
! 	    if (mul <= 0)
! 		skipwsep(&s);
  	}
+ 	if (mul < 0 && t != s)
+ 	    r++;
      }
      return r;
  }
*** Src/subst.c	1997/01/18 22:04:39	3.1.1.4
--- Src/subst.c	1997/01/22 21:45:16
***************
*** 1134,1139 ****
--- 1134,1140 ----
  	case Equals:
  	    if (vunset) {
  		char sav = *idend;
+ 		int l;
  
  		*idend = '\0';
  		val = dupstring(u);
***************
*** 1146,1161 ****
  		    char *arr[2], **t, **a, **p;
  		    if (spsep || spbreak) {
  			aval = sepsplit(val, spsep, 0);
! 			isarr = 1;
  			sep = spsep = NULL;
  			spbreak = 0;
  		    } else if (!isarr) {
  			arr[0] = val;
  			arr[1] = NULL;
! 			aval = arr;
  		    }
! 		    p = a = zcalloc(sizeof(char *) * (arrlen(aval) + 1));
! 		    for (t = aval; *t; untokenize(*t), *p++ = ztrdup(*t++));
  		    setaparam(idbeg, a);
  		} else {
  		    untokenize(val);
--- 1147,1174 ----
  		    char *arr[2], **t, **a, **p;
  		    if (spsep || spbreak) {
  			aval = sepsplit(val, spsep, 0);
! 			isarr = 2;
  			sep = spsep = NULL;
  			spbreak = 0;
+ 			l = arrlen(aval);
+ 			if (l && !*(aval[l-1]))
+ 			    l--;
+ 			if (l && !**aval)
+ 			    l--, t = aval + 1;
+ 			else
+ 			    t = aval;
  		    } else if (!isarr) {
  			arr[0] = val;
  			arr[1] = NULL;
! 			t = aval = arr;
! 			l = 1;
  		    }
! 		    p = a = zalloc(sizeof(char *) * (l + 1));
! 		    while (l--) {
! 			untokenize(*t);
! 			*p++ = ztrdup(*t++);
! 		    }
! 		    *p++ = NULL;
  		    setaparam(idbeg, a);
  		} else {
  		    untokenize(val);
***************
*** 1309,1315 ****
  	    else if (!aval[1])
  		val = aval[0];
  	    else
! 		isarr = 1;
  	}
      }
      if (casmod) {
--- 1322,1328 ----
  	    else if (!aval[1])
  		val = aval[0];
  	    else
! 		isarr = 2;
  	}
      }
      if (casmod) {
***************
*** 1403,1409 ****
  		for (tn = firstnode(tl); tn; incnode(tn)) {
  		    strcatsub(&y, ostr, aptr, x, xlen,
  			      (char *) getdata(tn), globsubst);
! 		    if (qt && !*y)
  			y = dupstring(nulstring);
  		    if (i == 1)
  			setdata(n, (void *) y);
--- 1416,1422 ----
  		for (tn = firstnode(tl); tn; incnode(tn)) {
  		    strcatsub(&y, ostr, aptr, x, xlen,
  			      (char *) getdata(tn), globsubst);
! 		    if (qt && !*y && isarr != 2)
  			y = dupstring(nulstring);
  		    if (i == 1)
  			setdata(n, (void *) y);
***************
*** 1420,1426 ****
  		return NULL;
  	    xlen = strlen(x);
  	    strcatsub(&y, ostr, aptr, x, xlen, NULL, globsubst);
! 	    if (qt && !*y)
  		y = dupstring(nulstring);
  	    setdata(n, (void *) y);
  
--- 1433,1439 ----
  		return NULL;
  	    xlen = strlen(x);
  	    strcatsub(&y, ostr, aptr, x, xlen, NULL, globsubst);
! 	    if (qt && !*y && isarr != 2)
  		y = dupstring(nulstring);
  	    setdata(n, (void *) y);
  
***************
*** 1433,1439 ****
  				  premul, postmul);
  		if (eval && parsestr(x))
  		    return NULL;
! 		if (qt && !*x)
  		    y = dupstring(nulstring);
  		else if (globsubst)
  		    tokenize(y = dupstring(x));
--- 1446,1452 ----
  				  premul, postmul);
  		if (eval && parsestr(x))
  		    return NULL;
! 		if (qt && !*x && isarr != 2)
  		    y = dupstring(nulstring);
  		else if (globsubst)
  		    tokenize(y = dupstring(x));
***************
*** 1450,1456 ****
  		return NULL;
  	    xlen = strlen(x);
  	    *str = strcatsub(&y, aptr, aptr, x, xlen, s, globsubst);
! 	    if (qt && !*y)
  		y = dupstring(nulstring);
  	    insertlinknode(l, n, (void *) y), incnode(n);
  	}
--- 1463,1469 ----
  		return NULL;
  	    xlen = strlen(x);
  	    *str = strcatsub(&y, aptr, aptr, x, xlen, s, globsubst);
! 	    if (qt && !*y && isarr != 2)
  		y = dupstring(nulstring);
  	    insertlinknode(l, n, (void *) y), incnode(n);
  	}
***************
*** 1469,1475 ****
  	    return NULL;
  	xlen = strlen(x);
  	*str = strcatsub(&y, ostr, aptr, x, xlen, s, globsubst);
! 	if (qt && !*y)
  	    y = dupstring(nulstring);
  	setdata(n, (void *) y);
      }
--- 1482,1488 ----
  	    return NULL;
  	xlen = strlen(x);
  	*str = strcatsub(&y, ostr, aptr, x, xlen, s, globsubst);
! 	if (qt && !*y && isarr != 2)
  	    y = dupstring(nulstring);
  	setdata(n, (void *) y);
      }


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~1997-01-22 22:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-01-16 14:07 Order of field splitting in zsh Andrej Borsenkow
1997-01-16 15:55 ` Zoltan Hidvegi
1997-01-22 15:58   ` Andrej Borsenkow
1997-01-22 21:59     ` Zoltan Hidvegi

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).