zsh-workers
 help / color / mirror / code / Atom feed
* Arith parsing bug with minus after $#
@ 2015-05-28 19:42 Martijn Dekker
  2015-05-29 15:02 ` Peter Stephenson
  0 siblings, 1 reply; 12+ messages in thread
From: Martijn Dekker @ 2015-05-28 19:42 UTC (permalink / raw)
  To: zsh-workers

After the getopts patch (which appeared within an hour after my report
-- impressive!), I ran into another hurdle with the same shell function.
Doing arithmetic calculations with the shell parameter containing the
number of positional parameters ($#) triggers a parsing bug in
arithmetic evaluation.

In current zsh git code:

$ zsh
% set --
% echo $#
0
% echo $(($#-1))
41

Expected output: -1, of course. Sometimes 81 is produced instead of 41!
I haven't figured out a pattern yet.

% echo $(($#-(1+1)))
zsh: bad math expression: operator expected at `(1+1)'

(expected output: -2)

In both cases, it does work correctly if a space is inserted before the
'-', but that space should be optional.

Enabling 'emulate sh' makes no difference.

This must be a long-standing bug, becasue zsh 4.3.11 that came with my
Mac has it too.

(According to my testing, other shells that support arith all do this
correctly.)

- Martijn


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-28 19:42 Arith parsing bug with minus after $# Martijn Dekker
@ 2015-05-29 15:02 ` Peter Stephenson
  2015-05-29 15:43   ` Martijn Dekker
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Peter Stephenson @ 2015-05-29 15:02 UTC (permalink / raw)
  To: zsh-workers

On Thu, 28 May 2015 21:42:41 +0200
Martijn Dekker <martijn@inlv.org> wrote:
> % set --
> % echo $#
> 0
> % echo $(($#-1))
> 41

That's not a compatibility issue, that's just plain weird.  I don't know
the POSIX terminology.

The problem is the overloading of "#" --- the test to establish what to
do with it is trying too hard to resolve to ${#-}, which is a valid
substitution, because it hasn't taken into account that there are no
braces.  So what you're seeing is ${#-}1.

"-" is overloaded, too, so there could be other cases involving those
two characters where they're misinterpreted, even with braces.

diff --git a/Src/subst.c b/Src/subst.c
index d4a04b8..168f7f1 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -2170,7 +2170,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags)
 		     */
 		    || ((cc == '#' || cc == Pound) &&
 			s[2] == Outbrace)
-		    || cc == '-' || (cc == ':' && s[2] == '-')
+		    || (inbrace && (cc == '-' || (cc == ':' && s[2] == '-')))
 		    || (isstring(cc) && (s[2] == Inbrace || s[2] == Inpar)))) {
 	    getlen = 1 + whichlen, s++;
 	    /*
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index d96ffb6..c41e05e 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -1703,3 +1703,8 @@
   funnychars='The qu*nk br!wan f@x j/mps o[]r \(e la~# ^"&;'
   [[ $funnychars = ${~${(b)funnychars}} ]]
 0:${(b)...} quoting protects from GLOB_SUBST
+
+  set --
+  print $#-1
+0:Avoid confusion after overloaded characters in braceless substitution
+>0-1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 15:02 ` Peter Stephenson
@ 2015-05-29 15:43   ` Martijn Dekker
  2015-05-29 18:09   ` Bart Schaefer
  2015-06-07  1:08   ` Bart Schaefer
  2 siblings, 0 replies; 12+ messages in thread
From: Martijn Dekker @ 2015-05-29 15:43 UTC (permalink / raw)
  To: zsh-workers

Peter Stephenson schreef op 29-05-15 om 17:02:
> On Thu, 28 May 2015 21:42:41 +0200
> Martijn Dekker <martijn@inlv.org> wrote:
>> % set --
>> % echo $#
>> 0
>> % echo $(($#-1))
>> 41
> 
> That's not a compatibility issue, that's just plain weird.  I don't know
> the POSIX terminology.
> 
> The problem is the overloading of "#" --- the test to establish what to
> do with it is trying too hard to resolve to ${#-}, which is a valid
> substitution, because it hasn't taken into account that there are no
> braces.  So what you're seeing is ${#-}1.

Ah, yes... ${#-} is the length in characters of $-, the shell options
that are set. Before 'emulate sh', the length of $- is 8, and after,
it's 4. That explains the results I got.

The fix is working; thanks yet again.

- M.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 15:02 ` Peter Stephenson
  2015-05-29 15:43   ` Martijn Dekker
@ 2015-05-29 18:09   ` Bart Schaefer
  2015-05-29 19:33     ` Martijn Dekker
  2015-06-07  1:08   ` Bart Schaefer
  2 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2015-05-29 18:09 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 530 bytes --]

On May 29, 2015 8:02 AM, "Peter Stephenson" <p.stephenson@samsung.com>
wrote:
>
>
> The problem is the overloading of "#" --- the test to establish what to
> do with it is trying too hard to resolve to ${#-}, which is a valid
> substitution, because it hasn't taken into account that there are no
> braces.  So what you're seeing is ${#-}1.

I think this actually was discussed on austin-group a few months back.  My
recollection is that zsh's behavior was deemed permissible and I therefore
thought no more about it at the time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 18:09   ` Bart Schaefer
@ 2015-05-29 19:33     ` Martijn Dekker
  2015-05-29 20:48       ` ZyX
  0 siblings, 1 reply; 12+ messages in thread
From: Martijn Dekker @ 2015-05-29 19:33 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer schreef op 29-05-15 om 20:09:
> On May 29, 2015 8:02 AM, "Peter Stephenson" <p.stephenson@samsung.com>
> wrote:
>> The problem is the overloading of "#" --- the test to establish what to
>> do with it is trying too hard to resolve to ${#-}, which is a valid
>> substitution, because it hasn't taken into account that there are no
>> braces.  So what you're seeing is ${#-}1.
> 
> I think this actually was discussed on austin-group a few months back.  My
> recollection is that zsh's behavior was deemed permissible and I therefore
> thought no more about it at the time.

It is incompatible with every other shell and with the POSIX spec.
Parameter expansion is only supposed to be done if braces are present. See:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

- M.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 19:33     ` Martijn Dekker
@ 2015-05-29 20:48       ` ZyX
  2015-05-29 21:24         ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: ZyX @ 2015-05-29 20:48 UTC (permalink / raw)
  To: Martijn Dekker, zsh-workers



29.05.2015, 22:34, "Martijn Dekker" <martijn@inlv.org>:
> Bart Schaefer schreef op 29-05-15 om 20:09:
>>  On May 29, 2015 8:02 AM, "Peter Stephenson" <p.stephenson@samsung.com>
>>  wrote:
>>>  The problem is the overloading of "#" --- the test to establish what to
>>>  do with it is trying too hard to resolve to ${#-}, which is a valid
>>>  substitution, because it hasn't taken into account that there are no
>>>  braces.  So what you're seeing is ${#-}1.
>>  I think this actually was discussed on austin-group a few months back.  My
>>  recollection is that zsh's behavior was deemed permissible and I therefore
>>  thought no more about it at the time.
>
> It is incompatible with every other shell and with the POSIX spec.
> Parameter expansion is only supposed to be done if braces are present. See:
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

It says that they are optional:

> The parameter name or symbol can be enclosed in braces, which are optional except for positional parameters with more than one digit or when parameter is a name and is followed by a character that could be interpreted as part of the name. The matching closing brace shall be determined by counting brace levels, skipping over enclosed quoted strings, and command substitutions.

This is the fourth paragraph.

But later it explicitly says that not enclosed in single braces may only be names or single-character variables. I.e. $#- is ${#}-, $10 is ${1}0, …

Besides $#- zsh has things like $array[index]: needs not be enclosed in braces (depends on some option: in `emulate sh` or `emulate ksh` this is ${array}[index], same in ksh). Or $file:h. I guess there are more I do not know about.

>
> - M.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 20:48       ` ZyX
@ 2015-05-29 21:24         ` Bart Schaefer
  2015-05-30 19:24           ` Peter Stephenson
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2015-05-29 21:24 UTC (permalink / raw)
  To: ZyX; +Cc: Zsh hackers list, Martijn Dekker

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

On May 29, 2015 1:54 PM, "ZyX" <kp-pav@yandex.ru> wrote:
>
> But later it explicitly says that not enclosed in single braces may only
be names or single-character variables. I.e. $#- is ${#}-, $10 is ${1}0, …

Yes, but $#name to return the length of the value of $name is already a zsh
extension, so unless we're in some emulation mode, treating $#- as the
length of $- is perfectly reasonable.

I vaguely recall having a similar discussion about the meaning of $## ...
Related, is ${##foo} parsed like ${#name#foo} with an empty name, or ...?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 21:24         ` Bart Schaefer
@ 2015-05-30 19:24           ` Peter Stephenson
  2015-05-30 19:40             ` Peter Stephenson
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Stephenson @ 2015-05-30 19:24 UTC (permalink / raw)
  To: Bart Schaefer, Zsh hackers list

On Fri, 29 May 2015 14:24:45 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On May 29, 2015 1:54 PM, "ZyX" <kp-pav@yandex.ru> wrote:
> >
> > But later it explicitly says that not enclosed in single braces may only
> be names or single-character variables. I.e. $#- is ${#}-, $10 is ${1}0, …
> 
> Yes, but $#name to return the length of the value of $name is already a zsh
> extension, so unless we're in some emulation mode, treating $#- as the
> length of $- is perfectly reasonable.

Perhaps more interesting is $#*, since "*" is a much more commmon
special parameter that's also an operator, and isn't quite so horrifically
overloaded in parameter substitution.

% emulate sh -c 'fn() { echo $(( $#*3 )); }'
% fn one
13

That looks like it needs some emulation, though it can wait since nobody's
tripped over it.

I'm not agonizing much over the extremely rare and confusing $#-, to
be honest.

pws


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-30 19:24           ` Peter Stephenson
@ 2015-05-30 19:40             ` Peter Stephenson
  2015-05-30 22:28               ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Stephenson @ 2015-05-30 19:40 UTC (permalink / raw)
  To: Zsh hackers list

On Sat, 30 May 2015 20:24:12 +0100
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On Fri, 29 May 2015 14:24:45 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > On May 29, 2015 1:54 PM, "ZyX" <kp-pav@yandex.ru> wrote:
> > >
> > > But later it explicitly says that not enclosed in single braces may only
> > be names or single-character variables. I.e. $#- is ${#}-, $10 is ${1}0, …
> > 
> > Yes, but $#name to return the length of the value of $name is already a zsh
> > extension, so unless we're in some emulation mode, treating $#- as the
> > length of $- is perfectly reasonable.
> 
> Perhaps more interesting is $#*, since "*" is a much more commmon
> special parameter that's also an operator, and isn't quite so horrifically
> overloaded in parameter substitution.
> 
> % emulate sh -c 'fn() { echo $(( $#*3 )); }'
> % fn one
> 13
> 
> That looks like it needs some emulation, though it can wait since nobody's
> tripped over it.

It's easy, though, so no time like the present...  and this ought to be
consistent across all parameters, so make "-" behave the same.

What I'm not sure about is how to decide.  SH_WORD_SPLIT isn't the
same thing, though there's an obvious mnemonic for why it might have
this effect. We have the option of basing it on emulation alone, but
that always strikes me as something of a counsel of despair.  I'll add
documentation when this gets decided.

diff --git a/Src/subst.c b/Src/subst.c
index 168f7f1..67bd088 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -2156,6 +2156,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags)
 		    nojoin = !(ifs && *ifs);
 	    }
 	} else if ((c == '#' || c == Pound) &&
+		   (inbrace || !isset(SHWORDSPLIT)) &&
 		   (itype_end(s+1, IIDENT, 0) != s + 1
 		    || (cc = s[1]) == '*' || cc == Star || cc == '@'
 		    || cc == '?' || cc == Quest
@@ -2170,7 +2171,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags)
 		     */
 		    || ((cc == '#' || cc == Pound) &&
 			s[2] == Outbrace)
-		    || (inbrace && (cc == '-' || (cc == ':' && s[2] == '-')))
+		    || cc == '-' || (cc == ':' && s[2] == '-')
 		    || (isstring(cc) && (s[2] == Inbrace || s[2] == Inpar)))) {
 	    getlen = 1 + whichlen, s++;
 	    /*
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index c41e05e..d06a73a 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -1704,7 +1704,10 @@
   [[ $funnychars = ${~${(b)funnychars}} ]]
 0:${(b)...} quoting protects from GLOB_SUBST
 
-  set --
-  print $#-1
-0:Avoid confusion after overloaded characters in braceless substitution
+  set -- foo
+  echo $(( $#*3 ))
+  emulate sh -c 'nolenwithoutbrace() { echo $#-1; }'
+  nolenwithoutbrace
+0:Avoid confusion after overloaded characters in braceless substitution in sh
+>13
 >0-1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-30 19:40             ` Peter Stephenson
@ 2015-05-30 22:28               ` Bart Schaefer
  2015-05-30 23:30                 ` Peter Stephenson
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2015-05-30 22:28 UTC (permalink / raw)
  To: Zsh hackers list

On May 30,  8:40pm, Peter Stephenson wrote:
}
} What I'm not sure about is how to decide.  SH_WORD_SPLIT isn't the
} same thing, though there's an obvious mnemonic for why it might have
} this effect. We have the option of basing it on emulation alone, but
} that always strikes me as something of a counsel of despair.

POSIX_IDENTIFIERS, perhaps?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-30 22:28               ` Bart Schaefer
@ 2015-05-30 23:30                 ` Peter Stephenson
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Stephenson @ 2015-05-30 23:30 UTC (permalink / raw)
  To: Zsh hackers list

On Sat, 30 May 2015 15:28:31 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On May 30,  8:40pm, Peter Stephenson wrote:
> }
> } What I'm not sure about is how to decide.  SH_WORD_SPLIT isn't the
> } same thing, though there's an obvious mnemonic for why it might have
> } this effect. We have the option of basing it on emulation alone, but
> } that always strikes me as something of a counsel of despair.
> 
> POSIX_IDENTIFIERS, perhaps?

That'll probably do.

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index afd6b1f..5a4be6b 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -777,6 +777,13 @@ This has the side-effect that joining is skipped even in quoted
 forms, which may affect other sub-expressions in var(spec).
 Note that `tt(^)', `tt(=)', and `tt(~)', below, must appear
 to the left of `tt(#)' when these forms are combined.
+
+If the option tt(POSIX_IDENTIFIERS) is not set, and var(spec) is a
+simple name, then the braces are optional; this is true even
+for special parameters so e.g. tt($#-) and tt($#*) take the length
+of the string tt($-) and the array tt($*) respectively.  If
+tt(POSIX_IDENTIFIERS) is set, then braces are required for
+the tt(#) to be treated in this fashion.
 )
 item(tt(${^)var(spec)tt(}))(
 pindex(RC_EXPAND_PARAM, toggle)
diff --git a/Doc/Zsh/options.yo b/Doc/Zsh/options.yo
index 4c0ae12..4dd68c9 100644
--- a/Doc/Zsh/options.yo
+++ b/Doc/Zsh/options.yo
@@ -2054,6 +2054,13 @@ When this option is set, only the ASCII characters tt(a) to tt(z), tt(A) to
 tt(Z), tt(0) to tt(9) and tt(_) may be used in identifiers (names
 of shell parameters and modules).
 
+In addition, setting this option limits the effect of parameter
+substitution with no braces, so that the expression tt($#) is treated as
+the parameter tt($#) even if followed by a valid parameter name.
+When it is unset, zsh allows expresions of the form tt($#)var(name)
+to refer to the length of tt($)var(name), even for special variables,
+for example in expressions such as tt($#-) and tt($#*).
+
 When the option is unset and multibyte character support is enabled (i.e. it
 is compiled in and the option tt(MULTIBYTE) is set), then additionally any
 alphanumeric characters in the local character set may be used in
diff --git a/Src/subst.c b/Src/subst.c
index 168f7f1..81d34d2 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -2156,6 +2156,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags)
 		    nojoin = !(ifs && *ifs);
 	    }
 	} else if ((c == '#' || c == Pound) &&
+		   (inbrace || !isset(POSIXIDENTIFIERS)) &&
 		   (itype_end(s+1, IIDENT, 0) != s + 1
 		    || (cc = s[1]) == '*' || cc == Star || cc == '@'
 		    || cc == '?' || cc == Quest
@@ -2170,7 +2171,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags)
 		     */
 		    || ((cc == '#' || cc == Pound) &&
 			s[2] == Outbrace)
-		    || (inbrace && (cc == '-' || (cc == ':' && s[2] == '-')))
+		    || cc == '-' || (cc == ':' && s[2] == '-')
 		    || (isstring(cc) && (s[2] == Inbrace || s[2] == Inpar)))) {
 	    getlen = 1 + whichlen, s++;
 	    /*
diff --git a/Test/D04parameter.ztst b/Test/D04parameter.ztst
index c41e05e..d06a73a 100644
--- a/Test/D04parameter.ztst
+++ b/Test/D04parameter.ztst
@@ -1704,7 +1704,10 @@
   [[ $funnychars = ${~${(b)funnychars}} ]]
 0:${(b)...} quoting protects from GLOB_SUBST
 
-  set --
-  print $#-1
-0:Avoid confusion after overloaded characters in braceless substitution
+  set -- foo
+  echo $(( $#*3 ))
+  emulate sh -c 'nolenwithoutbrace() { echo $#-1; }'
+  nolenwithoutbrace
+0:Avoid confusion after overloaded characters in braceless substitution in sh
+>13
 >0-1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Arith parsing bug with minus after $#
  2015-05-29 15:02 ` Peter Stephenson
  2015-05-29 15:43   ` Martijn Dekker
  2015-05-29 18:09   ` Bart Schaefer
@ 2015-06-07  1:08   ` Bart Schaefer
  2 siblings, 0 replies; 12+ messages in thread
From: Bart Schaefer @ 2015-06-07  1:08 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Fri, May 29, 2015 at 8:02 AM, Peter Stephenson
<p.stephenson@samsung.com> wrote:
> On Thu, 28 May 2015 21:42:41 +0200
> Martijn Dekker <martijn@inlv.org> wrote:
>> % set --
>> % echo $#
>> 0
>> % echo $(($#-1))
>> 41
>
> That's not a compatibility issue, that's just plain weird.  I don't know
> the POSIX terminology.
>
> The problem is the overloading of "#" --- the test to establish what to
> do with it is trying too hard to resolve to ${#-}, which is a valid
> substitution, because it hasn't taken into account that there are no
> braces.  So what you're seeing is ${#-}1.

Incidentally, here is a few-years-old austin-group thread that is related:

http://thread.gmane.org/gmane.comp.standards.posix.austin.general/3784


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-06-07  1:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-28 19:42 Arith parsing bug with minus after $# Martijn Dekker
2015-05-29 15:02 ` Peter Stephenson
2015-05-29 15:43   ` Martijn Dekker
2015-05-29 18:09   ` Bart Schaefer
2015-05-29 19:33     ` Martijn Dekker
2015-05-29 20:48       ` ZyX
2015-05-29 21:24         ` Bart Schaefer
2015-05-30 19:24           ` Peter Stephenson
2015-05-30 19:40             ` Peter Stephenson
2015-05-30 22:28               ` Bart Schaefer
2015-05-30 23:30                 ` Peter Stephenson
2015-06-07  1:08   ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).