zsh-workers
 help / color / mirror / code / Atom feed
* 'case' pattern matching bug with bracket expressions
@ 2015-05-14 13:14 Martijn Dekker
  2015-05-14 13:50 ` [BUG] " Martijn Dekker
  2015-05-14 14:42 ` Peter Stephenson
  0 siblings, 2 replies; 18+ messages in thread
From: Martijn Dekker @ 2015-05-14 13:14 UTC (permalink / raw)
  To: zsh-workers

While writing a cross-platform shell library I've come across a bug in
the way zsh (in POSIX mode) matches patterns in 'case' statements that
are at variance with other POSIX shells.

Normally, zsh considers an empty bracket expression [] a bad pattern
while other shells ([d]ash, bash, ksh) consider it a negative:

case abc in ( [] ) echo yes ;; ( * ) echo no ;; esac

Expected output: no
Got output: zsh: bad pattern: []

This is inconvenient if you want to pass such a bracket expression in a
parameter or variable, e.g. ["$param"]. If the parameter or variable is
empty, a 'bad pattern: []' error is produced.

I'm not sure whether the above is a bug or a variance in behaviour
permitted by POSIX, though of course as a writer of cross-platform shell
programs I'd prefer it if zsh acted like the majority.

However, I'm quite sure the following is a serious bug.

The same thing does NOT produce an error, but a false positive (!), if
an extra non-matching pattern with | is added:

case abc in ( [] | *[!a-z]*) echo yes ;; ( * ) echo no ;; esac

Expected output: no
Got output: yes

The above needs to be tested in a non-interactive shell (i.e. a script)
due to the "!". Other shells I've tested (bash, dash, ksh, pdksh, mksh)
behave as expected.

I confirmed the bug in zsh 4.3.11, zsh 5.0.2 and zsh 5.0.7-dev-2.

Thanks,

- Martijn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] 'case' pattern matching bug with bracket expressions
  2015-05-14 13:14 'case' pattern matching bug with bracket expressions Martijn Dekker
@ 2015-05-14 13:50 ` Martijn Dekker
  2015-05-14 14:42 ` Peter Stephenson
  1 sibling, 0 replies; 18+ messages in thread
From: Martijn Dekker @ 2015-05-14 13:50 UTC (permalink / raw)
  To: zsh-workers

Martijn Dekker schreef op 14-05-15 om 14:14:
> The same thing does NOT produce an error, but a false positive (!), if
> an extra non-matching pattern with | is added:
> 
> case abc in ( [] | *[!a-z]*) echo yes ;; ( * ) echo no ;; esac
> 
> Expected output: no
> Got output: yes

I tested some more and found the bug is very specific, occurring only if
the second bracket expression (after the |) both starts with the "!"
negator *and* is followed (not necessarily preceded) by a '*' wildcard.

Test cases:

case abc in ( [] | nonmatching ) echo yes ;; ( * ) echo no ;; esac
Expected output: no
Got output: no

case abc in ( [] | *[A-Z]* ) echo yes ;; ( * ) echo no ;; esac
Expected output: no
Got output: no

case abc in ( [] | *[!a-z] ) echo yes ;; ( * ) echo no ;; esac
Expected output: no
Got output: no

case abc in ( [] | [!a-z]* ) echo yes ;; ( * ) echo no ;; esac
Expected output: no
Got output: yes    <-- !!!!

Thanks,

- Martijn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 13:14 'case' pattern matching bug with bracket expressions Martijn Dekker
  2015-05-14 13:50 ` [BUG] " Martijn Dekker
@ 2015-05-14 14:42 ` Peter Stephenson
  2015-05-14 15:47   ` Martijn Dekker
                     ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Peter Stephenson @ 2015-05-14 14:42 UTC (permalink / raw)
  To: Martijn Dekker, zsh-workers

On Thu, 14 May 2015 14:14:26 +0100
Martijn Dekker <martijn@inlv.org> wrote:
> While writing a cross-platform shell library I've come across a bug in
> the way zsh (in POSIX mode) matches patterns in 'case' statements that
> are at variance with other POSIX shells.
> 
> Normally, zsh considers an empty bracket expression [] a bad pattern
> while other shells ([d]ash, bash, ksh) consider it a negative:
> 
> case abc in ( [] ) echo yes ;; ( * ) echo no ;; esac
> 
> Expected output: no
> Got output: zsh: bad pattern: []

This is the shell language being typically duplicitous and unhelpful.
"]" after a "[" indicates that the "]" is part of the set.  This is
normal; in bash as well as zsh:

  [[ ']' = []] ]] && echo yes

outputs 'yes'.

However, as you've found out, other shells handle the case where there
isn't another ']' later.  Generally there's no harm in this, and in most
cases we could do this (the case below is harder).

Nonetheless, there's a real ambiguity here, so given this and the
following I'd definitely suggest not relying on it if you can avoid
doing so --- use something else to signify an empty string.

> The same thing does NOT produce an error, but a false positive (!), if
> an extra non-matching pattern with | is added:
> 
> case abc in ( [] | *[!a-z]*) echo yes ;; ( * ) echo no ;; esac

This is the pattern:
 '['                   introducing bracketed expression
   '] | *[!a-z'        characters inside
 ']'                   end of bracketed expression
 '*'                   wildcard.

so it's a set including the character a followed by anything, and hence
matches.

I'm not really sure we *can* resolve this unambiguously the way you
want.  Is there something that forbids us from interpreting the pattern
that way?  The handling of ']' at the start is mandated, if I've
followed all the logic corretly --- POSIX 2007 Shell and Utilities
2.13.1 says:

[
    If an open bracket introduces a bracket expression as in XBD RE
    Bracket Expression, except that the <exclamation-mark> character (
    '!' ) shall replace the <circumflex> character ( '^' ) in its role
    in a non-matching list in the regular expression notation, it shall
    introduce a pattern bracket expression. A bracket expression
    starting with an unquoted <circumflex> character produces
    unspecified results. Otherwise, '[' shall match the character
    itself.

The languaqge is a little turgid, but I think it's saying "unless
you have ^ or [ just go with the RE rules in [section 9.3.5]".

9.3.5 (in regular expressions) says, amongst a lot of other things:

   The <right-square-bracket> ( ']' ) shall lose its special meaning and
   represent itself in a bracket expression if it occurs first in the
   list (after an initial <circumflex> ( '^' ), if any)

That's a "shall".

I haven't read through the "case" doc so there may be some killer reason
why that " | " has to be a case separator and not part of a
square-bracketed expression.  But that would seem to imply some form of
hierarchical parsing in which those characters couldn't occur within a
pattern.

By the way, we don't handle all forms in 9.3.5, e.g. equivalence sets,
so saying "it works like REs" isn't a perfect answer for zsh, either.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 14:42 ` Peter Stephenson
@ 2015-05-14 15:47   ` Martijn Dekker
  2015-05-14 15:55     ` Peter Stephenson
  2015-05-14 16:17     ` Peter Stephenson
  2015-05-14 21:23   ` Chet Ramey
  2015-05-14 23:45   ` Chet Ramey
  2 siblings, 2 replies; 18+ messages in thread
From: Martijn Dekker @ 2015-05-14 15:47 UTC (permalink / raw)
  To: zsh-workers

Peter Stephenson schreef op 14-05-15 om 15:42:
> This is the pattern:
>  '['                   introducing bracketed expression
>    '] | *[!a-z'        characters inside
>  ']'                   end of bracketed expression
>  '*'                   wildcard.

Ah ha, that makes sense of the whole thing. Thank you.

> I'm not really sure we *can* resolve this unambiguously the way you
> want.  Is there something that forbids us from interpreting the pattern
> that way?

Apparently not, as far as I can tell.

What confounded me is that this applies even if the pattern is [$var] or
even ["$var"], where $var is empty. This causes the bracket pattern to
unexpectedly swallow anything that follows it if $var is empty. That
seems very unexpected and undesirable to me, particularly since, as far
as I know, zsh appears to be the only shell that chooses to handle it
this way.

An unfortunate ambiguity does appear to exist in the standard that
allows this to happen. But perhaps it would be advisable for zsh to
follow the practice most (all?) other shells have chosen to take, where
[] unambiguously is a simple non-match. Alternatively, unambiguously
consider [] an error. This would at least block a ["$var"] where $var is
empty from unexpectedly swallowing arbitrary grammar that follows it.

Meanwhile, I've just become aware of another cross-platform shell
programming snag to add to my list. Thanks for the reply.

- Martijn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 15:47   ` Martijn Dekker
@ 2015-05-14 15:55     ` Peter Stephenson
  2015-05-14 17:30       ` Bart Schaefer
                         ` (2 more replies)
  2015-05-14 16:17     ` Peter Stephenson
  1 sibling, 3 replies; 18+ messages in thread
From: Peter Stephenson @ 2015-05-14 15:55 UTC (permalink / raw)
  To: Martijn Dekker, zsh-workers

On Thu, 14 May 2015 16:47:49 +0100
Martijn Dekker <martijn@inlv.org> wrote:
> Peter Stephenson schreef op 14-05-15 om 15:42:
> > This is the pattern:
> >  '['                   introducing bracketed expression
> >    '] | *[!a-z'        characters inside
> >  ']'                   end of bracketed expression
> >  '*'                   wildcard.
> 
> Ah ha, that makes sense of the whole thing. Thank you.
> 
> > I'm not really sure we *can* resolve this unambiguously the way you
> > want.  Is there something that forbids us from interpreting the pattern
> > that way?
> 
> Apparently not, as far as I can tell.

It occurs to me that other shells will treat whitespace as ending a
pattern for syntactic reasons, even if logically it can't:

[[ ' ' = [ ] ]]

works in zsh, but is a parse error in bash.

Usually zsh's behaviour is the more useful, because if the expression is
meaningful at all there has to be a closing brace(*).  But it's not
impossible it does break some rule about whitespace as separators ---
and you've come up with the first example I've ever seen of where it's
significant even if the alternative isn't an error.  So it might
be more useful to change this in POSIX compatibility mode, in which what
zsh users expect has never been a significant factor.

pws

(*) Relegated to a footnote because this is more of a rant than
a useful comment:  furthermore, the whole []] thing is designed
so that you *don't* need to quote characters in the [...].  So
do you, or don't you?  Bleugh.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 15:47   ` Martijn Dekker
  2015-05-14 15:55     ` Peter Stephenson
@ 2015-05-14 16:17     ` Peter Stephenson
  2015-05-14 17:07       ` Martijn Dekker
  2015-05-14 23:59       ` Chet Ramey
  1 sibling, 2 replies; 18+ messages in thread
From: Peter Stephenson @ 2015-05-14 16:17 UTC (permalink / raw)
  To: Martijn Dekker, zsh-workers

On Thu, 14 May 2015 16:47:49 +0100
Martijn Dekker <martijn@inlv.org> wrote:
> What confounded me is that this applies even if the pattern is [$var] or
> even ["$var"], where $var is empty. This causes the bracket pattern to
> unexpectedly swallow anything that follows it if $var is empty. That
> seems very unexpected and undesirable to me, particularly since, as far
> as I know, zsh appears to be the only shell that chooses to handle it
> this way.

Sorry about the multiple emails...

I don't *think* the following patch makes anything worse, but notice
you're in any case on fairly soggy ground here, even in bash:

$ var=
$ [[ ']' = [$var]*[$var] ]] && echo matches
matches

and presumably Chet would agree with me that's required by the
standard.

pws

diff --git a/Src/pattern.c b/Src/pattern.c
index 05dcb29..4e5e8a1 100644
--- a/Src/pattern.c
+++ b/Src/pattern.c
@@ -1405,7 +1405,16 @@ patcomppiece(int *flagp, int paren)
 		starter = patnode(P_ANYBUT);
 	    } else
 		starter = patnode(P_ANYOF);
-	    if (*patparse == Outbrack) {
+	    /*
+	     * []...] means match a "]" or other included characters.
+	     * However, to be a bit helpful and for compatibility
+	     * with other shells, don't take it in that sense if
+	     * there's no further active "]".  That's still imperfect,
+	     * but it's all we can do --- we're required to
+	     * treat [$var]*[$var]with empty var as [ ... ]
+	     * containing "]*[".
+	     */
+	    if (*patparse == Outbrack && strchr(patparse+1, Outbrack)) {
 		patparse++;
 		patadd(NULL, ']', 1, PA_NOALIGN);
 	    }


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 16:17     ` Peter Stephenson
@ 2015-05-14 17:07       ` Martijn Dekker
  2015-05-14 17:43         ` Peter Stephenson
  2015-05-14 23:59       ` Chet Ramey
  1 sibling, 1 reply; 18+ messages in thread
From: Martijn Dekker @ 2015-05-14 17:07 UTC (permalink / raw)
  To: Peter Stephenson, zsh-workers

Peter Stephenson schreef op 14-05-15 om 17:17:
> I don't *think* the following patch makes anything worse,

Thanks for the patch. I just tested it against zsh 5.0.7-dev-2.

It does solve the simple cases of:

case abc in ( [] ) echo yes ;; ( * ) echo no ;; esac
empty=''
case abc in ( ["$empty"] ) echo yes ;; ( * ) echo no ;; esac

which are now a non-match instead of an error, as I would expect.

However, the more insidious case that bit me:

case abc in ( [] | [!a-z]* ) echo yes ;; ( * ) echo no ;; esac
empty=''
case abc in ( ["$empty"] | [!a-z]* ) echo yes ;; ( * ) echo no ;; esac

still produces a false positive even with the patch.

- Martijn


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 15:55     ` Peter Stephenson
@ 2015-05-14 17:30       ` Bart Schaefer
  2015-05-15  0:05         ` Chet Ramey
  2015-05-14 23:51       ` Chet Ramey
  2015-05-15  8:38       ` Peter Stephenson
  2 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2015-05-14 17:30 UTC (permalink / raw)
  To: zsh-workers

On May 14,  4:55pm, Peter Stephenson wrote:
}
} It occurs to me that other shells will treat whitespace as ending a
} pattern for syntactic reasons, even if logically it can't:
} 
} [[ ' ' = [ ] ]]
} 
} works in zsh, but is a parse error in bash.

Right, and your patch in 35131 does not change that.  Arguably (in POSIX
mode, at least) the space should need to be escaped?

Even in bash the space can be left unescaped in some contexts; e.g.

schaefer@burner:~$ var=' '
schaefer@burner:~$ echo ${var//[ ]/foo}
foo

I guess it's a quoting thing:

schaefer@burner:~$ case " " in ( [" "] ) echo OK;; esac
OK

In ${...} the space is already implicitly quoted, but quoting it again
doesn't change anything:

schaefer@burner:~$ echo ${var//[" "]/foo}
foo
schaefer@burner:~$ var='"'
schaefer@burner:~$ echo ${var//[" "]/foo}
"

Anyway, this is one of the rare cases where I don't think it would be
terrible if this changed in native zsh mode too, as long as the quoted
examples, like the above, don't break.

-- 
Barton E. Schaefer


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 17:07       ` Martijn Dekker
@ 2015-05-14 17:43         ` Peter Stephenson
  2015-05-15  0:09           ` Chet Ramey
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Stephenson @ 2015-05-14 17:43 UTC (permalink / raw)
  To: Martijn Dekker, zsh-workers

On Thu, 14 May 2015 18:07:39 +0100
Martijn Dekker <martijn@inlv.org> wrote:
> However, the more insidious case that bit me:
> 
> case abc in ( [] | [!a-z]* ) echo yes ;; ( * ) echo no ;; esac
> empty=''
> case abc in ( ["$empty"] | [!a-z]* ) echo yes ;; ( * ) echo no ;; esac
> 
> still produces a false positive even with the patch.

This is hairy because you're relying on whitespace performing word
splitting in a case where you don't need whitespace according to the
grammar anyway, i.e. if you have the valid expression

case abc in (["$empty"]|[!a-z]*) echo yes ;; (*) echo no ;; esac

it can only be parsed the way zsh parses it --- you'll find bash does
the same.

I'll have a look at changing the way we handle whitespace in
bracketed expressions in POSIX mode (POSIX_STRINGS option?  That's
about the closest I can see and underworked at the moment).

The above still needs some work, though, since we don't currently word
split case statements.  I think that probably is required by the
standard since it specifies word (though not word list) handling,
i.e. implying the same as command line words but with an additional "|"
token between words.  That's probably going to have to wait till after
5.0.8.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 14:42 ` Peter Stephenson
  2015-05-14 15:47   ` Martijn Dekker
@ 2015-05-14 21:23   ` Chet Ramey
  2015-05-14 23:45   ` Chet Ramey
  2 siblings, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2015-05-14 21:23 UTC (permalink / raw)
  To: Peter Stephenson, Martijn Dekker, zsh-workers; +Cc: chet.ramey

On 5/14/15 10:42 AM, Peter Stephenson wrote:

> This is the pattern:
>  '['                   introducing bracketed expression
>    '] | *[!a-z'        characters inside
>  ']'                   end of bracketed expression
>  '*'                   wildcard.
> 
> so it's a set including the character a followed by anything, and hence
> matches.
> 
> I'm not really sure we *can* resolve this unambiguously the way you
> want.  Is there something that forbids us from interpreting the pattern
> that way?  

Look at the Posix grammar.  A case pattern list is a sequence of one
or more WORDs separated by (zero or more) `|'s.  WORDs are delimited by
metacharacters and cannot contain unquoted blanks.  The `[]' is a
separate WORD delimited by a space.

The rules governing how the pattern is treated only come into play
after the command is parsed.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 14:42 ` Peter Stephenson
  2015-05-14 15:47   ` Martijn Dekker
  2015-05-14 21:23   ` Chet Ramey
@ 2015-05-14 23:45   ` Chet Ramey
  2 siblings, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2015-05-14 23:45 UTC (permalink / raw)
  To: Peter Stephenson, Martijn Dekker, zsh-workers; +Cc: chet.ramey

On 5/14/15 10:42 AM, Peter Stephenson wrote:

> The handling of ']' at the start is mandated, if I've
> followed all the logic corretly --- POSIX 2007 Shell and Utilities
> 2.13.1 says:
> 
> [
>     If an open bracket introduces a bracket expression as in XBD RE
>     Bracket Expression, except that the <exclamation-mark> character (
>     '!' ) shall replace the <circumflex> character ( '^' ) in its role
>     in a non-matching list in the regular expression notation, it shall
>     introduce a pattern bracket expression. A bracket expression
>     starting with an unquoted <circumflex> character produces
>     unspecified results. Otherwise, '[' shall match the character
>     itself.
> 
> The languaqge is a little turgid, but I think it's saying "unless
> you have ^ or [ just go with the RE rules in [section 9.3.5]".

I think it means that improperly-formed pattern bracket expressions have
to be matched by a literal `[' followed by whatever the following
characters mean.

> I haven't read through the "case" doc so there may be some killer reason
> why that " | " has to be a case separator and not part of a
> square-bracketed expression.  But that would seem to imply some form of
> hierarchical parsing in which those characters couldn't occur within a
> pattern.

It's the grammar.  If you want `|' to be in a pattern you have to quote it.
Otherwise it's a metacharacter and a token delimiter (section 2.2).

The basic idea is that you tokenize case patterns as words and analyze them
as patterns after doing so.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 15:55     ` Peter Stephenson
  2015-05-14 17:30       ` Bart Schaefer
@ 2015-05-14 23:51       ` Chet Ramey
  2015-05-15  8:38       ` Peter Stephenson
  2 siblings, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2015-05-14 23:51 UTC (permalink / raw)
  To: Peter Stephenson, Martijn Dekker, zsh-workers; +Cc: chet.ramey

On 5/14/15 11:55 AM, Peter Stephenson wrote:

> It occurs to me that other shells will treat whitespace as ending a
> pattern for syntactic reasons, even if logically it can't:
> 
> [[ ' ' = [ ] ]]
> 
> works in zsh, but is a parse error in bash.

Because shell metacharacters delimit words.  The definition of a WORD
(and the use of WORDs as patterns) doesn't really change depending on
whether you're parsing a case command or a conditional command.


> (*) Relegated to a footnote because this is more of a rant than
> a useful comment:  furthermore, the whole []] thing is designed
> so that you *don't* need to quote characters in the [...].  So
> do you, or don't you?  Bleugh.

You do.  Conditional commands use the same definition of WORD tokens
as everything else.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 16:17     ` Peter Stephenson
  2015-05-14 17:07       ` Martijn Dekker
@ 2015-05-14 23:59       ` Chet Ramey
  1 sibling, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2015-05-14 23:59 UTC (permalink / raw)
  To: Peter Stephenson, Martijn Dekker, zsh-workers; +Cc: chet.ramey

On 5/14/15 12:17 PM, Peter Stephenson wrote:

> I don't *think* the following patch makes anything worse, but notice
> you're in any case on fairly soggy ground here, even in bash:
> 
> $ var=
> $ [[ ']' = [$var]*[$var] ]] && echo matches
> matches
> 
> and presumably Chet would agree with me that's required by the
> standard.

Yes, everybody matches `]' with that pattern.  Posix requires that the
`case' equivalent match (`[[' is not in the standard), since it specifies
the expansions that take place on the pattern before you attempt
matching using the pattern matching rules.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 17:30       ` Bart Schaefer
@ 2015-05-15  0:05         ` Chet Ramey
  0 siblings, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2015-05-15  0:05 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers; +Cc: chet.ramey

On 5/14/15 1:30 PM, Bart Schaefer wrote:

> Even in bash the space can be left unescaped in some contexts; e.g.
> 
> schaefer@burner:~$ var=' '
> schaefer@burner:~$ echo ${var//[ ]/foo}
> foo

Yes.  Inside ${...}, the space doesn't delimit a token.  The Posix
tokenizing rules specify that.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 17:43         ` Peter Stephenson
@ 2015-05-15  0:09           ` Chet Ramey
  0 siblings, 0 replies; 18+ messages in thread
From: Chet Ramey @ 2015-05-15  0:09 UTC (permalink / raw)
  To: Peter Stephenson, Martijn Dekker, zsh-workers; +Cc: chet.ramey

On 5/14/15 1:43 PM, Peter Stephenson wrote:

> This is hairy because you're relying on whitespace performing word
> splitting in a case where you don't need whitespace according to the
> grammar anyway, i.e. if you have the valid expression
> 
> case abc in (["$empty"]|[!a-z]*) echo yes ;; (*) echo no ;; esac
> 
> it can only be parsed the way zsh parses it --- you'll find bash does
> the same.

The `|' delimits the token; `["$empty"]' is a single WORD.  Both shells
effectively parse this the same way.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-14 15:55     ` Peter Stephenson
  2015-05-14 17:30       ` Bart Schaefer
  2015-05-14 23:51       ` Chet Ramey
@ 2015-05-15  8:38       ` Peter Stephenson
  2015-05-15 20:48         ` Peter Stephenson
  2 siblings, 1 reply; 18+ messages in thread
From: Peter Stephenson @ 2015-05-15  8:38 UTC (permalink / raw)
  To: Peter Stephenson, zsh-workers

On Thu, 14 May 2015 16:55:57 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> It occurs to me that other shells will treat whitespace as ending a
> pattern for syntactic reasons, even if logically it can't:
> 
> [[ ' ' = [ ] ]]
> 
> works in zsh, but is a parse error in bash.

I'm talking nonsense here --- I'm thinking of parentheses (which have
always been subtle because of zsh's pattern syntax and work differently
in POSIX mode anyway).  A space after a square bracket *does* split the
word.

Otherwise we wouldn't know whether "[ " at the start of the line was a
test command or the start of a group.

I don't think changing the case parsing to do proper words is *that*
difficult --- it's always been a bit of a hack, so could do with
being fixed.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-15  8:38       ` Peter Stephenson
@ 2015-05-15 20:48         ` Peter Stephenson
  2015-05-15 21:48           ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Stephenson @ 2015-05-15 20:48 UTC (permalink / raw)
  To: zsh-workers

On Fri, 15 May 2015 09:38:51 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> I don't think changing the case parsing to do proper words is *that*
> difficult --- it's always been a bit of a hack, so could do with
> being fixed.

I'm on the case, but just to note this isn't completely benign.  There
are (few) places in functions where we rely on the fact that anything in
parentheses is parsed as a single quoted string, which won't happen any
more.  In particular, this one from _path_commands running in the
completion tests had me stumped for half an hour:

_call_whatis() { 
  case "$(whatis --version)" in
  (whatis from *)
    local -A args
    zparseopts -D -A args s: r:
    apropos "${args[-r]:-"$@"}" | fgrep "($args[-s]"
    ;;
  (*) whatis "$@";;
  esac
}

Quoting "whatis from " fixes it.

No doubt we can track these down in functions in the distribution but it
could also crop up elsewhere.  I don't want to have to propagate the
existing hack any more, however, so this isn't really possible to work
around.

The better news is it gives a clear parse error (except in completion
tests where errors are obscurely hidden).  So perhaps it isn't so bad if
flagged up as a change.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 'case' pattern matching bug with bracket expressions
  2015-05-15 20:48         ` Peter Stephenson
@ 2015-05-15 21:48           ` Bart Schaefer
  0 siblings, 0 replies; 18+ messages in thread
From: Bart Schaefer @ 2015-05-15 21:48 UTC (permalink / raw)
  To: zsh-workers

On May 15,  9:48pm, Peter Stephenson wrote:
} Subject: Re: 'case' pattern matching bug with bracket expressions
}
} On Fri, 15 May 2015 09:38:51 +0100
} Peter Stephenson <p.stephenson@samsung.com> wrote:
} > I don't think changing the case parsing to do proper words is *that*
} > difficult --- it's always been a bit of a hack, so could do with
} > being fixed.
} 
} I'm on the case, but just to note this isn't completely benign.  There
} are (few) places in functions where we rely on the fact that anything in
} parentheses is parsed as a single quoted string, which won't happen any
} more.

Hm.  It appears we always parsed this "correctly" when using the old style
case statment:

torch% case "foo bar baz" in
case> foo bar *) echo here;;
zsh: parse error near `bar'

So perhaps the most benign thing is to apply the word parsing inside the
matched parens only in some sort of POSIX mode?

} I don't want to have to propagate the
} existing hack any more, however, so this isn't really possible to work
} around.

Urm.  OK.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-05-15 21:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-14 13:14 'case' pattern matching bug with bracket expressions Martijn Dekker
2015-05-14 13:50 ` [BUG] " Martijn Dekker
2015-05-14 14:42 ` Peter Stephenson
2015-05-14 15:47   ` Martijn Dekker
2015-05-14 15:55     ` Peter Stephenson
2015-05-14 17:30       ` Bart Schaefer
2015-05-15  0:05         ` Chet Ramey
2015-05-14 23:51       ` Chet Ramey
2015-05-15  8:38       ` Peter Stephenson
2015-05-15 20:48         ` Peter Stephenson
2015-05-15 21:48           ` Bart Schaefer
2015-05-14 16:17     ` Peter Stephenson
2015-05-14 17:07       ` Martijn Dekker
2015-05-14 17:43         ` Peter Stephenson
2015-05-15  0:09           ` Chet Ramey
2015-05-14 23:59       ` Chet Ramey
2015-05-14 21:23   ` Chet Ramey
2015-05-14 23:45   ` Chet Ramey

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).