zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <p.stephenson@samsung.com>
To: Martijn Dekker <martijn@inlv.org>, zsh-workers@zsh.org
Subject: Re: 'case' pattern matching bug with bracket expressions
Date: Thu, 14 May 2015 15:42:38 +0100	[thread overview]
Message-ID: <20150514154238.0e547ff0@pwslap01u.europe.root.pri> (raw)
In-Reply-To: <55549FB2.80705@inlv.org>

On Thu, 14 May 2015 14:14:26 +0100
Martijn Dekker <martijn@inlv.org> wrote:
> While writing a cross-platform shell library I've come across a bug in
> the way zsh (in POSIX mode) matches patterns in 'case' statements that
> are at variance with other POSIX shells.
> 
> Normally, zsh considers an empty bracket expression [] a bad pattern
> while other shells ([d]ash, bash, ksh) consider it a negative:
> 
> case abc in ( [] ) echo yes ;; ( * ) echo no ;; esac
> 
> Expected output: no
> Got output: zsh: bad pattern: []

This is the shell language being typically duplicitous and unhelpful.
"]" after a "[" indicates that the "]" is part of the set.  This is
normal; in bash as well as zsh:

  [[ ']' = []] ]] && echo yes

outputs 'yes'.

However, as you've found out, other shells handle the case where there
isn't another ']' later.  Generally there's no harm in this, and in most
cases we could do this (the case below is harder).

Nonetheless, there's a real ambiguity here, so given this and the
following I'd definitely suggest not relying on it if you can avoid
doing so --- use something else to signify an empty string.

> The same thing does NOT produce an error, but a false positive (!), if
> an extra non-matching pattern with | is added:
> 
> case abc in ( [] | *[!a-z]*) echo yes ;; ( * ) echo no ;; esac

This is the pattern:
 '['                   introducing bracketed expression
   '] | *[!a-z'        characters inside
 ']'                   end of bracketed expression
 '*'                   wildcard.

so it's a set including the character a followed by anything, and hence
matches.

I'm not really sure we *can* resolve this unambiguously the way you
want.  Is there something that forbids us from interpreting the pattern
that way?  The handling of ']' at the start is mandated, if I've
followed all the logic corretly --- POSIX 2007 Shell and Utilities
2.13.1 says:

[
    If an open bracket introduces a bracket expression as in XBD RE
    Bracket Expression, except that the <exclamation-mark> character (
    '!' ) shall replace the <circumflex> character ( '^' ) in its role
    in a non-matching list in the regular expression notation, it shall
    introduce a pattern bracket expression. A bracket expression
    starting with an unquoted <circumflex> character produces
    unspecified results. Otherwise, '[' shall match the character
    itself.

The languaqge is a little turgid, but I think it's saying "unless
you have ^ or [ just go with the RE rules in [section 9.3.5]".

9.3.5 (in regular expressions) says, amongst a lot of other things:

   The <right-square-bracket> ( ']' ) shall lose its special meaning and
   represent itself in a bracket expression if it occurs first in the
   list (after an initial <circumflex> ( '^' ), if any)

That's a "shall".

I haven't read through the "case" doc so there may be some killer reason
why that " | " has to be a case separator and not part of a
square-bracketed expression.  But that would seem to imply some form of
hierarchical parsing in which those characters couldn't occur within a
pattern.

By the way, we don't handle all forms in 9.3.5, e.g. equivalence sets,
so saying "it works like REs" isn't a perfect answer for zsh, either.

pws


  parent reply	other threads:[~2015-05-14 14:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-14 13:14 Martijn Dekker
2015-05-14 13:50 ` [BUG] " Martijn Dekker
2015-05-14 14:42 ` Peter Stephenson [this message]
2015-05-14 15:47   ` Martijn Dekker
2015-05-14 15:55     ` Peter Stephenson
2015-05-14 17:30       ` Bart Schaefer
2015-05-15  0:05         ` Chet Ramey
2015-05-14 23:51       ` Chet Ramey
2015-05-15  8:38       ` Peter Stephenson
2015-05-15 20:48         ` Peter Stephenson
2015-05-15 21:48           ` Bart Schaefer
2015-05-14 16:17     ` Peter Stephenson
2015-05-14 17:07       ` Martijn Dekker
2015-05-14 17:43         ` Peter Stephenson
2015-05-15  0:09           ` Chet Ramey
2015-05-14 23:59       ` Chet Ramey
2015-05-14 21:23   ` Chet Ramey
2015-05-14 23:45   ` Chet Ramey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150514154238.0e547ff0@pwslap01u.europe.root.pri \
    --to=p.stephenson@samsung.com \
    --cc=martijn@inlv.org \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).