zsh-workers
 help / color / mirror / code / Atom feed
* bracket expressions and POSIX
@ 2001-06-29 18:22 Clint Adams
  2001-06-30  2:14 ` Bart Schaefer
  2001-06-30 19:00 ` Peter Stephenson
  0 siblings, 2 replies; 12+ messages in thread
From: Clint Adams @ 2001-06-29 18:22 UTC (permalink / raw)
  To: zsh-workers

POSIX says that \ loses its special meaning within a bracket
expression for pattern matching and also that ! is the
^ character in that context.

So this strikes me as non-compliant:

% emulate sh
% touch \\test abc
% echo [!a]*
zsh: event not found: a]
% echo [\!a]*
\test
% echo [\]*
[]*
% echo [\\]*
\test

More specifically, the first three echoes appear to be
non-POSIX-compliant.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-06-29 18:22 bracket expressions and POSIX Clint Adams
@ 2001-06-30  2:14 ` Bart Schaefer
  2001-06-30 19:00 ` Peter Stephenson
  1 sibling, 0 replies; 12+ messages in thread
From: Bart Schaefer @ 2001-06-30  2:14 UTC (permalink / raw)
  To: Clint Adams, zsh-workers

On Jun 29,  2:22pm, Clint Adams wrote:
} Subject: bracket expressions and POSIX
}
} POSIX says that \ loses its special meaning within a bracket
} expression for pattern matching and also that ! is the
} ^ character in that context.
} 
} So this strikes me as non-compliant:
} 
} % emulate sh
} % touch \\test abc
} % echo [!a]*
} zsh: event not found: a]

Try it with `emulate -R sh'.  Just `emulate sh' does not turn on all the
POSIX shell emulation options -- in particular `banghist' is still set,
and history references take precedence over glob patterns.

} % echo [\!a]*
} \test
} % echo [\]*
} []*
} % echo [\\]*
} \test

With `emulate -R sh' I get:

$ ls
\test  abc
$ echo [!a]*
\test
$ echo [\!a]*			<- That one is especially odd.
\test
$ echo [\]*
[]*
$ echo [\\]*
\test
$ setopt badpattern
$ echo [\]*
zsh: bad pattern: []*

So it appears that zsh is in fact not POSIX-compliant with respect to
backslashes inside brackets, but is OK with respect to `!'.

$ echo [\\!a]*     
\test abc


-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-06-30 19:00 ` Peter Stephenson
@ 2001-06-30 18:25   ` Bart Schaefer
  2001-07-02  6:13   ` Andrej Borsenkow
  1 sibling, 0 replies; 12+ messages in thread
From: Bart Schaefer @ 2001-06-30 18:25 UTC (permalink / raw)
  To: Zsh hackers list

On Jun 30,  8:00pm, Peter Stephenson wrote:
} Subject: Re: bracket expressions and POSIX
}
} I hope it will be possible to turn this into an option (part of
} SH_GLOB?) by un-nulling the backslashes.

Making it an effect of SH_GLOB seems like the right thing to me.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-06-29 18:22 bracket expressions and POSIX Clint Adams
  2001-06-30  2:14 ` Bart Schaefer
@ 2001-06-30 19:00 ` Peter Stephenson
  2001-06-30 18:25   ` Bart Schaefer
  2001-07-02  6:13   ` Andrej Borsenkow
  1 sibling, 2 replies; 12+ messages in thread
From: Peter Stephenson @ 2001-06-30 19:00 UTC (permalink / raw)
  To: Zsh hackers list

Clint Adams wrote:
> POSIX says that \ loses its special meaning within a bracket
> expression for pattern matching and also that ! is the
> ^ character in that context.
> 
> So this strikes me as non-compliant:
> 
> % emulate sh
> % touch \\test abc
> % echo [!a]*
> zsh: event not found: a]

We can't do much directly about this (i.e. with BANG_HISTORY), as Bart
pointed out.

> % echo [\!a]*
> \test

Ooh err.  I don't like that one at all.

> % echo [\]*
> []*
> % echo [\\]*
> \test

These are non-compliant, but it's actually useful being able to quote
things in a range with a backslash, particularly square brackets, rather
than rely on the rather tricky rules of positioning (anyone remember off
the top of their head how to match both a `]' and a `-'?).  I hope
it will be possible to turn this into an option (part of SH_GLOB?) by
un-nulling the backslashes.  It's possible the extra test could make a
noticeable effect on the speed of pattern compilation, but again I hope
not.  I haven't looked at the code.

-- 
Peter Stephenson <pws@pwstephenson.fsnet.co.uk>
Work: pws@csr.com
Web: http://www.pwstephenson.fsnet.co.uk


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: bracket expressions and POSIX
  2001-06-30 19:00 ` Peter Stephenson
  2001-06-30 18:25   ` Bart Schaefer
@ 2001-07-02  6:13   ` Andrej Borsenkow
  2001-07-02  7:10     ` Bart Schaefer
                       ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Andrej Borsenkow @ 2001-07-02  6:13 UTC (permalink / raw)
  To: Peter Stephenson, Zsh hackers list

>
> Clint Adams wrote:
> > POSIX says that \ loses its special meaning within a bracket
> > expression for pattern matching

What gave you that idea? I would be highly surprised if POSIX had different
rules as XPG and SUS; and according to SUS the following still matches abc:

a["\b"]c

Actually, SUS explicitly speaks about both quoting (when parsed by shell)
and escaping (when pattern is interpreted).

<http://www.opengroup.org/onlinepubs/007908799/xcu/chap2.html#tag_001_013>

Note, that patterns in shell are *based* on regular expressions but not
identical (that probably has confused you).

                                                (anyone remember off
> the top of their head how to match both a `]' and a `-'?).

If a bracket expression must specify both - and ], the ] must be placed
first (after the ^, if any) and the - last within the bracket expression.

Where zsh really violates POSIX:

==
Since each asterisk matches zero or more occurrences, the patterns a*b and
a**b have identical functionality
==

Another point is using collating elements, ranges etc ... anything that has
to deal with locale.

-andrej


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-07-02  6:13   ` Andrej Borsenkow
@ 2001-07-02  7:10     ` Bart Schaefer
  2001-07-02  7:33       ` Andrej Borsenkow
  2001-07-02  7:44     ` Andrej Borsenkow
  2001-07-02 13:31     ` Clint Adams
  2 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2001-07-02  7:10 UTC (permalink / raw)
  To: Zsh hackers list

On Jul 2, 10:13am, Andrej Borsenkow wrote:
} Subject: RE: bracket expressions and POSIX
}
} Where zsh really violates POSIX:
} 
} ==
} Since each asterisk matches zero or more occurrences, the patterns a*b and
} a**b have identical functionality
} ==

Of course, a*b and a**b do have identical functionality.  ** only becomes
special when it has a slash after it.  I suppose SH_GLOB could disable it.

} Another point is using collating elements, ranges etc ... anything that
} has to deal with locale.

I've lost track of what happened to the strcoll() situation since PWS's
regex-like implementation of globbing was put in.  There used to be #if 0
code in glob.c (as of workers/7185) but it has completely disappeared.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: bracket expressions and POSIX
  2001-07-02  7:10     ` Bart Schaefer
@ 2001-07-02  7:33       ` Andrej Borsenkow
  0 siblings, 0 replies; 12+ messages in thread
From: Andrej Borsenkow @ 2001-07-02  7:33 UTC (permalink / raw)
  To: Zsh hackers list

>
> } Another point is using collating elements, ranges etc ... anything that
> } has to deal with locale.
>
> I've lost track of what happened to the strcoll() situation since PWS's
> regex-like implementation of globbing was put in.  There used to be #if 0
> code in glob.c (as of workers/7185) but it has completely disappeared.
>

It is more than just strcoll. Bracket expression is supposed to recognize
collating elements and equivalence classes, and I still have no idea how to
do it portably. Probably, when we see something like [[.ch.]] (valid for
Spanish locale), we could try native regcompile to see if it succeeds. It
seems the only possiblity to check it portably. The same for [[=a=]] that
may match aby accented character depending on locale.

But that means, that bracket expression may match more than one character; I
do not know if our code is prepared to do it.

Of course, [[.ch.]-z] is impossible without strcoll.

-andrej


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: bracket expressions and POSIX
  2001-07-02  6:13   ` Andrej Borsenkow
  2001-07-02  7:10     ` Bart Schaefer
@ 2001-07-02  7:44     ` Andrej Borsenkow
  2001-07-02 13:41       ` Clint Adams
  2001-07-02 13:31     ` Clint Adams
  2 siblings, 1 reply; 12+ messages in thread
From: Andrej Borsenkow @ 2001-07-02  7:44 UTC (permalink / raw)
  To: Zsh hackers list

>
> Another point is using collating elements, ranges etc ...
> anything that has
> to deal with locale.
>

IIRC POSIX does not deal with locale so it is a bit offtopic w.r.t. POSIX
compatibility. But XPG/SUS do. Anybody with access to pure POSIX (even
drafts would do; I have access only to XPG).

-andrej


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-07-02  6:13   ` Andrej Borsenkow
  2001-07-02  7:10     ` Bart Schaefer
  2001-07-02  7:44     ` Andrej Borsenkow
@ 2001-07-02 13:31     ` Clint Adams
  2 siblings, 0 replies; 12+ messages in thread
From: Clint Adams @ 2001-07-02 13:31 UTC (permalink / raw)
  To: Andrej Borsenkow; +Cc: Peter Stephenson, Zsh hackers list

> Actually, SUS explicitly speaks about both quoting (when parsed by shell)
> and escaping (when pattern is interpreted).
> 
> <http://www.opengroup.org/onlinepubs/007908799/xcu/chap2.html#tag_001_013>

This is similar to POSIX, as is the reference on Bracket Expressions.
http://www.opengroup.org/onlinepubs/007908799/xbd/re.html#tag_007_003_005

> Note, that patterns in shell are *based* on regular expressions but not
> identical (that probably has confused you).

What confuses me is where it says that the backslash loses its special
meaning within a bracket expression.  I see that backslash-escapes
take precedence over bracket expressions in BRE's, but, as you noted,
pattern matching is not BRE's.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-07-02  7:44     ` Andrej Borsenkow
@ 2001-07-02 13:41       ` Clint Adams
  2001-07-02 14:47         ` Andrej Borsenkow
  0 siblings, 1 reply; 12+ messages in thread
From: Clint Adams @ 2001-07-02 13:41 UTC (permalink / raw)
  To: Andrej Borsenkow; +Cc: Zsh hackers list

> IIRC POSIX does not deal with locale so it is a bit offtopic w.r.t. POSIX
> compatibility. But XPG/SUS do. Anybody with access to pure POSIX (even
> drafts would do; I have access only to XPG).

The standard utilities in the Shell and Utilities volume of IEEE Std 1003.1-200x shall base their
behavior on the current locale, as defined in the ENVIRONMENT VARIABLES section for each
utility. The behavior of some of the C-language functions defined in the System Interfaces
volume of IEEE Std 1003.1-200x shall also be modified based on the current locale, as defined by
the last call to setlocale ().

Locales other than those supplied by the implementation can be created via the localedef utility,
provided that the _POSIX2_LOCALEDEF symbol is defined on the system. Even if localedef is not
provided, all implementations conforming to the System Interfaces volume of
IEEE Std 1003.1-200x shall provide one or more locales that behave as described in this chapter.
The input to the utility is described in Section 7.3 (on page 122). The value that is used to specify
a locale when using environment variables shall be the string specified as the name operand to
the localedef utility when the locale was created. The strings "C" and "POSIX" are reserved as
identifiers for the POSIX locale (see Section 7.2 (on page 122)). When the value of a locale
environment variable begins with a slash ( / ), it shall be interpreted as the pathname of the
locale definition; the type of file (regular, directory, and so on) used to store the locale definition
is implementation-defined. If the value does not begin with a slash, the mechanism used to
locate the locale is implementation-defined.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: bracket expressions and POSIX
  2001-07-02 13:41       ` Clint Adams
@ 2001-07-02 14:47         ` Andrej Borsenkow
  2001-07-02 14:57           ` Clint Adams
  0 siblings, 1 reply; 12+ messages in thread
From: Andrej Borsenkow @ 2001-07-02 14:47 UTC (permalink / raw)
  To: Zsh hackers list

> 
> > IIRC POSIX does not deal with locale so it is a bit offtopic 
> w.r.t. POSIX
> > compatibility. But XPG/SUS do. Anybody with access to pure POSIX (even
> > drafts would do; I have access only to XPG).
> 
> The standard utilities in the Shell and Utilities volume of IEEE 
> Std 1003.1-200x shall base their
> behavior on the current locale,
> [... other stuff omitted ...]

Does pure POSIX define equivalence classes/collating symbols?

-andrej


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bracket expressions and POSIX
  2001-07-02 14:47         ` Andrej Borsenkow
@ 2001-07-02 14:57           ` Clint Adams
  0 siblings, 0 replies; 12+ messages in thread
From: Clint Adams @ 2001-07-02 14:57 UTC (permalink / raw)
  To: Andrej Borsenkow; +Cc: Zsh hackers list

> Does pure POSIX define equivalence classes/collating symbols?

It does.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2001-07-02 14:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-29 18:22 bracket expressions and POSIX Clint Adams
2001-06-30  2:14 ` Bart Schaefer
2001-06-30 19:00 ` Peter Stephenson
2001-06-30 18:25   ` Bart Schaefer
2001-07-02  6:13   ` Andrej Borsenkow
2001-07-02  7:10     ` Bart Schaefer
2001-07-02  7:33       ` Andrej Borsenkow
2001-07-02  7:44     ` Andrej Borsenkow
2001-07-02 13:41       ` Clint Adams
2001-07-02 14:47         ` Andrej Borsenkow
2001-07-02 14:57           ` Clint Adams
2001-07-02 13:31     ` Clint Adams

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).