zsh-workers
 help / color / mirror / code / Atom feed
* behavior of test true -a \( ! -a \)
@ 2024-03-21 10:07 Vincent Lefevre
  2024-03-21 10:28 ` Peter Stephenson
  2024-03-21 17:39 ` Bart Schaefer
  0 siblings, 2 replies; 22+ messages in thread
From: Vincent Lefevre @ 2024-03-21 10:07 UTC (permalink / raw)
  To: zsh-workers

I know that the "test" utility (builtin in zsh) is ambiguous,
is not completely specified by POSIX and should not be used,
but IMHO, it should behave in a sensible and consistent way.

The following with zsh 5.9 is inconsistent:

qaa% test \( ! -a \) ; echo $?
1
qaa% test true -a \( ! -a \) ; echo $?
test: argument expected
2

In the second case, one should just get 1, like in the first case.

In 2008, I had noted at

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=421591#25

that zsh was returning 1 as expected. So it seems that this has
changed. This bug is about bash giving an error (like zsh now),
but note that with ash (BusyBox v1.36.1 sh), dash 0.5.12 and the
"test" command from the GNU coreutils 9.4, one gets 1 in these
two cases.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 10:07 behavior of test true -a \( ! -a \) Vincent Lefevre
@ 2024-03-21 10:28 ` Peter Stephenson
  2024-03-21 11:04   ` Vincent Lefevre
  2024-03-21 17:39 ` Bart Schaefer
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Stephenson @ 2024-03-21 10:28 UTC (permalink / raw)
  To: zsh-workers

> On 21/03/2024 10:07 GMT Vincent Lefevre <vincent@vinc17.net> wrote:
> I know that the "test" utility (builtin in zsh) is ambiguous,
> is not completely specified by POSIX and should not be used,
> but IMHO, it should behave in a sensible and consistent way.
> 
> The following with zsh 5.9 is inconsistent:
> 
> qaa% test \( ! -a \) ; echo $?
> 1
> qaa% test true -a \( ! -a \) ; echo $?
> test: argument expected
> 2

As you can imagine, trying to put some order on the ill-defined mess
here tends to mean moving the problems around rather than fixing them.

I haven't had time to go through this completely but I think somewhere
near the root of the issue is this chunk in par_cond_2(), encountered at
the opint we get to the "!":

    if (tok == BANG) {
	/*
	 * In "test" compatibility mode, "! -a ..." and "! -o ..."
	 * are treated as "[string] [and] ..." and "[string] [or] ...".
	 */
	if (!(n_testargs > 2 && (check_cond(*testargs, "a") ||
				 check_cond(*testargs, "o"))))
	{
	    condlex();
	    ecadd(WCB_COND(COND_NOT, 0));
	    return par_cond_2();
	}
    }

in which case it needs yet more logic to decide why we shouldn't treat !
-a as a string followed by a logical "and" in this case.  To be clear,
obviously *I* can see why you want that, the question is teaching the
code without confusing it further.

pws


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 10:28 ` Peter Stephenson
@ 2024-03-21 11:04   ` Vincent Lefevre
  2024-03-21 11:29     ` Peter Stephenson
  0 siblings, 1 reply; 22+ messages in thread
From: Vincent Lefevre @ 2024-03-21 11:04 UTC (permalink / raw)
  To: zsh-workers

On 2024-03-21 10:28:16 +0000, Peter Stephenson wrote:
> I haven't had time to go through this completely but I think somewhere
> near the root of the issue is this chunk in par_cond_2(), encountered at
> the opint we get to the "!":
> 
>     if (tok == BANG) {
> 	/*
> 	 * In "test" compatibility mode, "! -a ..." and "! -o ..."
> 	 * are treated as "[string] [and] ..." and "[string] [or] ...".
> 	 */
> 	if (!(n_testargs > 2 && (check_cond(*testargs, "a") ||
> 				 check_cond(*testargs, "o"))))
> 	{
> 	    condlex();
> 	    ecadd(WCB_COND(COND_NOT, 0));
> 	    return par_cond_2();
> 	}
>     }
> 
> in which case it needs yet more logic to decide why we shouldn't treat !
> -a as a string followed by a logical "and" in this case.  To be clear,
> obviously *I* can see why you want that, the question is teaching the
> code without confusing it further.

Perhaps follow the coreutils logic. What matters is that if there is
a "(" argument, it tries to look at a matching ")" argument among the
following 3 arguments. So, for instance, if it can see

  ( arg2 arg3 )

(possibly with other arguments after the closing parenthesis[*]), it
will apply the POSIX test on 4 arguments.

[*] which can make sense if the 5th argument is -a or -o.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 11:04   ` Vincent Lefevre
@ 2024-03-21 11:29     ` Peter Stephenson
  2024-03-21 12:18       ` Vincent Lefevre
  2024-03-25 16:38       ` Peter Stephenson
  0 siblings, 2 replies; 22+ messages in thread
From: Peter Stephenson @ 2024-03-21 11:29 UTC (permalink / raw)
  To: zsh-workers

> On 21/03/2024 11:04 GMT Vincent Lefevre <vincent@vinc17.net> wrote:
> On 2024-03-21 10:28:16 +0000, Peter Stephenson wrote:
> > I haven't had time to go through this completely but I think somewhere
> > near the root of the issue is this chunk in par_cond_2(), encountered at
> > the opint we get to the "!":
> > 
> >     if (tok == BANG) {
> > 	/*
> > 	 * In "test" compatibility mode, "! -a ..." and "! -o ..."
> > 	 * are treated as "[string] [and] ..." and "[string] [or] ...".
> > 	 */
> > 	if (!(n_testargs > 2 && (check_cond(*testargs, "a") ||
> > 				 check_cond(*testargs, "o"))))
> > 	{
> > 	    condlex();
> > 	    ecadd(WCB_COND(COND_NOT, 0));
> > 	    return par_cond_2();
> > 	}
> >     }
> > 
> > in which case it needs yet more logic to decide why we shouldn't treat !
> > -a as a string followed by a logical "and" in this case.  To be clear,
> > obviously *I* can see why you want that, the question is teaching the
> > code without confusing it further.
> 
> Perhaps follow the coreutils logic. What matters is that if there is
> a "(" argument, it tries to look at a matching ")" argument among the
> following 3 arguments. So, for instance, if it can see
> 
>   ( arg2 arg3 )
> 
> (possibly with other arguments after the closing parenthesis[*]), it
> will apply the POSIX test on 4 arguments.
> 
> [*] which can make sense if the 5th argument is -a or -o.

I suppose as long as we only look for ")" when we know there's one to
match we can probably get away with it without being too clever.  If
there's a ")" that logically needs to be treated as a string following a
"(" we're stuck but I think that's fair game.

Something simple like: if we find a (, look for a matching ), so blindly
count intervening ('s and )'s regardless of where they occur, and then
NULL out the matching ) temporarily until we've parsed the expression
inside.  If we don't find a matching one treat the ( as as a string.

pws


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 11:29     ` Peter Stephenson
@ 2024-03-21 12:18       ` Vincent Lefevre
  2024-03-21 12:25         ` Peter Stephenson
  2024-03-25 16:38       ` Peter Stephenson
  1 sibling, 1 reply; 22+ messages in thread
From: Vincent Lefevre @ 2024-03-21 12:18 UTC (permalink / raw)
  To: zsh-workers

On 2024-03-21 11:29:39 +0000, Peter Stephenson wrote:
> I suppose as long as we only look for ")" when we know there's one to
> match we can probably get away with it without being too clever.  If
> there's a ")" that logically needs to be treated as a string following a
> "(" we're stuck but I think that's fair game.
> 
> Something simple like: if we find a (, look for a matching ), so blindly
> count intervening ('s and )'s regardless of where they occur, and then
> NULL out the matching ) temporarily until we've parsed the expression
> inside.  If we don't find a matching one treat the ( as as a string.

But be careful with the simple

  test \( = \)

which you may not want to change. This is currently the equality test,
thus returning false (1).

In POSIX, this special case is unspecified, but if the obsolescent
XSI option is supported, this should be the unary test of "=", thus
true (0) instead of false (1).

In practice, all implementations (including zsh) does

$ test \( = \) ; echo $?
1

i.e. seeing it as an equality test. This is rather surprising (but
perhaps more useful), as the goal of the XSI option was to support
parentheses, which implementations do; thus one could expect 0.

However, zsh differs for

  test true -a \( = \) ; echo $?

zsh still sees \( = \) as an equality test (it outputs 1), while
the other implementations output 0, probably because they perform
the unary test on "=" (which returns true, i.e. 0).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 12:18       ` Vincent Lefevre
@ 2024-03-21 12:25         ` Peter Stephenson
  2024-03-21 19:06           ` Bart Schaefer
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Stephenson @ 2024-03-21 12:25 UTC (permalink / raw)
  To: zsh-workers

> On 21/03/2024 12:18 GMT Vincent Lefevre <vincent@vinc17.net> wrote:
> On 2024-03-21 11:29:39 +0000, Peter Stephenson wrote:
> > I suppose as long as we only look for ")" when we know there's one to
> > match we can probably get away with it without being too clever.  If
> > there's a ")" that logically needs to be treated as a string following a
> > "(" we're stuck but I think that's fair game.
> > 
> > Something simple like: if we find a (, look for a matching ), so blindly
> > count intervening ('s and )'s regardless of where they occur, and then
> > NULL out the matching ) temporarily until we've parsed the expression
> > inside.  If we don't find a matching one treat the ( as as a string.
> 
> But be careful with the simple
> 
>   test \( = \)
> 
> which you may not want to change. This is currently the equality test,
> thus returning false (1).

Meh.

I think we factor out simple cases with two or three arguments and
assume they aren't doing grouping or logical combinations, so that might
be OK.  Obviously once we're into more complciated expressions it's
going to get fraught, but keeping the simple cases that test was
originally designed for (and ideally should only ever be used with)
working is a good goal.

pws


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 10:07 behavior of test true -a \( ! -a \) Vincent Lefevre
  2024-03-21 10:28 ` Peter Stephenson
@ 2024-03-21 17:39 ` Bart Schaefer
  2024-03-23 21:48   ` Bart Schaefer
  1 sibling, 1 reply; 22+ messages in thread
From: Bart Schaefer @ 2024-03-21 17:39 UTC (permalink / raw)
  To: zsh-workers

On Thu, Mar 21, 2024 at 3:07 AM Vincent Lefevre <vincent@vinc17.net> wrote:
>
> In 2008, I had noted at
>
>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=421591#25
>
> that zsh was returning 1 as expected. So it seems that this has
> changed.

I'm guessing the change is from commit 2afa556d8 (workers/31696).


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 12:25         ` Peter Stephenson
@ 2024-03-21 19:06           ` Bart Schaefer
  2024-03-22  5:02             ` Bart Schaefer
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Schaefer @ 2024-03-21 19:06 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

On Thu, Mar 21, 2024 at 5:25 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
>
> I think we factor out simple cases with two or three arguments and
> assume they aren't doing grouping or logical combinations, so that might
> be OK.  Obviously once we're into more complciated expressions it's
> going to get fraught

Isn't the problem really here?

        if (n_testargs > 2) {
            /* three arguments: if the second argument is a binary operator, *
             * perform that binary test on the first and the third argument  */
            if (!strcmp(*testargs, "=")  ||
                !strcmp(*testargs, "==") ||
                !strcmp(*testargs, "!=") ||
                (IS_DASH(**testargs) && get_cond_num(*testargs + 1) >= 0)) {
                s1 = tokstr;
                condlex();
                s2 = tokstr;
                condlex();
                s3 = tokstr;
                condlex();
                return par_cond_triple(s1, s2, s3);
            }
        }

Shouldn't this be doing an expression parse for s3 rather than just
lexing a token?  Admittedly I'm not sure how that factors in s1 given
that s2 is the actual binary operator here.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 19:06           ` Bart Schaefer
@ 2024-03-22  5:02             ` Bart Schaefer
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Schaefer @ 2024-03-22  5:02 UTC (permalink / raw)
  To: zsh-workers

On Thu, Mar 21, 2024 at 12:06 PM Bart Schaefer
<schaefer@brasslantern.com> wrote:
>
> Isn't the problem really here?

OK, no, it isn't.  That would be for e.g. this:

test 1 -lt \( 2 \)

Operators like -a / -o fall through to later code.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 17:39 ` Bart Schaefer
@ 2024-03-23 21:48   ` Bart Schaefer
  2024-03-23 22:20     ` Vincent Lefevre
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Schaefer @ 2024-03-23 21:48 UTC (permalink / raw)
  To: zsh-workers

On Thu, Mar 21, 2024 at 10:39 AM Bart Schaefer
<schaefer@brasslantern.com> wrote:
>
> I'm guessing the change is from commit 2afa556d8 (workers/31696).

That commit includes this comment:

+       /*
+        * In "test" compatibility mode, "! -a ..." and "! -o ..."
+        * are treated as "[string] [and] ..." and "[string] [or] ...".
+        */

The way this is tested was changed first at commit cb596a55
(workers/35306) and again at commit daa208e9 (workers/49269) to end up
with this:

        if (!(n_testargs > 2 && (check_cond(*testargs, "a") ||
                                 check_cond(*testargs, "o"))))

So this means that in the original example:

% test true -a \( ! -a \) ; echo $?
test: argument expected

The closing \) is being treated as the string argument of [ ! -a \) ]
rather than as a match for the opening \(.

Which you can see by:

% test \( ! -a \) \)
% echo $?
0
% test \( ! -a \) \) \)
test: too many arguments

I'd therefore argue that it's actually

% test \( ! -a \)

that is wrong: It should be complaining of a missing close paren,
rather than magically reverting to treating "!" as "not" (and also
"-a" as a plain string).  It's entirely dependent on the count of
arguments rather than on treating parens as tokens.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-23 21:48   ` Bart Schaefer
@ 2024-03-23 22:20     ` Vincent Lefevre
  2024-03-23 22:41       ` Bart Schaefer
  0 siblings, 1 reply; 22+ messages in thread
From: Vincent Lefevre @ 2024-03-23 22:20 UTC (permalink / raw)
  To: zsh-workers

On 2024-03-23 14:48:36 -0700, Bart Schaefer wrote:
> I'd therefore argue that it's actually
> 
> % test \( ! -a \)
> 
> that is wrong: It should be complaining of a missing close paren,
> rather than magically reverting to treating "!" as "not" (and also
> "-a" as a plain string).  It's entirely dependent on the count of
> arguments rather than on treating parens as tokens.

POSIX specifies what happens with up to 4 arguments. The idea is to
interpret the operators in a way so that the expression is meaningful
rather than having an incorrect syntax.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-23 22:20     ` Vincent Lefevre
@ 2024-03-23 22:41       ` Bart Schaefer
  2024-03-23 23:33         ` Lawrence Velázquez
  2024-03-25 10:23         ` Vincent Lefevre
  0 siblings, 2 replies; 22+ messages in thread
From: Bart Schaefer @ 2024-03-23 22:41 UTC (permalink / raw)
  To: zsh-workers

On Sat, Mar 23, 2024 at 3:20 PM Vincent Lefevre <vincent@vinc17.net> wrote:
>
> On 2024-03-23 14:48:36 -0700, Bart Schaefer wrote:
> > I'd therefore argue that it's actually
> >
> > % test \( ! -a \)
> >
> > that is wrong
>
> POSIX specifies what happens with up to 4 arguments.

Ok, but

% test \( ! -a \) \)

has five and

% test \( ! -a \) -a true

has six, and in neither case are the "first four" interpreted as you
would have the "last four" interpreted in

% test true -a \( ! -a \)

> The idea is to
> interpret the operators in a way so that the expression is meaningful

The only way to do that is to (in effect) start counting arguments
again when \( is encountered.  That changes the meaning of everything
with an open paren and more than four words.  At what point do we
stop?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-23 22:41       ` Bart Schaefer
@ 2024-03-23 23:33         ` Lawrence Velázquez
  2024-03-24  0:14           ` Bart Schaefer
  2024-03-25 10:23         ` Vincent Lefevre
  1 sibling, 1 reply; 22+ messages in thread
From: Lawrence Velázquez @ 2024-03-23 23:33 UTC (permalink / raw)
  To: zsh-workers

On Sat, Mar 23, 2024, at 6:41 PM, Bart Schaefer wrote:
> On Sat, Mar 23, 2024 at 3:20 PM Vincent Lefevre <vincent@vinc17.net> wrote:
>>
>> On 2024-03-23 14:48:36 -0700, Bart Schaefer wrote:
>> > I'd therefore argue that it's actually
>> >
>> > % test \( ! -a \)
>> >
>> > that is wrong
>>
>> POSIX specifies what happens with up to 4 arguments.

The next version of POSIX will only specify what happens with four
arguments if the very first one is "!".  The "-a" and "-o" primaries
and the "(" and ")" operators have been removed.

> Ok, but
>
> % test \( ! -a \) \)
>
> has five and
>
> % test \( ! -a \) -a true
>
> has six, and in neither case are the "first four" interpreted as you
> would have the "last four" interpreted in
>
> % test true -a \( ! -a \)
>
>> The idea is to
>> interpret the operators in a way so that the expression is meaningful
>
> The only way to do that is to (in effect) start counting arguments
> again when \( is encountered.  That changes the meaning of everything
> with an open paren and more than four words.  At what point do we
> stop?

The current version of POSIX leaves test(1) behavior with more than
4 arguments unspecified but says that:

	On XSI-conformant systems, combinations of primaries and
	operators shall be evaluated using the precedence and
	associativity rules described previously.  In addition, the
	string comparison binary primaries '=' and "!=" shall have
	a higher precedence than any unary primary.

I suspect that the Austin Group gave up on the whole thing once
they realized its general intractability.

-- 
vq


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-23 23:33         ` Lawrence Velázquez
@ 2024-03-24  0:14           ` Bart Schaefer
  2024-03-24  2:52             ` Lawrence Velázquez
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Schaefer @ 2024-03-24  0:14 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: zsh-workers

On Sat, Mar 23, 2024 at 4:33 PM Lawrence Velázquez <larryv@zsh.org> wrote:
>
> The next version of POSIX will only specify what happens with four
> arguments if the very first one is "!".  The "-a" and "-o" primaries
> and the "(" and ")" operators have been removed.

So, no "and"/"or" at all, and no grouping?  Or are they just demoted
from "primary" and "operator" in some way?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-24  0:14           ` Bart Schaefer
@ 2024-03-24  2:52             ` Lawrence Velázquez
  0 siblings, 0 replies; 22+ messages in thread
From: Lawrence Velázquez @ 2024-03-24  2:52 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

On Sat, Mar 23, 2024, at 8:14 PM, Bart Schaefer wrote:
> On Sat, Mar 23, 2024 at 4:33 PM Lawrence Velázquez <larryv@zsh.org> wrote:
>>
>> The next version of POSIX will only specify what happens with four
>> arguments if the very first one is "!".  The "-a" and "-o" primaries
>> and the "(" and ")" operators have been removed.
>
> So, no "and"/"or" at all, and no grouping?  Or are they just demoted
> from "primary" and "operator" in some way?

The former -- everything marked "OB" in the current "test" spec [1]
has been completely excised from the upcoming one (although shells
can retain them as extensions to the standard, to avoid breaking
scripts).  The current "Application Usage" section [2] already
recommends that portable scripts stick to simpler "test" commands
connected with general shell syntax, like so:

	test -f "$foo" || { test ! "$bar" && test -d "$baz"; }

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
[2]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html#tag_20_128_16

-- 
vq


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-23 22:41       ` Bart Schaefer
  2024-03-23 23:33         ` Lawrence Velázquez
@ 2024-03-25 10:23         ` Vincent Lefevre
  2024-03-25 15:21           ` Bart Schaefer
  1 sibling, 1 reply; 22+ messages in thread
From: Vincent Lefevre @ 2024-03-25 10:23 UTC (permalink / raw)
  To: zsh-workers

On 2024-03-23 15:41:33 -0700, Bart Schaefer wrote:
> On Sat, Mar 23, 2024 at 3:20 PM Vincent Lefevre <vincent@vinc17.net> wrote:
> >
> > On 2024-03-23 14:48:36 -0700, Bart Schaefer wrote:
> > > I'd therefore argue that it's actually
> > >
> > > % test \( ! -a \)
> > >
> > > that is wrong
> >
> > POSIX specifies what happens with up to 4 arguments.
> 
> Ok, but
> 
> % test \( ! -a \) \)
> 
> has five and
> 
> % test \( ! -a \) -a true
> 
> has six, and in neither case are the "first four" interpreted as you
> would have the "last four" interpreted in
> 
> % test true -a \( ! -a \)

I meant that

  test \( ! -a \)

has four, thus fully specified and not wrong.

Concerning

  test true -a \( ! -a \)

I would say that if you decide that the first "-a" is an "and",
then after this "-a", there remain exactly 4 arguments, so that
for *consistency*, I think that the remaining 4 arguments should
be interpreted exactly as in

  test \( ! -a \)

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-25 10:23         ` Vincent Lefevre
@ 2024-03-25 15:21           ` Bart Schaefer
  2024-03-25 17:33             ` Vincent Lefevre
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Schaefer @ 2024-03-25 15:21 UTC (permalink / raw)
  To: zsh-workers

On Mon, Mar 25, 2024 at 3:24 AM Vincent Lefevre <vincent@vinc17.net> wrote:
>
> On 2024-03-23 15:41:33 -0700, Bart Schaefer wrote:
> > On Sat, Mar 23, 2024 at 3:20 PM Vincent Lefevre <vincent@vinc17.net> wrote:
> > >
> > > On 2024-03-23 14:48:36 -0700, Bart Schaefer wrote:
> > > > I'd therefore argue that it's actually
> > > >
> > > > % test \( ! -a \)
> > > >
> > > > that is wrong
> > >
> > > POSIX specifies what happens with up to 4 arguments.
> >
> > Ok, but
> >
> > % test \( ! -a \) \)
> >
> > has five [...]
>
> I meant that
>
>   test \( ! -a \)
>
> has four, thus fully specified and not wrong.

And I meant (in the follow-up, being unaware of the exact spec at
first) that I think it was wrong to specify it that way in the first
place.

I suspect that we're re-hashing an argument that's already been had on
austin-group and led to these constructs being first declared obsolete
and soon dropped.

> Concerning
>
>   test true -a \( ! -a \)
>
> I would say that if you decide that the first "-a" is an "and",
> then after this "-a", there remain exactly 4 arguments, so that
> for *consistency*, I think that the remaining 4 arguments should
> be interpreted exactly as in
>
>   test \( ! -a \)

So what about
  test true -a \( ! -a \) \)
??

Counting arguments just doesn't work.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-21 11:29     ` Peter Stephenson
  2024-03-21 12:18       ` Vincent Lefevre
@ 2024-03-25 16:38       ` Peter Stephenson
  2024-03-25 17:36         ` Bart Schaefer
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Stephenson @ 2024-03-25 16:38 UTC (permalink / raw)
  To: zsh-workers

> On 21/03/2024 11:29 GMT Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > On 21/03/2024 11:04 GMT Vincent Lefevre <vincent@vinc17.net> wrote:
> > On 2024-03-21 10:28:16 +0000, Peter Stephenson wrote:
> > > I haven't had time to go through this completely but I think somewhere
> > > near the root of the issue is this chunk in par_cond_2(), encountered at
> > > the opint we get to the "!":
> > > 
> > >     if (tok == BANG) {
> > > 	/*
> > > 	 * In "test" compatibility mode, "! -a ..." and "! -o ..."
> > > 	 * are treated as "[string] [and] ..." and "[string] [or] ...".
> > > 	 */
> > > 	if (!(n_testargs > 2 && (check_cond(*testargs, "a") ||
> > > 				 check_cond(*testargs, "o"))))
> > > 	{
> > > 	    condlex();
> > > 	    ecadd(WCB_COND(COND_NOT, 0));
> > > 	    return par_cond_2();
> > > 	}
> > >     }
> > > 
> > > in which case it needs yet more logic to decide why we shouldn't treat !
> > > -a as a string followed by a logical "and" in this case.  To be clear,
> > > obviously *I* can see why you want that, the question is teaching the
> > > code without confusing it further.
> > 
> > Perhaps follow the coreutils logic. What matters is that if there is
> > a "(" argument, it tries to look at a matching ")" argument among the
> > following 3 arguments. So, for instance, if it can see
> > 
> >   ( arg2 arg3 )
> > 
> > (possibly with other arguments after the closing parenthesis[*]), it
> > will apply the POSIX test on 4 arguments.
> > 
> > [*] which can make sense if the 5th argument is -a or -o.
> 
> I suppose as long as we only look for ")" when we know there's one to
> match we can probably get away with it without being too clever.  If
> there's a ")" that logically needs to be treated as a string following a
> "(" we're stuck but I think that's fair game.
> 
> Something simple like: if we find a (, look for a matching ), so blindly
> count intervening ('s and )'s regardless of where they occur, and then
> NULL out the matching ) temporarily until we've parsed the expression
> inside.  If we don't find a matching one treat the ( as as a string.

This implements the above.  I don't think any of the subsequent
discussion has any impact on the effectiveness of this.  Because this is
only done when we've already identified a "(" as starting grouping,
looking for a ")" is benign --- failing to find it would mean the pattern
was invalid.  As Vincent already pointed out, assuming the pattern is
valid is a sensible strategy.  The remaining question is which ")" to
match if there are multiple.

I've added a check that "test \( = \)" returns 1.  This isn't affected
because as I said before three arguments are treated specially.
Possibly some more tests for non-pathological cases where parentheses do
grouping in test compatibility mode would be sensible.

There is inevitably a trade off: "test \( \) = \) \)" used to "work"
(test that the two inner closing parentheses were the same string) but
now doesn't (the first closing parenthesis ends the group started by the
first opening parenthesis).  That strikes me as OK as we're making the
less pathological case (the one where parentheses mean just one thing)
work.  However, it is a sign we're right on the edge of sanity, so I'm
not proposing to "fix" this any further.

Feel free to argue that the current behaviour of simply parsing
parentheses in order and blindly trusting the result is actually a
better bet, though I can't frankly see how, myself.  There is an
alternative strategy which is to assume the rightmost closing parenthesis
ends the outermost group.

Also feel free to come up with other pathologies.

pws

diff --git a/Src/parse.c b/Src/parse.c
index 3343656..1505b49 100644
--- a/Src/parse.c
+++ b/Src/parse.c
@@ -2528,10 +2528,39 @@ par_cond_2(void)
     if (tok == INPAR) {
 	int r;
 
+	/*
+	 * Owing to ambiguuities in "test" compatibility mode, it's
+	 * safest to assume the INPAR has a corresponding OUTPAR
+	 * before trying to guess what intervening strings mean.
+	 */
+	char **endargptr = NULL, *endarg = NULL;
+	if (condlex == testlex) {
+	    char **argptr;
+	    int n_inpar = 1;
+
+	    for (argptr = testargs; *argptr; argptr++) {
+		if (!strcmp(*argptr, ")")) {
+		    if (!--n_inpar) {
+			endargptr = argptr;
+			endarg = *argptr;
+			*argptr = NULL;
+			break;
+		    }
+		} else if (!strcmp(*argptr, "(")) {
+		    ++n_inpar;
+		}
+	    }
+	}
+
 	condlex();
 	while (COND_SEP())
 	    condlex();
 	r = par_cond();
+	if (endargptr) {
+	    *endargptr = endarg;
+	    if (testargs == endargptr)
+		condlex();
+	}
 	while (COND_SEP())
 	    condlex();
 	if (tok != OUTPAR)
diff --git a/Test/C02cond.ztst b/Test/C02cond.ztst
index daea5b4..453fa1c 100644
--- a/Test/C02cond.ztst
+++ b/Test/C02cond.ztst
@@ -442,6 +442,14 @@ F:scenario if you encounter it.
 >in conjunction: 3
 ?(eval):6: no such option: invalidoption
 
+  test \( = \)
+1: test compatility mode doesn't do grouping with three arguments
+
+# This becomes [[ -n true && ( -n -a ) ]]
+# The test is to ensure the ! -a is analysed as two arguments.
+  test true -a \( ! -a \)
+1: test compatilibty mode is based on arguments inside parentheses
+
 %clean
   # This works around a bug in rm -f in some versions of Cygwin
   chmod 644 unmodish


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-25 15:21           ` Bart Schaefer
@ 2024-03-25 17:33             ` Vincent Lefevre
  2024-03-25 17:43               ` Bart Schaefer
  0 siblings, 1 reply; 22+ messages in thread
From: Vincent Lefevre @ 2024-03-25 17:33 UTC (permalink / raw)
  To: zsh-workers

On 2024-03-25 08:21:53 -0700, Bart Schaefer wrote:
> On Mon, Mar 25, 2024 at 3:24 AM Vincent Lefevre <vincent@vinc17.net> wrote:
> > I meant that
> >
> >   test \( ! -a \)
> >
> > has four, thus fully specified and not wrong.
> 
> And I meant (in the follow-up, being unaware of the exact spec at
> first) that I think it was wrong to specify it that way in the first
> place.

Well, "test" with its ambiguities between strings and operators should
have never existed in the first place.

> I suspect that we're re-hashing an argument that's already been had on
> austin-group and led to these constructs being first declared obsolete
> and soon dropped.

Dropped from the standard, but not from existing scripts.

> > Concerning
> >
> >   test true -a \( ! -a \)
> >
> > I would say that if you decide that the first "-a" is an "and",
> > then after this "-a", there remain exactly 4 arguments, so that
> > for *consistency*, I think that the remaining 4 arguments should
> > be interpreted exactly as in
> >
> >   test \( ! -a \)
> 
> So what about
>   test true -a \( ! -a \) \)
> ??

With zsh, this is consistent:

cventin% test \( ! -a \) \) ; echo $?
0
cventin% test true -a \( ! -a \) \) ; echo $?
0

The other implementations see consistent too.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-25 16:38       ` Peter Stephenson
@ 2024-03-25 17:36         ` Bart Schaefer
  2024-04-03 13:59           ` Vincent Lefevre
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Schaefer @ 2024-03-25 17:36 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

On Mon, Mar 25, 2024 at 9:38 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
>
> > I suppose as long as we only look for ")" when we know there's one to
> > match we can probably get away with it without being too clever.  If
> > there's a ")" that logically needs to be treated as a string following a
> > "(" we're stuck but I think that's fair game.

So in other words you're intentionally breaking this:

% test \( ! -a \) \)
test: too many arguments

In the name of "fixing" this:

% test \) -a \( ! -a \)

If we're arguing here based on spec, POSIX says the below should
return 1 because $2 is a "binary primary" which takes precedence over
parens, but it's broken with or without this patch:

% test \( -a \(
test: parse error

To be fair, /bin/test on MacOS and /usr/bin/test on Ubuntu both choke
(or not) in exactly those same cases.

> Feel free to argue that the current behaviour of simply parsing
> parentheses in order and blindly trusting the result is actually a
> better bet

That's not how I'd describe the current behavior, but I'm arguing that
it's no worse and anything with more than 4 arguments is unspecified
anyway.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-25 17:33             ` Vincent Lefevre
@ 2024-03-25 17:43               ` Bart Schaefer
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Schaefer @ 2024-03-25 17:43 UTC (permalink / raw)
  To: zsh-workers

On Mon, Mar 25, 2024 at 10:33 AM Vincent Lefevre <vincent@vinc17.net> wrote:
>
> With zsh, this is consistent:
>
> cventin% test \( ! -a \) \) ; echo $?
> 0
> cventin% test true -a \( ! -a \) \) ; echo $?
> 0
>
> The other implementations see consistent too.

Those will both fail if PWS's patch is applied.  I'm not sure which
other implementations you mean, but

Mac% /bin/test \( ! -a \) \) ; echo $?
test: closing paren expected
2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: behavior of test true -a \( ! -a \)
  2024-03-25 17:36         ` Bart Schaefer
@ 2024-04-03 13:59           ` Vincent Lefevre
  0 siblings, 0 replies; 22+ messages in thread
From: Vincent Lefevre @ 2024-04-03 13:59 UTC (permalink / raw)
  To: zsh-workers

On 2024-03-25 10:36:13 -0700, Bart Schaefer wrote:
> So in other words you're intentionally breaking this:
> 
> % test \( ! -a \) \)
> test: too many arguments

I suppose that with zsh 5.9, zsh sees that -a is a binary primary
(so, no contradictions with POSIX[*] since for the decision of the
matching parenthesis, at least 5 arguments need to be considered,
and this is where the results are unspecified).

[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html

> In the name of "fixing" this:
> 
> % test \) -a \( ! -a \)
> 
> If we're arguing here based on spec, POSIX says the below should
> return 1 because $2 is a "binary primary" which takes precedence over
> parens, but it's broken with or without this patch:
> 
> % test \( -a \(
> test: parse error
> 
> To be fair, /bin/test on MacOS and /usr/bin/test on Ubuntu both choke
> (or not) in exactly those same cases.

In Debian:

With dash 0.5.12-6:

$ test \( -a \( ; echo $?
sh: 1: test: closing paren expected
2

With ksh93u+m 1.0.8-1:

$ test \( -a \( ; echo $?                                                      
ksh93: test: argument expected
2

With mksh 59c-35:

$ test \( -a \( ; echo $?
0

With bash 5.2.21-2:

vinc17@qaa:~$ test \( -a \( ; echo $?
0

With coreutils 9.4-3.1:

$ /usr/bin/test \( -a \( ; echo $?
/usr/bin/test: ‘-a’: unary operator expected
2

Note that POSIX says that the results are unspecified if
$1 is '(' and $3 is ')', but here both $1 and $3 are '('.
So the rule is:
  If $2 is a binary primary, perform the binary test of $1 and $3.

so that 0 is expected (only mksh and bash are correct).

IMHO, once you have reached a subsequence with at most 4 arguments
(like here), you should apply the POSIX rules. Doing otherwise is
surprising.

Said otherwise, I suppose that the following should work:
  * If there are at most 4 arguments, apply the POSIX rules.
  * If the first argument is an opening parenthesis, choose a rule
    to determine the matching closing parenthesis (say, arg n), or
    possibly regard this first argument as a string.
  * In the case of a real opening parenthesis, arg n needs to be the
    last argument or be followed by -a or -o. Apply the test algorithm
    on arg 2 to n-1, and in case of -a or -o, also on arg n+2 to the
    last arg (possibly with early termination in the parsing).
  * Otherwise:
    Choose a rule to determine the operator. Note that the obtained
    expression associated with this operator is necessarily followed
    by a -a or -o binary primary (say, arg n). Evaluate the expression
    and apply the test algorithm on arg n+1 to the last arg (possibly
    with early termination in the parsing).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-04-03 13:59 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-21 10:07 behavior of test true -a \( ! -a \) Vincent Lefevre
2024-03-21 10:28 ` Peter Stephenson
2024-03-21 11:04   ` Vincent Lefevre
2024-03-21 11:29     ` Peter Stephenson
2024-03-21 12:18       ` Vincent Lefevre
2024-03-21 12:25         ` Peter Stephenson
2024-03-21 19:06           ` Bart Schaefer
2024-03-22  5:02             ` Bart Schaefer
2024-03-25 16:38       ` Peter Stephenson
2024-03-25 17:36         ` Bart Schaefer
2024-04-03 13:59           ` Vincent Lefevre
2024-03-21 17:39 ` Bart Schaefer
2024-03-23 21:48   ` Bart Schaefer
2024-03-23 22:20     ` Vincent Lefevre
2024-03-23 22:41       ` Bart Schaefer
2024-03-23 23:33         ` Lawrence Velázquez
2024-03-24  0:14           ` Bart Schaefer
2024-03-24  2:52             ` Lawrence Velázquez
2024-03-25 10:23         ` Vincent Lefevre
2024-03-25 15:21           ` Bart Schaefer
2024-03-25 17:33             ` Vincent Lefevre
2024-03-25 17:43               ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).