From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham autolearn_force=no version=3.4.4 Received: from zero.zsh.org (zero.zsh.org [IPv6:2a02:898:31:0:48:4558:7a:7368]) by inbox.vuxu.org (Postfix) with ESMTP id E981A21706 for ; Mon, 25 Mar 2024 17:39:19 +0100 (CET) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1711384759; b=l50yeWSCk8KM31UAEn+Ks8yVkH9b1sBOIvz3nV+OcYVmKQej1rpdmLyDAdpFUpOT+rAwstAs1q RSbWCPcgFXEZGru2/VjiquVyaT4MsT1Xh7jgwL8iN+/nsiP6YR4uj+5kdSbYSyrjPhVAJfiSr8 5Q3Suk/FBjzf5gTEzT1ORQB5kUkpY4QlfM1C5HN5lfxj7N6fhZ8kRZVtAe0s6BE8w2NvvEEz0b Oav4lJ4QTmZMrsLnWc+0In98aYvqyFFw6P3qeWxdyr659pIw5ez3m9lt9Un3krijE74qelKBOl CZc5SJwv1HXxDUiBFKyStqpnIAdvqzVVcwgOijUkc7A/ag==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (csmtpq1-prd-nl1-vmo.edge.unified.services) smtp.remote-ip=84.116.50.35; dkim=pass header.d=ntlworld.com header.s=meg.feb2017 header.a=rsa-sha256; dmarc=pass header.from=ntlworld.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1711384759; bh=ZyQoLIfGFnVbCVrrV2Azn6L8VgXIFa6kOvWF6uyYp8w=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject: References:In-Reply-To:Message-ID:To:From:Date:DKIM-Signature: DKIM-Signature; b=snoOZo2GFjJIDrLNU4ak4AJypOX7JfoTvRpWXecM1MFOhfeAnPsuUciYPqIKeqvqULsE6sPsuu 8DnIrCyacWvzBkd8gbFvK+xCrOVuCFCuYfBoMVb2Wd7R8YihFuRXs/J9bslKCgH+EXkMHYiaYI ZxTXz3HoGIVpUy/kwovewL2mgbnRL5xm1xqnRZpI9CS5gL7P47hPVc/RubkDiT0fJuSFIH4tpH oyyVsFu96sC2npngLRb6rUZXDBilFfLRznP7SQraoI7LfBpDQG1kG6qVoNM6wDW+9rp/N3D9xc OlER2f/GxwZUtlza86qPgAIhmrB4LcXIROmeETn1eg5ABQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:Subject:References:In-Reply-To:Message-ID:To:From: Date:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=dqqfYqsThl/UbQCjzDDnjVdcLh0nGBSVntJ4o0qqc8Y=; b=ps2mFsQwcUJpqXPVLmw/S+S8ZC 4ET+MX/lCllyMVZsSiDrxB3lJEtG/Iu1hwvkvqW7P8jyMKE7FFWtFCY4G///7wNsMtOj1IU9i2ovM JBRTZDrhtXxTP/++HwBuAFDk8AhBu5qT5fYTKYi5tOoLYzNyG8kJXUAcRENrjfOoKMN3pBVH15g6m 0WMyQw3bldhzUpdhHr4ajEKaSSB4NM7Gzw+E9TZlVbWMyBV/krCsvOjRybzIPXMWMvhncMGywnSet nkuwTx3KdvUvxCfJd3WH3sAhlZZ/OPa06zfahd1Mj+7iDy+AOshQAF2OIHF/OYCULij3Ao/YFO7m+ zACDCtkQ==; Received: by zero.zsh.org with local id 1ronLy-000MIv-Uv; Mon, 25 Mar 2024 16:39:18 +0000 Authentication-Results: zsh.org; iprev=pass (csmtpq1-prd-nl1-vmo.edge.unified.services) smtp.remote-ip=84.116.50.35; dkim=pass header.d=ntlworld.com header.s=meg.feb2017 header.a=rsa-sha256; dmarc=pass header.from=ntlworld.com; arc=none Received: from csmtpq1-prd-nl1-vmo.edge.unified.services ([84.116.50.35]:62239) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1ronLM-000LyS-7x; Mon, 25 Mar 2024 16:38:41 +0000 Received: from oxsmtp4-prd-nl1-vmo.nl1.unified.services ([100.107.83.44]) by csmtpq1-prd-nl1-vmo.edge.unified.services with esmtp (Exim 4.93) (envelope-from ) id 1ronLG-0043ZI-CH for zsh-workers@zsh.org; Mon, 25 Mar 2024 17:38:34 +0100 Received: from oxbe21-prd-nl1-vmo.nl1.unified.services ([100.107.83.151]) by oxsmtp4-prd-nl1-vmo.nl1.unified.services with ESMTP id onLGrc2DXcTgionLGr0oVi; Mon, 25 Mar 2024 17:38:34 +0100 X-Env-Mailfrom: p.w.stephenson@ntlworld.com X-Env-Rcptto: zsh-workers@zsh.org X-SourceIP: 100.107.83.151 X-CNFS-Analysis: v=2.4 cv=Y+/+sAeN c=1 sm=1 tr=0 ts=6601a88a cx=a_exe a=k28V3Wc/3s9bJu6KWJ1bhA==:117 a=IvlwODmuRu4A:10 a=IkcTkHD0fZMA:10 a=NLZqzBF-AAAA:8 a=YR4_K0clAAAA:8 a=npRAfKc09mNz5PxOuP8A:9 a=QEXdDO2ut3YA:10 a=HbQOABYz3jhqdZF7JfnK:22 a=waMAY2KAzxQUlx2gRwsv:22 X-Authenticated-Sender: p.w.stephenson@ntlworld.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ntlworld.com; s=meg.feb2017; t=1711384714; bh=ZyQoLIfGFnVbCVrrV2Azn6L8VgXIFa6kOvWF6uyYp8w=; h=Date:From:To:In-Reply-To:References:Subject; b=Dve4NumxH3giLIm8k/FHYNeaZJ77moVbfpQvVYqjckR4dYiEMm48w8Ombzy+91sru HAQ8UbZDATW/pfFSzsc5G+w8XijxCQZbP4IAmjqp20p+4cztI/248co1PsxL2K8CZ3 27fLMEv5Ax24CP1ASB4MpOBsv67NpfQmmiAmy0f6Ozz3WS9riIjMba4xlJC1mtAB2l Ebg0PSjw3q5iw8OKc7rUPWZtGTWQRpZ9CK+CaX//4S5Fm5eb2EE/mYwcn7KSQVrg7P e/LWdnZ2cZcdzm4GcPk5ccUCew0K37/8UrqFHTX6/YqyoV2V5rujEFUrdIxEpXeXsy r5STZMpCpdm2g== Date: Mon, 25 Mar 2024 16:38:34 +0000 (GMT) From: Peter Stephenson To: zsh-workers@zsh.org Message-ID: <1255066524.6153675.1711384714319@mail.virginmedia.com> In-Reply-To: <1507569659.5899391.1711020579178@mail.virginmedia.com> References: <20240321100710.GA164665@qaa.vinc17.org> <1443395979.5911218.1711016896863@mail.virginmedia.com> <20240321110444.GC164665@qaa.vinc17.org> <1507569659.5899391.1711020579178@mail.virginmedia.com> Subject: Re: behavior of test true -a \( ! -a \) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer X-Originating-IP: 147.161.224.196 X-Originating-Client: open-xchange-appsuite X-CMAE-Envelope: MS4xfIp+sIdzIRgxLettz+AXCM2CjWubGQbOmUevVEllsW2emWDXFHO1PtZRfuiTfS7UjgF0pD66fj1lFwS+eMi24AR3Km0D19gpWXaageabkLsk7eJ4CSEJ 1FR9Ev7DPj9UXjTR5wl+b93DWThf3dZ8XS0wbG3rLYnwMmTdH2PhVfba9BZIdotz2+/aUqrnX0UKfbaamvwwrYrxCn5zeMXs5mBVhtf8M9DAHkD0fESVlGj7 X-Seq: 52813 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: > On 21/03/2024 11:29 GMT Peter Stephenson wrote: > > On 21/03/2024 11:04 GMT Vincent Lefevre wrote: > > On 2024-03-21 10:28:16 +0000, Peter Stephenson wrote: > > > I haven't had time to go through this completely but I think somewhere > > > near the root of the issue is this chunk in par_cond_2(), encountered at > > > the opint we get to the "!": > > > > > > if (tok == BANG) { > > > /* > > > * In "test" compatibility mode, "! -a ..." and "! -o ..." > > > * are treated as "[string] [and] ..." and "[string] [or] ...". > > > */ > > > if (!(n_testargs > 2 && (check_cond(*testargs, "a") || > > > check_cond(*testargs, "o")))) > > > { > > > condlex(); > > > ecadd(WCB_COND(COND_NOT, 0)); > > > return par_cond_2(); > > > } > > > } > > > > > > in which case it needs yet more logic to decide why we shouldn't treat ! > > > -a as a string followed by a logical "and" in this case. To be clear, > > > obviously *I* can see why you want that, the question is teaching the > > > code without confusing it further. > > > > Perhaps follow the coreutils logic. What matters is that if there is > > a "(" argument, it tries to look at a matching ")" argument among the > > following 3 arguments. So, for instance, if it can see > > > > ( arg2 arg3 ) > > > > (possibly with other arguments after the closing parenthesis[*]), it > > will apply the POSIX test on 4 arguments. > > > > [*] which can make sense if the 5th argument is -a or -o. > > I suppose as long as we only look for ")" when we know there's one to > match we can probably get away with it without being too clever. If > there's a ")" that logically needs to be treated as a string following a > "(" we're stuck but I think that's fair game. > > Something simple like: if we find a (, look for a matching ), so blindly > count intervening ('s and )'s regardless of where they occur, and then > NULL out the matching ) temporarily until we've parsed the expression > inside. If we don't find a matching one treat the ( as as a string. This implements the above. I don't think any of the subsequent discussion has any impact on the effectiveness of this. Because this is only done when we've already identified a "(" as starting grouping, looking for a ")" is benign --- failing to find it would mean the pattern was invalid. As Vincent already pointed out, assuming the pattern is valid is a sensible strategy. The remaining question is which ")" to match if there are multiple. I've added a check that "test \( = \)" returns 1. This isn't affected because as I said before three arguments are treated specially. Possibly some more tests for non-pathological cases where parentheses do grouping in test compatibility mode would be sensible. There is inevitably a trade off: "test \( \) = \) \)" used to "work" (test that the two inner closing parentheses were the same string) but now doesn't (the first closing parenthesis ends the group started by the first opening parenthesis). That strikes me as OK as we're making the less pathological case (the one where parentheses mean just one thing) work. However, it is a sign we're right on the edge of sanity, so I'm not proposing to "fix" this any further. Feel free to argue that the current behaviour of simply parsing parentheses in order and blindly trusting the result is actually a better bet, though I can't frankly see how, myself. There is an alternative strategy which is to assume the rightmost closing parenthesis ends the outermost group. Also feel free to come up with other pathologies. pws diff --git a/Src/parse.c b/Src/parse.c index 3343656..1505b49 100644 --- a/Src/parse.c +++ b/Src/parse.c @@ -2528,10 +2528,39 @@ par_cond_2(void) if (tok == INPAR) { int r; + /* + * Owing to ambiguuities in "test" compatibility mode, it's + * safest to assume the INPAR has a corresponding OUTPAR + * before trying to guess what intervening strings mean. + */ + char **endargptr = NULL, *endarg = NULL; + if (condlex == testlex) { + char **argptr; + int n_inpar = 1; + + for (argptr = testargs; *argptr; argptr++) { + if (!strcmp(*argptr, ")")) { + if (!--n_inpar) { + endargptr = argptr; + endarg = *argptr; + *argptr = NULL; + break; + } + } else if (!strcmp(*argptr, "(")) { + ++n_inpar; + } + } + } + condlex(); while (COND_SEP()) condlex(); r = par_cond(); + if (endargptr) { + *endargptr = endarg; + if (testargs == endargptr) + condlex(); + } while (COND_SEP()) condlex(); if (tok != OUTPAR) diff --git a/Test/C02cond.ztst b/Test/C02cond.ztst index daea5b4..453fa1c 100644 --- a/Test/C02cond.ztst +++ b/Test/C02cond.ztst @@ -442,6 +442,14 @@ F:scenario if you encounter it. >in conjunction: 3 ?(eval):6: no such option: invalidoption + test \( = \) +1: test compatility mode doesn't do grouping with three arguments + +# This becomes [[ -n true && ( -n -a ) ]] +# The test is to ensure the ! -a is analysed as two arguments. + test true -a \( ! -a \) +1: test compatilibty mode is based on arguments inside parentheses + %clean # This works around a bug in rm -f in some versions of Cygwin chmod 644 unmodish