From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5297 invoked from network); 10 Oct 2021 20:16:07 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 10 Oct 2021 20:16:07 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1633896967; b=pFQ64JO10kWy/ocizV3Alv8fOkDGgcOjmnw+XRD/DcMRFSOTf/Je0LTY28MkaiwWYxB4qO2xST OzKLZrMyAf69OoqaAbxZlv16C+oQarhsvrh/DSHzNTlMTCeV8baJ15t/4aYoHrzUy6c75zHF8t 2eUw+q7BQO5QawpLZ9X9uvX4MjJnNPWN9og40OsogwmunsyguRDGjAWvB1j1e0znWKseviLZoL 368HsFV989jRUOnD1j/84vABswL8v5FRV4klB998uVWuxCDB4nFXLd/r1RoZ6ak5FVCim3KfL+ wAtebCGOlIHR0T4J/Go0+1HbNg10v/0x2JRctmr4Jr5Jwg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-vs1-f50.google.com) smtp.remote-ip=209.85.217.50; dkim=pass header.d=gmail.com header.s=20210112 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1633896967; bh=EBn2BcNYVT/6i5Xihskl1WVHf0cPiyYkTYF6rc+EXhU=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:Cc:To:Subject: Message-ID:Date:From:In-Reply-To:References:MIME-Version:DKIM-Signature: DKIM-Signature; b=JUp1LdGlNxNasZsIoYNWpcFoB/DIkMUiBrq4z2s7ghnGKpBMCnoB0X9JatwSGjExwvxqQTwcq8 sEYpCKrUnhiCgiq5ANYl7F7M7cN8Ry3M2UB38f8uh/Zzo9Laxxri9SaqzeBztC+fawd2e9NNFK 7+evQ6b8VHN66bM31oG8EMyfyQTbOUMj8/S8P5wX7lzdsOe5n17pSmr29z/PdrPkEd4mctd+ns Dxg+KEGrFaIZC9uqwUP8dWeK6TEHZku7u2+52q68exr+ldLWDi5G64NaktlyswbiFY46plWuuM WEZwL764iT1WqA6YC8aK37E4z7jSne/CkmzwuZdLeLI8ww==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To:References: MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=0fA3/1Us2rBF7VsP8mhjow8lQKyUBoorZ5MsGCYzsgI=; b=ZcHCtcHF2QB3y3pnPFNONdQS8H xwFsIkyn7k/7e1Qfg+atDzmZIKajijb+gXunuPY55EPaPDo9vUFX2K6cYGeWpb+YwpOgAbhDRpabt yi0m2/DbpeFyJlgKeK9JXkOdKwr3mrszRUtz7xtaLQchZe447aoIxkFUJRefmKZXHEjD77VEqcUE2 /JqIm3QR7R4CTrH+GXPxTJB1PccN4tLredxusaI6PwOkic5I7mvmobcDRzvoccwEQtClAALV9Td86 b5o4K21/eMYpNsOFkUi2W4UQ1MGhtwj16V8nKsGnVoq+mcmGw4xHBAnQ1jvbhojGLKNBYs+D5BEHv 949LQwsA==; Received: from authenticated user by zero.zsh.org with local id 1mZfEr-000BEw-7U; Sun, 10 Oct 2021 20:16:05 +0000 Authentication-Results: zsh.org; iprev=pass (mail-vs1-f50.google.com) smtp.remote-ip=209.85.217.50; dkim=pass header.d=gmail.com header.s=20210112 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none Received: from mail-vs1-f50.google.com ([209.85.217.50]:37418) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1mZfDo-000AW8-HM; Sun, 10 Oct 2021 20:15:01 +0000 Received: by mail-vs1-f50.google.com with SMTP id f2so16597323vsj.4; Sun, 10 Oct 2021 13:15:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=0fA3/1Us2rBF7VsP8mhjow8lQKyUBoorZ5MsGCYzsgI=; b=KbDs7kTmBszCjFr+DAh9EFtO7Lx8zNgtRXCtrc6I7G3pimT7Mayy8lB4VikVGko/PA /Flk2vGk1P2YVceANFBBhjxWWORNOPXso27X2HWYgFS9ddYuiAX9hs3jJiCke3Wkkzre RxagxJYkn72bHpWckznZy2xyvB/MT5yQuRAe24wxZK2kbyW2qnN9a6bkCQ4Il0XiSB2t HncRl8uCRLKUq7ogWw12iXUG7iFUudxhwoBmBa7RjQhpxRYD4zuSowjvqg2qcmx7UXyk w5crOaCTHKVezSToMilJK0Lc9Ukp58lb0cKVQew24ej/N1n+Hhzn8kdqDpcPTxoVnF0y ceRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=0fA3/1Us2rBF7VsP8mhjow8lQKyUBoorZ5MsGCYzsgI=; b=oQIHZkTTZCdcE8UltSaqYK6VGocwLDn6xHM3ZoCyb0d62iewDeuhQvrh1ZN2Fo2Sic xVyBcTJhsnQqlYIuHQXZblIzny+01m+/HapmS/D7Cb1qdG5fNbgmdg6xlPm+aDzcsOb4 NEnvvtO6t6AzysgEaDttxspaIvlo8w8wps9QsTN9iUbnFk2kxj/WVEKcUKlXYE8C4GZp yMTjKxes+r87CiiG3Y78rcQ3ndANV2ioWp+Cr3/TVqTyhwpqdRfQaxRTcZimjkmYvBt/ oojapfEbt9t0aar+hz2OfO/4WZylFM1xjxtW83uS0US4Ufs04+VHAENokDUaxwUJYUNr AvpQ== X-Gm-Message-State: AOAM530rbcwH8GDeQXaHuABpRGi0FRnSHbYCTFKuHjMrUGIH/UXSJ+5z y4Y7sKvrdd+r2og1S91IPaXqVICXCNXnk73/rMj7v3TE198= X-Google-Smtp-Source: ABdhPJwGqg31krUK8NXNqwzVtaxIFEKN+O8QrnDc6dm6nEUxmCYpkUKMKXO2aLS3WCf8Xlb0mq1maxFS8DOGXNkj9vk= X-Received: by 2002:a05:6102:3e84:: with SMTP id m4mr18805999vsv.51.1633896898964; Sun, 10 Oct 2021 13:14:58 -0700 (PDT) MIME-Version: 1.0 References: <20296-1632661753.678317@ipjb.25sX.Whnd> <12742-1633816758.622067@mB95.qJqC.4--_> In-Reply-To: From: Marlon Richert Date: Sun, 10 Oct 2021 23:14:22 +0300 Message-ID: Subject: Re: Questions about completion matchers To: Oliver Kiddle Cc: Zsh Users Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Seq: 27228 Archived-At: X-Loop: zsh-users@zsh.org Errors-To: zsh-users-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-users-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: I have to say, after having processed both of your explanations, it appears that r:lanchor||ranchor=3Dtpat and l:lanchor||ranchor=3Dtpat are not working as intended. It intuitively feels like they should cover this very common case: If lanchor and ranchor are present and adjacent in the command line string, then apply m:=3Dtpat to the empty string between them. That is to say: Enable completion between lanchor and ranchor, just like we can enable completion to the left or right of an anchor. In terms of syntax, this treats the void between || as an empty lpat, just like it is in :|lanchor=3D or :ranchor|=3D. The || form (and indeed, the | form) is essentially a conditional version of one of the other matchers. This actually extrapolates to a consistent interpretation of the symbols in the matching syntax: * lpat is always the substring whose meaning is "transformed": That is to say, it (and only it) is made to be considered equal to any trial substring matching tpat. It is permitted for lpat to be equal to the empty string or the beginning/end of the command line string. * Each |ranchor or lanchor| adds a constraint: A substring matching them needs to be directly to the right or left of lpat -- or lpat's meaning won't be "transformed". The meaning of the anchors themselves is never "transformed": Any substring matching the anchor on the command line needs to be matched literally in the trial string. * For the first anchor in a matcher, the substring matching lpat will not be considered equivalent to a trial substring that matches the anchor. This clause is essentially there to prevent the matcher from becoming too "greedy". * For the second anchor, there is no such restriction. (Or otherwise, the matcher could easily become too constrained and unable to match any trial string at all.) >From this then follows the following meaning of each matcher: * m:lpat=3Dtpat - Treat each substring matching lpat on the command line as being equal to any substring matching tpat in the trial string. * r:lpat|ranchor=3D** - The same as m:lpat=3D*, but only if the substring matching lpat has directly to its right a substring matching ranchor. * r:lpat|ranchor=3Dtpat - The same as m:lpat=3Dtpat~ranchor, but only if the substring matching lpat has directly to its right a substring matching ranchor. * r:lanchor||ranchor=3Dtpat - The same as r:|ranchor=3Dtpat, but only if the substring matching ranchor is immediately preceded by a substring matching lanchor. One could even continue this pattern, as || is nothing more than |lpat| with lpat equal to the empty string: * r:lanchor|lpat|ranchor=3Dtpat - The same as r:lpat|ranchor=3Dtpat, but only if the substring matching lpat is immediately preceded by a substring matching lanchor. However, in practice, the more constraints a matcher has, the more likely it is to break consistency with this pattern. As a result, the || matchers no longer support the case for which it looks that they were intended - to complete the missing substring between ranchor and lanchor - which is now, unfortunately, a missing feature. I would hope the implementation of the || matchers could be modified to restore this feature -- which I assume must (or was intended to) have been there at some point. > On Sun, Sep 26, 2021 at 4:09 PM Oliver Kiddle wrote: > > > > With matching control, it is often easiest if you view it as converting > > what is on the command-line into a regular expression. I haven't probed > > the source code to get a precise view of how these are mapped. For my > > own purposes, I keep a list but don't trust it in all cases because I'v= e > > found contradictory examples and tweaked it more than once, perhaps > > making it less accurate in the process. So with the caveat that this > > may contain errors, my current list is as follows: > > > > Not that that starting point is: > > [cursor position] =E2=86=92 .* > > Then: > > 'm:a=3Db' =E2=80=93 a =E2=86=92 b (* doesn't wo= rk on rhs) > > 'r:|b=3D*' =E2=80=93 b =E2=86=92 [^b]*b > > The appearance of [^a] and [^b] in your patterns was a complete > surprise to me. I would've expected * to work as * in a glob > expression. This is not clear from the docs. Now that I know that the > matcher syntax was based on regex, it makes more sense, but I still > wouldn't have figured this out intuitively. A clearer explanation > about this in the docs would be helpful. Yes, it's mentioned somewhere > in the examples, but it should be explained more clearly earlier on. > > > 'r:a|b=3D*' =E2=80=93 ab =E2=86=92 [^b]*a?b > > This one looks incorrect to me as it does not match the example in the > docs. From that example, it appears to me that it is supposed to work > like this: > 'r:a|b=3D*' =E2=80=93 b =E2=86=92 [^b]*ab > > > 'r:a|b=3Dc' - ab =E2=86=92 cb > > 'l:a|=3D*' =E2=80=93 a =E2=86=92 [^a]*a > > 'l:a|b=3D*' =E2=80=93 ab =E2=86=92 [^a]*ab? > Shouldn't these last two result in a[^a]* and ab[^a]*, respectively, > since the anchor goes to the left? > > > 'l:a|b=3Dc' =E2=80=93 ab =E2=86=92 ac > > 'b:a=3D*' =E2=80=93 ^a =E2=86=92 .* > > Oh, but here * does work like a * glob? So, I guess * behaves > differently only when anchors are involved? > > > 'b:a=3Dc' =E2=80=93 ^a =E2=86=92 ^c > > 'e:a=3D*' =E2=80=93 a$ =E2=86=92 .* > > 'r:a||b=3D*' =E2=80=93 b =E2=86=92 [^a]*ab (only * works= on rhs, empty a or b has no use) > > 'l:a||b=3D*' =E2=80=93 ^a =E2=86=92 a.* (only * on rh= s, empty a no use, b ignored?!) > > The comments on the last two items sound like bugs to me. Also, > 'l:a||b=3D*' should work on just 'a' and not require '^a'. > > > On Sun, Oct 10, 2021 at 12:59 AM Oliver Kiddle wrote: > > > > The difference between b: and l: with an empty anchor (or e/r) is not > > encapsulated by my regular expressions. They only differ in how strict > > the anchoring to the start of the match is where another matching > > control allowed extra characters to be inserted at the beginning. > > So, does that mean then that matcher are not evaluated strictly left-to-r= ight? > > > The example given when this was added was zsh option completion where > > underscores are ignored and a prefix of NO is allowed. > > About that example, what exactly is the difference between L: and B: > that lets B: complete '_NO_f' to '_NO_foo' and 'NONO_f' to 'NONO_f' > but not L:? It's not clear from the example, let alone from the > description of the matchers. > > > I took a look at the source code and dug out original -workers posts an= d > > it does seem that the intention for the two anchor || forms was as I > > thought. Even as designed I don't think either is ideal for camel case = - > > the l: form excludes characters from the wrong anchor for that. > > The matching code looks a lot like regular expression matching with a > > back tracking algorithm. > > Y02compmatch.ztst contains a lot of examples that could be added to > the docs to better explain how the different matchers are intended to > be used. It would help to better understand their workings.