From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 11456 invoked from network); 26 Sep 2021 13:10:16 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 26 Sep 2021 13:10:16 -0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Message-ID:Date: Content-Transfer-Encoding:Content-ID:Content-Type:MIME-Version:Subject:To: References:From:In-reply-to:cc:Reply-To:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=aNsnNizZs1IifATEO1bKXNAUvZ+XXLpFMyCdIvvzp6w=; b=cY9lNhQ1hoTnFiD+U6ZiFE1wzc q2n5zyfvgeHaIBkxgaWDuvKnnQD2bCXGGTbm4THZ/sSVf9J43SlqSb4AUoKGF9REQ3vvbVBoGrO/X qdQC5hgx3p8xObTgmEQJ1J3KrY5G3ybsZh6wFpdf+0ZtA3ozpO5lWUReq1omQFGAMETpicxhQfnMu vpE/YFmGMsUN9947uz3ckQpHgeI9DG5M6W3x6exN5tus4JhIyFRdTVpmuKg/pfm/Nf1Get5ffe3YV RhcrbQ3xaQaj2u1Q7n2IMAuBopJMAtRCOjG6/ZxMWPYNEwzBeNMXRXEAoaqi1N1famLoQdVaJO9g4 f6CIJgLA==; Received: from authenticated user by zero.zsh.org with local id 1mUTv4-0006Sh-LB; Sun, 26 Sep 2021 13:10:14 +0000 Received: from authenticated user by zero.zsh.org with esmtpsa (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1mUTu6-0005l3-QL; Sun, 26 Sep 2021 13:09:15 +0000 Received: from [192.168.178.21] (helo=hydra) by mail.kiddle.eu with esmtp(Exim 4.94.2) (envelope-from ) id 1mUTu5-0005HN-Lz; Sun, 26 Sep 2021 15:09:13 +0200 cc: Zsh Users In-reply-to: From: Oliver Kiddle References: To: Marlon Richert Subject: Re: Questions about completion matchers MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-ID: <20295.1632661753.1@hydra> Content-Transfer-Encoding: 8bit Date: Sun, 26 Sep 2021 15:09:13 +0200 Message-ID: <20296-1632661753.678317@ipjb.25sX.Whnd> X-Seq: 27137 Archived-At: X-Loop: zsh-users@zsh.org Errors-To: zsh-users-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-users-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: Marlon Richert wrote: > How can I make a matcher that completes the right-most part (and only > the right-most part) of each subword? That is, given a target > completion 'abcDefGhi', how do I make a match specification that > completes inputs If you're trying to do camel-case matching, one option is: 'r:|[A-Z]=* r:|=*' The following was used by the original creator of matching control, it works and breaks for the same cases as above in your example: 'r:[^ A-Z0-9]||[ A-Z0-9]=* r:|=*' These allow extra characters at the beginning. So in your example, D and DG match the target. There are also oddities with consecutive runs of upper case characters, consider e.g. completion after ssh -o where there is, e.g. "TCPKeepAlive" as an option. TKA won't match but ideally would. With matching control, it is often easiest if you view it as converting what is on the command-line into a regular expression. I haven't probed the source code to get a precise view of how these are mapped. For my own purposes, I keep a list but don't trust it in all cases because I've found contradictory examples and tweaked it more than once, perhaps making it less accurate in the process. So with the caveat that this may contain errors, my current list is as follows: Not that that starting point is: [cursor position] → .* Then: 'm:a=b' – a → b (* doesn't work on rhs) 'r:|b=*' – b → [^b]*b 'r:a|b=*' – ab → [^b]*a?b 'r:a|b=c' - ab → cb 'l:a|=*' – a → [^a]*a 'l:a|b=*' – ab → [^a]*ab? 'l:a|b=c' – ab → ac 'b:a=*' – ^a → .* 'b:a=c' – ^a → ^c 'e:a=*' – a$ → .* 'r:a||b=*' – b → [^a]*ab (only * works on rhs, empty a or b has no use) 'l:a||b=*' – ^a → a.* (only * on rhs, empty a no use, b ignored?!) Something like [A-Z] becomes it's concrete form from the command-line in the regex For correspondence classes, the corresponding form goes in the regex and only work with m:/M: forms. ** is like * but with .* instead of [^x]* In all cases, the original unchanged form also passes - a matching control does not have to be used. I've excluded those in the regular expressions above. But including them note the following potentially useful effects with an empty lpat: 'r:|b=c' – b → c?b 'l:a|=c' – a → ac? When composing multiple matching controls, it doesn't try to apply over the results of the previous. You can consider it an alternation of the effect of each matching control. So 'r:a|b=* l:a|b=*' would be: ab → (ab|[^b]*a?b|[^a]*ab?) For the most part there are certain common forms and if you stick to those, you find fewer bugs than when being creative. The || forms seem buggy to me. From the documentation, my assumption would be that one means a[^a]*b and the other a[^b]*b That could be more helpful for camel-case but I would need to generate tests to say for sure. b seems to even be ignored for the l form. > Additionally, the following are unclear to me from the manual: > * What is the exact difference between l:lanchor||ranchor=tpat and > r:lanchor||ranchor=tpat ? >From the documentation and assuming some actual symmetry I would assume the difference to be that lanchor needs to match the completion candidate but not the command-line, while a tpat of * will not match ranchor – swap l and r anchors for l and r forms in the description. If that's what it did do, it might possibly bring us closer to a good solution for camel-case matching. But as the regex above indicates, that isn't the case. I don't really see the logic of the l:lanchor||ranchor=tpat seeming to be anchored to the beginning. I think those forms came about as an attempt to get camel-case to work. > * Why do the examples in the manual add r:|=* to the end of each > matcher? This appears to make no difference at all. For the case where the cursor is in the middle rather than the end. For the example from the manual with Usenet group names like comp.sources.unix, try c.s.u with the cursor after the s. There are three components. Two have a dot anchor at the end. The final has an end-of-string anchor. > * It appears that the order of "match descriptions" in a matchers > matters, but it is unclear to me in what way and it isn't mentioned in > the manual. For example, the pairs of matchers below differ only in > the order of their match descriptions, yet each produces a different > behavior. How are the match descriptions inside a matcher evaluated > and what causes the difference between these? Order shouldn't really matter (apart from the x: matcher). As I mention earlier, you can consider it as being the alternaton of all of them - at every point in the command-line where one of them can do something. So a single match may rely on more than one matching control to be matched. I can imagine that order might matter where you have mixed up anchors. An example would be interesting. > * 'r:|[[:punct:]]=** l:?|=[[:punct:]]' completes 'cd a/b' to 'cd > a/bc', but 'l:?|=[[:punct:]] r:|[[:punct:]]=**' does not. In my testing, neither do. Where is the cursor? You can think of the matching as adding .* at the cursor position so a/b completes to a/bc with no matching control if the cursor is at the end. The lack of other candidate completions can also confuse testing of this because with prefix completion, a/bc can be the only unambiguous match. Are you sure you don't have other customisations that is allowing the first case to match. The l: pattern allows punctuation after any character so a/b becomes the pattern a(|[[:punct:]])/(|[[:punct:]])b(|[[:punct:]]) The r: pattern allows anything before the punctuation so a/b becomes the pattern a*/b > * Given two target completions 'a-b' and 'a_b', both 'l:?|=[-_] > m:{-}={_}' and 'm:{-}={_} l:?|=[-_]' will insert 'a-b' as the > unambiguous substring on the first try, but on the second try, only > the former will then list both completions, whereas the latter will > complete only 'a-b'. I'm not sure I follow what you mean by the first and second try. If you mean a second press of , matching is done completely anew with the new command-line contents. With just compadd -M 'l:?|=[_-]' - a-b a_b ab offers both candidates as matches. Adding 'm:-=_' in just means that completion after a-b will also match a_b Single element correspondence classes are pointless by the way. Especially with the uppercase forms (L: etc) it is easy to create situations where an unambiguous substring is inserted and the set of candidate matches is quite different with the new command-line contents. The effect can be somewhat jarring and has the appearance of a bug. Oliver