Thanks, Oliver, for your long and thoughtful response. I'm afraid I don't quite understand all of it, though. Let me try to explain how I've understood things, but in a way that I find easier to process, and do please correct me where I'm wrong. The way I've understood it, is that, if $word contains the command line string for which completion is attempted, then each matcher should transform $word as follows: * m:$lpat=$tpat -> ${word//$~lpat/$~tpat} * b:$lpat=$tpat -> ${word/#$~lpat/$~tpat} * l:|$lpat=$tpat -> ${word/#$~lpat/$~tpat} * l:||$ranchor=$tpat -> ${word/#(#b)($~ranchor)/$~tpat$match[1]} * l:$lanchor|$lpat=$tpat -> ${word//(#b)($~lanchor)$~lpat/$match[1]$~tpat} * l:$lanchor||$ranchor=$tpat -> ${word//(#b)($~lanchor)($~ranchor)/$match[1]$~tpat$match[2]} * e:$lpat=$tpat -> ${word/%$~lpat/$~tpat} * r:$lpat|=$tpat -> ${word/%$~lpat/$~tpat} * r:$lanchor||=$tpat -> ${word/%(#b)($~lanchor)/$match[1]$~tpat} * r:$lpat|$ranchor=$tpat -> ${word//(#b)$~lpat($~ranchor)/$~tpat$match[1]} * r:$lanchor||$ranchor=$tpat -> ${word//(#b)($~lanchor)($~ranchor)/$match[1]$~tpat$match[2]} However, this leaves several transformations identical, which makes me believe I've misunderstood something. What did I miss? On Sun, Sep 26, 2021 at 4:09 PM Oliver Kiddle wrote: > > Marlon Richert wrote: > > How can I make a matcher that completes the right-most part (and only > > the right-most part) of each subword? That is, given a target > > completion 'abcDefGhi', how do I make a match specification that > > completes inputs > > If you're trying to do camel-case matching, one option is: > 'r:|[A-Z]=* r:|=*' > > The following was used by the original creator of matching control, it > works and breaks for the same cases as above in your example: > 'r:[^ A-Z0-9]||[ A-Z0-9]=* r:|=*' > > These allow extra characters at the beginning. So in your example, D > and DG match the target. There are also oddities with consecutive runs > of upper case characters, consider e.g. completion after ssh -o where > there is, e.g. "TCPKeepAlive" as an option. TKA won't match but ideally > would. > > With matching control, it is often easiest if you view it as converting > what is on the command-line into a regular expression. I haven't probed > the source code to get a precise view of how these are mapped. For my > own purposes, I keep a list but don't trust it in all cases because I've > found contradictory examples and tweaked it more than once, perhaps > making it less accurate in the process. So with the caveat that this > may contain errors, my current list is as follows: > > Not that that starting point is: > [cursor position] → .* > Then: > 'm:a=b' – a → b (* doesn't work on rhs) > 'r:|b=*' – b → [^b]*b > 'r:a|b=*' – ab → [^b]*a?b > 'r:a|b=c' - ab → cb > 'l:a|=*' – a → [^a]*a > 'l:a|b=*' – ab → [^a]*ab? > 'l:a|b=c' – ab → ac > 'b:a=*' – ^a → .* > 'b:a=c' – ^a → ^c > 'e:a=*' – a$ → .* > 'r:a||b=*' – b → [^a]*ab (only * works on rhs, empty a or b has no use) > 'l:a||b=*' – ^a → a.* (only * on rhs, empty a no use, b ignored?!) > > Something like [A-Z] becomes it's concrete form from the command-line in the regex > For correspondence classes, the corresponding form goes in the regex and only work with m:/M: forms. > ** is like * but with .* instead of [^x]* > > In all cases, the original unchanged form also passes - a matching > control does not have to be used. I've excluded those in the regular > expressions above. But including them note the following potentially > useful effects with an empty lpat: > > 'r:|b=c' – b → c?b > 'l:a|=c' – a → ac? > > When composing multiple matching controls, it doesn't try to apply over > the results of the previous. You can consider it an alternation of the > effect of each matching control. > > So 'r:a|b=* l:a|b=*' would be: ab → (ab|[^b]*a?b|[^a]*ab?) > > For the most part there are certain common forms and if you stick to > those, you find fewer bugs than when being creative. > > The || forms seem buggy to me. From the documentation, my assumption > would be that one means a[^a]*b and the other a[^b]*b > That could be more helpful for camel-case but I would need to generate > tests to say for sure. > b seems to even be ignored for the l form. > > > Additionally, the following are unclear to me from the manual: > > * What is the exact difference between l:lanchor||ranchor=tpat and > > r:lanchor||ranchor=tpat ? > > From the documentation and assuming some actual symmetry I would assume > the difference to be that lanchor needs to match the completion > candidate but not the command-line, while a tpat of * will not match > ranchor – swap l and r anchors for l and r forms in the description. > If that's what it did do, it might possibly bring us closer to a good > solution for camel-case matching. > > But as the regex above indicates, that isn't the case. I don't really > see the logic of the l:lanchor||ranchor=tpat seeming to be anchored to > the beginning. I think those forms came about as an attempt to get > camel-case to work. > > > * Why do the examples in the manual add r:|=* to the end of each > > matcher? This appears to make no difference at all. > > For the case where the cursor is in the middle rather than the end. For > the example from the manual with Usenet group names like > comp.sources.unix, try c.s.u with the cursor after the s. > > There are three components. Two have a dot anchor at the end. The final > has an end-of-string anchor. > > > * It appears that the order of "match descriptions" in a matchers > > matters, but it is unclear to me in what way and it isn't mentioned in > > the manual. For example, the pairs of matchers below differ only in > > the order of their match descriptions, yet each produces a different > > behavior. How are the match descriptions inside a matcher evaluated > > and what causes the difference between these? > > Order shouldn't really matter (apart from the x: matcher). > > As I mention earlier, you can consider it as being the alternaton of all > of them - at every point in the command-line where one of them can do > something. So a single match may rely on more than one matching control > to be matched. I can imagine that order might matter where you have mixed > up anchors. An example would be interesting. > > > * 'r:|[[:punct:]]=** l:?|=[[:punct:]]' completes 'cd a/b' to 'cd > > a/bc', but 'l:?|=[[:punct:]] r:|[[:punct:]]=**' does not. > > In my testing, neither do. Where is the cursor? You can think of the > matching as adding .* at the cursor position so a/b completes to a/bc > with no matching control if the cursor is at the end. The lack of other > candidate completions can also confuse testing of this because with > prefix completion, a/bc can be the only unambiguous match. Are you sure > you don't have other customisations that is allowing the first case to > match. > > The l: pattern allows punctuation after any character so a/b becomes the > pattern a(|[[:punct:]])/(|[[:punct:]])b(|[[:punct:]]) > > The r: pattern allows anything before the punctuation so a/b becomes the > pattern a*/b > > > * Given two target completions 'a-b' and 'a_b', both 'l:?|=[-_] > > m:{-}={_}' and 'm:{-}={_} l:?|=[-_]' will insert 'a-b' as the > > unambiguous substring on the first try, but on the second try, only > > the former will then list both completions, whereas the latter will > > complete only 'a-b'. > > I'm not sure I follow what you mean by the first and second try. If you > mean a second press of , matching is done completely anew with the > new command-line contents. > > With just compadd -M 'l:?|=[_-]' - a-b a_b > ab offers both candidates as matches. > Adding 'm:-=_' in just means that completion after a-b will also match > a_b > Single element correspondence classes are pointless by the way. > > Especially with the uppercase forms (L: etc) it is easy to create > situations where an unambiguous substring is inserted and the set of > candidate matches is quite different with the new command-line contents. > The effect can be somewhat jarring and has the appearance of a bug. > > Oliver