zsh-workers
help / color / mirror / code / Atom feed
* [RFC] Add xfail tests for || form of completion matchers
@ 2021-10-11 14:34 Marlon Richert
2021-10-12 12:08  Marlon Richert
From: Marlon Richert @ 2021-10-11 14:34 UTC (permalink / raw)
To: Zsh hackers list; +Cc: Oliver Kiddle, Bart Schaefer

[-- Attachment #1: Type: text/plain, Size: 170 bytes --]

The tests show how :||= matchers should behave in order to provide
completion features that cannot be implemented with :|= matchers.

This is a follow-up to users/27228.

[-- Type: text/plain, Size: 5113 bytes --]

From 8640592169e90c89fd879baf39274f4a6a5822ee Mon Sep 17 00:00:00 2001
From: Marlon Richert <marlon.richert@gmail.com>
Date: Mon, 11 Oct 2021 17:30:07 +0300
Subject: [PATCH] Add xfail tests for || form of completion matchers

The tests show how :||= matchers should behave in order to provide
completion features that cannot be implemented with :|= matchers.
---
Test/Y02compmatch.ztst | 108 +++++++++++++++++++++++++++++++++++++----
1 file changed, 98 insertions(+), 10 deletions(-)

diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
index 621707482..ee7e422c1 100644
--- a/Test/Y02compmatch.ztst
+++ b/Test/Y02compmatch.ztst
@@ -378,15 +378,26 @@
comp.graphics.rendering.misc comp.graphics.rendering.raytracing
comp.graphics.rendering.renderman)
test_code $example4_matcher example4_list - comptest$'tst c.s.u\t'
-0:Documentation example using input c.s.u
+ comptest $'tst .s.u\t' +0:Documentation example using input .s.u +>line: {tst comp.sources.unix }{} +>COMPADD:{} +>INSERT_POSITIONS:{21} + + example4b_matcher='r:[^.]||.=* r:|=*' + test_code$example4b_matcher example4_list
+ comptest $'tst .s.u\t^[bc\t' +0f:Documentation example using input .s.u but with double anchor +>line: {tst .s.u}{} +>COMPADD:{} +>INSERT_POSITIONS:{} >line: {tst comp.sources.unix }{} >COMPADD:{} >INSERT_POSITIONS:{21} test_code$example4_matcher example4_list
- comptest $'tst c.g.\ta\t.\tp\ta\tg\t' -0:Documentation example using input c.g.\ta\t.\tp\ta\tg\t + comptest$'tst .g.\ta\t.\tp\ta\tg\t'
+0:Documentation example using input .g.\ta\t.\tp\ta\tg\t
>line: {tst comp.graphics.}{}
>INSERT_POSITIONS:{18}
@@ -424,9 +435,32 @@
>INSERT_POSITIONS:{32}

+ test_code $example4b_matcher example4_list + comptest$'tst .g.\t^[bc\t'
+0f:Documentation example using input .g. with double anchor
+>line: {tst .g.}{}
+>INSERT_POSITIONS:{}
+>line: {tst comp.graphics.}{}
+>INSERT_POSITIONS:{18}
+
test_code $example4_matcher example4_list - comptest$'tst c...pag\t'
-0:Documentation example using input c...pag\t
+ comptest $'tst ...pag\t' +0:Documentation example using input ...pag +>line: {tst comp.graphics.apps.pagemaker }{} +>COMPADD:{} +>INSERT_POSITIONS:{32} + + test_code$example4b_matcher example4_list
+ comptest $'tst ...pag\t^[bc\t^Fg^F^Fa\t' +0f:Documentation example using input ...pag with double anchor +>line: {tst .g.}{} +>COMPADD:{} +>INSERT_POSITIONS:{} +>line: {tst c...pag}{} +>COMPADD:{} +>INSERT_POSITIONS:{} >line: {tst comp.graphics.apps.pagemaker }{} >COMPADD:{} >INSERT_POSITIONS:{32} @@ -444,8 +478,8 @@ example5_matcher='r:|[.,_-]=* r:|=*' example5_list=(veryverylongfile.c veryverylongheader.h) test_code$example5_matcher example5_list
- comptest $'tst v.c\tv.h\t' -0:Documentation example using input v.c\t + comptest$'tst  .c\t.h\t'
+0:Documentation example using input .c
>line: {tst  veryverylongfile.c }{}
>INSERT_POSITIONS:{23}
@@ -453,6 +487,23 @@
>INSERT_POSITIONS:{44}

+ example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
+ test_code $example5b_matcher example5_list + comptest$'tst  .c\t^[bv\t.h\t^[bv'
+0f:Documentation example using input .c but with double anchor
+>line: {tst  .c}{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c }{}
+>INSERT_POSITIONS:{23}
+>line: {tst  veryverylongfile.c .h}{}
+>INSERT_POSITIONS:{}
+>INSERT_POSITIONS:{44}
+

example6_list=(LikeTHIS FooHoo 5foo123 5bar234)
test_code 'r:|[A-Z0-9]=* r:|=*' example6_list
@@ -493,15 +544,52 @@
example7_matcher="r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
example7_list=($example6_list) test_code$example7_matcher example7_list
- comptest $'tst H\t2\t' -0:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*" + comptest$'tst H\t^[bF\to2\t^[b5\tb\t'
+0f:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
+>line: {tst H}{}
+>INSERT_POSITIONS:{}
+>line: {tst F}{H}
+>INSERT_POSITIONS:{}
>line: {tst FooHoo }{}
>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 2}{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo 5}{2}
+>INSERT_POSITIONS:{}
>line: {tst FooHoo 5bar234 }{}
>INSERT_POSITIONS:{18}

+ example7b_matcher="r:?||[A-Z0-9]=* r:|=*"
+ test_code $example7b_matcher example7_list + comptest$'tst H\t^[bF2\t^[b5\t'
+0f:Documentation example using "r:?||[A-Z0-9]=* r:|=*"
+>line: {tst H}{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo }{}
+>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 5bar234 }{}
+>INSERT_POSITIONS:{18}
+
+ example8_list=(passwd.byname)
+ test_code 'r:[^.]||.=* l:.||[^.]=*'
+ comptest $'tst .^B\tpass^Fname\t' +0f:Symmetry between r and l +>line: {tst }{.} +>COMPADD:{} +>INSERT_POSITIONS:{} +>line: {tst passwd.byname }{} +>COMPADD:{} +>INSERT_POSITIONS:{17} + workers_7311_matcher="m:{a-z}={A-Z} r:|[.,_-]=* r:|=*" workers_7311_list=(Abc-Def-Ghij.txt Abc-def.ghi.jkl_mno.pqr.txt Abc_def_ghi_jkl_mno_pqr.txt) -- 2.33.0 ^ permalink raw reply [flat|nested] 14+ messages in thread * Re: [RFC] Add xfail tests for || form of completion matchers 2021-10-11 14:34 [RFC] Add xfail tests for || form of completion matchers Marlon Richert @ 2021-10-12 12:08  Marlon Richert 2021-10-12 15:25  Daniel Shahaf  (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Marlon Richert @ 2021-10-12 12:08 UTC (permalink / raw) To: Zsh hackers list; +Cc: Oliver Kiddle, Bart Schaefer [-- Attachment #1: Type: text/plain, Size: 327 bytes --] On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote: > > The tests show how :||= matchers should behave in order to provide > completion features that cannot be implemented with :|= matchers. > > This is a follow-up to users/27228. I've now added an accompanying documentation update to the patch. [-- Attachment #2: 0001-Add-xfail-tests-for-form-of-completion-matchers.txt --] [-- Type: text/plain, Size: 29640 bytes --] From 3ec2fceced1f327eb2ac7484772bd1d3756bf8d2 Mon Sep 17 00:00:00 2001 From: Marlon Richert <marlon.richert@gmail.com> Date: Tue, 12 Oct 2021 15:02:31 +0300 Subject: [PATCH] Add xfail tests for || form of completion matchers The tests show how :||= matchers should behave in order to provide completion features that cannot be implemented with :|= matchers. --- Doc/Zsh/compwid.yo | 446 ++++++++++++++++++----------------------- Test/Y02compmatch.ztst | 108 +++++++++- 2 files changed, 293 insertions(+), 261 deletions(-) diff --git a/Doc/Zsh/compwid.yo b/Doc/Zsh/compwid.yo index 3e86d3b42..5dd2127df 100644 --- a/Doc/Zsh/compwid.yo +++ b/Doc/Zsh/compwid.yo @@ -896,72 +896,210 @@ enditem() texinode(Completion Matching Control)(Completion Widget Example)(Completion Condition Codes)(Completion Widgets) sect(Completion Matching Control) -It is possible by use of the -tt(-M) option of the tt(compadd) builtin command to specify how the -characters in the string to be completed (referred to here as the -command line) map onto the characters in the list of matches produced by -the completion code (referred to here as the trial completions). Note -that this is not used if the command line contains a glob pattern and -the tt(GLOB_COMPLETE) option is set or the tt(pattern_match) of the -tt(compstate) special association is set to a non-empty string. - -The var(match-spec) given as the argument to the tt(-M) option (see +By default, characters in the string to be completed (referred to here as the +command line) map only onto identical characters in the list of matches +produced by the completion code (referred to here as the trial completions) and +missing characters are inserted only at the cursor position, if the shell +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line, +otherwise. However, it is possible to modify this behavior by use of the +tt(-M) option of the tt(compadd) builtin command. Note that this is not used +if the command line contains a glob pattern and the shell +optiontt(GLOB_COMPLETE) is set or the tt(pattern_match) of the tt(compstate) +special association is set to a non-empty string. + +The tt(-M) option (see ifzman(Completion Builtin Commands' above)\ -ifnzman(noderef(Completion Builtin Commands))\ -) consists of one or more matching descriptions separated by -whitespace. Each description consists of a letter followed by a colon -and then the patterns describing which character sequences on the line match -which character sequences in the trial completion. Any sequence of -characters not handled in this fashion must match exactly, as usual. - -The forms of var(match-spec) understood are as follows. In each case, the -form with an upper case initial character retains the string already -typed on the command line as the final result of completion, while with -a lower case initial character the string on the command line is changed -into the corresponding part of the trial completion. +ifnzman(noderef(Completion Builtin +Commands))\ +) requires a var(match-spec) as it argument, consisting of one or more matching +descriptions separated by whitespace. Each description consists of a letter, +followed by a colon, and then patterns describing which substrings on the +command line map onto which substrings in the trial completion. Descriptions +are evaluated from left to right and are cumulative. An earlier mapping can +thus potentially change the outcome of a later mapping. Finally, any unmapped +substrings will be mapped using the default mapping of identical substrings. + +When using the completion system (see +ifzman(zmanref(zshcompsys))\ +ifnzman(noderef(Completion System))\ +), users can define match specifications that are to be used for specific +contexts by using the tt(matcher) and tt(matcher-list) styles. The values for +the latter will be used everywhere. + +Each pattern in a var(match-spec) is either an empty string or consists of a +sequence of literal characters (which may be quoted with a backslash), question +marks, character classes, and correspondence classes (see next paragraph). +Ordinary shell patterns are not used. Literal characters match only +themselves, question marks match any character, and character classes are +formed as for globbing and match any character in the given set. + +Correspondence classes are defined like character classes, but with two +differences: They are delimited by a pair of braces, and negated classes are +not allowed, so the characters tt(!) and tt(^) have no special meaning directly +after the opening brace. They indicate that a range of characters on the line +match a range of characters in the trial completion, but (unlike ordinary +character classes) paired according to the corresponding position in the +sequence. More than one pair of classes can occur, in which case the first +class before the tt(=) corresponds to the first after it, and so on. If one +side has more such classes than the other side, the superfluous classes behave +like normal character classes. + +The standard tt([:)var(name)tt(:])' forms described for standard shell +patterns (see +ifnzman(noderef(Filename Generation))\ +ifzman(the section +FILENAME GENERATION in zmanref(zshexpn))\ +) may appear in correspondence classes as well as normal character classes. +The only special behaviour in correspondence classes is if the form on the left +and the form on the right are each one of tt([:upper:]), tt([:lower:]). In +these cases the character in the word and the character on the line must be the +same up to a difference in case. Although the matching system does not yet +handle multibyte characters, this is likely to be a future extension, at which +point this syntax will handle arbitrary alphabets; hence this form, rather than +the use of explicit ranges, is the recommended form. In other cases +tt([:)var(name)tt(:])' forms are allowed. If the two forms on the left and +right are the same, the characters must match exactly. In remaining cases, the +corresponding tests are applied to both characters, but they are not otherwise +constrained; any matching character in one set goes with any matching character +in the other set: this is equivalent to the behaviour of ordinary character +classes. + +The forms of var(match-spec) understood are listed below. For each of these, +the form with an upper case initial character replaces mapped substrings in the +trial completions with their counterparts from the command line, whereas with a +lower case initial character, once a trial completion has been accepted, +matched substrings on the command line are replaced with their counterparts +from the accepted completion. startitem() xitem(tt(m:)var(lpat)tt(=)var(tpat)) item(tt(M:)var(lpat)tt(=)var(tpat))( -Here, var(lpat) is a pattern that matches on the command line, -corresponding to var(tpat) which matches in the trial completion. +Let any substring matching var(lpat) be completed to any substring matching +var(tpat). + +Examples: + +tt(m:{[:lower:]}={[:upper:]}) lets any lower case character be completed to its +uppercase counterpart. + +tt(M:_=) inserts every underscore on the command line into each trial +completion, in the same relative position, determined by matching the +substrings around it. Note that the definition of what is matching can be +modified by applying other matchers first. + +If these two matchers are combined to tt('m:{[:lower:]}={[:upper:]} M:_='), +then given a trial completion tt(NO)', it lets tt(_n_o_)' be completed to +tt(_N_O_)', even though tt(_N_O_)' itself is not present as a trial +completion. tt(m:{[:lower:]}={[:upper:]}) is evaluated first and makes tt(n) +match tt(N)' and tt(o) match tt(O)', after which tt(M:_=) is then able to +insert underscores into the correct positions. +) +xitem(tt(l:)tt(|)var(lpat)tt(=)var(tpat)) +xitem(tt(L:)tt(|)var(lpat)tt(=)var(tpat)) +xitem(tt(r:)var(lpat)tt(|)tt(=)var(tpat)) +item(tt(R:)var(lpat)tt(|)tt(=)var(tpat))( +Let any substring matching var(lpat) at the left (for tt(l:) and tt(L:)) or +right (for tt(r:) and tt(R:)) edge of the command line be completed to any +substring matching var(tpat) in the same position in the trial completion. + +With these matchers, the pattern var(tpat) may also be a star, tt(*)'. This +lets a matching command line substring be completed to any trial completion +substring in the same relative position. + +Examples: + +tt(L:|[nN][oO]=) makes it so that, if there is a single tt(no)', tt(nO)', +tt(No)' or tt(no)' at the left end of the command line, then it is added to +the left of each trial completion. + +tt(r:|=*) lets (the empty substring at) the right edge of the command line +string be completed to any number of characters at the edge of each trial +completion. + +If these two matchers are combined to tt('L:[nN][oO]= r:|=*'), then given a +trial completion tt(foo)', it lets tt(NOf)' be completed to tt(NOfoo)'. +First, tt(L:[nN][oO]=) prefixes the trial completion with tt(NO), after which +tt(r:|=*) is able to match the command line to the trial completion and +complete the missing characters at the end. ) -xitem(tt(l:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat)) -xitem(tt(L:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat)) -xitem(tt(l:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) -xitem(tt(L:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) xitem(tt(b:)var(lpat)tt(=)var(tpat)) -item(tt(B:)var(lpat)tt(=)var(tpat))( -These letters are for patterns that are anchored by another pattern on -the left side. Matching for var(lpat) and var(tpat) is as for tt(m) and -tt(M), but the pattern var(lpat) matched on the command line must be -preceded by the pattern var(lanchor). The var(lanchor) can be blank to -anchor the match to the start of the command line string; otherwise the -anchor can occur anywhere, but must match in both the command line and -trial completion strings. - -If no var(lpat) is given but a var(ranchor) is, this matches the gap -between substrings matched by var(lanchor) and var(ranchor). Unlike -var(lanchor), the var(ranchor) only needs to match the trial -completion string. - -The tt(b) and tt(B) forms are similar to tt(l) and tt(L) with an empty -anchor, but need to match only the beginning of the word on the command line -or trial completion, respectively. -) -xitem(tt(r:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat)) -xitem(tt(R:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat)) -xitem(tt(r:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) -xitem(tt(R:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) +xitem(tt(B:)var(lpat)tt(=)var(tpat)) xitem(tt(e:)var(lpat)tt(=)var(tpat)) item(tt(E:)var(lpat)tt(=)var(tpat))( -As tt(l), tt(L), tt(b) and tt(B), with the difference that the command -line and trial completion patterns are anchored on the right side. -Here an empty var(ranchor) and the tt(e) and tt(E) forms force the -match to the end of the command line or trial completion string. - -In the form where var(lanchor) is given, the var(lanchor) only needs -to match the trial completion string. +Let all substrings matching var(lpat) at the beginning (for tt(b:) and tt(B:)) +or end (for tt(e:) and tt(E:)) of the command line be completed to the same +number of substrings matching var(tpat) in each trial completion in the same +relative position. + +Example: + +tt(B:[nN][oO]=) adds all occurences of tt(no)', tt(nO)', tt(No)' and +tt(NO)' at the beginning of the command line to the beginning of each trial +completion. If tt(r:|=*) is added to this, then given a trial completion +tt(foo)', it lets tt(noNOf)' be completed to tt(noNOfoo)'. +) +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat)) +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat)) +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat)) +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))( +Let any command line substring, which is left/right-adjacent (respectively) to +a substring matching var(anchor) and which matches var(lpat), be completed to +any trial completion substring, which +startitemize() +itemiz(\ +is adjacent to the same substring and which +) +itemiz(\ +matches var(tpat), but which +) +itemiz(\ +does not contain any substrings matching var(anchor). +) +enditemize() + +When a matcher includes at least one anchor (which also applies to the forms +with two anchors, below), the pattern var(tpat) may also be one or two stars, +tt(*)' or tt(**)'. The first star can match any number of characters, within +the constraints outlined above, whereas a second star removes the last +constraint and can match substrings matching var(anchor). + +Example: + +tt(r:|.=*) lets each dot be completed to any substring that ends at the right +in a dot, but does not otherwise contain any dots, in the trial string. Thus, +given a trial string tt(comp.sources.unix)', tt(..unix)' can be completed to +it, but tt(.unix)' cannot, since the matcher will refuse to map any dots other +than the one matched by the var(anchor). +) +xitem(tt(l:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat)) +xitem(tt(L:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat)) +xitem(tt(r:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat)) +item(tt(R:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat))( +Lets the empty string between each two adjacent command line substrings +matching var(anchor) and var(coanchor), in the order given, be completed to any +trial completion substring, which +startitemize() +itemiz(\ +is adjacent to the same two substrings and which +) +itemiz(\ +matches var(tpat), but which +) +itemiz(\ +does not contain any substrings matching var(anchor). +) +enditemize() + +Note there is no restriction on substrings matching var(coanchor). + +Example: + +tt(r:?||[[:upper:]]=*) will complete tt(fHoo)' to tt(fooHoo)', but not +tt(Hoo)' to tt(fooHoo)', because there is no character to the left of tt(H)' +on the command line˙. Likewise, it will not complete tt(lHIS)' to +tt(likeTHIS)', because, other than the one substring it maps to var(anchor), +it cannot map any substring containing uppercase letters in the trial +completion. ) item(tt(x:))( This form is used to mark the end of matching specifications: @@ -972,200 +1110,6 @@ function to override another. ) enditem() -Each var(lpat), var(tpat) or var(anchor) is either an empty string or -consists of a sequence of literal characters (which may be quoted with a -backslash), question marks, character classes, and correspondence -classes; ordinary shell patterns are not used. Literal characters match -only themselves, question marks match any character, and character -classes are formed as for globbing and match any character in the given -set. - -Correspondence classes are defined like character classes, but with two -differences: they are delimited by a pair of braces, and negated classes -are not allowed, so the characters tt(!) and tt(^) have no special -meaning directly after the opening brace. They indicate that a range of -characters on the line match a range of characters in the trial -completion, but (unlike ordinary character classes) paired according to -the corresponding position in the sequence. For example, to make any -ASCII lower case letter on the line match the corresponding upper case -letter in the trial completion, you can use tt(m:{a-z}={A-Z})' -(however, see below for the recommended form for this). More -than one pair of classes can occur, in which case the first class before -the tt(=) corresponds to the first after it, and so on. If one side has -more such classes than the other side, the superfluous classes behave -like normal character classes. In anchor patterns correspondence classes -also behave like normal character classes. - -The standard tt([:)var(name)tt(:])' forms described for standard shell -patterns (see -ifnzman(noderef(Filename Generation))\ -ifzman(the section FILENAME GENERATION in zmanref(zshexpn))) -may appear in correspondence classes as well as normal character -classes. The only special behaviour in correspondence classes is if -the form on the left and the form on the right are each one of -tt([:upper:]), tt([:lower:]). In these cases the -character in the word and the character on the line must be the same up -to a difference in case. Hence to make any lower case character on the -line match the corresponding upper case character in the trial -completion you can use tt(m:{[:lower:]}={[:upper:]})'. Although the -matching system does not yet handle multibyte characters, this is likely -to be a future extension, at which point this syntax will handle -arbitrary alphabets; hence this form, rather than the use of explicit -ranges, is the recommended form. In other cases -tt([:)var(name)tt(:])' forms are allowed. If the two forms on the left -and right are the same, the characters must match exactly. In remaining -cases, the corresponding tests are applied to both characters, but they -are not otherwise constrained; any matching character in one set goes -with any matching character in the other set: this is equivalent to the -behaviour of ordinary character classes. - -The pattern var(tpat) may also be one or two stars, tt(*)' or -tt(**)'. This means that the pattern on the command line can match -any number of characters in the trial completion. In this case the -pattern must be anchored (on either side); in the case of a single -star, the var(anchor) then determines how much of the trial completion -is to be included DASH()- only the characters up to the next appearance of -the anchor will be matched. With two stars, substrings matched by -the anchor can be matched, too. In the forms that include two -anchors, tt(*)' can match characters from the additional anchor -DASH()- var(lanchor) with tt(r) or var(ranchor) with tt(l). - -Examples: - -The keys of the tt(options) association defined by the tt(parameter) -module are the option names in all-lower-case form, without -underscores, and without the optional tt(no) at the beginning even -though the builtins tt(setopt) and tt(unsetopt) understand option names -with upper case letters, underscores, and the optional tt(no). The -following alters the matching rules so that the prefix tt(no) and any -underscore are ignored when trying to match the trial completions -generated and upper case letters on the line match the corresponding -lower case letters in the words: - -example(compadd -M 'L:|[nN][oO]= M:_= M:{[:upper:]}={[:lower:]}' - \ -${(k)options} )
-
-The first part says that the pattern tt([nN][oO])' at the beginning
-(the empty anchor before the pipe symbol) of the string on the
-line matches the empty string in the list of words generated by
-completion, so it will be ignored if present. The second part does the
-same for an underscore anywhere in the command line string, and the
-third part uses correspondence classes so that any
-upper case letter on the line matches the corresponding lower case
-letter in the word. The use of the upper case forms of the
-specification characters (tt(L) and tt(M)) guarantees that what has
-already been typed on the command line (in particular the prefix
-tt(no)) will not be deleted.
-
-Note that the use of tt(L) in the first part means that it matches
-only when at the beginning of both the command line string and the
-trial completion. I.e., the string tt(_NO_f)' would not be
-completed to tt(_NO_foo)', nor would tt(NONO_f)' be completed to
-tt(NONO_foo)' because of the leading underscore or the second
-tt(NO)' on the line which makes the pattern fail even though they are
-otherwise ignored. To fix this, one would use tt(B:[nN][oO]=)'
-instead of the first part. As described above, this matches at the
-beginning of the trial completion, independent of other characters or
-substrings at the beginning of the command line word which are ignored
-by the same or other var(match-spec)s.
-
-The second example makes completion case insensitive.  This is just
-the same as in the option example, except here we wish to retain the
-characters in the list of completions:
-
-
-This makes lower case letters match their upper case counterparts.
-To make upper case letters match the lower case forms as well:
-
-
-A nice example for the use of tt(*) patterns is partial word
-completion. Sometimes you would like to make strings like tt(c.s.u)'
-complete to strings like tt(comp.source.unix)', i.e. the word on the
-command line consists of multiple parts, separated by a dot in this
-example, where each part should be completed separately DASH()- note,
-however, that the case where each part of the word, i.e. tt(comp)',
-tt(source)' and tt(unix)' in this example, is to be completed from
-separate sets of matches
-is a different problem to be solved by the implementation of the
-completion widget.  The example can be handled by:
-
-  - comp.sources.unix comp.sources.misc ...)
-
-The first specification says that var(lpat) is the empty string, while
-var(anchor) is a dot; var(tpat) is tt(*), so this can match anything
-except for the tt(.)' from the anchor in
-the trial completion word.  So in tt(c.s.u)', the matcher sees tt(c)',
-followed by the empty string, followed by the anchor tt(.)', and
-likewise for the second dot, and replaces the empty strings before the
-anchors, giving tt(c)[tt(omp)]tt(.s)[tt(ources)]tt(.u)[tt(nix)]', where
-the last part of the completion is just as normal.
-
-With the pattern shown above, the string tt(c.u)' could not be
-completed to tt(comp.sources.unix)' because the single star means
-that no dot (matched by the anchor) can be skipped. By using two stars
-as in tt(r:|.=**)', however, tt(c.u)' could be completed to
-tt(comp.sources.unix)'. This also shows that in some cases,
-especially if the anchor is a real pattern, like a character class,
-the form with two stars may result in more matches than one would like.
-
-The second specification is needed to make this work when the cursor is
-in the middle of the string on the command line and the option
-tt(COMPLETE_IN_WORD) is set. In this case the completion code would
-normally try to match trial completions that end with the string as
-typed so far, i.e. it will only insert new characters at the cursor
-position rather than at the end.  However in our example we would like
-the code to recognise matches which contain extra characters after the
-string on the line (the tt(nix)' in the example).  Hence we say that the
-empty string at the end of the string on the line matches any characters
-at the end of the trial completion.
-
-More generally, the specification
-
-example(compadd -M 'r:|[.,_-]=* r:|=*' ... )
-
-allows one to complete words with abbreviations before any of the
-characters in the square brackets.  For example, to
-with the above in effect, you can just type tt(very.c) before attempting
-completion.
-
-The specifications with both a left and a right anchor are useful to
-complete partial words whose parts are not separated by some
-special character. For example, in some places strings have to be
-completed that are formed tt(LikeThis)' (i.e. the separate parts are
-determined by a leading upper case letter) or maybe one has to
-complete strings with trailing numbers. Here one could use the simple
-form with only one anchor as in:
-
-example(compadd -M 'r:|[[:upper:]0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
-
-But with this, the string tt(H)' would neither complete to tt(FooHoo)'
-nor to tt(LikeTHIS)' because in each case there is an upper case
-letter before the tt(H)' and that is matched by the anchor. Likewise,
-a tt(2)' would not be completed. In both cases this could be changed
-by using tt(r:|[[:upper:]0-9]=**)', but then tt(H)' completes to both
-tt(LikeTHIS)' and tt(FooHoo)' and a tt(2)' matches the other
-strings because characters can be inserted before every upper case
-letter and digit. To avoid this one would use:
-
-    LikeTHIS FooHoo foo123 bar234)
-
-By using these two anchors, a tt(H)' matches only upper case tt(H)'s that
-are immediately preceded by something matching the left anchor
-tt([^[:upper:]0-9])'. The effect is, of course, that tt(H)' matches only
-the string tt(FooHoo)', a tt(2)' matches only tt(bar234)' and so on.
-
-When using the completion system (see
-ifzman(zmanref(zshcompsys))\
-ifnzman(noderef(Completion System))\
-), users can define match specifications that are to be used for
-specific contexts by using the tt(matcher) and tt(matcher-list)
-styles. The values for the latter will be used everywhere.
-
texinode(Completion Widget Example)()(Completion Matching Control)(Completion Widgets)
sect(Completion Widget Example)
cindex(completion widgets, example)
diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
index 621707482..ee7e422c1 100644
--- a/Test/Y02compmatch.ztst
+++ b/Test/Y02compmatch.ztst
@@ -378,15 +378,26 @@
comp.graphics.rendering.misc comp.graphics.rendering.raytracing
comp.graphics.rendering.renderman)
test_code $example4_matcher example4_list - comptest$'tst c.s.u\t'
-0:Documentation example using input c.s.u
+ comptest $'tst .s.u\t' +0:Documentation example using input .s.u +>line: {tst comp.sources.unix }{} +>COMPADD:{} +>INSERT_POSITIONS:{21} + + example4b_matcher='r:[^.]||.=* r:|=*' + test_code$example4b_matcher example4_list
+ comptest $'tst .s.u\t^[bc\t' +0f:Documentation example using input .s.u but with double anchor +>line: {tst .s.u}{} +>COMPADD:{} +>INSERT_POSITIONS:{} >line: {tst comp.sources.unix }{} >COMPADD:{} >INSERT_POSITIONS:{21} test_code$example4_matcher example4_list
- comptest $'tst c.g.\ta\t.\tp\ta\tg\t' -0:Documentation example using input c.g.\ta\t.\tp\ta\tg\t + comptest$'tst .g.\ta\t.\tp\ta\tg\t'
+0:Documentation example using input .g.\ta\t.\tp\ta\tg\t
>line: {tst comp.graphics.}{}
>INSERT_POSITIONS:{18}
@@ -424,9 +435,32 @@
>INSERT_POSITIONS:{32}

+ test_code $example4b_matcher example4_list + comptest$'tst .g.\t^[bc\t'
+0f:Documentation example using input .g. with double anchor
+>line: {tst .g.}{}
+>INSERT_POSITIONS:{}
+>line: {tst comp.graphics.}{}
+>INSERT_POSITIONS:{18}
+
test_code $example4_matcher example4_list - comptest$'tst c...pag\t'
-0:Documentation example using input c...pag\t
+ comptest $'tst ...pag\t' +0:Documentation example using input ...pag +>line: {tst comp.graphics.apps.pagemaker }{} +>COMPADD:{} +>INSERT_POSITIONS:{32} + + test_code$example4b_matcher example4_list
+ comptest $'tst ...pag\t^[bc\t^Fg^F^Fa\t' +0f:Documentation example using input ...pag with double anchor +>line: {tst .g.}{} +>COMPADD:{} +>INSERT_POSITIONS:{} +>line: {tst c...pag}{} +>COMPADD:{} +>INSERT_POSITIONS:{} >line: {tst comp.graphics.apps.pagemaker }{} >COMPADD:{} >INSERT_POSITIONS:{32} @@ -444,8 +478,8 @@ example5_matcher='r:|[.,_-]=* r:|=*' example5_list=(veryverylongfile.c veryverylongheader.h) test_code$example5_matcher example5_list
- comptest $'tst v.c\tv.h\t' -0:Documentation example using input v.c\t + comptest$'tst  .c\t.h\t'
+0:Documentation example using input .c
>line: {tst  veryverylongfile.c }{}
>INSERT_POSITIONS:{23}
@@ -453,6 +487,23 @@
>INSERT_POSITIONS:{44}

+ example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
+ test_code $example5b_matcher example5_list + comptest$'tst  .c\t^[bv\t.h\t^[bv'
+0f:Documentation example using input .c but with double anchor
+>line: {tst  .c}{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c }{}
+>INSERT_POSITIONS:{23}
+>line: {tst  veryverylongfile.c .h}{}
+>INSERT_POSITIONS:{}
+>INSERT_POSITIONS:{44}
+

example6_list=(LikeTHIS FooHoo 5foo123 5bar234)
test_code 'r:|[A-Z0-9]=* r:|=*' example6_list
@@ -493,15 +544,52 @@
example7_matcher="r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
example7_list=($example6_list) test_code$example7_matcher example7_list
- comptest $'tst H\t2\t' -0:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*" + comptest$'tst H\t^[bF\to2\t^[b5\tb\t'
+0f:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
+>line: {tst H}{}
+>INSERT_POSITIONS:{}
+>line: {tst F}{H}
+>INSERT_POSITIONS:{}
>line: {tst FooHoo }{}
>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 2}{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo 5}{2}
+>INSERT_POSITIONS:{}
>line: {tst FooHoo 5bar234 }{}
>INSERT_POSITIONS:{18}

+ example7b_matcher="r:?||[A-Z0-9]=* r:|=*"
+ test_code $example7b_matcher example7_list + comptest$'tst H\t^[bF2\t^[b5\t'
+0f:Documentation example using "r:?||[A-Z0-9]=* r:|=*"
+>line: {tst H}{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo }{}
+>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 5bar234 }{}
+>INSERT_POSITIONS:{18}
+
+ example8_list=(passwd.byname)
+ test_code 'r:[^.]||.=* l:.||[^.]=*'
+ comptest 'tst .^B\tpass^Fname\t' +0f:Symmetry between r and l +>line: {tst }{.} +>COMPADD:{} +>INSERT_POSITIONS:{} +>line: {tst passwd.byname }{} +>COMPADD:{} +>INSERT_POSITIONS:{17} + workers_7311_matcher="m:{a-z}={A-Z} r:|[.,_-]=* r:|=*" workers_7311_list=(Abc-Def-Ghij.txt Abc-def.ghi.jkl_mno.pqr.txt Abc_def_ghi_jkl_mno_pqr.txt) -- 2.33.0 ^ permalink raw reply [flat|nested] 14+ messages in thread * Re: [RFC] Add xfail tests for || form of completion matchers 2021-10-12 12:08  Marlon Richert @ 2021-10-12 15:25  Daniel Shahaf 2021-10-13 4:57  Bart Schaefer 2021-10-13 5:08  Bart Schaefer 2021-10-14 20:43  Oliver Kiddle 2021-10-22 13:02  Oliver Kiddle 2 siblings, 2 replies; 14+ messages in thread From: Daniel Shahaf @ 2021-10-12 15:25 UTC (permalink / raw) To: Marlon Richert; +Cc: Zsh hackers list Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300: > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote: > > > > The tests show how :||= matchers should behave in order to provide > > completion features that cannot be implemented with :|= matchers. Would this be backwards compatible? > > This is a follow-up to users/27228. > > I've now added an accompanying documentation update to the patch. Thanks. I have never found that section easy to follow. Could you confirm that the text which the docs patch deletes or changes was all confirmed correct (even if perhaps unclear)? I'm concerned about us possibly changing dense, accurate docs into clear, less-accurate docs. Case in point: The incumbent docs say that the coanchor is matched only against the trial completion, but the new docs say something else. If that's an intentional change, it needs to be called out explicitly in the log message. Haven't looked for other differences. In the man page rendering on my system, itemiz()'s bullet is vertically aligned with the parent item()'s text. Cheers, Daniel > From 3ec2fceced1f327eb2ac7484772bd1d3756bf8d2 Mon Sep 17 00:00:00 2001 > From: Marlon Richert <marlon.richert@gmail.com> > Date: Tue, 12 Oct 2021 15:02:31 +0300 > Subject: [PATCH] Add xfail tests for || form of completion matchers > > The tests show how :||= matchers should behave in order to provide > completion features that cannot be implemented with :|= matchers. > --- > Doc/Zsh/compwid.yo | 446 ++++++++++++++++++----------------------- > Test/Y02compmatch.ztst | 108 +++++++++- > 2 files changed, 293 insertions(+), 261 deletions(-) > > diff --git a/Doc/Zsh/compwid.yo b/Doc/Zsh/compwid.yo > index 3e86d3b42..5dd2127df 100644 > --- a/Doc/Zsh/compwid.yo > +++ b/Doc/Zsh/compwid.yo > @@ -896,72 +896,210 @@ enditem() > texinode(Completion Matching Control)(Completion Widget Example)(Completion Condition Codes)(Completion Widgets) > sect(Completion Matching Control) > > -It is possible by use of the > -tt(-M) option of the tt(compadd) builtin command to specify how the > -characters in the string to be completed (referred to here as the > -command line) map onto the characters in the list of matches produced by > -the completion code (referred to here as the trial completions). Note > -that this is not used if the command line contains a glob pattern and > -the tt(GLOB_COMPLETE) option is set or the tt(pattern_match) of the > -tt(compstate) special association is set to a non-empty string. > - > -The var(match-spec) given as the argument to the tt(-M) option (see > +By default, characters in the string to be completed (referred to here as the > +command line) map only onto identical characters in the list of matches > +produced by the completion code (referred to here as the trial completions) and > +missing characters are inserted only at the cursor position, if the shell > +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line, > +otherwise. However, it is possible to modify this behavior by use of the > +tt(-M) option of the tt(compadd) builtin command. Note that this is not used > +if the command line contains a glob pattern and the shell > +optiontt(GLOB_COMPLETE) is set or the tt(pattern_match) of the tt(compstate) > +special association is set to a non-empty string. > + > +The tt(-M) option (see > ifzman(Completion Builtin Commands' above)\ > -ifnzman(noderef(Completion Builtin Commands))\ > -) consists of one or more matching descriptions separated by > -whitespace. Each description consists of a letter followed by a colon > -and then the patterns describing which character sequences on the line match > -which character sequences in the trial completion. Any sequence of > -characters not handled in this fashion must match exactly, as usual. > - > -The forms of var(match-spec) understood are as follows. In each case, the > -form with an upper case initial character retains the string already > -typed on the command line as the final result of completion, while with > -a lower case initial character the string on the command line is changed > -into the corresponding part of the trial completion. > +ifnzman(noderef(Completion Builtin > +Commands))\ > +) requires a var(match-spec) as it argument, consisting of one or more matching > +descriptions separated by whitespace. Each description consists of a letter, > +followed by a colon, and then patterns describing which substrings on the > +command line map onto which substrings in the trial completion. Descriptions > +are evaluated from left to right and are cumulative. An earlier mapping can > +thus potentially change the outcome of a later mapping. Finally, any unmapped > +substrings will be mapped using the default mapping of identical substrings. > + > +When using the completion system (see > +ifzman(zmanref(zshcompsys))\ > +ifnzman(noderef(Completion System))\ > +), users can define match specifications that are to be used for specific > +contexts by using the tt(matcher) and tt(matcher-list) styles. The values for > +the latter will be used everywhere. > + > +Each pattern in a var(match-spec) is either an empty string or consists of a > +sequence of literal characters (which may be quoted with a backslash), question > +marks, character classes, and correspondence classes (see next paragraph). > +Ordinary shell patterns are not used. Literal characters match only > +themselves, question marks match any character, and character classes are > +formed as for globbing and match any character in the given set. > + > +Correspondence classes are defined like character classes, but with two > +differences: They are delimited by a pair of braces, and negated classes are > +not allowed, so the characters tt(!) and tt(^) have no special meaning directly > +after the opening brace. They indicate that a range of characters on the line > +match a range of characters in the trial completion, but (unlike ordinary > +character classes) paired according to the corresponding position in the > +sequence. More than one pair of classes can occur, in which case the first > +class before the tt(=) corresponds to the first after it, and so on. If one > +side has more such classes than the other side, the superfluous classes behave > +like normal character classes. > + > +The standard tt([:)var(name)tt(:])' forms described for standard shell > +patterns (see > +ifnzman(noderef(Filename Generation))\ > +ifzman(the section > +FILENAME GENERATION in zmanref(zshexpn))\ > +) may appear in correspondence classes as well as normal character classes. > +The only special behaviour in correspondence classes is if the form on the left > +and the form on the right are each one of tt([:upper:]), tt([:lower:]). In > +these cases the character in the word and the character on the line must be the > +same up to a difference in case. Although the matching system does not yet > +handle multibyte characters, this is likely to be a future extension, at which > +point this syntax will handle arbitrary alphabets; hence this form, rather than > +the use of explicit ranges, is the recommended form. In other cases > +tt([:)var(name)tt(:])' forms are allowed. If the two forms on the left and > +right are the same, the characters must match exactly. In remaining cases, the > +corresponding tests are applied to both characters, but they are not otherwise > +constrained; any matching character in one set goes with any matching character > +in the other set: this is equivalent to the behaviour of ordinary character > +classes. > + > +The forms of var(match-spec) understood are listed below. For each of these, > +the form with an upper case initial character replaces mapped substrings in the > +trial completions with their counterparts from the command line, whereas with a > +lower case initial character, once a trial completion has been accepted, > +matched substrings on the command line are replaced with their counterparts > +from the accepted completion. > > startitem() > xitem(tt(m:)var(lpat)tt(=)var(tpat)) > item(tt(M:)var(lpat)tt(=)var(tpat))( > -Here, var(lpat) is a pattern that matches on the command line, > -corresponding to var(tpat) which matches in the trial completion. > +Let any substring matching var(lpat) be completed to any substring matching > +var(tpat). > + > +Examples: > + > +tt(m:{[:lower:]}={[:upper:]}) lets any lower case character be completed to its > +uppercase counterpart. > + > +tt(M:_=) inserts every underscore on the command line into each trial > +completion, in the same relative position, determined by matching the > +substrings around it. Note that the definition of what is matching can be > +modified by applying other matchers first. > + > +If these two matchers are combined to tt('m:{[:lower:]}={[:upper:]} M:_='), > +then given a trial completion tt(NO)', it lets tt(_n_o_)' be completed to > +tt(_N_O_)', even though tt(_N_O_)' itself is not present as a trial > +completion. tt(m:{[:lower:]}={[:upper:]}) is evaluated first and makes tt(n) > +match tt(N)' and tt(o) match tt(O)', after which tt(M:_=) is then able to > +insert underscores into the correct positions. > +) > +xitem(tt(l:)tt(|)var(lpat)tt(=)var(tpat)) > +xitem(tt(L:)tt(|)var(lpat)tt(=)var(tpat)) > +xitem(tt(r:)var(lpat)tt(|)tt(=)var(tpat)) > +item(tt(R:)var(lpat)tt(|)tt(=)var(tpat))( > +Let any substring matching var(lpat) at the left (for tt(l:) and tt(L:)) or > +right (for tt(r:) and tt(R:)) edge of the command line be completed to any > +substring matching var(tpat) in the same position in the trial completion. > + > +With these matchers, the pattern var(tpat) may also be a star, tt(*)'. This > +lets a matching command line substring be completed to any trial completion > +substring in the same relative position. > + > +Examples: > + > +tt(L:|[nN][oO]=) makes it so that, if there is a single tt(no)', tt(nO)', > +tt(No)' or tt(no)' at the left end of the command line, then it is added to > +the left of each trial completion. > + > +tt(r:|=*) lets (the empty substring at) the right edge of the command line > +string be completed to any number of characters at the edge of each trial > +completion. > + > +If these two matchers are combined to tt('L:[nN][oO]= r:|=*'), then given a > +trial completion tt(foo)', it lets tt(NOf)' be completed to tt(NOfoo)'. > +First, tt(L:[nN][oO]=) prefixes the trial completion with tt(NO), after which > +tt(r:|=*) is able to match the command line to the trial completion and > +complete the missing characters at the end. > ) > -xitem(tt(l:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat)) > -xitem(tt(L:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat)) > -xitem(tt(l:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) > -xitem(tt(L:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) > xitem(tt(b:)var(lpat)tt(=)var(tpat)) > -item(tt(B:)var(lpat)tt(=)var(tpat))( > -These letters are for patterns that are anchored by another pattern on > -the left side. Matching for var(lpat) and var(tpat) is as for tt(m) and > -tt(M), but the pattern var(lpat) matched on the command line must be > -preceded by the pattern var(lanchor). The var(lanchor) can be blank to > -anchor the match to the start of the command line string; otherwise the > -anchor can occur anywhere, but must match in both the command line and > -trial completion strings. > - > -If no var(lpat) is given but a var(ranchor) is, this matches the gap > -between substrings matched by var(lanchor) and var(ranchor). Unlike > -var(lanchor), the var(ranchor) only needs to match the trial > -completion string. > - > -The tt(b) and tt(B) forms are similar to tt(l) and tt(L) with an empty > -anchor, but need to match only the beginning of the word on the command line > -or trial completion, respectively. > -) > -xitem(tt(r:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat)) > -xitem(tt(R:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat)) > -xitem(tt(r:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) > -xitem(tt(R:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) > +xitem(tt(B:)var(lpat)tt(=)var(tpat)) > xitem(tt(e:)var(lpat)tt(=)var(tpat)) > item(tt(E:)var(lpat)tt(=)var(tpat))( > -As tt(l), tt(L), tt(b) and tt(B), with the difference that the command > -line and trial completion patterns are anchored on the right side. > -Here an empty var(ranchor) and the tt(e) and tt(E) forms force the > -match to the end of the command line or trial completion string. > - > -In the form where var(lanchor) is given, the var(lanchor) only needs > -to match the trial completion string. > +Let all substrings matching var(lpat) at the beginning (for tt(b:) and tt(B:)) > +or end (for tt(e:) and tt(E:)) of the command line be completed to the same > +number of substrings matching var(tpat) in each trial completion in the same > +relative position. > + > +Example: > + > +tt(B:[nN][oO]=) adds all occurences of tt(no)', tt(nO)', tt(No)' and > +tt(NO)' at the beginning of the command line to the beginning of each trial > +completion. If tt(r:|=*) is added to this, then given a trial completion > +tt(foo)', it lets tt(noNOf)' be completed to tt(noNOfoo)'. > +) > +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat)) > +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat)) > +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat)) > +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))( > +Let any command line substring, which is left/right-adjacent (respectively) to > +a substring matching var(anchor) and which matches var(lpat), be completed to > +any trial completion substring, which > +startitemize() > +itemiz(\ > +is adjacent to the same substring and which > +) > +itemiz(\ > +matches var(tpat), but which > +) > +itemiz(\ > +does not contain any substrings matching var(anchor). > +) > +enditemize() > + > +When a matcher includes at least one anchor (which also applies to the forms > +with two anchors, below), the pattern var(tpat) may also be one or two stars, > +tt(*)' or tt(**)'. The first star can match any number of characters, within > +the constraints outlined above, whereas a second star removes the last > +constraint and can match substrings matching var(anchor). > + > +Example: > + > +tt(r:|.=*) lets each dot be completed to any substring that ends at the right > +in a dot, but does not otherwise contain any dots, in the trial string. Thus, > +given a trial string tt(comp.sources.unix)', tt(..unix)' can be completed to > +it, but tt(.unix)' cannot, since the matcher will refuse to map any dots other > +than the one matched by the var(anchor). > +) > +xitem(tt(l:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat)) > +xitem(tt(L:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat)) > +xitem(tt(r:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat)) > +item(tt(R:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat))( > +Lets the empty string between each two adjacent command line substrings > +matching var(anchor) and var(coanchor), in the order given, be completed to any > +trial completion substring, which > +startitemize() > +itemiz(\ > +is adjacent to the same two substrings and which > +) > +itemiz(\ > +matches var(tpat), but which > +) > +itemiz(\ > +does not contain any substrings matching var(anchor). > +) > +enditemize() > + > +Note there is no restriction on substrings matching var(coanchor). > + > +Example: > + > +tt(r:?||[[:upper:]]=*) will complete tt(fHoo)' to tt(fooHoo)', but not > +tt(Hoo)' to tt(fooHoo)', because there is no character to the left of tt(H)' > +on the command line˙. Likewise, it will not complete tt(lHIS)' to > +tt(likeTHIS)', because, other than the one substring it maps to var(anchor), > +it cannot map any substring containing uppercase letters in the trial > +completion. > ) > item(tt(x:))( > This form is used to mark the end of matching specifications: > @@ -972,200 +1110,6 @@ function to override another. > ) > enditem() > > -Each var(lpat), var(tpat) or var(anchor) is either an empty string or > -consists of a sequence of literal characters (which may be quoted with a > -backslash), question marks, character classes, and correspondence > -classes; ordinary shell patterns are not used. Literal characters match > -only themselves, question marks match any character, and character > -classes are formed as for globbing and match any character in the given > -set. > - > -Correspondence classes are defined like character classes, but with two > -differences: they are delimited by a pair of braces, and negated classes > -are not allowed, so the characters tt(!) and tt(^) have no special > -meaning directly after the opening brace. They indicate that a range of > -characters on the line match a range of characters in the trial > -completion, but (unlike ordinary character classes) paired according to > -the corresponding position in the sequence. For example, to make any > -ASCII lower case letter on the line match the corresponding upper case > -letter in the trial completion, you can use tt(m:{a-z}={A-Z})' > -(however, see below for the recommended form for this). More > -than one pair of classes can occur, in which case the first class before > -the tt(=) corresponds to the first after it, and so on. If one side has > -more such classes than the other side, the superfluous classes behave > -like normal character classes. In anchor patterns correspondence classes > -also behave like normal character classes. > - > -The standard tt([:)var(name)tt(:])' forms described for standard shell > -patterns (see > -ifnzman(noderef(Filename Generation))\ > -ifzman(the section FILENAME GENERATION in zmanref(zshexpn))) > -may appear in correspondence classes as well as normal character > -classes. The only special behaviour in correspondence classes is if > -the form on the left and the form on the right are each one of > -tt([:upper:]), tt([:lower:]). In these cases the > -character in the word and the character on the line must be the same up > -to a difference in case. Hence to make any lower case character on the > -line match the corresponding upper case character in the trial > -completion you can use tt(m:{[:lower:]}={[:upper:]})'. Although the > -matching system does not yet handle multibyte characters, this is likely > -to be a future extension, at which point this syntax will handle > -arbitrary alphabets; hence this form, rather than the use of explicit > -ranges, is the recommended form. In other cases > -tt([:)var(name)tt(:])' forms are allowed. If the two forms on the left > -and right are the same, the characters must match exactly. In remaining > -cases, the corresponding tests are applied to both characters, but they > -are not otherwise constrained; any matching character in one set goes > -with any matching character in the other set: this is equivalent to the > -behaviour of ordinary character classes. > - > -The pattern var(tpat) may also be one or two stars, tt(*)' or > -tt(**)'. This means that the pattern on the command line can match > -any number of characters in the trial completion. In this case the > -pattern must be anchored (on either side); in the case of a single > -star, the var(anchor) then determines how much of the trial completion > -is to be included DASH()- only the characters up to the next appearance of > -the anchor will be matched. With two stars, substrings matched by > -the anchor can be matched, too. In the forms that include two > -anchors, tt(*)' can match characters from the additional anchor > -DASH()- var(lanchor) with tt(r) or var(ranchor) with tt(l). > - > -Examples: > - > -The keys of the tt(options) association defined by the tt(parameter) > -module are the option names in all-lower-case form, without > -underscores, and without the optional tt(no) at the beginning even > -though the builtins tt(setopt) and tt(unsetopt) understand option names > -with upper case letters, underscores, and the optional tt(no). The > -following alters the matching rules so that the prefix tt(no) and any > -underscore are ignored when trying to match the trial completions > -generated and upper case letters on the line match the corresponding > -lower case letters in the words: > - > -example(compadd -M 'L:|[nN][oO]= M:_= M:{[:upper:]}={[:lower:]}' - \ > -{(k)options} )
> -
> -The first part says that the pattern tt([nN][oO])' at the beginning
> -(the empty anchor before the pipe symbol) of the string on the
> -line matches the empty string in the list of words generated by
> -completion, so it will be ignored if present. The second part does the
> -same for an underscore anywhere in the command line string, and the
> -third part uses correspondence classes so that any
> -upper case letter on the line matches the corresponding lower case
> -letter in the word. The use of the upper case forms of the
> -specification characters (tt(L) and tt(M)) guarantees that what has
> -already been typed on the command line (in particular the prefix
> -tt(no)) will not be deleted.
> -
> -Note that the use of tt(L) in the first part means that it matches
> -only when at the beginning of both the command line string and the
> -trial completion. I.e., the string tt(_NO_f)' would not be
> -completed to tt(_NO_foo)', nor would tt(NONO_f)' be completed to
> -tt(NONO_foo)' because of the leading underscore or the second
> -tt(NO)' on the line which makes the pattern fail even though they are
> -otherwise ignored. To fix this, one would use tt(B:[nN][oO]=)'
> -instead of the first part. As described above, this matches at the
> -beginning of the trial completion, independent of other characters or
> -substrings at the beginning of the command line word which are ignored
> -by the same or other var(match-spec)s.
> -
> -The second example makes completion case insensitive.  This is just
> -the same as in the option example, except here we wish to retain the
> -characters in the list of completions:
> -
> -example(compadd -M 'm:{[:lower:]}={[:upper:]}' ... )
> -
> -This makes lower case letters match their upper case counterparts.
> -To make upper case letters match the lower case forms as well:
> -
> -example(compadd -M 'm:{[:lower:][:upper:]}={[:upper:][:lower:]}' ... )
> -
> -A nice example for the use of tt(*) patterns is partial word
> -completion. Sometimes you would like to make strings like tt(c.s.u)'
> -complete to strings like tt(comp.source.unix)', i.e. the word on the
> -command line consists of multiple parts, separated by a dot in this
> -example, where each part should be completed separately DASH()- note,
> -however, that the case where each part of the word, i.e. tt(comp)',
> -tt(source)' and tt(unix)' in this example, is to be completed from
> -separate sets of matches
> -is a different problem to be solved by the implementation of the
> -completion widget.  The example can be handled by:
> -
> -example(compadd -M 'r:|.=* r:|=*' \
> -  - comp.sources.unix comp.sources.misc ...)
> -
> -The first specification says that var(lpat) is the empty string, while
> -var(anchor) is a dot; var(tpat) is tt(*), so this can match anything
> -except for the tt(.)' from the anchor in
> -the trial completion word.  So in tt(c.s.u)', the matcher sees tt(c)',
> -followed by the empty string, followed by the anchor tt(.)', and
> -likewise for the second dot, and replaces the empty strings before the
> -anchors, giving tt(c)[tt(omp)]tt(.s)[tt(ources)]tt(.u)[tt(nix)]', where
> -the last part of the completion is just as normal.
> -
> -With the pattern shown above, the string tt(c.u)' could not be
> -completed to tt(comp.sources.unix)' because the single star means
> -that no dot (matched by the anchor) can be skipped. By using two stars
> -as in tt(r:|.=**)', however, tt(c.u)' could be completed to
> -tt(comp.sources.unix)'. This also shows that in some cases,
> -especially if the anchor is a real pattern, like a character class,
> -the form with two stars may result in more matches than one would like.
> -
> -The second specification is needed to make this work when the cursor is
> -in the middle of the string on the command line and the option
> -tt(COMPLETE_IN_WORD) is set. In this case the completion code would
> -normally try to match trial completions that end with the string as
> -typed so far, i.e. it will only insert new characters at the cursor
> -position rather than at the end.  However in our example we would like
> -the code to recognise matches which contain extra characters after the
> -string on the line (the tt(nix)' in the example).  Hence we say that the
> -empty string at the end of the string on the line matches any characters
> -at the end of the trial completion.
> -
> -More generally, the specification
> -
> -example(compadd -M 'r:|[.,_-]=* r:|=*' ... )
> -
> -allows one to complete words with abbreviations before any of the
> -characters in the square brackets.  For example, to
> -complete tt(veryverylongfile.c) rather than tt(veryverylongheader.h)
> -with the above in effect, you can just type tt(very.c) before attempting
> -completion.
> -
> -The specifications with both a left and a right anchor are useful to
> -complete partial words whose parts are not separated by some
> -special character. For example, in some places strings have to be
> -completed that are formed tt(LikeThis)' (i.e. the separate parts are
> -determined by a leading upper case letter) or maybe one has to
> -complete strings with trailing numbers. Here one could use the simple
> -form with only one anchor as in:
> -
> -example(compadd -M 'r:|[[:upper:]0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
> -
> -But with this, the string tt(H)' would neither complete to tt(FooHoo)'
> -nor to tt(LikeTHIS)' because in each case there is an upper case
> -letter before the tt(H)' and that is matched by the anchor. Likewise,
> -a tt(2)' would not be completed. In both cases this could be changed
> -by using tt(r:|[[:upper:]0-9]=**)', but then tt(H)' completes to both
> -tt(LikeTHIS)' and tt(FooHoo)' and a tt(2)' matches the other
> -strings because characters can be inserted before every upper case
> -letter and digit. To avoid this one would use:
> -
> -example(compadd -M 'r:[^[:upper:]0-9]||[[:upper:]0-9]=** r:|=*' \
> -    LikeTHIS FooHoo foo123 bar234)
> -
> -By using these two anchors, a tt(H)' matches only upper case tt(H)'s that
> -are immediately preceded by something matching the left anchor
> -tt([^[:upper:]0-9])'. The effect is, of course, that tt(H)' matches only
> -the string tt(FooHoo)', a tt(2)' matches only tt(bar234)' and so on.
> -
> -When using the completion system (see
> -ifzman(zmanref(zshcompsys))\
> -ifnzman(noderef(Completion System))\
> -), users can define match specifications that are to be used for
> -specific contexts by using the tt(matcher) and tt(matcher-list)
> -styles. The values for the latter will be used everywhere.
> -
>  texinode(Completion Widget Example)()(Completion Matching Control)(Completion Widgets)
>  sect(Completion Widget Example)
>  cindex(completion widgets, example)
> diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
> index 621707482..ee7e422c1 100644
> --- a/Test/Y02compmatch.ztst
> +++ b/Test/Y02compmatch.ztst
> @@ -378,15 +378,26 @@

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-12 15:25    Daniel Shahaf
@ 2021-10-13  4:57      Bart Schaefer
2021-10-13  5:08      Bart Schaefer
1 sibling, 0 replies; 14+ messages in thread
From: Bart Schaefer @ 2021-10-13  4:57 UTC (permalink / raw)
To: Marlon Richert; +Cc: Zsh hackers list

On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> >
> > I've now added an accompanying documentation update to the patch.
>
> Could you confirm that the text which the docs patch deletes or changes
> was all confirmed correct (even if perhaps unclear)?

For example, this part is misleading:

> > +By default, characters in the string to be completed (referred to here as the
> > +command line) map only onto identical characters in the list of matches
[...]
> > +missing characters are inserted only at the cursor position, if the shell
> > +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,

It's at the end of the current word, not the end of the command line.
The old wording nearly always says "string on the command line" which
is only somewhat better; if it's going to be completely rewritten to
drop "string on the", the phrase "command line" should become more
precise.  "Incomplete word" perhaps?

> > +) requires a var(match-spec) as it argument, consisting of one or more matching

"its"

> > +corresponding tests are applied to both characters, but they are not otherwise
> > +constrained; any matching character in one set goes with any matching character
> > +in the other set: this is equivalent to the behaviour of ordinary character
> > +classes.

What's an "ordinary" character class?  That is, what ordinary context
is implied?

> > +xitem(tt(l:)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(L:)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(r:)var(lpat)tt(|)tt(=)var(tpat))
> > +item(tt(R:)var(lpat)tt(|)tt(=)var(tpat))(
> > +Let any substring matching var(lpat) at the left (for tt(l:) and tt(L:)) or
> > +right (for tt(r:) and tt(R:)) edge of the command line be completed to any

Again, not the command line, just the current word under (or to the
left of) the cursor, but I'll stop mentioning this because it's a
problem with definition of terms.

> > +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
> > +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
> > +Let any command line substring, which is left/right-adjacent (respectively) to
> > +a substring matching var(anchor) and which matches var(lpat), be completed to
> > +any trial completion substring, which
> > +startitemize()
> > +itemiz(\
> > +is adjacent to the same substring and which

Unclear whether "the same substring" refers to "any command line
substring" or to "a substring matching anchor".  I believe you mean
the former (or perhaps the larger substring composed of both of the
former)?  Best to specify.

I believe the rest of the explanations are correct, but it would be
good if Oliver confirms.

Did you remove the assorted other examples because there is a problem with them?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-12 15:25    Daniel Shahaf
2021-10-13  4:57      Bart Schaefer
@ 2021-10-13  5:08      Bart Schaefer
2021-10-13 14:20        Marlon Richert
From: Bart Schaefer @ 2021-10-13  5:08 UTC (permalink / raw)
To: Daniel Shahaf; +Cc: Marlon Richert, Zsh hackers list

On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > >
> > > The tests show how :||= matchers should behave in order to provide
> > > completion features that cannot be implemented with :|= matchers.
>
> Would this be backwards compatible?

In particular, with the exception of specific bug regression tests,
all the tests using || matchers have been converted to xfails.
Shouldn't there still be some generic tests of the (functionally
correct subset of) the current behavior of || ?  And, do you think any
of the regression tests would begin to fail if the xfail tests begin
to succeed?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-13  5:08      Bart Schaefer
@ 2021-10-13 14:20        Marlon Richert
2021-10-13 19:37          Daniel Shahaf
From: Marlon Richert @ 2021-10-13 14:20 UTC (permalink / raw)
To: Bart Schaefer; +Cc: Daniel Shahaf, Zsh hackers list

On Wed, Oct 13, 2021 at 8:08 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > > >
> > > > The tests show how :||= matchers should behave in order to provide
> > > > completion features that cannot be implemented with :|= matchers.
> >
> > Would this be backwards compatible?

No, it would not, but that's unavoidable, since at present, the :||=

On the plus side, there are only two lines in the Zsh codebase where
:||= matchers are used, in _ssh and _x_color. It won't require much
work to convert those.

> In particular, with the exception of specific bug regression tests,
> all the tests using || matchers have been converted to xfails.
> Shouldn't there still be some generic tests of the (functionally
> correct subset of) the current behavior of || ?

There was exactly one non-regression test using :||= matchers,
«Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"», and
unfortunately, that one will no longer pass as written. However, I
will see if I can find some cases for which the current implementation
works correctly and add tests for them.

> And, do you think any
> of the regression tests would begin to fail if the xfail tests begin
> to succeed?

There are four regression tests that incorporate one or more :||=
matchers. I investigated them and this is what I found:
* Bug from workers 11081 is about the cursor jumping back and forth. I
was able to remove all three :||= matchers from the test without
breaking it.
* Bug from workers 11586 is about characters getting deleted while
inserting the "unambiguous" substring. Here, I was not able to remove
or replace the :||= matcher and still get the same output. This one
might break, but most of the output it expects is actually not
relevant to the bug it is testing.
* Test from workers 13320 is about cursor positioning. I was able to
remove the :||= matcher from the test without breaking it.
* Second test from workers 13345 is about a character getting deleted.
I was able to replace the :||= matcher with a :|= matcher without
breaking the test.

On Tue, Oct 12, 2021 at 6:25 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Thanks.  I have never found that section easy to follow.

You're not the only one. ;)

> Could you confirm that the text which the docs patch deletes or changes
> was all confirmed correct (even if perhaps unclear)?  I'm concerned
> about us possibly changing dense, accurate docs into clear, less-accurate
> docs.

The original docs were vague and ambiguous on many points, and even

> Case in point: The incumbent docs say that the coanchor is matched only
> against the trial completion, but the new docs say something else.  If
> that's an intentional change, it needs to be called out explicitly in
> the log message.

Yes, that's intentional. I'll add it to the commit message.

> In the man page rendering on my system, itemiz()'s bullet is vertically
> aligned with the parent item()'s text.

I can confirm that this indeed looks wrong. Do you know what I should
use to get them indented properly? Or if that's not possible, I can
add a ifzman check to format the text without bullets in the man page.
However, I would at least like to keep them in the html page, because
it helps make the text clearer.

On Wed, Oct 13, 2021 at 7:57 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Could you confirm that the text which the docs patch deletes or changes
> > was all confirmed correct (even if perhaps unclear)?
>
> For example, this part is misleading:
>
> > > +By default, characters in the string to be completed (referred to here as the
> > > +command line) map only onto identical characters in the list of matches
> [...]
> > > +missing characters are inserted only at the cursor position, if the shell
> > > +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
>
> It's at the end of the current word, not the end of the command line.
> The old wording nearly always says "string on the command line" which
> is only somewhat better; if it's going to be completely rewritten to
> drop "string on the", the phrase "command line" should become more
> precise.  "Incomplete word" perhaps?

The use of "command line" in this fashion is from the original text;
it is used there about half of the time without the addition of
"string". However, I agree that it's ambiguous. I'm fine replacing it
with "incomplete word" (unless we come up with a better term).

> > > +corresponding tests are applied to both characters, but they are not otherwise
> > > +constrained; any matching character in one set goes with any matching character
> > > +in the other set: this is equivalent to the behaviour of ordinary character
> > > +classes.
>
> What's an "ordinary" character class?  That is, what ordinary context
> is implied?

I didn't write that paragraph; it was already present in the original
doc. I just moved it around.

> > > +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > > +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > > +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
> > > +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
> > > +Let any command line substring, which is left/right-adjacent (respectively) to
> > > +a substring matching var(anchor) and which matches var(lpat), be completed to
> > > +any trial completion substring, which
> > > +startitemize()
> > > +itemiz(\
> > > +is adjacent to the same substring and which
>
> Unclear whether "the same substring" refers to "any command line
> substring" or to "a substring matching anchor".  I believe you mean
> the former (or perhaps the larger substring composed of both of the
> former)?  Best to specify.

Will do.

> I believe the rest of the explanations are correct, but it would be
> good if Oliver confirms.
>
> Did you remove the assorted other examples because there is a problem with them?

I split them into parts and moved each part directly beneath the
matcher(s) it uses. This makes the matchers easier to understand and
allows the examples to be explained with less text.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-13 14:20        Marlon Richert
@ 2021-10-13 19:37          Daniel Shahaf
2021-10-13 20:02            Bart Schaefer
2021-10-14 20:25            Oliver Kiddle
0 siblings, 2 replies; 14+ messages in thread
From: Daniel Shahaf @ 2021-10-13 19:37 UTC (permalink / raw)
To: Marlon Richert; +Cc: Zsh hackers list

Marlon Richert wrote on Wed, Oct 13, 2021 at 17:20:09 +0300:
> On Wed, Oct 13, 2021 at 8:08 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > >
> > > Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > > > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > > > >
> > > > > The tests show how :||= matchers should behave in order to provide
> > > > > completion features that cannot be implemented with :|= matchers.
> > >
> > > Would this be backwards compatible?
>
> No, it would not, but that's unavoidable, since at present, the :||=
> matchers don't work correctly. Please see my and Oliver's comments in
>

Care to give a more specific pointer?  As in, specific cases where the
incumbent documentation doesn't match the implementation?  users/27228
itself reads rather along the lines of "let's re-design the feature
retroactively".

If you want to clarify the documentation of the feature as designed,
kudos.  If you want to increase test coverage, more kudos.  If you want
to throw out the existing documentation and implementation and
reimplement things differently… that's not to be done lightly,
notwithstanding that you went the right way about proposing it (first
clarifying the status quo, then posting docs and XFail tests).

> On the plus side, there are only two lines in the Zsh codebase where
> :||= matchers are used, in _ssh and _x_color. It won't require much
> work to convert those.

That's not how it works.  Documented semantics are API promises that
should be presumed to be used in the wild.  Any change that may break
anybody's proverbial spacebar heating is an incompatible change and
should be treated accordingly (avoided if possible, and failing that,
clearly documented for upgraders, designed with a reasonable failure
mode for old code on new zsh, etc.).

When you give your house key to a trusted, you can always change the
lock and give the friend a new key.  However, user code isn't like
a house key.  User code is more closely analogous to public roads in
that old story about how the width of a car was basically determined by
the Romans (because cars had to be compatible with existing roads): it
exists, it can't be changed, it must be compatible with; it's a design
constraint.

> > In particular, with the exception of specific bug regression tests,
> > all the tests using || matchers have been converted to xfails.
> > Shouldn't there still be some generic tests of the (functionally
> > correct subset of) the current behavior of || ?
>
> There was exactly one non-regression test using :||= matchers,
> «Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"», and
> unfortunately, that one will no longer pass as written.

A patch that breaks a documentation example is the archetype of an
incompatible change.  Is there no alternative to that?  Can't you add
a new syntax?  It can be as simple as «-M 'v2: …'» (that's pretty common
in standards that retrofit themselves into DNS TXT records, for instance).

> However, I will see if I can find some cases for which the current
> implementation works correctly and add tests for them.

Thanks.

> > And, do you think any of the regression tests would begin to fail if
> > the xfail tests begin to succeed?
>
> There are four regression tests that incorporate one or more :||=
> matchers. I investigated them and this is what I found:
> * Bug from workers 11081 is about the cursor jumping back and forth. I
> was able to remove all three :||= matchers from the test without
> breaking it.

The test isn't a unit test of :||=, where one would expect the output to
change when :||= is removed.  The test is a regression test that's
supposed to catch instances of a bug that could be reproduced only by
one person and only intermittently.  For such tests, changing their code
in any way might make them no longer test for the bug they claim to.

> * Bug from workers 11586 is about characters getting deleted while
> inserting the "unambiguous" substring. Here, I was not able to remove
> or replace the :||= matcher and still get the same output. This one
> might break,

Ack.

> but most of the output it expects is actually not relevant to the bug
> it is testing.

FWIW, the reply to 11586, 11634, mentions CamelCase briefly.

> * Test from workers 13320 is about cursor positioning. I was able to
> remove the :||= matcher from the test without breaking it.

So what?  The question isn't whether users who have used :||= could have
written their code without it.  The question is whether users who have
used :||= would see a behaviour change if :||='s semantics were changed
in the manner proposed.

> * Second test from workers 13345 is about a character getting deleted.
> I was able to replace the :||= matcher with a :|= matcher without
> breaking the test.
>

Ditto.

> On Tue, Oct 12, 2021 at 6:25 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > Could you confirm that the text which the docs patch deletes or changes
> > was all confirmed correct (even if perhaps unclear)?  I'm concerned
> > about us possibly changing dense, accurate docs into clear, less-accurate
> > docs.
>
> The original docs were vague and ambiguous on many points, and even

itself is 500 80-character lines long.

> > Case in point: The incumbent docs say that the coanchor is matched only
> > against the trial completion, but the new docs say something else.  If
> > that's an intentional change, it needs to be called out explicitly in
> > the log message.
>
> Yes, that's intentional. I'll add it to the commit message.

Thanks, but again, that was just a case in point.  You need to identify
all such cases, or better yet, split the patch into a series of small,
reviewable changes.  That's always a good idea, and more so for changes
that are _a priori_ controversial (in this case, due to being backwards
incompatible).

> > In the man page rendering on my system, itemiz()'s bullet is vertically
> > aligned with the parent item()'s text.
>
> I can confirm that this indeed looks wrong. Do you know what I should
> use to get them indented properly?

No, sorry.  You might want to look in zman.yo and ztexi.yo to see if we
define or redefine startitem(), item(), startitemize(), or itemiz().  If
not, then it might be some issue that can be reproduced with yodl/nroff/man
alone, i.e., not an issue specific to zsh's yodl code.

> Or if that's not possible, I can add a ifzman check to format the text
> without bullets in the man page.

Or with ASCII bullets.

> However, I would at least like to keep them in the html page, because
> it helps make the text clearer.

Sure.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-13 19:37          Daniel Shahaf
@ 2021-10-13 20:02            Bart Schaefer
2021-10-14 20:25            Oliver Kiddle
1 sibling, 0 replies; 14+ messages in thread
From: Bart Schaefer @ 2021-10-13 20:02 UTC (permalink / raw)
To: Daniel Shahaf; +Cc: Marlon Richert, Zsh hackers list

On Wed, Oct 13, 2021 at 12:38 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> If you want to clarify the documentation of the feature as designed,

Just for the record, I believe that's what Marlon has done in the doc
patch.  I would not apply the xfail patch in its current state (where
it removes existing tests and replaces them with xfails).  Adding that
set of xfails as new tests and documenting that they represent
proposed behavior changes (without removing the existing tests) would
be preferred.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-13 19:37          Daniel Shahaf
2021-10-13 20:02            Bart Schaefer
@ 2021-10-14 20:25            Oliver Kiddle
1 sibling, 0 replies; 14+ messages in thread
From: Oliver Kiddle @ 2021-10-14 20:25 UTC (permalink / raw)
To: Daniel Shahaf; +Cc: Marlon Richert, Zsh hackers list

Daniel Shahaf wrote:
> That's not how it works.  Documented semantics are API promises that
> should be presumed to be used in the wild.  Any change that may break
> anybody's proverbial spacebar heating is an incompatible change and
> should be treated accordingly (avoided if possible, and failing that,
> clearly documented for upgraders, designed with a reasonable failure
> mode for old code on new zsh, etc.).

The existing documentation and implementation don't entirely match.
Following the history of the feature including original list posts, I
think it is fairly clear what the intended behaviour is supposed to be.
The behaviour has inconsistencies, quirks and bugs and even determining
what the behaviour is in a form that could be documented is not easy.
started this.

I'm happy to see better documentation and test cases will make it easier
on anyone brave enough to attempt to fixup issues. I've not had a chance
to review the patch properly but will do.

Oliver

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-12 12:08  Marlon Richert
2021-10-12 15:25    Daniel Shahaf
@ 2021-10-14 20:43    Oliver Kiddle
2021-10-14 21:16      Bart Schaefer
2021-10-22 13:02    Oliver Kiddle
From: Oliver Kiddle @ 2021-10-14 20:43 UTC (permalink / raw)
To: Marlon Richert; +Cc: Zsh hackers list

This is just some initial comments. I'll delve into this in more detail
at a later date.

On 12 Oct, Marlon Richert wrote:

> +By default, characters in the string to be completed (referred to here as the
> +command line) map only onto identical characters in the list of matches
> +produced by the completion code (referred to here as the trial completions) and
> +missing characters are inserted only at the cursor position, if the shell
> +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
> +otherwise.  However, it is possible to modify this behavior by use of the

I'm fairly sure that if complete_in_word is unset, missing characters
are still allowed at the cursor position. It has the effect of treating
the rest of the word after the cursor as being a separate following
word. I think the code even adds a fake space in for some paths to
achieve this.

Is "trial completions" the best term we can come up with? Where it
occurs in singular form it isn't obvious that it doesn't refer to what
is on the command-line. I tend use "candidate matches". With the term
"matches" for those that remain following matching. Or does anyone have
also used for completion definitions for commands.

Oliver

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-14 20:43    Oliver Kiddle
@ 2021-10-14 21:16      Bart Schaefer
0 siblings, 0 replies; 14+ messages in thread
From: Bart Schaefer @ 2021-10-14 21:16 UTC (permalink / raw)
To: Oliver Kiddle; +Cc: Marlon Richert, Zsh hackers list

On Thu, Oct 14, 2021 at 1:58 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> Is "trial completions" the best term we can come up with?

That's merely what Sven W. used.

> I tend use "candidate matches". With the term
> "matches" for those that remain following matching.

I'm fine with that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-12 12:08  Marlon Richert
2021-10-12 15:25    Daniel Shahaf
2021-10-14 20:43    Oliver Kiddle
@ 2021-10-22 13:02    Oliver Kiddle
2021-10-25 18:41      Marlon Richert
From: Oliver Kiddle @ 2021-10-22 13:02 UTC (permalink / raw)
To: Marlon Richert; +Cc: Zsh hackers list

On 12 Oct, Marlon Richert wrote:
> On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > The tests show how :||= matchers should behave in order to provide
> > completion features that cannot be implemented with :|= matchers.

Thanks. These confirm that we both have the same understanding on the intended
behaviour. That's of more importance than anything in the comments I've
added below. The change should at least clarify things for anyone else
while the || form remains buggy.

> I've now added an accompanying documentation update to the patch.

The documentation is also definitely an improvement. The separation out
of l:/r: with an empty anchor and use of the term coanchor definitely helps.

be happy to push the work to git. Regarding my comment about the term "trial
completions" in that mail, it occurs to me that the t in tpat stands for
"trial" so that may need changing if the term is changed.

I also attach a small patch here for the matching code: it bails out
early if the command line is too short for the anchors and patterns.
This doesn't include the coanchor but there's no reason why it
shouldn't. It's early enough in the matching code for me to still be
fairly confident of what's going on in the code at that stage and to
target the condition with a debugger.

> +tt(-M) option of the tt(compadd) builtin command.  Note that this is not used
> +if the command line contains a glob pattern and the shell
> +optiontt(GLOB_COMPLETE) is set or the tt(pattern_match) of the tt(compstate)

You're missing a space here which prevents tt() working for GLOB_COMPLETE
And the wording isn't right for pattern_match. Perhaps remove "the" before it
and change "of" to "in". Or perhaps use the word "key".

> +ifnzman(noderef(Completion Builtin
> +Commands))\
> +) requires a var(match-spec) as it argument, consisting of one or more matching

as "its" argument or as "an" argument.

> +descriptions separated by whitespace.  Each description consists of a letter,
> +followed by a colon, and then patterns describing which substrings on the
> +command line map onto which substrings in the trial completion.  Descriptions

> +are evaluated from left to right and are cumulative.  An earlier mapping can
> +thus potentially change the outcome of a later mapping.  Finally, any unmapped
> +substrings will be mapped using the default mapping of identical substrings.

Identical strings will always match. Matching control only defines additional
ways. This last sentence might imply otherwise.

> +When using the completion system (see
> +ifzman(zmanref(zshcompsys))\
> +ifnzman(noderef(Completion System))\
> +), users can define match specifications that are to be used for specific
> +contexts by using the tt(matcher) and tt(matcher-list) styles.  The values for
> +the latter will be used everywhere.

matcher-list is not used "everywhere". It is looked up early but you can have
different values for different completers. Perhaps just remove that last
sentence, people can refer to zshcompsys for details.

> +Correspondence classes are defined like character classes, but with two
> +differences: They are delimited by a pair of braces, and negated classes are
> +not allowed, so the characters tt(!) and tt(^) have no special meaning directly
> +after the opening brace.  They indicate that a range of characters on the line

"They" here could be understood to refer to ! and ^ so we probably need to
spell out "Correspondence classes" again. This text was not your addition, it
was there before.

> +tt(r:|=*) lets (the empty substring at) the right edge of the command line
> +string be completed to any number of characters at the edge of each trial
> +completion.

Could add a note here that this would only have any effect if the cursor is not
already as the end of the command line.

> -preceded by the pattern var(lanchor).  The var(lanchor) can be blank to
> -anchor the match to the start of the command line string; otherwise the

Where "command line" is used as a compound adjective, I'd hyphenate it
("command-line"). There are other cases. Mostly, it occurs as a noun which I'd
leave with a space.

> +Let all substrings matching var(lpat) at the beginning (for tt(b:) and tt(B:))
> +or end (for tt(e:) and tt(E:)) of the command line be completed to the same
> +number of substrings matching var(tpat) in each trial completion in the same
> +relative position.

I'd restore some of the old sentence where we acknowledge that b/e are very
similar to l/r with an empty anchor just with scope for multiple applications
of the same or different matching controls.

> +
> +Example:
> +
> +tt(B:[nN][oO]=) adds all occurences of tt(no)', tt(nO)', tt(No)' and

A more useful example is B:0= for initial zeros. A fairly good demonstration of
the differences is:
compadd -M 'B:0= L:|-=' 1 2 3
A single minus sign that is really first is allowed. Multiple initial zeros,
including after the minus are allowed. So this allows -1 002 -03 but not
0-1 or --2.

> +tt(NO)' at the beginning of the command line to the beginning of each trial
> +completion.  If tt(r:|=*) is added to this, then given a trial completion
> +tt(foo)', it lets tt(noNOf)' be completed to tt(noNOfoo)'.

Not sure the r:|=* really helps understanding here. The cursor at the end of
"foo" would be just as good and it isn't a useful example to begin with.

> +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
> +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
> +Let any command line substring, which is left/right-adjacent (respectively) to
> +a substring matching var(anchor) and which matches var(lpat), be completed to
> +any trial completion substring, which

I'd consider it to be the anchor which is left(/right)-adjacent to the
substring not the other way around. How about:

Let any command-line substring matching var(lpat) complete to a trial
completion substring matching var(tpat) where both are adjacent to an
identical substring matching var(anchor). The l: and r: forms allow for
anchors appearing to the left or right, respectively.

If you specify something like [.-] as the anchor, it can't be . on the line and
- in the candidate so noting that the anchors need to be identical is useful.

> +startitemize()
> +itemiz(\
> +is adjacent to the same substring and which
> +)
> +itemiz(\
> +matches var(tpat), but which
> +)
> +itemiz(\
> +does not contain any substrings matching var(anchor).

That is only applicable to *. Even ? can match the anchor.

> --- a/Test/Y02compmatch.ztst
> +++ b/Test/Y02compmatch.ztst

> - comptest $'tst c...pag\t' > -0:Documentation example using input c...pag\t > + comptest$'tst ...pag\t'
> +0:Documentation example using input ...pag

It is good to have tests matching exactly the examples in the documentation but
in some cases there could be value in preserving the essence of the old test
too. To get good test coverage, we want empty and partial components in both
the middle and beginning of the command line to be tested.

> + test_code $example4b_matcher example4_list > + comptest$'tst ...pag\t^[bc\t^Fg^F^Fa\t'
> +0f:Documentation example using input ...pag with double anchor
> +>line: {tst .g.}{}

Don't think I follow how ...pag would be transformed to .g.
I assume this is copied from two tests before and should be unchanged.

> + example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
> + test_code $example5b_matcher example5_list > + comptest$'tst  .c\t^[bv\t.h\t^[bv'
> +0f:Documentation example using input .c but with double anchor

The second tab doesn't really test the matcher because the cursor is located
where characters need to be added. And the final tab at the end seems to be
missing - and the same issue would apply if it is added.

>   example7_matcher="r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
>   example7_list=($example6_list) > test_code$example7_matcher example7_list
> - comptest $'tst H\t2\t' > -0:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*" > + comptest$'tst H\t^[bF\to2\t^[b5\tb\t'
^
Is there meant to be a tab after that o?

Thanks again for this. You've also helped me to get a clearer picture of
matching control.

Oliver

diff --git a/Src/Zle/compmatch.c b/Src/Zle/compmatch.c
index cc4c3eca9..95eff1e92 100644
--- a/Src/Zle/compmatch.c
+++ b/Src/Zle/compmatch.c
@@ -693,8 +693,9 @@ match_str(char *l, char *w, Brinfo *bpp, int bc, int *rwlp,
alen = mp->ralen; aol = mp->lalen;
}
/* Give up if we don't have enough characters for the
-		     * line-string and the anchor. */
-		    if (ll < llen + alen || lw < alen)
+		     * line-string and the anchor, or for both anchors in
+		     * the case of the trial completion word. */
+		    if (ll < llen + alen || lw < alen + aol)
continue;

if (mp->flags & CMF_LEFT) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-22 13:02    Oliver Kiddle
@ 2021-10-25 18:41      Marlon Richert
2021-10-25 19:41        Bart Schaefer
From: Marlon Richert @ 2021-10-25 18:41 UTC (permalink / raw)
To: Oliver Kiddle; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 3694 bytes --]

Attached is a new version of the patch, which should address most of
matching logic. Below are my replies to points not addressed by the
patch.

On Wed, Oct 13, 2021 at 7:57 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> For example, this part is misleading:
>
> > > +By default, characters in the string to be completed (referred to here as the
> > > +command line) map only onto identical characters in the list of matches
> [...]
> > > +missing characters are inserted only at the cursor position, if the shell
> > > +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
>
> It's at the end of the current word, not the end of the command line.
> The old wording nearly always says "string on the command line" which
> is only somewhat better; if it's going to be completely rewritten to
> drop "string on the", the phrase "command line" should become more
> precise.  "Incomplete word" perhaps?

On Thu, Oct 14, 2021 at 11:43 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> Is "trial completions" the best term we can come up with? Where it
> occurs in singular form it isn't obvious that it doesn't refer to what
> is on the command-line. I tend use "candidate matches". With the term
> "matches" for those that remain following matching. Or does anyone have
> other ideas?

I iterations over several permutations of terms in the docs and found
that this one seems to work best:

"When the user invokes completion, the _current word_ on the command
line is used to generate a _match pattern_ defining which
_completions_ are considered _matches._"

This also matches the use of these terms in the majority of the
manual. Where it doesn't match in compwid.yo, I've changed the text so
that it does.

On Thu, Oct 14, 2021 at 11:43 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> also used for completion definitions for commands.

I used "completions" anyway, because it feels like the most natural
and concise word to use for this purpose. It doesn't appear to cause
any ambiguities in the text.

On Fri, Oct 22, 2021 at 4:02 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> > --- a/Test/Y02compmatch.ztst
> > +++ b/Test/Y02compmatch.ztst
>
> > - comptest $'tst c...pag\t' > > -0:Documentation example using input c...pag\t > > + comptest$'tst ...pag\t'
> > +0:Documentation example using input ...pag
>
> It is good to have tests matching exactly the examples in the documentation but
> in some cases there could be value in preserving the essence of the old test
> too. To get good test coverage, we want empty and partial components in both
> the middle and beginning of the command line to be tested.

I'm not sure what these tests should look like. So, I didn't add them.

On Fri, Oct 22, 2021 at 4:02 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> > + example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
> > + test_code $example5b_matcher example5_list > > + comptest$'tst  .c\t^[bv\t.h\t^[bv'
> > +0f:Documentation example using input .c but with double anchor
>
> The second tab doesn't really test the matcher because the cursor is located
> where characters need to be added.

On Thu, Oct 14, 2021 at 11:43 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> I'm fairly sure that if complete_in_word is unset, missing characters
> are still allowed at the cursor position.

No, it does not appear to work like that in vanilla zsh:

% PS1='%# ' zsh -f
% _tst() { compadd veryverylongfile.c }
% tst v.c^B^B\t

The word 'v.c' in this example will not get completed, no matter where
you put the cursor.

[-- Attachment #2: 0001-Define-correct-behavior-of-completion-matchers.txt --]
[-- Type: text/plain, Size: 51681 bytes --]

From 94bab5e505aecd4b437187ab938c3c08948233db Mon Sep 17 00:00:00 2001
From: Marlon Richert <marlon.richert@gmail.com>
Date: Mon, 25 Oct 2021 21:33:29 +0300
Subject: [PATCH] Define correct behavior of || completion matchers

* Add xfail tests to show how :||= matchers should behave in order to
provide completion features that cannot be implemented with :|=
matchers.
* Update compwid.yo to further describe the correct behavior.
* Update compwid.yo to use completion terminology more consistently.
---
Doc/Zsh/compwid.yo     | 671 +++++++++++++++++++++--------------------
Test/Y02compmatch.ztst | 126 ++++++--
2 files changed, 450 insertions(+), 347 deletions(-)

diff --git a/Doc/Zsh/compwid.yo b/Doc/Zsh/compwid.yo
index 3e86d3b42..1a606f19c 100644
--- a/Doc/Zsh/compwid.yo
+++ b/Doc/Zsh/compwid.yo
@@ -39,7 +39,7 @@ using the tt(bindkey) builtin command defined in the tt(zsh/zle) module
ifzman(see zmanref(zshzle))\
ifnzman(noderef(Zsh Line Editor))\
), typing that key will call the shell function tt(completer)'. This
-function is responsible for generating the possible matches using the
+function is responsible for generating completion matches using the
builtins described below.  As with other ZLE widgets, the function is
called with its standard input closed.

@@ -213,7 +213,7 @@ The string of an exact match if one was found, otherwise unset.
)
vindex(ignored, compstate)
item(tt(ignored))(
-The number of words that were ignored because they matched one of the
+The number of completions that were ignored because they matched one of the
patterns given with the tt(-F) option to the tt(compadd) builtin
command.
)
@@ -309,8 +309,7 @@ will be used in the same way as the value of tt(LISTMAX).
)
vindex(nmatches, compstate)
item(tt(nmatches))(
-The number of matches generated and accepted by the completion code so
-far.
+The number of matches added by the completion code so far.
)
vindex(old_insert, compstate)
item(tt(old_insert))(
@@ -346,7 +345,8 @@ value of a parameter assignment.
vindex(pattern_insert, compstate)
item(tt(pattern_insert))(
Normally this is set to tt(menu), which specifies that menu completion will
-be used whenever a set of matches was generated using pattern matching.  If
+be used whenever a set of matches was generated using
+tt(pattern_match) (see below).  If
it is set to any other non-empty string by the user and menu completion is
not selected by other option settings, the code will instead insert any
common prefix for the generated matches as with normal completion.
@@ -362,7 +362,7 @@ command line will be treated as patterns; if it is tt(*)', then
additionally a wildcard tt(*)' is assumed at the cursor position; if
it is empty or unset, metacharacters will be treated literally.

-Note that the matcher specifications given to the tt(compadd) builtin
+Note that the match specifications given to the tt(compadd) builtin
command are not used if this is set to a non-empty string.
)
vindex(quote, compstate)
@@ -456,17 +456,16 @@ xitem(SPACES()[tt(-V) var(group-name) ] [ tt(-o) [ var(order) ] ])
xitem(SPACES()[tt(-r) var(remove-chars) ] [ tt(-R) var(remove-func) ])
xitem(SPACES()[tt(-D) var(array) ] [ tt(-O) var(array) ] [ tt(-A) var(array) ])
xitem(SPACES()[tt(-E) var(number) ])
-item(SPACES()[tt(-M) var(match-spec) ] [ tt(-)tt(-) ] [ var(words) ... ])(
+item(SPACES()[tt(-M) var(match-spec) ] [ tt(-)tt(-) ] [ var(completions) ... ])(

This builtin command can be used to add matches directly and control
all the information the completion code stores with each possible
-match. The return status is zero if at least one match was added and
+completion. The return status is zero if at least one match was added and
non-zero if no matches were added.

-The completion code breaks the string to complete into seven fields in
-the order:
+The completion code breaks each match into seven fields in the order:

-indent(var(<ipre><apre><hpre><word><hsuf><asuf><isuf>))
+indent(var(<ipre><apre><hpre><body><hsuf><asuf><isuf>))

The first field
is an ignored prefix taken from the command line, the contents of the
@@ -474,12 +473,12 @@ tt(IPREFIX) parameter plus the string given with the tt(-i)
option. With the tt(-U) option, only the string from the tt(-i)
option is used. The field var(<apre>) is an optional prefix string
given with the tt(-P) option.  The var(<hpre>) field is a string
-that is considered part of the match but that should not be shown when
+that is considered part of the match but that should not be shown when
listing completions, given with the tt(-p) option; for example,
functions that do filename generation might specify
-a common path prefix this way.  var(<word>) is the part of the match that
-should appear in the list of completions, i.e. one of the var(words) given
-at the end of the tt(compadd) command line. The suffixes var(<hsuf>),
+a common path prefix this way.  var(<body>) is the part of the match that
+should appear in the list of matches shown to the user.
+The suffixes var(<hsuf>),
var(<asuf>) and var(<isuf>) correspond to the prefixes var(<hpre>),
var(<apre>) and var(<ipre>) and are given by the options tt(-s), tt(-S) and
tt(-I), respectively.
@@ -488,52 +487,52 @@ The supported flags are:

startitem()
item(tt(-P) var(prefix))(
-This gives a string to be inserted before the given var(words).  The
+This gives a string to be inserted before each match.  The
string given is not considered as part of the match and any shell
metacharacters in it will not be quoted when the string is inserted.
)
item(tt(-S) var(suffix))(
-Like tt(-P), but gives a string to be inserted after the match.
+Like tt(-P), but gives a string to be inserted after each match.
)
item(tt(-p) var(hidden-prefix))(
-This gives a string that should be inserted into the command line before the
+This gives a string that should be inserted before each
match but that should not appear in the list of matches. Unless the
tt(-U) option is given, this string must be matched as part of the string
on the command line.
)
item(tt(-s) var(hidden-suffix))(
-Like tt(-p)', but gives a string to insert after the match.
+Like tt(-p)', but gives a string to insert after each match.
)
item(tt(-i) var(ignored-prefix))(
-This gives a string to insert into the command line just before any
+This gives a string to insert just before any
string given with the tt(-P)' option.  Without tt(-P)' the string is
-inserted before the string given with tt(-p)' or directly before the
+inserted before the string given with tt(-p)' or directly before each
match.
)
item(tt(-I) var(ignored-suffix))(
Like tt(-i), but gives an ignored suffix.
)
item(tt(-a))(
-With this flag the var(words) are taken as names of arrays and the
-possible matches are their values.  If only some elements of the
-arrays are needed, the var(words) may also contain subscripts, as in
+With this flag the var(completions) are taken as names of arrays and the
+actual completions are their values.  If only some elements of the
+arrays are needed, the var(completions) may also contain subscripts, as in
tt(foo[2,-1])'.
)
item(tt(-k))(
-With this flag the var(words) are taken as names of associative arrays
-and the possible matches are their keys.  As for tt(-a), the
+With this flag the var(completions) are taken as names of associative arrays
+and the actual completions are their keys.  As for tt(-a), the
var(words) may also contain subscripts, as in tt(foo[(R)*bar*])'.
)
item(tt(-d) var(array))(
-This adds per-match display strings. The var(array) should contain one
-element per var(word) given. The completion code will then display the
-first element instead of the first var(word), and so on. The
+This adds per-completion display strings. The var(array) should contain one
+element per var(completion) given. The completion code will then display the
+first element instead of the first var(completion), and so on. The
var(array) may be given as the name of an array parameter or directly
as a space-separated list of words in parentheses.

-If there are fewer display strings than var(words), the leftover
-var(words) will be displayed unchanged and if there are more display
-strings than var(words), the leftover display strings will be silently
+If there are fewer display strings than var(completions), the leftover
+var(completions) will be displayed unchanged and if there are more display
+strings than var(completions), the leftover display strings will be silently
ignored.
)
item(tt(-l))(
@@ -556,7 +555,8 @@ by the tt(-d) option). This is the default if tt(-o)' is specified but
the var(order) argument is omitted.
)
item(tt(nosort))(
-This specifies that the matches are pre-sorted and their order should be
+This specifies that the var(completions)
+are pre-sorted and their order should be
preserved.  This value only makes sense alone and cannot be combined with any
others.
)
@@ -570,7 +570,7 @@ Arrange the matches backwards by reversing the sort ordering.
enditem()
)
item(tt(-J) var(group-name))(
-Gives the name of the group of matches the words should be stored in.
+Gives the name of the group that the matches should be stored in.
)
item(tt(-V) var(group-name))(
Like tt(-J) but naming an unsorted group. This option is identical to
@@ -616,13 +616,13 @@ produce unexpected results. If arbitrary text is to be passed in a
description, it can be escaped using e.g. tt(${my_str//\%/%%}). ) item(tt(-x) var(message))( -Like tt(-X), but the var(message) will be printed even if there are no +Like tt(-X), but the var(message) will be printed even if there are no matches in the group. ) item(tt(-q))( -The suffix given with tt(-S) will be automatically removed if +The suffix given with tt(-S) will be automatically removed if the next character typed is a blank or does not insert anything, or if -the suffix consists of only one character and the next character typed +the suffix consists of only one character and the next character typed is the same character. ) item(tt(-r) var(remove-chars))( @@ -644,8 +644,8 @@ automatically added space will be removed when one of the characters in the list is typed. ) item(tt(-R) var(remove-func))( -This is another form of the tt(-r) option. When a suffix -has been inserted and the completion accepted, the function +This is another form of the tt(-r) option. When a match +has been accepted and a suffix has been inserted, the function var(remove-func) will be called after the next character typed. It is passed the length of the suffix as an argument and can use the special parameters available in ordinary (non-completion) zle widgets (see @@ -654,7 +654,7 @@ ifnzman(noderef(Zsh Line Editor))\ ) to analyse and modify the command line. ) item(tt(-f))( -If this flag is given, all of the matches built from var(words) are +If this flag is given, all of the matches built from the var(completions) are marked as being the names of files. They are not required to be actual filenames, but if they are, and the option tt(LIST_TYPES) is set, the characters describing the types of the files in the completion lists will @@ -668,15 +668,14 @@ the tt(AUTO_PARAM_SLASH) and tt(AUTO_PARAM_KEYS) options be used for the matches. ) item(tt(-W) var(file-prefix))( -This string is a pathname that will be -prepended to each of the matches formed by the given var(words) together +This string is a pathname that will be prepended to each match together with any prefix specified by the tt(-p) option to form a complete filename for testing. Hence it is only useful if combined with the tt(-f) flag, as the tests will not otherwise be performed. ) item(tt(-F) var(array))( -Specifies an array containing patterns. Words matching one of these -patterns are ignored, i.e. not considered to be possible matches. +Specifies an array containing patterns. var(completions) that match one of +these patterns are ignored, that is, not considered to be matches. The var(array) may be the name of an array parameter or a list of literal patterns enclosed in parentheses and quoted, as in tt(-F "(*?.o @@ -684,8 +683,8 @@ literal patterns enclosed in parentheses and quoted, as in tt(-F "(*?.o taken as the patterns. ) item(tt(-Q))( -This flag instructs the completion -code not to quote any metacharacters in the words when inserting them +This flag instructs the completion +code not to quote any metacharacters in the matches when inserting them into the command line. ) item(tt(-M) var(match-spec))( @@ -696,47 +695,48 @@ between them to form the specification string to use. Note that they will only be used if the tt(-U) option is not given. ) item(tt(-n))( -Specifies that the words added are to be used as possible -matches, but are not to appear in the completion listing. +Specifies that matching var(completions) are to be added to the set of +matches, but are not to be listed to the user. ) item(tt(-U))( -If this flag is given, all words given will be accepted and no matching +If this flag is given, all var(completions) are added +to the set of matches and no matching will be done by the completion code. Normally this is used in functions that do the matching themselves. ) item(tt(-O) var(array))( -If this option is given, the var(words) are em(not) added to the set of -possible completions. Instead, matching is done as usual and all of the -var(words) given as arguments that match the string on the command line +If this option is given, the var(completions) are em(not) added to the set of +matches. Instead, matching is done as usual and all of the +var(completions) that match will be stored in the array parameter whose name is given as var(array). ) item(tt(-A) var(array))( -As the tt(-O) option, except that instead of those of the var(words) which +As the tt(-O) option, except that instead of those of the var(completions) +which match being stored in var(array), the strings generated internally by the -completion code are stored. For example, -with a matching specification of tt(-M "L:|no=")', the string tt(nof)' -on the command line and the string tt(foo)' as one of the var(words), this +completion code are stored. For example, +with a match specification of tt(-M "L:|no=")', a current word of tt(nof)' +and var(completions) of tt(foo)', this option stores the string tt(nofoo)' in the array, whereas the tt(-O) option stores the tt(foo)' originally given. ) item(tt(-D) var(array))( -As with tt(-O), the var(words) are not added to the set of possible -completions. Instead, the completion code tests whether each var(word) -in turn matches what is on the line. If the var(n)th var(word) does not +As with tt(-O), the var(completions) are not added to the set of matches. +Instead, whenever the var(n)th var(completion) does not match, the var(n)th element of the var(array) is removed. Elements -for which the corresponding var(word) is matched are retained. +for which the corresponding var(completion) matches are retained. ) item(tt(-C))( This option adds a special match which expands to all other matches when inserted into the line, even those that are added after this option is used. Together with the tt(-d) option it is possible to -specify a string that should be displayed in the list for this special -match. If no string is given, it will be shown as a string containing -the strings that would be inserted for the other matches, truncated to +specify a string that should be displayed in the list for this special +match. If no string is given, it will be shown as a string containing +the strings that would be inserted for the other matches, truncated to the width of the screen. ) item(tt(-E) var(number))( -This option adds var(number) empty matches after the var(words) have +This option adds var(number) empty matches after matching var(completions) have been added. An empty match takes up space in completion listings but will never be inserted in the line and can't be selected with menu completion or menu selection. This makes empty matches only useful to @@ -751,7 +751,7 @@ added. xitem(tt(-)) item(tt(-)tt(-))( This flag ends the list of flags and options. All arguments after it -will be taken as the words to use as matches even if they begin with +will be taken as the var(completions) even if they begin with hyphens. ) enditem() @@ -788,7 +788,7 @@ Without the optional var(number), the longest match is taken, but if var(number) is given, anything up to the var(number)th match is moved. If the var(number) is negative, the var(number)th longest match is moved. For example, if tt(PREFIX) contains the string -tt(a=b=c)', then tt(compset -P '*\=') will move the string tt(a=b=)' +tt(a=b=c)', then tt(compset -P '*\=') will move the string tt(a=b=)' into the tt(IPREFIX) parameter, but tt(compset -P 1 '*\=') will move only the string tt(a=)'. ) @@ -801,7 +801,7 @@ As tt(-P), but match the last portion of tt(SUFFIX) and transfer the matched portion to the front of the value of tt(ISUFFIX). ) item(tt(-n) var(begin) [ var(end) ])( -If the current word position as specified by the parameter tt(CURRENT) +If the current word position as specified by the parameter tt(CURRENT) is greater than or equal to var(begin), anything up to the var(begin)th word is removed from the tt(words) array and the value of the parameter tt(CURRENT) is decremented by var(begin). @@ -824,7 +824,7 @@ point to the same word in the changed array. If the optional pattern var(end-pat) is also given, and there is an element in the tt(words) array matching this pattern, the parameters are modified only if the index of this word is higher than the one -given by the tt(CURRENT) parameter (so that the matching word has +given by the tt(CURRENT) parameter (so that the matching word has to be after the cursor). In this case, the words starting with the one matching tt(end-pat) are also removed from the tt(words) array. If tt(words) contains no word matching var(end-pat), the @@ -833,7 +833,7 @@ testing and modification is performed as if it were not given. item(tt(-q))( The word currently being completed is split on spaces into separate words, -respecting the usual shell quoting conventions. The +respecting the usual shell quoting conventions. The resulting words are stored in the tt(words) array, and tt(CURRENT), tt(PREFIX), tt(SUFFIX), tt(QIPREFIX), and tt(QISUFFIX) are modified to reflect the word part that is completed. @@ -885,7 +885,7 @@ item(tt(-suffix) [ var(number) ] var(pattern))( true if the test for the tt(-S) option of tt(compset) would succeed. ) item(tt(-after) var(beg-pat))( -true if the test of the tt(-N) option with only the var(beg-pat) given +true if the test of the tt(-N) option with only the var(beg-pat) given would succeed. ) item(tt(-between) var(beg-pat end-pat))( @@ -896,275 +896,286 @@ enditem() texinode(Completion Matching Control)(Completion Widget Example)(Completion Condition Codes)(Completion Widgets) sect(Completion Matching Control) -It is possible by use of the -tt(-M) option of the tt(compadd) builtin command to specify how the -characters in the string to be completed (referred to here as the -command line) map onto the characters in the list of matches produced by -the completion code (referred to here as the trial completions). Note -that this is not used if the command line contains a glob pattern and -the tt(GLOB_COMPLETE) option is set or the tt(pattern_match) of the -tt(compstate) special association is set to a non-empty string. +When the user invokes completion, the current em(word) on the command line +(that is, the word the cursor is currently on) is used to generate a em(match +pattern). Only those em(completions) that match the pattern are offered to the +user as em(matches). -The var(match-spec) given as the argument to the tt(-M) option (see +The default match pattern is generated from the current word by either + +startitemize() +itemiz(\ +appending a tt(*)' (matching any number of characters in a completion) +em(or,)\ +) +itemiz(\ +if the shell option tt(COMPLETE_IN_WORD) is set, inserting a tt(*)' at the +cursor position.\ +) +enditemize() + +This narrow pattern can be broadened selectively by passing a em(match +specification) to the tt(compadd) builtin command through its tt(-M) option +(see ifzman(Completion Builtin Commands' above)\ ifnzman(noderef(Completion Builtin Commands))\ -) consists of one or more matching descriptions separated by -whitespace. Each description consists of a letter followed by a colon -and then the patterns describing which character sequences on the line match -which character sequences in the trial completion. Any sequence of -characters not handled in this fashion must match exactly, as usual. - -The forms of var(match-spec) understood are as follows. In each case, the -form with an upper case initial character retains the string already -typed on the command line as the final result of completion, while with -a lower case initial character the string on the command line is changed -into the corresponding part of the trial completion. +). A match specification consists of one or more var(matchers) separated by +whitespace. Matchers in a match specification are applied one at a time, from +left to right. Once all matchers have been applied, completions are compared +to the final match pattern and non-matching ones are discarded. + +startitemize() +itemiz(\ +Note that the tt(-M) option is ignored if the current word contains a glob +pattern and the shell option tt(GLOB_COMPLETE) is set or if the +tt(pattern_match) key of the special associative array tt(compstate) is set to +a non-empty value (see +ifzman(Completion Special Parameters' above)\ +ifnzman(noderef(Completion Special Parameters))\ +).\ +) +itemiz(\ +Users of the \ +ifzman(completion system (see zmanref(zshcompsys))) \ +ifnzman(noderef(Completion System)) \ +should generally not use the tt(-M) option directly, but rather use the +tt(matcher-list) and tt(matcher) styles (see the subsection em(Standard Styles) +in +ifzman(\ +the documentation for COMPLETION SYSTEM CONFIGURATION in zmanref(zshcompsys))\ +ifnzman(noderef(Completion System Configuration))\ +).\ +) +enditemize() + +Each matcher consists of + +startitemize() +itemiz(a case-sensitive letter) +itemiz(a tt(:)',) +itemiz(one or more patterns separated by pipes (tt(|)'),) +itemiz(an equals sign (tt(=)'), and) +itemiz(another pattern.) +enditemize() + +The patterns before the tt(=)' are used to match substrings of the current +word. For each matched substring, the corresponding part of the match pattern +is broadened with the pattern after the tt(=)', by means of a logical tt(OR). + +Each pattern in a matcher cosists of either + +startitemize() +itemiz(the empty string or) +itemiz(a sequence of + +startitemize() +itemiz(literal characters (which may be quoted with a tt(\)'),) +itemiz(question marks (tt(?)'),) +itemiz(\ +bracket expressions (tt([...])'; see the subsection em(Glob Operators) in +ifnzman(noderef(Filename Generation))\ +ifzman(the documentation for GLOB OPERATORS in zmanref(zshexpn))\ +), and/or\ +) +itemiz(brace expressions (see below).) +enditemize() +) +enditemize() + +Other shell patterns are not allowed. + +A brace expression, like a bracket expression, consists of a list of + +startitemize() +itemiz(literal characters,) +itemiz(ranges (tt(0-9)'), and/or) +itemiz(character classes (tt([:)var(name)tt(:])').) +enditemize() + +However, they differ from each other as follows: + +startitemize() +itemiz(\ +A brace expression is delimited by a pair of braces (tt({...})').\ +) +itemiz(\ +Brace expressions do not support negations. That is, an initial +tt(!)' or tt(^)' has no special meaning and will be interpreted as a literal +character.\ +) +itemiz(\ +When a character in the current word matches the var(n)th pattern in a brace +expression, the corresponding part of the match pattern is broadened only with +the var(n)th pattern of the brace expression on the other side of the tt(=)', +if there is one; if there is no brace expression on the other side, then this +pattern is the empty string. However, if either brace expression has more +elements than the other, then the excess entries are simply ignored. When +comparing indexes, each literal character or character class counts as one +element, but each range is instead expanded to the full list of literal +characters it represents. Additionally, if on em(both) sides of the +tt(=)', the var(n)th pattern is tt([:upper:])' or tt([:lower:])', then these +are expanded as ranges, too.\ +) +enditemize() + +Note that, although the matching system does not yet handle multibyte +characters, this is likely to be a future extension. Hence, using +tt([:upper:])' and tt([:lower:])' is recommended over +tt(A-Z)' and tt(a-z)'. + +Below are the different forms of matchers supported. Each em(uppercase) form +behaves exactly like its lowercase counterpart, but adds an additional step +em(after) the match pattern has filtered out non-matching completions: Each of +a match's substrings that was matched by a subpattern from an uppercase matcher +is replaced with the corresponding substring of the current word. However, +patterns from em(lowercase) matchers have higher weight: If a substring of the +current word was matched by patterns from both a lowercase and an uppercase +matcher, then the lowercase matcher's pattern wins and the corresponding part +of the match is not modified. + +Unless indicated otherwise, each example listed assumes tt(COMPLETE_IN_WORD) to +be unset (as it is by default). startitem() -xitem(tt(m:)var(lpat)tt(=)var(tpat)) -item(tt(M:)var(lpat)tt(=)var(tpat))( -Here, var(lpat) is a pattern that matches on the command line, -corresponding to var(tpat) which matches in the trial completion. -) -xitem(tt(l:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat)) -xitem(tt(L:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat)) -xitem(tt(l:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) -xitem(tt(L:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) -xitem(tt(b:)var(lpat)tt(=)var(tpat)) -item(tt(B:)var(lpat)tt(=)var(tpat))( -These letters are for patterns that are anchored by another pattern on -the left side. Matching for var(lpat) and var(tpat) is as for tt(m) and -tt(M), but the pattern var(lpat) matched on the command line must be -preceded by the pattern var(lanchor). The var(lanchor) can be blank to -anchor the match to the start of the command line string; otherwise the -anchor can occur anywhere, but must match in both the command line and -trial completion strings. - -If no var(lpat) is given but a var(ranchor) is, this matches the gap -between substrings matched by var(lanchor) and var(ranchor). Unlike -var(lanchor), the var(ranchor) only needs to match the trial -completion string. - -The tt(b) and tt(B) forms are similar to tt(l) and tt(L) with an empty -anchor, but need to match only the beginning of the word on the command line -or trial completion, respectively. -) -xitem(tt(r:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat)) -xitem(tt(R:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat)) -xitem(tt(r:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) -xitem(tt(R:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat)) -xitem(tt(e:)var(lpat)tt(=)var(tpat)) -item(tt(E:)var(lpat)tt(=)var(tpat))( -As tt(l), tt(L), tt(b) and tt(B), with the difference that the command -line and trial completion patterns are anchored on the right side. -Here an empty var(ranchor) and the tt(e) and tt(E) forms force the -match to the end of the command line or trial completion string. - -In the form where var(lanchor) is given, the var(lanchor) only needs -to match the trial completion string. +xitem(tt(m:)var(word-pat)tt(=)var(match-pat)) +item(tt(M:)var(word-pat)tt(=)var(match-pat))( + +For each substring of the current word that matches var(word-pat), broaden the +corresponding part of the match pattern to additionally match var(match-pat). + +startitem() +item(Examples:)( + +tt(m:{[:lower:]}={[:upper:]}) lets any lower case character in the current word +be completed to itself or its uppercase counterpart. So, the completions +tt(foo)', tt(FOO)' and tt(Foo)' will are be considered matches for the word +tt(fo)'. + +tt(M:_=) inserts every underscore from the current word into each match, in the +same relative position, determined by matching the substrings around it. So, +given a completion tt(foo)', the word tt(f_o)' will be completed to the match +tt(f_oo)', even though the latter was not present as a completion. ) -item(tt(x:))( -This form is used to mark the end of matching specifications: -subsequent specifications are ignored. In a single standalone list -of specifications this has no use but where matching specifications -are accumulated, such as from nested function calls, it can allow one -function to override another. +enditem() +) +xitem(tt(b:)var(word-pat)tt(=)var(match-pat)) +xitem(tt(B:)var(word-pat)tt(=)var(match-pat)) +xitem(tt(e:)var(word-pat)tt(=)var(match-pat)) +item(tt(E:)var(word-pat)tt(=)var(match-pat))( + +For each consecutive substring at the tt(b:)eginning or tt(e:)nd of the current +word that matches var(word-pat), broaden the corresponding part of the match +pattern to additionally match var(match-pat). + +startitem() +item(Examples:)( + +tt(b:-=+)' lets any number of minuses at the start of the current word be +completed to a minus or a plus. + +tt(B:0=)' adds all zeroes at the beginning of the current word to the +beginning of each match. ) enditem() +) +xitem(tt(l:)tt(|)var(word-pat)tt(=)var(match-pat)) +xitem(tt(L:)tt(|)var(word-pat)tt(=)var(match-pat)) +xitem(tt(R:)var(word-pat)tt(|)tt(=)var(match-pat)) +item(tt(r:)var(word-pat)tt(|)tt(=)var(match-pat))( -Each var(lpat), var(tpat) or var(anchor) is either an empty string or -consists of a sequence of literal characters (which may be quoted with a -backslash), question marks, character classes, and correspondence -classes; ordinary shell patterns are not used. Literal characters match -only themselves, question marks match any character, and character -classes are formed as for globbing and match any character in the given -set. - -Correspondence classes are defined like character classes, but with two -differences: they are delimited by a pair of braces, and negated classes -are not allowed, so the characters tt(!) and tt(^) have no special -meaning directly after the opening brace. They indicate that a range of -characters on the line match a range of characters in the trial -completion, but (unlike ordinary character classes) paired according to -the corresponding position in the sequence. For example, to make any -ASCII lower case letter on the line match the corresponding upper case -letter in the trial completion, you can use tt(m:{a-z}={A-Z})' -(however, see below for the recommended form for this). More -than one pair of classes can occur, in which case the first class before -the tt(=) corresponds to the first after it, and so on. If one side has -more such classes than the other side, the superfluous classes behave -like normal character classes. In anchor patterns correspondence classes -also behave like normal character classes. - -The standard tt([:)var(name)tt(:])' forms described for standard shell -patterns (see -ifnzman(noderef(Filename Generation))\ -ifzman(the section FILENAME GENERATION in zmanref(zshexpn))) -may appear in correspondence classes as well as normal character -classes. The only special behaviour in correspondence classes is if -the form on the left and the form on the right are each one of -tt([:upper:]), tt([:lower:]). In these cases the -character in the word and the character on the line must be the same up -to a difference in case. Hence to make any lower case character on the -line match the corresponding upper case character in the trial -completion you can use tt(m:{[:lower:]}={[:upper:]})'. Although the -matching system does not yet handle multibyte characters, this is likely -to be a future extension, at which point this syntax will handle -arbitrary alphabets; hence this form, rather than the use of explicit -ranges, is the recommended form. In other cases -tt([:)var(name)tt(:])' forms are allowed. If the two forms on the left -and right are the same, the characters must match exactly. In remaining -cases, the corresponding tests are applied to both characters, but they -are not otherwise constrained; any matching character in one set goes -with any matching character in the other set: this is equivalent to the -behaviour of ordinary character classes. - -The pattern var(tpat) may also be one or two stars, tt(*)' or -tt(**)'. This means that the pattern on the command line can match -any number of characters in the trial completion. In this case the -pattern must be anchored (on either side); in the case of a single -star, the var(anchor) then determines how much of the trial completion -is to be included DASH()- only the characters up to the next appearance of -the anchor will be matched. With two stars, substrings matched by -the anchor can be matched, too. In the forms that include two -anchors, tt(*)' can match characters from the additional anchor -DASH()- var(lanchor) with tt(r) or var(ranchor) with tt(l). - -Examples: - -The keys of the tt(options) association defined by the tt(parameter) -module are the option names in all-lower-case form, without -underscores, and without the optional tt(no) at the beginning even -though the builtins tt(setopt) and tt(unsetopt) understand option names -with upper case letters, underscores, and the optional tt(no). The -following alters the matching rules so that the prefix tt(no) and any -underscore are ignored when trying to match the trial completions -generated and upper case letters on the line match the corresponding -lower case letters in the words: - -example(compadd -M 'L:|[nN][oO]= M:_= M:{[:upper:]}={[:lower:]}' - \ -${(k)options} )
-
-The first part says that the pattern tt([nN][oO])' at the beginning
-(the empty anchor before the pipe symbol) of the string on the
-line matches the empty string in the list of words generated by
-completion, so it will be ignored if present. The second part does the
-same for an underscore anywhere in the command line string, and the
-third part uses correspondence classes so that any
-upper case letter on the line matches the corresponding lower case
-letter in the word. The use of the upper case forms of the
-specification characters (tt(L) and tt(M)) guarantees that what has
-already been typed on the command line (in particular the prefix
-tt(no)) will not be deleted.
-
-Note that the use of tt(L) in the first part means that it matches
-only when at the beginning of both the command line string and the
-trial completion. I.e., the string tt(_NO_f)' would not be
-completed to tt(_NO_foo)', nor would tt(NONO_f)' be completed to
-tt(NONO_foo)' because of the leading underscore or the second
-tt(NO)' on the line which makes the pattern fail even though they are
-otherwise ignored. To fix this, one would use tt(B:[nN][oO]=)'
-instead of the first part. As described above, this matches at the
-beginning of the trial completion, independent of other characters or
-substrings at the beginning of the command line word which are ignored
-by the same or other var(match-spec)s.
-
-The second example makes completion case insensitive.  This is just
-the same as in the option example, except here we wish to retain the
-characters in the list of completions:
-
-
-This makes lower case letters match their upper case counterparts.
-To make upper case letters match the lower case forms as well:
-
-
-A nice example for the use of tt(*) patterns is partial word
-completion. Sometimes you would like to make strings like tt(c.s.u)'
-complete to strings like tt(comp.source.unix)', i.e. the word on the
-command line consists of multiple parts, separated by a dot in this
-example, where each part should be completed separately DASH()- note,
-however, that the case where each part of the word, i.e. tt(comp)',
-tt(source)' and tt(unix)' in this example, is to be completed from
-separate sets of matches
-is a different problem to be solved by the implementation of the
-completion widget.  The example can be handled by:
-
-  - comp.sources.unix comp.sources.misc ...)
-
-The first specification says that var(lpat) is the empty string, while
-var(anchor) is a dot; var(tpat) is tt(*), so this can match anything
-except for the tt(.)' from the anchor in
-the trial completion word.  So in tt(c.s.u)', the matcher sees tt(c)',
-followed by the empty string, followed by the anchor tt(.)', and
-likewise for the second dot, and replaces the empty strings before the
-anchors, giving tt(c)[tt(omp)]tt(.s)[tt(ources)]tt(.u)[tt(nix)]', where
-the last part of the completion is just as normal.
-
-With the pattern shown above, the string tt(c.u)' could not be
-completed to tt(comp.sources.unix)' because the single star means
-that no dot (matched by the anchor) can be skipped. By using two stars
-as in tt(r:|.=**)', however, tt(c.u)' could be completed to
-tt(comp.sources.unix)'. This also shows that in some cases,
-especially if the anchor is a real pattern, like a character class,
-the form with two stars may result in more matches than one would like.
-
-The second specification is needed to make this work when the cursor is
-in the middle of the string on the command line and the option
-tt(COMPLETE_IN_WORD) is set. In this case the completion code would
-normally try to match trial completions that end with the string as
-typed so far, i.e. it will only insert new characters at the cursor
-position rather than at the end.  However in our example we would like
-the code to recognise matches which contain extra characters after the
-string on the line (the tt(nix)' in the example).  Hence we say that the
-empty string at the end of the string on the line matches any characters
-at the end of the trial completion.
-
-More generally, the specification
-
-example(compadd -M 'r:|[.,_-]=* r:|=*' ... )
-
-allows one to complete words with abbreviations before any of the
-characters in the square brackets.  For example, to
-with the above in effect, you can just type tt(very.c) before attempting
-completion.
-
-The specifications with both a left and a right anchor are useful to
-complete partial words whose parts are not separated by some
-special character. For example, in some places strings have to be
-completed that are formed tt(LikeThis)' (i.e. the separate parts are
-determined by a leading upper case letter) or maybe one has to
-complete strings with trailing numbers. Here one could use the simple
-form with only one anchor as in:
-
-example(compadd -M 'r:|[[:upper:]0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
-
-But with this, the string tt(H)' would neither complete to tt(FooHoo)'
-nor to tt(LikeTHIS)' because in each case there is an upper case
-letter before the tt(H)' and that is matched by the anchor. Likewise,
-a tt(2)' would not be completed. In both cases this could be changed
-by using tt(r:|[[:upper:]0-9]=**)', but then tt(H)' completes to both
-tt(LikeTHIS)' and tt(FooHoo)' and a tt(2)' matches the other
-strings because characters can be inserted before every upper case
-letter and digit. To avoid this one would use:
-
-    LikeTHIS FooHoo foo123 bar234)
-
-By using these two anchors, a tt(H)' matches only upper case tt(H)'s that
-are immediately preceded by something matching the left anchor
-tt([^[:upper:]0-9])'. The effect is, of course, that tt(H)' matches only
-the string tt(FooHoo)', a tt(2)' matches only tt(bar234)' and so on.
-
-When using the completion system (see
-ifzman(zmanref(zshcompsys))\
+If there is a substring at the tt(l:)eft or tt(r:)ight edge of the current word
+that matches var(word-pat), then broaden the corresponding part of the match
+
+For each tt(l:), tt(L:), tt(r:) and tt(R:) matcher (including the ones below),
+the pattern var(match-pat) may also be a tt(*)'.  This matches any number of
+characters in a completion.
+
+startitem()
+item(Examples:)(
+
+tt(r:|=*)' appends a tt(*)' to the match pattern, even when
+tt(COMPLETE_IN_WORD) is set and the cursor is not at the end of the current
+word.
+
+If the current word starts with a minus, then tt(L:|-=)' will prepend it to
+each match.
+)
+enditem()
+)
+xitem(tt(l:)var(anchor)tt(|)var(word-pat)tt(=)var(match-pat))
+xitem(tt(L:)var(anchor)tt(|)var(word-pat)tt(=)var(match-pat))
+xitem(tt(r:)var(word-pat)tt(|)var(anchor)tt(=)var(match-pat))
+item(tt(R:)var(word-pat)tt(|)var(anchor)tt(=)var(match-pat))(
+
+For each substring of the current word that matches var(word-pat) and has on
+its tt(l:)eft or tt(r:)ight another substring matching var(anchor), broaden the
+corresponding part of the match pattern to additionally match var(match-pat).
+
+Note that these matchers (and the ones below) modify only what is matched by
+var(word-pat); they do not change the matching behavior of what is matched by
+var(anchor) (or var(coanchor); see the matchers below).  Thus, unless its
+corresponding part of the match pattern has been modified, the anchor in the
+current word has to match literally in each completion, just like any other
+substring of the current word.
+
+If a matcher includes at least one anchor (which includes the matchers with two
+anchors, below), then var(match-pat) may also be tt(*)' or tt(**)'.  tt(*)'
+can match any part of a completion that does not contain any substrings
+matching var(anchor), whereas a tt(**)' can match any part of a completion,
+period.  (Note that this is different from the behavior of tt(*)' in the
+anchorless forms of tt(l:)' and tt(r:)' and and also different from tt(*)'
+and tt(**)' in glob expressions.)
+
+startitem()
+item(Examples:)(
+
+tt(r:|.=*)' makes the completion tt(comp.sources.unix)' a match for the word
+tt(..u)' DASH()- but em(not) for the word tt(.u)'.
+
+Given a completion tt(-)tt(-foo)', the matcher tt(L:--|no-=)' will complete
+the word tt(-)tt(-no-)' to the match tt(-)tt(-no-foo)'.
+)
+enditem()
+)
+xitem(tt(l:)var(anchor)tt(||)var(coanchor)tt(=)var(match-pat))
+xitem(tt(L:)var(anchor)tt(||)var(coanchor)tt(=)var(match-pat))
+xitem(tt(r:)var(coanchor)tt(||)var(anchor)tt(=)var(match-pat))
+item(tt(R:)var(coanchor)tt(||)var(anchor)tt(=)var(match-pat))(
+
+For any two consecutive substrings of the current word that match var(anchor)
+and var(coanchor), in the order given, insert the pattern var(match-pat)
+between their corresponding parts in the match pattern.
+
+Note that, unlike var(anchor), the pattern var(coanchor) does not change what
+tt(*)' can match.
+
+startitem()
+item(Examples:)(
+
+tt(r:?||[[:upper:]]=*)' will complete the current word tt(fB)' to
+tt(fooBar)', but it will not complete it to tt(fooHooBar)' (because tt(*)'
+here cannot match anything that includes a match for tt([[:upper:]])), nor
+will it complete tt(B)' to tt(fooBar)' (because there is no character in the
+current word to match var(coanchor)).
+
+Given the current word tt(pass.n)' and a completion tt(pass.byname)', the
+matcher tt(L:.||[[:alpha:]]=by)' will produce the match tt(pass.name)'.
+)
+enditem()
+)
+item(tt(x:))(
+
+Ignore this matcher and all matchers to its right.
+
+This matcher is used to mark the end of a match specification.  In a single
+standalone list of matchers, this has no use, but where match specifications
+are concatenated, as is often the case when using the
+ifzman(completion system (see zmanref(zshcompsys)))\
ifnzman(noderef(Completion System))\
-), users can define match specifications that are to be used for
-specific contexts by using the tt(matcher) and tt(matcher-list)
-styles. The values for the latter will be used everywhere.
+, it can allow one match specification to override another.
+)
+enditem()

texinode(Completion Widget Example)()(Completion Matching Control)(Completion Widgets)
sect(Completion Widget Example)
@@ -1185,5 +1196,5 @@ matches, e.g.:

example(complete-files LPAR()RPAR() { compadd - * })

-This function will complete files in the current directory matching the
+This function will complete files in the current directory matching the
current word.
diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
index 621707482..4a0a1a060 100644
--- a/Test/Y02compmatch.ztst
+++ b/Test/Y02compmatch.ztst
@@ -378,15 +378,26 @@
comp.graphics.rendering.misc comp.graphics.rendering.raytracing
comp.graphics.rendering.renderman)
test_code $example4_matcher example4_list - comptest$'tst c.s.u\t'
-0:Documentation example using input c.s.u
+ comptest $'tst .s.u\t' +0:r:|.=* should complete .s.u +>line: {tst comp.sources.unix }{} +>COMPADD:{} +>INSERT_POSITIONS:{21} + + example4b_matcher='r:[^.]||.=* r:|=*' + test_code$example4b_matcher example4_list
+ comptest $'tst .s.u\t^[bc\t' +0f:r:[^.]||.=* should not complete .s.u, but should complete c.s.u +>line: {tst .s.u}{} +>COMPADD:{} +>INSERT_POSITIONS:{} >line: {tst comp.sources.unix }{} >COMPADD:{} >INSERT_POSITIONS:{21} test_code$example4_matcher example4_list
- comptest $'tst c.g.\ta\t.\tp\ta\tg\t' -0:Documentation example using input c.g.\ta\t.\tp\ta\tg\t + comptest$'tst .g.\ta\t.\tp\ta\tg\t'
+0f:r:|.=* should complete .g.
>line: {tst comp.graphics.}{}
>INSERT_POSITIONS:{18}
@@ -424,9 +435,32 @@
>INSERT_POSITIONS:{32}

+ test_code $example4b_matcher example4_list + comptest$'tst .g.\t^[bc\t'
+0f:r:[^.]||.=* should not complete .g., but should complete c.g.
+>line: {tst .g.}{}
+>INSERT_POSITIONS:{}
+>line: {tst comp.graphics.}{}
+>INSERT_POSITIONS:{18}
+
test_code $example4_matcher example4_list - comptest$'tst c...pag\t'
-0:Documentation example using input c...pag\t
+ comptest $'tst ...pag\t' +0:r:|.=* should complete ...pag +>line: {tst comp.graphics.apps.pagemaker }{} +>COMPADD:{} +>INSERT_POSITIONS:{32} + + test_code$example4b_matcher example4_list
+ comptest $'tst ...pag\t^[bc\t^Fg^F^Fa\t' +0f:r:[^.]||.=* should not complete ...pag or c...pag, but should complete c.g.a.p +>line: {tst ...pag}{} +>COMPADD:{} +>INSERT_POSITIONS:{} +>line: {tst c...pag}{} +>COMPADD:{} +>INSERT_POSITIONS:{} >line: {tst comp.graphics.apps.pagemaker }{} >COMPADD:{} >INSERT_POSITIONS:{32} @@ -444,8 +478,8 @@ example5_matcher='r:|[.,_-]=* r:|=*' example5_list=(veryverylongfile.c veryverylongheader.h) test_code$example5_matcher example5_list
- comptest $'tst v.c\tv.h\t' -0:Documentation example using input v.c\t + comptest$'tst  .c\t.h\t'
+0:r:|[.,_-]=* should complete .c and .h
>line: {tst  veryverylongfile.c }{}
>INSERT_POSITIONS:{23}
@@ -453,6 +487,23 @@
>INSERT_POSITIONS:{44}

+ example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
+ test_code $example5b_matcher example5_list + comptest$'tst  .c\t^[bv\t.h\t^[bv\t'
+0f:r:[^.,_-]||[.,_-]=* should not complete .c or .h, but should complete v.c and v.h
+>line: {tst  .c}{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c }{}
+>INSERT_POSITIONS:{23}
+>line: {tst  veryverylongfile.c .h}{}
+>INSERT_POSITIONS:{}
+>INSERT_POSITIONS:{44}
+

example6_list=(LikeTHIS FooHoo 5foo123 5bar234)
test_code 'r:|[A-Z0-9]=* r:|=*' example6_list
@@ -493,16 +544,57 @@
example7_matcher="r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
example7_list=($example6_list) test_code$example7_matcher example7_list
- comptest $'tst H\t2\t' -0:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*" + comptest$'tst H\t^BF\to\t2\t^B5\tb\t'
+0f:r:[^A-Z0-9]||[A-Z0-9]=** should not complete H, FH, 2 or 52, but should complete FoH and 5b2.
+>line: {tst H}{}
+>INSERT_POSITIONS:{}
+>line: {tst F}{H}
+>INSERT_POSITIONS:{}
>line: {tst FooHoo }{}
>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 2}{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo 5}{2}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo 5bar234 }{}
+>INSERT_POSITIONS:{18}
+
+ example7b_matcher="r:?||[A-Z0-9]=* r:|=*"
+ test_code $example7b_matcher example7_list + comptest$'tst H\t^BF2\t^B5\t'
+0f:r:?||[A-Z0-9]=* r:|=* should not complete H or 2, but should complete FH and 52.
+>line: {tst H}{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo }{}
+>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 2}{}
+>INSERT_POSITIONS:{}
>line: {tst FooHoo 5bar234 }{}
>INSERT_POSITIONS:{18}

+ example8_list=(passwd.byname)
+ test_code 'r:[^.]||.=* l:.||[^.]=*'
+ comptest $'tst .^B\tpass^Fname\t' +0f:r:[^.]||.=* and l:.||[^.]=* should work symmetrically. +>line: {tst }{.} +>COMPADD:{} +>INSERT_POSITIONS:{} +>line: {tst passwd.byname }{} +>COMPADD:{} +>INSERT_POSITIONS:{17} + + workers_7311_matcher="m:{a-z}={A-Z} r:|[.,_-]=* r:|=*" workers_7311_list=(Abc-Def-Ghij.txt Abc-def.ghi.jkl_mno.pqr.txt Abc_def_ghi_jkl_mno_pqr.txt) test_code$workers_7311_matcher workers_7311_list
@@ -537,11 +629,11 @@
>INSERT_POSITIONS:{5}

- workers_11081_matcher='m:{a-zA-Z}={A-Za-z} r:|[.,_-]=* r:[^A-Z0-9]||[A-Z0-9]=* r:[A-Z0-9]||[^A-Z0-9]=* r:[^0-9]||[0-9]=* r:|=*'
+ workers_11081_matcher='m:{a-zA-Z}={A-Za-z} r:|[.,_-]=* r:|=*'
workers_11081_list=(build.out build.out1 build.out2)
test_code $workers_11081_matcher workers_11081_list comptest$'tst bui\t\t\t'
-0:Bug from workers 11081
+0:Erratic completion bug from workers 11081: bui > build.out[] > build[.]out > build.out[] > build.out1[] > build.out2[]
>line: {tst build.out}{}
>INSERT_POSITIONS:{13}
@@ -578,7 +670,7 @@
workers_11586_list=(c00.abc c01.abc.def.00.0)
test_code $workers_11586_matcher workers_11586_list comptest$'tst c00\t.\ta\t'
-0:Bug from workers 11586
+0:Disappearing characters bug from workers 11586: c00\t -> c0[], c00\t -> c0.abc[], c00.\t -> c0.abc[]
>line: {tst c00}{}
>INSERT_POSITIONS:{6}
@@ -611,12 +703,12 @@
>INSERT_POSITIONS:{22}

- workers_13320_matcher='r:|[.,_-]=** r:[^0-9]||[0-9]=**'
+ workers_13320_matcher='r:|[.,_-]=**'
workers_13320_list=(glibc-2.1.94-3.i386.rpm glibc-devel-2.1.94-3.i386.rpm)
workers_13320_list=($workers_13320_list glibc-profile-2.1.94-3.i386.rpm) test_code$workers_13320_matcher workers_13320_list
comptest $'tst glibc-2.1\t' -0:Test from workers 13320 +0:Incorrect cursor position bug from workers 13320: glibc-2.1\t -> glibc-2[.]1.94-3.i386.rpm >line: {tst glibc}{-2.1.94-3.i386.rpm} >COMPADD:{} >INSERT_POSITIONS:{9:27} @@ -641,11 +733,11 @@ >NO:{A.C} - workers_13345b_matcher='r:|[.,_-]=** r:[^0-9]||[0-9]=**' + workers_13345b_matcher='r:|[.,_-]=** r:|[0-9]=**' workers_13345b_list=(a-b_1_2_2 a-b_2_0.gz a-b_2_0.zip) test_code$workers_13345b_matcher workers_13345b_list
comptest \$'tst a-b_2\t'
-0:Second test from workers 13345
+0:Disappearing character bug from workers 13345: a-b_2\t -> a-b__
>line: {tst a-b_2_}{}
>INSERT_POSITIONS:{8:10}
--
2.33.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
2021-10-25 18:41      Marlon Richert
@ 2021-10-25 19:41        Bart Schaefer
0 siblings, 0 replies; 14+ messages in thread
From: Bart Schaefer @ 2021-10-25 19:41 UTC (permalink / raw)
To: Marlon Richert; +Cc: Oliver Kiddle, Zsh hackers list

On Mon, Oct 25, 2021 at 11:42 AM Marlon Richert
<marlon.richert@gmail.com> wrote:
>
> On Thu, Oct 14, 2021 at 11:43 PM Oliver Kiddle <opk@zsh.org> wrote:
> >
> > I'm fairly sure that if complete_in_word is unset, missing characters
> > are still allowed at the cursor position.
>
> No, it does not appear to work like that in vanilla zsh:

I believe Oliver means that missing characters are allowed when a
matcher is specified, not in general.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-10-25 19:42 UTC | newest]

2021-10-11 14:34 [RFC] Add xfail tests for || form of completion matchers Marlon Richert
2021-10-12 12:08  Marlon Richert
2021-10-12 15:25    Daniel Shahaf
2021-10-13  4:57      Bart Schaefer
2021-10-13  5:08      Bart Schaefer
2021-10-13 14:20        Marlon Richert
2021-10-13 19:37          Daniel Shahaf
2021-10-13 20:02            Bart Schaefer
2021-10-14 20:25            Oliver Kiddle
2021-10-14 20:43    Oliver Kiddle
2021-10-14 21:16      Bart Schaefer
2021-10-22 13:02    Oliver Kiddle
2021-10-25 18:41      Marlon Richert
2021-10-25 19:41        Bart Schaefer


Code repositories for project(s) associated with this public inbox

https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).