zsh-workers
 help / color / mirror / Atom feed
* [RFC] Add xfail tests for || form of completion matchers
@ 2021-10-11 14:34 Marlon Richert
  2021-10-12 12:08 ` Marlon Richert
  0 siblings, 1 reply; 11+ messages in thread
From: Marlon Richert @ 2021-10-11 14:34 UTC (permalink / raw)
  To: Zsh hackers list; +Cc: Oliver Kiddle, Bart Schaefer

[-- Attachment #1: Type: text/plain, Size: 170 bytes --]

The tests show how :||= matchers should behave in order to provide
completion features that cannot be implemented with :|= matchers.

This is a follow-up to users/27228.

[-- Attachment #2: 0001-Add-xfail-tests-for-form-of-completion-matchers.txt --]
[-- Type: text/plain, Size: 5113 bytes --]

From 8640592169e90c89fd879baf39274f4a6a5822ee Mon Sep 17 00:00:00 2001
From: Marlon Richert <marlon.richert@gmail.com>
Date: Mon, 11 Oct 2021 17:30:07 +0300
Subject: [PATCH] Add xfail tests for || form of completion matchers

The tests show how :||= matchers should behave in order to provide
completion features that cannot be implemented with :|= matchers.
---
 Test/Y02compmatch.ztst | 108 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 98 insertions(+), 10 deletions(-)

diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
index 621707482..ee7e422c1 100644
--- a/Test/Y02compmatch.ztst
+++ b/Test/Y02compmatch.ztst
@@ -378,15 +378,26 @@
   comp.graphics.rendering.misc comp.graphics.rendering.raytracing
   comp.graphics.rendering.renderman)
  test_code $example4_matcher example4_list
- comptest $'tst c.s.u\t'
-0:Documentation example using input c.s.u
+ comptest $'tst .s.u\t'
+0:Documentation example using input .s.u
+>line: {tst comp.sources.unix }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{21}
+
+  example4b_matcher='r:[^.]||.=* r:|=*'
+ test_code $example4b_matcher example4_list
+ comptest $'tst .s.u\t^[bc\t'
+0f:Documentation example using input .s.u but with double anchor
+>line: {tst .s.u}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst comp.sources.unix }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{21}
 
  test_code $example4_matcher example4_list
- comptest $'tst c.g.\ta\t.\tp\ta\tg\t'
-0:Documentation example using input c.g.\ta\t.\tp\ta\tg\t
+ comptest $'tst .g.\ta\t.\tp\ta\tg\t'
+0:Documentation example using input .g.\ta\t.\tp\ta\tg\t
 >line: {tst comp.graphics.}{}
 >COMPADD:{}
 >INSERT_POSITIONS:{18}
@@ -424,9 +435,32 @@
 >COMPADD:{}
 >INSERT_POSITIONS:{32}
 
+ test_code $example4b_matcher example4_list
+ comptest $'tst .g.\t^[bc\t'
+0f:Documentation example using input .g. with double anchor
+>line: {tst .g.}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst comp.graphics.}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{18}
+
  test_code $example4_matcher example4_list
- comptest $'tst c...pag\t'
-0:Documentation example using input c...pag\t
+ comptest $'tst ...pag\t'
+0:Documentation example using input ...pag
+>line: {tst comp.graphics.apps.pagemaker }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{32}
+
+ test_code $example4b_matcher example4_list
+ comptest $'tst ...pag\t^[bc\t^Fg^F^Fa\t'
+0f:Documentation example using input ...pag with double anchor
+>line: {tst .g.}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst c...pag}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst comp.graphics.apps.pagemaker }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{32}
@@ -444,8 +478,8 @@
  example5_matcher='r:|[.,_-]=* r:|=*'
  example5_list=(veryverylongfile.c veryverylongheader.h)
  test_code $example5_matcher example5_list
- comptest $'tst  v.c\tv.h\t'
-0:Documentation example using input v.c\t
+ comptest $'tst  .c\t.h\t'
+0:Documentation example using input .c
 >line: {tst  veryverylongfile.c }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{23}
@@ -453,6 +487,23 @@
 >COMPADD:{}
 >INSERT_POSITIONS:{44}
 
+ example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
+ test_code $example5b_matcher example5_list
+ comptest $'tst  .c\t^[bv\t.h\t^[bv'
+0f:Documentation example using input .c but with double anchor
+>line: {tst  .c}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{23}
+>line: {tst  veryverylongfile.c .h}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c veryverylongheader.h }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{44}
+
 
  example6_list=(LikeTHIS FooHoo 5foo123 5bar234)
  test_code 'r:|[A-Z0-9]=* r:|=*' example6_list
@@ -493,15 +544,52 @@
  example7_matcher="r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
  example7_list=($example6_list)
  test_code $example7_matcher example7_list
- comptest $'tst H\t2\t'
-0:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
+ comptest $'tst H\t^[bF\to2\t^[b5\tb\t'
+0f:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
+>line: {tst H}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst F}{H}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst FooHoo }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{10}
+>line: {tst FooHoo 2}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo 5}{2}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst FooHoo 5bar234 }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{18}
 
+ example7b_matcher="r:?||[A-Z0-9]=* r:|=*"
+ test_code $example7b_matcher example7_list
+ comptest $'tst H\t^[bF2\t^[b5\t'
+0f:Documentation example using "r:?||[A-Z0-9]=* r:|=*"
+>line: {tst H}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 5bar234 }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{18}
+
+ example8_list=(passwd.byname)
+ test_code 'r:[^.]||.=* l:.||[^.]=*'
+ comptest $'tst .^B\tpass^Fname\t'
+0f:Symmetry between r and l
+>line: {tst }{.}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst passwd.byname }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{17}
+
 
  workers_7311_matcher="m:{a-z}={A-Z} r:|[.,_-]=* r:|=*"
  workers_7311_list=(Abc-Def-Ghij.txt Abc-def.ghi.jkl_mno.pqr.txt Abc_def_ghi_jkl_mno_pqr.txt)
-- 
2.33.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-11 14:34 [RFC] Add xfail tests for || form of completion matchers Marlon Richert
@ 2021-10-12 12:08 ` Marlon Richert
  2021-10-12 15:25   ` Daniel Shahaf
  2021-10-14 20:43   ` Oliver Kiddle
  0 siblings, 2 replies; 11+ messages in thread
From: Marlon Richert @ 2021-10-12 12:08 UTC (permalink / raw)
  To: Zsh hackers list; +Cc: Oliver Kiddle, Bart Schaefer

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
>
> The tests show how :||= matchers should behave in order to provide
> completion features that cannot be implemented with :|= matchers.
>
> This is a follow-up to users/27228.

I've now added an accompanying documentation update to the patch.

[-- Attachment #2: 0001-Add-xfail-tests-for-form-of-completion-matchers.txt --]
[-- Type: text/plain, Size: 29640 bytes --]

From 3ec2fceced1f327eb2ac7484772bd1d3756bf8d2 Mon Sep 17 00:00:00 2001
From: Marlon Richert <marlon.richert@gmail.com>
Date: Tue, 12 Oct 2021 15:02:31 +0300
Subject: [PATCH] Add xfail tests for || form of completion matchers

The tests show how :||= matchers should behave in order to provide
completion features that cannot be implemented with :|= matchers.
---
 Doc/Zsh/compwid.yo     | 446 ++++++++++++++++++-----------------------
 Test/Y02compmatch.ztst | 108 +++++++++-
 2 files changed, 293 insertions(+), 261 deletions(-)

diff --git a/Doc/Zsh/compwid.yo b/Doc/Zsh/compwid.yo
index 3e86d3b42..5dd2127df 100644
--- a/Doc/Zsh/compwid.yo
+++ b/Doc/Zsh/compwid.yo
@@ -896,72 +896,210 @@ enditem()
 texinode(Completion Matching Control)(Completion Widget Example)(Completion Condition Codes)(Completion Widgets)
 sect(Completion Matching Control)
 
-It is possible by use of the
-tt(-M) option of the tt(compadd) builtin command to specify how the
-characters in the string to be completed (referred to here as the
-command line) map onto the characters in the list of matches produced by
-the completion code (referred to here as the trial completions). Note
-that this is not used if the command line contains a glob pattern and
-the tt(GLOB_COMPLETE) option is set or the tt(pattern_match) of the
-tt(compstate) special association is set to a non-empty string.
-
-The var(match-spec) given as the argument to the tt(-M) option (see
+By default, characters in the string to be completed (referred to here as the
+command line) map only onto identical characters in the list of matches
+produced by the completion code (referred to here as the trial completions) and
+missing characters are inserted only at the cursor position, if the shell
+option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
+otherwise.  However, it is possible to modify this behavior by use of the
+tt(-M) option of the tt(compadd) builtin command.  Note that this is not used
+if the command line contains a glob pattern and the shell
+optiontt(GLOB_COMPLETE) is set or the tt(pattern_match) of the tt(compstate)
+special association is set to a non-empty string.
+
+The tt(-M) option (see
 ifzman(`Completion Builtin Commands' above)\
-ifnzman(noderef(Completion Builtin Commands))\
-) consists of one or more matching descriptions separated by
-whitespace.  Each description consists of a letter followed by a colon
-and then the patterns describing which character sequences on the line match
-which character sequences in the trial completion.  Any sequence of
-characters not handled in this fashion must match exactly, as usual.
-
-The forms of var(match-spec) understood are as follows. In each case, the
-form with an upper case initial character retains the string already
-typed on the command line as the final result of completion, while with
-a lower case initial character the string on the command line is changed
-into the corresponding part of the trial completion.
+ifnzman(noderef(Completion Builtin
+Commands))\
+) requires a var(match-spec) as it argument, consisting of one or more matching
+descriptions separated by whitespace.  Each description consists of a letter,
+followed by a colon, and then patterns describing which substrings on the
+command line map onto which substrings in the trial completion.  Descriptions
+are evaluated from left to right and are cumulative.  An earlier mapping can
+thus potentially change the outcome of a later mapping.  Finally, any unmapped
+substrings will be mapped using the default mapping of identical substrings.
+
+When using the completion system (see
+ifzman(zmanref(zshcompsys))\
+ifnzman(noderef(Completion System))\
+), users can define match specifications that are to be used for specific
+contexts by using the tt(matcher) and tt(matcher-list) styles.  The values for
+the latter will be used everywhere.
+
+Each pattern in a var(match-spec) is either an empty string or consists of a
+sequence of literal characters (which may be quoted with a backslash), question
+marks, character classes, and correspondence classes (see next paragraph).
+Ordinary shell patterns are not used.  Literal characters match only
+themselves, question marks match any character, and character classes are
+formed as for globbing and match any character in the given set.
+
+Correspondence classes are defined like character classes, but with two
+differences: They are delimited by a pair of braces, and negated classes are
+not allowed, so the characters tt(!) and tt(^) have no special meaning directly
+after the opening brace.  They indicate that a range of characters on the line
+match a range of characters in the trial completion, but (unlike ordinary
+character classes) paired according to the corresponding position in the
+sequence.  More than one pair of classes can occur, in which case the first
+class before the tt(=) corresponds to the first after it, and so on.  If one
+side has more such classes than the other side, the superfluous classes behave
+like normal character classes.
+
+The standard `tt([:)var(name)tt(:])' forms described for standard shell
+patterns (see 
+ifnzman(noderef(Filename Generation))\
+ifzman(the section
+FILENAME GENERATION in zmanref(zshexpn))\
+) may appear in correspondence classes as well as normal character classes.
+The only special behaviour in correspondence classes is if the form on the left
+and the form on the right are each one of tt([:upper:]), tt([:lower:]).  In
+these cases the character in the word and the character on the line must be the
+same up to a difference in case.  Although the matching system does not yet
+handle multibyte characters, this is likely to be a future extension, at which
+point this syntax will handle arbitrary alphabets; hence this form, rather than
+the use of explicit ranges, is the recommended form.  In other cases
+`tt([:)var(name)tt(:])' forms are allowed.  If the two forms on the left and
+right are the same, the characters must match exactly.  In remaining cases, the
+corresponding tests are applied to both characters, but they are not otherwise
+constrained; any matching character in one set goes with any matching character
+in the other set: this is equivalent to the behaviour of ordinary character
+classes.
+
+The forms of var(match-spec) understood are listed below.  For each of these,
+the form with an upper case initial character replaces mapped substrings in the
+trial completions with their counterparts from the command line, whereas with a
+lower case initial character, once a trial completion has been accepted,
+matched substrings on the command line are replaced with their counterparts
+from the accepted completion.
 
 startitem()
 xitem(tt(m:)var(lpat)tt(=)var(tpat))
 item(tt(M:)var(lpat)tt(=)var(tpat))(
-Here, var(lpat) is a pattern that matches on the command line,
-corresponding to var(tpat) which matches in the trial completion.
+Let any substring matching var(lpat) be completed to any substring matching
+var(tpat).
+
+Examples:
+
+tt(m:{[:lower:]}={[:upper:]}) lets any lower case character be completed to its
+uppercase counterpart.
+
+tt(M:_=) inserts every underscore on the command line into each trial
+completion, in the same relative position, determined by matching the
+substrings around it.  Note that the definition of what is matching can be
+modified by applying other matchers first.
+
+If these two matchers are combined to tt('m:{[:lower:]}={[:upper:]} M:_='),
+then given a trial completion `tt(NO)', it lets `tt(_n_o_)' be completed to
+`tt(_N_O_)', even though `tt(_N_O_)' itself is not present as a trial
+completion.  tt(m:{[:lower:]}={[:upper:]}) is evaluated first and makes `tt(n)`
+match `tt(N)' and `tt(o)` match `tt(O)', after which tt(M:_=) is then able to
+insert underscores into the correct positions.
+)
+xitem(tt(l:)tt(|)var(lpat)tt(=)var(tpat))
+xitem(tt(L:)tt(|)var(lpat)tt(=)var(tpat))
+xitem(tt(r:)var(lpat)tt(|)tt(=)var(tpat))
+item(tt(R:)var(lpat)tt(|)tt(=)var(tpat))(
+Let any substring matching var(lpat) at the left (for tt(l:) and tt(L:)) or
+right (for tt(r:) and tt(R:)) edge of the command line be completed to any
+substring matching var(tpat) in the same position in the trial completion.
+
+With these matchers, the pattern var(tpat) may also be a star, `tt(*)'.  This
+lets a matching command line substring be completed to any trial completion
+substring in the same relative position.
+
+Examples:
+
+tt(L:|[nN][oO]=) makes it so that, if there is a single `tt(no)', `tt(nO)',
+`tt(No)' or `tt(no)' at the left end of the command line, then it is added to
+the left of each trial completion.
+
+tt(r:|=*) lets (the empty substring at) the right edge of the command line
+string be completed to any number of characters at the edge of each trial
+completion.
+
+If these two matchers are combined to tt('L:[nN][oO]= r:|=*'), then given a
+trial completion `tt(foo)', it lets `tt(NOf)' be completed to `tt(NOfoo)'.
+First, tt(L:[nN][oO]=) prefixes the trial completion with tt(NO), after which
+tt(r:|=*) is able to match the command line to the trial completion and
+complete the missing characters at the end.
 )
-xitem(tt(l:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat))
-xitem(tt(L:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat))
-xitem(tt(l:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
-xitem(tt(L:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
 xitem(tt(b:)var(lpat)tt(=)var(tpat))
-item(tt(B:)var(lpat)tt(=)var(tpat))(
-These letters are for patterns that are anchored by another pattern on
-the left side. Matching for var(lpat) and var(tpat) is as for tt(m) and
-tt(M), but the pattern var(lpat) matched on the command line must be
-preceded by the pattern var(lanchor).  The var(lanchor) can be blank to
-anchor the match to the start of the command line string; otherwise the
-anchor can occur anywhere, but must match in both the command line and
-trial completion strings.
-
-If no var(lpat) is given but a var(ranchor) is, this matches the gap
-between substrings matched by var(lanchor) and var(ranchor). Unlike
-var(lanchor), the var(ranchor) only needs to match the trial
-completion string.
-
-The tt(b) and tt(B) forms are similar to tt(l) and tt(L) with an empty 
-anchor, but need to match only the beginning of the word on the command line
-or trial completion, respectively.
-)
-xitem(tt(r:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat))
-xitem(tt(R:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat))
-xitem(tt(r:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
-xitem(tt(R:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
+xitem(tt(B:)var(lpat)tt(=)var(tpat))
 xitem(tt(e:)var(lpat)tt(=)var(tpat))
 item(tt(E:)var(lpat)tt(=)var(tpat))(
-As tt(l), tt(L), tt(b) and tt(B), with the difference that the command
-line and trial completion patterns are anchored on the right side.
-Here an empty var(ranchor) and the tt(e) and tt(E) forms force the
-match to the end of the command line or trial completion string.
-
-In the form where var(lanchor) is given, the var(lanchor) only needs
-to match the trial completion string.
+Let all substrings matching var(lpat) at the beginning (for tt(b:) and tt(B:))
+or end (for tt(e:) and tt(E:)) of the command line be completed to the same
+number of substrings matching var(tpat) in each trial completion in the same
+relative position.
+
+Example:
+
+tt(B:[nN][oO]=) adds all occurences of `tt(no)', `tt(nO)', `tt(No)' and
+`tt(NO)' at the beginning of the command line to the beginning of each trial
+completion.  If tt(r:|=*) is added to this, then given a trial completion
+`tt(foo)', it lets `tt(noNOf)' be completed to `tt(noNOfoo)'.
+)
+xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
+xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
+xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
+item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
+Let any command line substring, which is left/right-adjacent (respectively) to
+a substring matching var(anchor) and which matches var(lpat), be completed to
+any trial completion substring, which
+startitemize()
+itemiz(\
+is adjacent to the same substring and which
+)
+itemiz(\
+matches var(tpat), but which
+)
+itemiz(\
+does not contain any substrings matching var(anchor).
+)
+enditemize()
+
+When a matcher includes at least one anchor (which also applies to the forms
+with two anchors, below), the pattern var(tpat) may also be one or two stars,
+`tt(*)' or `tt(**)'.  The first star can match any number of characters, within
+the constraints outlined above, whereas a second star removes the last
+constraint and can match substrings matching var(anchor).
+
+Example:
+
+tt(r:|.=*) lets each dot be completed to any substring that ends at the right
+in a dot, but does not otherwise contain any dots, in the trial string.  Thus,
+given a trial string `tt(comp.sources.unix)', `tt(..unix)' can be completed to
+it, but `tt(.unix)' cannot, since the matcher will refuse to map any dots other
+than the one matched by the var(anchor).
+)
+xitem(tt(l:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat))
+xitem(tt(L:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat))
+xitem(tt(r:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat))
+item(tt(R:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat))(
+Lets the empty string between each two adjacent command line substrings
+matching var(anchor) and var(coanchor), in the order given, be completed to any
+trial completion substring, which
+startitemize()
+itemiz(\
+is adjacent to the same two substrings and which
+)
+itemiz(\
+matches var(tpat), but which
+)
+itemiz(\
+does not contain any substrings matching var(anchor).
+)
+enditemize()
+
+Note there is no restriction on substrings matching var(coanchor).
+
+Example:
+
+tt(r:?||[[:upper:]]=*) will complete `tt(fHoo)' to `tt(fooHoo)', but not
+`tt(Hoo)' to `tt(fooHoo)', because there is no character to the left of `tt(H)'
+on the command line˙.  Likewise, it will not complete `tt(lHIS)' to
+`tt(likeTHIS)', because, other than the one substring it maps to var(anchor),
+it cannot map any substring containing uppercase letters in the trial
+completion.
 )
 item(tt(x:))(
 This form is used to mark the end of matching specifications:
@@ -972,200 +1110,6 @@ function to override another.
 )
 enditem()
 
-Each var(lpat), var(tpat) or var(anchor) is either an empty string or
-consists of a sequence of literal characters (which may be quoted with a
-backslash), question marks, character classes, and correspondence
-classes; ordinary shell patterns are not used.  Literal characters match
-only themselves, question marks match any character, and character
-classes are formed as for globbing and match any character in the given
-set.
-
-Correspondence classes are defined like character classes, but with two
-differences: they are delimited by a pair of braces, and negated classes
-are not allowed, so the characters tt(!) and tt(^) have no special
-meaning directly after the opening brace.  They indicate that a range of
-characters on the line match a range of characters in the trial
-completion, but (unlike ordinary character classes) paired according to
-the corresponding position in the sequence.  For example, to make any
-ASCII lower case letter on the line match the corresponding upper case
-letter in the trial completion, you can use `tt(m:{a-z}={A-Z})'
-(however, see below for the recommended form for this).  More
-than one pair of classes can occur, in which case the first class before
-the tt(=) corresponds to the first after it, and so on.  If one side has
-more such classes than the other side, the superfluous classes behave
-like normal character classes.  In anchor patterns correspondence classes
-also behave like normal character classes.
-
-The standard `tt([:)var(name)tt(:])' forms described for standard shell
-patterns (see
-ifnzman(noderef(Filename Generation))\
-ifzman(the section FILENAME GENERATION in zmanref(zshexpn)))
-may appear in correspondence classes as well as normal character
-classes.  The only special behaviour in correspondence classes is if
-the form on the left and the form on the right are each one of
-tt([:upper:]), tt([:lower:]).  In these cases the
-character in the word and the character on the line must be the same up
-to a difference in case.  Hence to make any lower case character on the
-line match the corresponding upper case character in the trial
-completion you can use `tt(m:{[:lower:]}={[:upper:]})'.  Although the
-matching system does not yet handle multibyte characters, this is likely
-to be a future extension, at which point this syntax will handle
-arbitrary alphabets; hence this form, rather than the use of explicit
-ranges, is the recommended form.  In other cases
-`tt([:)var(name)tt(:])' forms are allowed.  If the two forms on the left
-and right are the same, the characters must match exactly.  In remaining
-cases, the corresponding tests are applied to both characters, but they
-are not otherwise constrained; any matching character in one set goes
-with any matching character in the other set:  this is equivalent to the
-behaviour of ordinary character classes.
-
-The pattern var(tpat) may also be one or two stars, `tt(*)' or
-`tt(**)'. This means that the pattern on the command line can match
-any number of characters in the trial completion. In this case the
-pattern must be anchored (on either side); in the case of a single
-star, the var(anchor) then determines how much of the trial completion
-is to be included DASH()- only the characters up to the next appearance of
-the anchor will be matched. With two stars, substrings matched by
-the anchor can be matched, too. In the forms that include two
-anchors, `tt(*)' can match characters from the additional anchor
-DASH()- var(lanchor) with tt(r) or var(ranchor) with tt(l).
-
-Examples:
-
-The keys of the tt(options) association defined by the tt(parameter)
-module are the option names in all-lower-case form, without
-underscores, and without the optional tt(no) at the beginning even
-though the builtins tt(setopt) and tt(unsetopt) understand option names
-with upper case letters, underscores, and the optional tt(no).  The
-following alters the matching rules so that the prefix tt(no) and any
-underscore are ignored when trying to match the trial completions
-generated and upper case letters on the line match the corresponding
-lower case letters in the words:
-
-example(compadd -M 'L:|[nN][oO]= M:_= M:{[:upper:]}={[:lower:]}' - \ 
-  ${(k)options} )
-
-The first part says that the pattern `tt([nN][oO])' at the beginning
-(the empty anchor before the pipe symbol) of the string on the
-line matches the empty string in the list of words generated by
-completion, so it will be ignored if present. The second part does the
-same for an underscore anywhere in the command line string, and the
-third part uses correspondence classes so that any
-upper case letter on the line matches the corresponding lower case
-letter in the word. The use of the upper case forms of the
-specification characters (tt(L) and tt(M)) guarantees that what has
-already been typed on the command line (in particular the prefix
-tt(no)) will not be deleted.
-
-Note that the use of tt(L) in the first part means that it matches
-only when at the beginning of both the command line string and the
-trial completion. I.e., the string `tt(_NO_f)' would not be
-completed to `tt(_NO_foo)', nor would `tt(NONO_f)' be completed to
-`tt(NONO_foo)' because of the leading underscore or the second
-`tt(NO)' on the line which makes the pattern fail even though they are 
-otherwise ignored. To fix this, one would use `tt(B:[nN][oO]=)'
-instead of the first part. As described above, this matches at the
-beginning of the trial completion, independent of other characters or
-substrings at the beginning of the command line word which are ignored
-by the same or other var(match-spec)s.
-
-The second example makes completion case insensitive.  This is just
-the same as in the option example, except here we wish to retain the
-characters in the list of completions:
-
-example(compadd -M 'm:{[:lower:]}={[:upper:]}' ... )
-
-This makes lower case letters match their upper case counterparts.
-To make upper case letters match the lower case forms as well:
-
-example(compadd -M 'm:{[:lower:][:upper:]}={[:upper:][:lower:]}' ... )
-
-A nice example for the use of tt(*) patterns is partial word
-completion. Sometimes you would like to make strings like `tt(c.s.u)'
-complete to strings like `tt(comp.source.unix)', i.e. the word on the
-command line consists of multiple parts, separated by a dot in this
-example, where each part should be completed separately DASH()- note,
-however, that the case where each part of the word, i.e. `tt(comp)',
-`tt(source)' and `tt(unix)' in this example, is to be completed from
-separate sets of matches
-is a different problem to be solved by the implementation of the
-completion widget.  The example can be handled by:
-
-example(compadd -M 'r:|.=* r:|=*' \ 
-  - comp.sources.unix comp.sources.misc ...)
-
-The first specification says that var(lpat) is the empty string, while
-var(anchor) is a dot; var(tpat) is tt(*), so this can match anything
-except for the `tt(.)' from the anchor in
-the trial completion word.  So in `tt(c.s.u)', the matcher sees `tt(c)',
-followed by the empty string, followed by the anchor `tt(.)', and
-likewise for the second dot, and replaces the empty strings before the
-anchors, giving `tt(c)[tt(omp)]tt(.s)[tt(ources)]tt(.u)[tt(nix)]', where
-the last part of the completion is just as normal.
-
-With the pattern shown above, the string `tt(c.u)' could not be
-completed to `tt(comp.sources.unix)' because the single star means
-that no dot (matched by the anchor) can be skipped. By using two stars 
-as in `tt(r:|.=**)', however, `tt(c.u)' could be completed to
-`tt(comp.sources.unix)'. This also shows that in some cases,
-especially if the anchor is a real pattern, like a character class,
-the form with two stars may result in more matches than one would like.
-
-The second specification is needed to make this work when the cursor is
-in the middle of the string on the command line and the option
-tt(COMPLETE_IN_WORD) is set. In this case the completion code would
-normally try to match trial completions that end with the string as
-typed so far, i.e. it will only insert new characters at the cursor
-position rather than at the end.  However in our example we would like
-the code to recognise matches which contain extra characters after the
-string on the line (the `tt(nix)' in the example).  Hence we say that the
-empty string at the end of the string on the line matches any characters
-at the end of the trial completion.
-
-More generally, the specification
-
-example(compadd -M 'r:|[.,_-]=* r:|=*' ... )
-
-allows one to complete words with abbreviations before any of the
-characters in the square brackets.  For example, to
-complete tt(veryverylongfile.c) rather than tt(veryverylongheader.h)
-with the above in effect, you can just type tt(very.c) before attempting
-completion.
-
-The specifications with both a left and a right anchor are useful to
-complete partial words whose parts are not separated by some
-special character. For example, in some places strings have to be
-completed that are formed `tt(LikeThis)' (i.e. the separate parts are
-determined by a leading upper case letter) or maybe one has to
-complete strings with trailing numbers. Here one could use the simple
-form with only one anchor as in:
-
-example(compadd -M 'r:|[[:upper:]0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
-
-But with this, the string `tt(H)' would neither complete to `tt(FooHoo)'
-nor to `tt(LikeTHIS)' because in each case there is an upper case
-letter before the `tt(H)' and that is matched by the anchor. Likewise, 
-a `tt(2)' would not be completed. In both cases this could be changed
-by using `tt(r:|[[:upper:]0-9]=**)', but then `tt(H)' completes to both
-`tt(LikeTHIS)' and `tt(FooHoo)' and a `tt(2)' matches the other
-strings because characters can be inserted before every upper case
-letter and digit. To avoid this one would use:
-
-example(compadd -M 'r:[^[:upper:]0-9]||[[:upper:]0-9]=** r:|=*' \ 
-    LikeTHIS FooHoo foo123 bar234)
-
-By using these two anchors, a `tt(H)' matches only upper case `tt(H)'s that 
-are immediately preceded by something matching the left anchor
-`tt([^[:upper:]0-9])'. The effect is, of course, that `tt(H)' matches only
-the string `tt(FooHoo)', a `tt(2)' matches only `tt(bar234)' and so on.
-
-When using the completion system (see
-ifzman(zmanref(zshcompsys))\
-ifnzman(noderef(Completion System))\
-), users can define match specifications that are to be used for
-specific contexts by using the tt(matcher) and tt(matcher-list)
-styles. The values for the latter will be used everywhere.
-
 texinode(Completion Widget Example)()(Completion Matching Control)(Completion Widgets)
 sect(Completion Widget Example)
 cindex(completion widgets, example)
diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
index 621707482..ee7e422c1 100644
--- a/Test/Y02compmatch.ztst
+++ b/Test/Y02compmatch.ztst
@@ -378,15 +378,26 @@
   comp.graphics.rendering.misc comp.graphics.rendering.raytracing
   comp.graphics.rendering.renderman)
  test_code $example4_matcher example4_list
- comptest $'tst c.s.u\t'
-0:Documentation example using input c.s.u
+ comptest $'tst .s.u\t'
+0:Documentation example using input .s.u
+>line: {tst comp.sources.unix }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{21}
+
+  example4b_matcher='r:[^.]||.=* r:|=*'
+ test_code $example4b_matcher example4_list
+ comptest $'tst .s.u\t^[bc\t'
+0f:Documentation example using input .s.u but with double anchor
+>line: {tst .s.u}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst comp.sources.unix }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{21}
 
  test_code $example4_matcher example4_list
- comptest $'tst c.g.\ta\t.\tp\ta\tg\t'
-0:Documentation example using input c.g.\ta\t.\tp\ta\tg\t
+ comptest $'tst .g.\ta\t.\tp\ta\tg\t'
+0:Documentation example using input .g.\ta\t.\tp\ta\tg\t
 >line: {tst comp.graphics.}{}
 >COMPADD:{}
 >INSERT_POSITIONS:{18}
@@ -424,9 +435,32 @@
 >COMPADD:{}
 >INSERT_POSITIONS:{32}
 
+ test_code $example4b_matcher example4_list
+ comptest $'tst .g.\t^[bc\t'
+0f:Documentation example using input .g. with double anchor
+>line: {tst .g.}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst comp.graphics.}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{18}
+
  test_code $example4_matcher example4_list
- comptest $'tst c...pag\t'
-0:Documentation example using input c...pag\t
+ comptest $'tst ...pag\t'
+0:Documentation example using input ...pag
+>line: {tst comp.graphics.apps.pagemaker }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{32}
+
+ test_code $example4b_matcher example4_list
+ comptest $'tst ...pag\t^[bc\t^Fg^F^Fa\t'
+0f:Documentation example using input ...pag with double anchor
+>line: {tst .g.}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst c...pag}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst comp.graphics.apps.pagemaker }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{32}
@@ -444,8 +478,8 @@
  example5_matcher='r:|[.,_-]=* r:|=*'
  example5_list=(veryverylongfile.c veryverylongheader.h)
  test_code $example5_matcher example5_list
- comptest $'tst  v.c\tv.h\t'
-0:Documentation example using input v.c\t
+ comptest $'tst  .c\t.h\t'
+0:Documentation example using input .c
 >line: {tst  veryverylongfile.c }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{23}
@@ -453,6 +487,23 @@
 >COMPADD:{}
 >INSERT_POSITIONS:{44}
 
+ example5b_matcher='r:[^.,_-]||[.,_-]=* r:|=*'
+ test_code $example5b_matcher example5_list
+ comptest $'tst  .c\t^[bv\t.h\t^[bv'
+0f:Documentation example using input .c but with double anchor
+>line: {tst  .c}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{23}
+>line: {tst  veryverylongfile.c .h}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst  veryverylongfile.c veryverylongheader.h }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{44}
+
 
  example6_list=(LikeTHIS FooHoo 5foo123 5bar234)
  test_code 'r:|[A-Z0-9]=* r:|=*' example6_list
@@ -493,15 +544,52 @@
  example7_matcher="r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
  example7_list=($example6_list)
  test_code $example7_matcher example7_list
- comptest $'tst H\t2\t'
-0:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
+ comptest $'tst H\t^[bF\to2\t^[b5\tb\t'
+0f:Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"
+>line: {tst H}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst F}{H}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst FooHoo }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{10}
+>line: {tst FooHoo 2}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo 5}{2}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
 >line: {tst FooHoo 5bar234 }{}
 >COMPADD:{}
 >INSERT_POSITIONS:{18}
 
+ example7b_matcher="r:?||[A-Z0-9]=* r:|=*"
+ test_code $example7b_matcher example7_list
+ comptest $'tst H\t^[bF2\t^[b5\t'
+0f:Documentation example using "r:?||[A-Z0-9]=* r:|=*"
+>line: {tst H}{}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst FooHoo }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{10}
+>line: {tst FooHoo 5bar234 }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{18}
+
+ example8_list=(passwd.byname)
+ test_code 'r:[^.]||.=* l:.||[^.]=*'
+ comptest $'tst .^B\tpass^Fname\t'
+0f:Symmetry between r and l
+>line: {tst }{.}
+>COMPADD:{}
+>INSERT_POSITIONS:{}
+>line: {tst passwd.byname }{}
+>COMPADD:{}
+>INSERT_POSITIONS:{17}
+
 
  workers_7311_matcher="m:{a-z}={A-Z} r:|[.,_-]=* r:|=*"
  workers_7311_list=(Abc-Def-Ghij.txt Abc-def.ghi.jkl_mno.pqr.txt Abc_def_ghi_jkl_mno_pqr.txt)
-- 
2.33.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-12 12:08 ` Marlon Richert
@ 2021-10-12 15:25   ` Daniel Shahaf
  2021-10-13  4:57     ` Bart Schaefer
  2021-10-13  5:08     ` Bart Schaefer
  2021-10-14 20:43   ` Oliver Kiddle
  1 sibling, 2 replies; 11+ messages in thread
From: Daniel Shahaf @ 2021-10-12 15:25 UTC (permalink / raw)
  To: Marlon Richert; +Cc: Zsh hackers list

Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> >
> > The tests show how :||= matchers should behave in order to provide
> > completion features that cannot be implemented with :|= matchers.

Would this be backwards compatible?

> > This is a follow-up to users/27228.
> 
> I've now added an accompanying documentation update to the patch.

Thanks.  I have never found that section easy to follow.

Could you confirm that the text which the docs patch deletes or changes
was all confirmed correct (even if perhaps unclear)?  I'm concerned
about us possibly changing dense, accurate docs into clear, less-accurate
docs.

Case in point: The incumbent docs say that the coanchor is matched only
against the trial completion, but the new docs say something else.  If
that's an intentional change, it needs to be called out explicitly in
the log message.

Haven't looked for other differences.

In the man page rendering on my system, itemiz()'s bullet is vertically
aligned with the parent item()'s text.

Cheers,

Daniel

> From 3ec2fceced1f327eb2ac7484772bd1d3756bf8d2 Mon Sep 17 00:00:00 2001
> From: Marlon Richert <marlon.richert@gmail.com>
> Date: Tue, 12 Oct 2021 15:02:31 +0300
> Subject: [PATCH] Add xfail tests for || form of completion matchers
> 
> The tests show how :||= matchers should behave in order to provide
> completion features that cannot be implemented with :|= matchers.
> ---
>  Doc/Zsh/compwid.yo     | 446 ++++++++++++++++++-----------------------
>  Test/Y02compmatch.ztst | 108 +++++++++-
>  2 files changed, 293 insertions(+), 261 deletions(-)
> 
> diff --git a/Doc/Zsh/compwid.yo b/Doc/Zsh/compwid.yo
> index 3e86d3b42..5dd2127df 100644
> --- a/Doc/Zsh/compwid.yo
> +++ b/Doc/Zsh/compwid.yo
> @@ -896,72 +896,210 @@ enditem()
>  texinode(Completion Matching Control)(Completion Widget Example)(Completion Condition Codes)(Completion Widgets)
>  sect(Completion Matching Control)
>  
> -It is possible by use of the
> -tt(-M) option of the tt(compadd) builtin command to specify how the
> -characters in the string to be completed (referred to here as the
> -command line) map onto the characters in the list of matches produced by
> -the completion code (referred to here as the trial completions). Note
> -that this is not used if the command line contains a glob pattern and
> -the tt(GLOB_COMPLETE) option is set or the tt(pattern_match) of the
> -tt(compstate) special association is set to a non-empty string.
> -
> -The var(match-spec) given as the argument to the tt(-M) option (see
> +By default, characters in the string to be completed (referred to here as the
> +command line) map only onto identical characters in the list of matches
> +produced by the completion code (referred to here as the trial completions) and
> +missing characters are inserted only at the cursor position, if the shell
> +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
> +otherwise.  However, it is possible to modify this behavior by use of the
> +tt(-M) option of the tt(compadd) builtin command.  Note that this is not used
> +if the command line contains a glob pattern and the shell
> +optiontt(GLOB_COMPLETE) is set or the tt(pattern_match) of the tt(compstate)
> +special association is set to a non-empty string.
> +
> +The tt(-M) option (see
>  ifzman(`Completion Builtin Commands' above)\
> -ifnzman(noderef(Completion Builtin Commands))\
> -) consists of one or more matching descriptions separated by
> -whitespace.  Each description consists of a letter followed by a colon
> -and then the patterns describing which character sequences on the line match
> -which character sequences in the trial completion.  Any sequence of
> -characters not handled in this fashion must match exactly, as usual.
> -
> -The forms of var(match-spec) understood are as follows. In each case, the
> -form with an upper case initial character retains the string already
> -typed on the command line as the final result of completion, while with
> -a lower case initial character the string on the command line is changed
> -into the corresponding part of the trial completion.
> +ifnzman(noderef(Completion Builtin
> +Commands))\
> +) requires a var(match-spec) as it argument, consisting of one or more matching
> +descriptions separated by whitespace.  Each description consists of a letter,
> +followed by a colon, and then patterns describing which substrings on the
> +command line map onto which substrings in the trial completion.  Descriptions
> +are evaluated from left to right and are cumulative.  An earlier mapping can
> +thus potentially change the outcome of a later mapping.  Finally, any unmapped
> +substrings will be mapped using the default mapping of identical substrings.
> +
> +When using the completion system (see
> +ifzman(zmanref(zshcompsys))\
> +ifnzman(noderef(Completion System))\
> +), users can define match specifications that are to be used for specific
> +contexts by using the tt(matcher) and tt(matcher-list) styles.  The values for
> +the latter will be used everywhere.
> +
> +Each pattern in a var(match-spec) is either an empty string or consists of a
> +sequence of literal characters (which may be quoted with a backslash), question
> +marks, character classes, and correspondence classes (see next paragraph).
> +Ordinary shell patterns are not used.  Literal characters match only
> +themselves, question marks match any character, and character classes are
> +formed as for globbing and match any character in the given set.
> +
> +Correspondence classes are defined like character classes, but with two
> +differences: They are delimited by a pair of braces, and negated classes are
> +not allowed, so the characters tt(!) and tt(^) have no special meaning directly
> +after the opening brace.  They indicate that a range of characters on the line
> +match a range of characters in the trial completion, but (unlike ordinary
> +character classes) paired according to the corresponding position in the
> +sequence.  More than one pair of classes can occur, in which case the first
> +class before the tt(=) corresponds to the first after it, and so on.  If one
> +side has more such classes than the other side, the superfluous classes behave
> +like normal character classes.
> +
> +The standard `tt([:)var(name)tt(:])' forms described for standard shell
> +patterns (see 
> +ifnzman(noderef(Filename Generation))\
> +ifzman(the section
> +FILENAME GENERATION in zmanref(zshexpn))\
> +) may appear in correspondence classes as well as normal character classes.
> +The only special behaviour in correspondence classes is if the form on the left
> +and the form on the right are each one of tt([:upper:]), tt([:lower:]).  In
> +these cases the character in the word and the character on the line must be the
> +same up to a difference in case.  Although the matching system does not yet
> +handle multibyte characters, this is likely to be a future extension, at which
> +point this syntax will handle arbitrary alphabets; hence this form, rather than
> +the use of explicit ranges, is the recommended form.  In other cases
> +`tt([:)var(name)tt(:])' forms are allowed.  If the two forms on the left and
> +right are the same, the characters must match exactly.  In remaining cases, the
> +corresponding tests are applied to both characters, but they are not otherwise
> +constrained; any matching character in one set goes with any matching character
> +in the other set: this is equivalent to the behaviour of ordinary character
> +classes.
> +
> +The forms of var(match-spec) understood are listed below.  For each of these,
> +the form with an upper case initial character replaces mapped substrings in the
> +trial completions with their counterparts from the command line, whereas with a
> +lower case initial character, once a trial completion has been accepted,
> +matched substrings on the command line are replaced with their counterparts
> +from the accepted completion.
>  
>  startitem()
>  xitem(tt(m:)var(lpat)tt(=)var(tpat))
>  item(tt(M:)var(lpat)tt(=)var(tpat))(
> -Here, var(lpat) is a pattern that matches on the command line,
> -corresponding to var(tpat) which matches in the trial completion.
> +Let any substring matching var(lpat) be completed to any substring matching
> +var(tpat).
> +
> +Examples:
> +
> +tt(m:{[:lower:]}={[:upper:]}) lets any lower case character be completed to its
> +uppercase counterpart.
> +
> +tt(M:_=) inserts every underscore on the command line into each trial
> +completion, in the same relative position, determined by matching the
> +substrings around it.  Note that the definition of what is matching can be
> +modified by applying other matchers first.
> +
> +If these two matchers are combined to tt('m:{[:lower:]}={[:upper:]} M:_='),
> +then given a trial completion `tt(NO)', it lets `tt(_n_o_)' be completed to
> +`tt(_N_O_)', even though `tt(_N_O_)' itself is not present as a trial
> +completion.  tt(m:{[:lower:]}={[:upper:]}) is evaluated first and makes `tt(n)`
> +match `tt(N)' and `tt(o)` match `tt(O)', after which tt(M:_=) is then able to
> +insert underscores into the correct positions.
> +)
> +xitem(tt(l:)tt(|)var(lpat)tt(=)var(tpat))
> +xitem(tt(L:)tt(|)var(lpat)tt(=)var(tpat))
> +xitem(tt(r:)var(lpat)tt(|)tt(=)var(tpat))
> +item(tt(R:)var(lpat)tt(|)tt(=)var(tpat))(
> +Let any substring matching var(lpat) at the left (for tt(l:) and tt(L:)) or
> +right (for tt(r:) and tt(R:)) edge of the command line be completed to any
> +substring matching var(tpat) in the same position in the trial completion.
> +
> +With these matchers, the pattern var(tpat) may also be a star, `tt(*)'.  This
> +lets a matching command line substring be completed to any trial completion
> +substring in the same relative position.
> +
> +Examples:
> +
> +tt(L:|[nN][oO]=) makes it so that, if there is a single `tt(no)', `tt(nO)',
> +`tt(No)' or `tt(no)' at the left end of the command line, then it is added to
> +the left of each trial completion.
> +
> +tt(r:|=*) lets (the empty substring at) the right edge of the command line
> +string be completed to any number of characters at the edge of each trial
> +completion.
> +
> +If these two matchers are combined to tt('L:[nN][oO]= r:|=*'), then given a
> +trial completion `tt(foo)', it lets `tt(NOf)' be completed to `tt(NOfoo)'.
> +First, tt(L:[nN][oO]=) prefixes the trial completion with tt(NO), after which
> +tt(r:|=*) is able to match the command line to the trial completion and
> +complete the missing characters at the end.
>  )
> -xitem(tt(l:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat))
> -xitem(tt(L:)var(lanchor)tt(|)var(lpat)tt(=)var(tpat))
> -xitem(tt(l:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
> -xitem(tt(L:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
>  xitem(tt(b:)var(lpat)tt(=)var(tpat))
> -item(tt(B:)var(lpat)tt(=)var(tpat))(
> -These letters are for patterns that are anchored by another pattern on
> -the left side. Matching for var(lpat) and var(tpat) is as for tt(m) and
> -tt(M), but the pattern var(lpat) matched on the command line must be
> -preceded by the pattern var(lanchor).  The var(lanchor) can be blank to
> -anchor the match to the start of the command line string; otherwise the
> -anchor can occur anywhere, but must match in both the command line and
> -trial completion strings.
> -
> -If no var(lpat) is given but a var(ranchor) is, this matches the gap
> -between substrings matched by var(lanchor) and var(ranchor). Unlike
> -var(lanchor), the var(ranchor) only needs to match the trial
> -completion string.
> -
> -The tt(b) and tt(B) forms are similar to tt(l) and tt(L) with an empty 
> -anchor, but need to match only the beginning of the word on the command line
> -or trial completion, respectively.
> -)
> -xitem(tt(r:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat))
> -xitem(tt(R:)var(lpat)tt(|)var(ranchor)tt(=)var(tpat))
> -xitem(tt(r:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
> -xitem(tt(R:)var(lanchor)tt(||)var(ranchor)tt(=)var(tpat))
> +xitem(tt(B:)var(lpat)tt(=)var(tpat))
>  xitem(tt(e:)var(lpat)tt(=)var(tpat))
>  item(tt(E:)var(lpat)tt(=)var(tpat))(
> -As tt(l), tt(L), tt(b) and tt(B), with the difference that the command
> -line and trial completion patterns are anchored on the right side.
> -Here an empty var(ranchor) and the tt(e) and tt(E) forms force the
> -match to the end of the command line or trial completion string.
> -
> -In the form where var(lanchor) is given, the var(lanchor) only needs
> -to match the trial completion string.
> +Let all substrings matching var(lpat) at the beginning (for tt(b:) and tt(B:))
> +or end (for tt(e:) and tt(E:)) of the command line be completed to the same
> +number of substrings matching var(tpat) in each trial completion in the same
> +relative position.
> +
> +Example:
> +
> +tt(B:[nN][oO]=) adds all occurences of `tt(no)', `tt(nO)', `tt(No)' and
> +`tt(NO)' at the beginning of the command line to the beginning of each trial
> +completion.  If tt(r:|=*) is added to this, then given a trial completion
> +`tt(foo)', it lets `tt(noNOf)' be completed to `tt(noNOfoo)'.
> +)
> +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
> +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
> +Let any command line substring, which is left/right-adjacent (respectively) to
> +a substring matching var(anchor) and which matches var(lpat), be completed to
> +any trial completion substring, which
> +startitemize()
> +itemiz(\
> +is adjacent to the same substring and which
> +)
> +itemiz(\
> +matches var(tpat), but which
> +)
> +itemiz(\
> +does not contain any substrings matching var(anchor).
> +)
> +enditemize()
> +
> +When a matcher includes at least one anchor (which also applies to the forms
> +with two anchors, below), the pattern var(tpat) may also be one or two stars,
> +`tt(*)' or `tt(**)'.  The first star can match any number of characters, within
> +the constraints outlined above, whereas a second star removes the last
> +constraint and can match substrings matching var(anchor).
> +
> +Example:
> +
> +tt(r:|.=*) lets each dot be completed to any substring that ends at the right
> +in a dot, but does not otherwise contain any dots, in the trial string.  Thus,
> +given a trial string `tt(comp.sources.unix)', `tt(..unix)' can be completed to
> +it, but `tt(.unix)' cannot, since the matcher will refuse to map any dots other
> +than the one matched by the var(anchor).
> +)
> +xitem(tt(l:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat))
> +xitem(tt(L:)var(anchor)tt(||)var(coanchor)tt(=)var(tpat))
> +xitem(tt(r:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat))
> +item(tt(R:)var(coanchor)tt(||)var(anchor)tt(=)var(tpat))(
> +Lets the empty string between each two adjacent command line substrings
> +matching var(anchor) and var(coanchor), in the order given, be completed to any
> +trial completion substring, which
> +startitemize()
> +itemiz(\
> +is adjacent to the same two substrings and which
> +)
> +itemiz(\
> +matches var(tpat), but which
> +)
> +itemiz(\
> +does not contain any substrings matching var(anchor).
> +)
> +enditemize()
> +
> +Note there is no restriction on substrings matching var(coanchor).
> +
> +Example:
> +
> +tt(r:?||[[:upper:]]=*) will complete `tt(fHoo)' to `tt(fooHoo)', but not
> +`tt(Hoo)' to `tt(fooHoo)', because there is no character to the left of `tt(H)'
> +on the command line˙.  Likewise, it will not complete `tt(lHIS)' to
> +`tt(likeTHIS)', because, other than the one substring it maps to var(anchor),
> +it cannot map any substring containing uppercase letters in the trial
> +completion.
>  )
>  item(tt(x:))(
>  This form is used to mark the end of matching specifications:
> @@ -972,200 +1110,6 @@ function to override another.
>  )
>  enditem()
>  
> -Each var(lpat), var(tpat) or var(anchor) is either an empty string or
> -consists of a sequence of literal characters (which may be quoted with a
> -backslash), question marks, character classes, and correspondence
> -classes; ordinary shell patterns are not used.  Literal characters match
> -only themselves, question marks match any character, and character
> -classes are formed as for globbing and match any character in the given
> -set.
> -
> -Correspondence classes are defined like character classes, but with two
> -differences: they are delimited by a pair of braces, and negated classes
> -are not allowed, so the characters tt(!) and tt(^) have no special
> -meaning directly after the opening brace.  They indicate that a range of
> -characters on the line match a range of characters in the trial
> -completion, but (unlike ordinary character classes) paired according to
> -the corresponding position in the sequence.  For example, to make any
> -ASCII lower case letter on the line match the corresponding upper case
> -letter in the trial completion, you can use `tt(m:{a-z}={A-Z})'
> -(however, see below for the recommended form for this).  More
> -than one pair of classes can occur, in which case the first class before
> -the tt(=) corresponds to the first after it, and so on.  If one side has
> -more such classes than the other side, the superfluous classes behave
> -like normal character classes.  In anchor patterns correspondence classes
> -also behave like normal character classes.
> -
> -The standard `tt([:)var(name)tt(:])' forms described for standard shell
> -patterns (see
> -ifnzman(noderef(Filename Generation))\
> -ifzman(the section FILENAME GENERATION in zmanref(zshexpn)))
> -may appear in correspondence classes as well as normal character
> -classes.  The only special behaviour in correspondence classes is if
> -the form on the left and the form on the right are each one of
> -tt([:upper:]), tt([:lower:]).  In these cases the
> -character in the word and the character on the line must be the same up
> -to a difference in case.  Hence to make any lower case character on the
> -line match the corresponding upper case character in the trial
> -completion you can use `tt(m:{[:lower:]}={[:upper:]})'.  Although the
> -matching system does not yet handle multibyte characters, this is likely
> -to be a future extension, at which point this syntax will handle
> -arbitrary alphabets; hence this form, rather than the use of explicit
> -ranges, is the recommended form.  In other cases
> -`tt([:)var(name)tt(:])' forms are allowed.  If the two forms on the left
> -and right are the same, the characters must match exactly.  In remaining
> -cases, the corresponding tests are applied to both characters, but they
> -are not otherwise constrained; any matching character in one set goes
> -with any matching character in the other set:  this is equivalent to the
> -behaviour of ordinary character classes.
> -
> -The pattern var(tpat) may also be one or two stars, `tt(*)' or
> -`tt(**)'. This means that the pattern on the command line can match
> -any number of characters in the trial completion. In this case the
> -pattern must be anchored (on either side); in the case of a single
> -star, the var(anchor) then determines how much of the trial completion
> -is to be included DASH()- only the characters up to the next appearance of
> -the anchor will be matched. With two stars, substrings matched by
> -the anchor can be matched, too. In the forms that include two
> -anchors, `tt(*)' can match characters from the additional anchor
> -DASH()- var(lanchor) with tt(r) or var(ranchor) with tt(l).
> -
> -Examples:
> -
> -The keys of the tt(options) association defined by the tt(parameter)
> -module are the option names in all-lower-case form, without
> -underscores, and without the optional tt(no) at the beginning even
> -though the builtins tt(setopt) and tt(unsetopt) understand option names
> -with upper case letters, underscores, and the optional tt(no).  The
> -following alters the matching rules so that the prefix tt(no) and any
> -underscore are ignored when trying to match the trial completions
> -generated and upper case letters on the line match the corresponding
> -lower case letters in the words:
> -
> -example(compadd -M 'L:|[nN][oO]= M:_= M:{[:upper:]}={[:lower:]}' - \ 
> -  ${(k)options} )
> -
> -The first part says that the pattern `tt([nN][oO])' at the beginning
> -(the empty anchor before the pipe symbol) of the string on the
> -line matches the empty string in the list of words generated by
> -completion, so it will be ignored if present. The second part does the
> -same for an underscore anywhere in the command line string, and the
> -third part uses correspondence classes so that any
> -upper case letter on the line matches the corresponding lower case
> -letter in the word. The use of the upper case forms of the
> -specification characters (tt(L) and tt(M)) guarantees that what has
> -already been typed on the command line (in particular the prefix
> -tt(no)) will not be deleted.
> -
> -Note that the use of tt(L) in the first part means that it matches
> -only when at the beginning of both the command line string and the
> -trial completion. I.e., the string `tt(_NO_f)' would not be
> -completed to `tt(_NO_foo)', nor would `tt(NONO_f)' be completed to
> -`tt(NONO_foo)' because of the leading underscore or the second
> -`tt(NO)' on the line which makes the pattern fail even though they are 
> -otherwise ignored. To fix this, one would use `tt(B:[nN][oO]=)'
> -instead of the first part. As described above, this matches at the
> -beginning of the trial completion, independent of other characters or
> -substrings at the beginning of the command line word which are ignored
> -by the same or other var(match-spec)s.
> -
> -The second example makes completion case insensitive.  This is just
> -the same as in the option example, except here we wish to retain the
> -characters in the list of completions:
> -
> -example(compadd -M 'm:{[:lower:]}={[:upper:]}' ... )
> -
> -This makes lower case letters match their upper case counterparts.
> -To make upper case letters match the lower case forms as well:
> -
> -example(compadd -M 'm:{[:lower:][:upper:]}={[:upper:][:lower:]}' ... )
> -
> -A nice example for the use of tt(*) patterns is partial word
> -completion. Sometimes you would like to make strings like `tt(c.s.u)'
> -complete to strings like `tt(comp.source.unix)', i.e. the word on the
> -command line consists of multiple parts, separated by a dot in this
> -example, where each part should be completed separately DASH()- note,
> -however, that the case where each part of the word, i.e. `tt(comp)',
> -`tt(source)' and `tt(unix)' in this example, is to be completed from
> -separate sets of matches
> -is a different problem to be solved by the implementation of the
> -completion widget.  The example can be handled by:
> -
> -example(compadd -M 'r:|.=* r:|=*' \ 
> -  - comp.sources.unix comp.sources.misc ...)
> -
> -The first specification says that var(lpat) is the empty string, while
> -var(anchor) is a dot; var(tpat) is tt(*), so this can match anything
> -except for the `tt(.)' from the anchor in
> -the trial completion word.  So in `tt(c.s.u)', the matcher sees `tt(c)',
> -followed by the empty string, followed by the anchor `tt(.)', and
> -likewise for the second dot, and replaces the empty strings before the
> -anchors, giving `tt(c)[tt(omp)]tt(.s)[tt(ources)]tt(.u)[tt(nix)]', where
> -the last part of the completion is just as normal.
> -
> -With the pattern shown above, the string `tt(c.u)' could not be
> -completed to `tt(comp.sources.unix)' because the single star means
> -that no dot (matched by the anchor) can be skipped. By using two stars 
> -as in `tt(r:|.=**)', however, `tt(c.u)' could be completed to
> -`tt(comp.sources.unix)'. This also shows that in some cases,
> -especially if the anchor is a real pattern, like a character class,
> -the form with two stars may result in more matches than one would like.
> -
> -The second specification is needed to make this work when the cursor is
> -in the middle of the string on the command line and the option
> -tt(COMPLETE_IN_WORD) is set. In this case the completion code would
> -normally try to match trial completions that end with the string as
> -typed so far, i.e. it will only insert new characters at the cursor
> -position rather than at the end.  However in our example we would like
> -the code to recognise matches which contain extra characters after the
> -string on the line (the `tt(nix)' in the example).  Hence we say that the
> -empty string at the end of the string on the line matches any characters
> -at the end of the trial completion.
> -
> -More generally, the specification
> -
> -example(compadd -M 'r:|[.,_-]=* r:|=*' ... )
> -
> -allows one to complete words with abbreviations before any of the
> -characters in the square brackets.  For example, to
> -complete tt(veryverylongfile.c) rather than tt(veryverylongheader.h)
> -with the above in effect, you can just type tt(very.c) before attempting
> -completion.
> -
> -The specifications with both a left and a right anchor are useful to
> -complete partial words whose parts are not separated by some
> -special character. For example, in some places strings have to be
> -completed that are formed `tt(LikeThis)' (i.e. the separate parts are
> -determined by a leading upper case letter) or maybe one has to
> -complete strings with trailing numbers. Here one could use the simple
> -form with only one anchor as in:
> -
> -example(compadd -M 'r:|[[:upper:]0-9]=* r:|=*' LikeTHIS FooHoo 5foo123 5bar234)
> -
> -But with this, the string `tt(H)' would neither complete to `tt(FooHoo)'
> -nor to `tt(LikeTHIS)' because in each case there is an upper case
> -letter before the `tt(H)' and that is matched by the anchor. Likewise, 
> -a `tt(2)' would not be completed. In both cases this could be changed
> -by using `tt(r:|[[:upper:]0-9]=**)', but then `tt(H)' completes to both
> -`tt(LikeTHIS)' and `tt(FooHoo)' and a `tt(2)' matches the other
> -strings because characters can be inserted before every upper case
> -letter and digit. To avoid this one would use:
> -
> -example(compadd -M 'r:[^[:upper:]0-9]||[[:upper:]0-9]=** r:|=*' \ 
> -    LikeTHIS FooHoo foo123 bar234)
> -
> -By using these two anchors, a `tt(H)' matches only upper case `tt(H)'s that 
> -are immediately preceded by something matching the left anchor
> -`tt([^[:upper:]0-9])'. The effect is, of course, that `tt(H)' matches only
> -the string `tt(FooHoo)', a `tt(2)' matches only `tt(bar234)' and so on.
> -
> -When using the completion system (see
> -ifzman(zmanref(zshcompsys))\
> -ifnzman(noderef(Completion System))\
> -), users can define match specifications that are to be used for
> -specific contexts by using the tt(matcher) and tt(matcher-list)
> -styles. The values for the latter will be used everywhere.
> -
>  texinode(Completion Widget Example)()(Completion Matching Control)(Completion Widgets)
>  sect(Completion Widget Example)
>  cindex(completion widgets, example)
> diff --git a/Test/Y02compmatch.ztst b/Test/Y02compmatch.ztst
> index 621707482..ee7e422c1 100644
> --- a/Test/Y02compmatch.ztst
> +++ b/Test/Y02compmatch.ztst
> @@ -378,15 +378,26 @@


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-12 15:25   ` Daniel Shahaf
@ 2021-10-13  4:57     ` Bart Schaefer
  2021-10-13  5:08     ` Bart Schaefer
  1 sibling, 0 replies; 11+ messages in thread
From: Bart Schaefer @ 2021-10-13  4:57 UTC (permalink / raw)
  To: Marlon Richert; +Cc: Zsh hackers list

On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> >
> > I've now added an accompanying documentation update to the patch.
>
> Could you confirm that the text which the docs patch deletes or changes
> was all confirmed correct (even if perhaps unclear)?

For example, this part is misleading:

> > +By default, characters in the string to be completed (referred to here as the
> > +command line) map only onto identical characters in the list of matches
[...]
> > +missing characters are inserted only at the cursor position, if the shell
> > +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,

It's at the end of the current word, not the end of the command line.
The old wording nearly always says "string on the command line" which
is only somewhat better; if it's going to be completely rewritten to
drop "string on the", the phrase "command line" should become more
precise.  "Incomplete word" perhaps?

> > +) requires a var(match-spec) as it argument, consisting of one or more matching

"its"

> > +corresponding tests are applied to both characters, but they are not otherwise
> > +constrained; any matching character in one set goes with any matching character
> > +in the other set: this is equivalent to the behaviour of ordinary character
> > +classes.

What's an "ordinary" character class?  That is, what ordinary context
is implied?

> > +xitem(tt(l:)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(L:)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(r:)var(lpat)tt(|)tt(=)var(tpat))
> > +item(tt(R:)var(lpat)tt(|)tt(=)var(tpat))(
> > +Let any substring matching var(lpat) at the left (for tt(l:) and tt(L:)) or
> > +right (for tt(r:) and tt(R:)) edge of the command line be completed to any

Again, not the command line, just the current word under (or to the
left of) the cursor, but I'll stop mentioning this because it's a
problem with definition of terms.

> > +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
> > +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
> > +Let any command line substring, which is left/right-adjacent (respectively) to
> > +a substring matching var(anchor) and which matches var(lpat), be completed to
> > +any trial completion substring, which
> > +startitemize()
> > +itemiz(\
> > +is adjacent to the same substring and which

Unclear whether "the same substring" refers to "any command line
substring" or to "a substring matching anchor".  I believe you mean
the former (or perhaps the larger substring composed of both of the
former)?  Best to specify.

I believe the rest of the explanations are correct, but it would be
good if Oliver confirms.

Did you remove the assorted other examples because there is a problem with them?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-12 15:25   ` Daniel Shahaf
  2021-10-13  4:57     ` Bart Schaefer
@ 2021-10-13  5:08     ` Bart Schaefer
  2021-10-13 14:20       ` Marlon Richert
  1 sibling, 1 reply; 11+ messages in thread
From: Bart Schaefer @ 2021-10-13  5:08 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Marlon Richert, Zsh hackers list

On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > >
> > > The tests show how :||= matchers should behave in order to provide
> > > completion features that cannot be implemented with :|= matchers.
>
> Would this be backwards compatible?

In particular, with the exception of specific bug regression tests,
all the tests using || matchers have been converted to xfails.
Shouldn't there still be some generic tests of the (functionally
correct subset of) the current behavior of || ?  And, do you think any
of the regression tests would begin to fail if the xfail tests begin
to succeed?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-13  5:08     ` Bart Schaefer
@ 2021-10-13 14:20       ` Marlon Richert
  2021-10-13 19:37         ` Daniel Shahaf
  0 siblings, 1 reply; 11+ messages in thread
From: Marlon Richert @ 2021-10-13 14:20 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Daniel Shahaf, Zsh hackers list

On Wed, Oct 13, 2021 at 8:08 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > > >
> > > > The tests show how :||= matchers should behave in order to provide
> > > > completion features that cannot be implemented with :|= matchers.
> >
> > Would this be backwards compatible?

No, it would not, but that's unavoidable, since at present, the :||=
matchers don't work correctly. Please see my and Oliver's comments in
the thread of users/27228.

On the plus side, there are only two lines in the Zsh codebase where
:||= matchers are used, in _ssh and _x_color. It won't require much
work to convert those.

> In particular, with the exception of specific bug regression tests,
> all the tests using || matchers have been converted to xfails.
> Shouldn't there still be some generic tests of the (functionally
> correct subset of) the current behavior of || ?

There was exactly one non-regression test using :||= matchers,
«Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"», and
unfortunately, that one will no longer pass as written. However, I
will see if I can find some cases for which the current implementation
works correctly and add tests for them.

> And, do you think any
> of the regression tests would begin to fail if the xfail tests begin
> to succeed?

There are four regression tests that incorporate one or more :||=
matchers. I investigated them and this is what I found:
* Bug from workers 11081 is about the cursor jumping back and forth. I
was able to remove all three :||= matchers from the test without
breaking it.
* Bug from workers 11586 is about characters getting deleted while
inserting the "unambiguous" substring. Here, I was not able to remove
or replace the :||= matcher and still get the same output. This one
might break, but most of the output it expects is actually not
relevant to the bug it is testing.
* Test from workers 13320 is about cursor positioning. I was able to
remove the :||= matcher from the test without breaking it.
* Second test from workers 13345 is about a character getting deleted.
I was able to replace the :||= matcher with a :|= matcher without
breaking the test.

On Tue, Oct 12, 2021 at 6:25 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Thanks.  I have never found that section easy to follow.

You're not the only one. ;)

> Could you confirm that the text which the docs patch deletes or changes
> was all confirmed correct (even if perhaps unclear)?  I'm concerned
> about us possibly changing dense, accurate docs into clear, less-accurate
> docs.

The original docs were vague and ambiguous on many points, and even
self-contradictory on some.

> Case in point: The incumbent docs say that the coanchor is matched only
> against the trial completion, but the new docs say something else.  If
> that's an intentional change, it needs to be called out explicitly in
> the log message.

Yes, that's intentional. I'll add it to the commit message.

> In the man page rendering on my system, itemiz()'s bullet is vertically
> aligned with the parent item()'s text.

I can confirm that this indeed looks wrong. Do you know what I should
use to get them indented properly? Or if that's not possible, I can
add a ifzman check to format the text without bullets in the man page.
However, I would at least like to keep them in the html page, because
it helps make the text clearer.

On Wed, Oct 13, 2021 at 7:57 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Could you confirm that the text which the docs patch deletes or changes
> > was all confirmed correct (even if perhaps unclear)?
>
> For example, this part is misleading:
>
> > > +By default, characters in the string to be completed (referred to here as the
> > > +command line) map only onto identical characters in the list of matches
> [...]
> > > +missing characters are inserted only at the cursor position, if the shell
> > > +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
>
> It's at the end of the current word, not the end of the command line.
> The old wording nearly always says "string on the command line" which
> is only somewhat better; if it's going to be completely rewritten to
> drop "string on the", the phrase "command line" should become more
> precise.  "Incomplete word" perhaps?

The use of "command line" in this fashion is from the original text;
it is used there about half of the time without the addition of
"string". However, I agree that it's ambiguous. I'm fine replacing it
with "incomplete word" (unless we come up with a better term).

> > > +corresponding tests are applied to both characters, but they are not otherwise
> > > +constrained; any matching character in one set goes with any matching character
> > > +in the other set: this is equivalent to the behaviour of ordinary character
> > > +classes.
>
> What's an "ordinary" character class?  That is, what ordinary context
> is implied?

I didn't write that paragraph; it was already present in the original
doc. I just moved it around.

> > > +xitem(tt(l:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > > +xitem(tt(L:)var(anchor)tt(|)var(lpat)tt(=)var(tpat))
> > > +xitem(tt(r:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))
> > > +item(tt(R:)var(lpat)tt(|)var(anchor)tt(=)var(tpat))(
> > > +Let any command line substring, which is left/right-adjacent (respectively) to
> > > +a substring matching var(anchor) and which matches var(lpat), be completed to
> > > +any trial completion substring, which
> > > +startitemize()
> > > +itemiz(\
> > > +is adjacent to the same substring and which
>
> Unclear whether "the same substring" refers to "any command line
> substring" or to "a substring matching anchor".  I believe you mean
> the former (or perhaps the larger substring composed of both of the
> former)?  Best to specify.

Will do.

> I believe the rest of the explanations are correct, but it would be
> good if Oliver confirms.
>
> Did you remove the assorted other examples because there is a problem with them?

I split them into parts and moved each part directly beneath the
matcher(s) it uses. This makes the matchers easier to understand and
allows the examples to be explained with less text.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-13 14:20       ` Marlon Richert
@ 2021-10-13 19:37         ` Daniel Shahaf
  2021-10-13 20:02           ` Bart Schaefer
  2021-10-14 20:25           ` Oliver Kiddle
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Shahaf @ 2021-10-13 19:37 UTC (permalink / raw)
  To: Marlon Richert; +Cc: Zsh hackers list

Marlon Richert wrote on Wed, Oct 13, 2021 at 17:20:09 +0300:
> On Wed, Oct 13, 2021 at 8:08 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > On Tue, Oct 12, 2021 at 8:26 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > >
> > > Marlon Richert wrote on Tue, Oct 12, 2021 at 15:08:46 +0300:
> > > > On Mon, Oct 11, 2021 at 5:34 PM Marlon Richert <marlon.richert@gmail.com> wrote:
> > > > >
> > > > > The tests show how :||= matchers should behave in order to provide
> > > > > completion features that cannot be implemented with :|= matchers.
> > >
> > > Would this be backwards compatible?
> 
> No, it would not, but that's unavoidable, since at present, the :||=
> matchers don't work correctly. Please see my and Oliver's comments in
> the thread of users/27228.
> 

Care to give a more specific pointer?  As in, specific cases where the
incumbent documentation doesn't match the implementation?  users/27228
itself reads rather along the lines of "let's re-design the feature
retroactively".

If you want to clarify the documentation of the feature as designed,
kudos.  If you want to increase test coverage, more kudos.  If you want
to throw out the existing documentation and implementation and
reimplement things differently… that's not to be done lightly,
notwithstanding that you went the right way about proposing it (first
clarifying the status quo, then posting docs and XFail tests).

> On the plus side, there are only two lines in the Zsh codebase where
> :||= matchers are used, in _ssh and _x_color. It won't require much
> work to convert those.

That's not how it works.  Documented semantics are API promises that
should be presumed to be used in the wild.  Any change that may break
anybody's proverbial spacebar heating is an incompatible change and
should be treated accordingly (avoided if possible, and failing that,
clearly documented for upgraders, designed with a reasonable failure
mode for old code on new zsh, etc.).

When you give your house key to a trusted, you can always change the
lock and give the friend a new key.  However, user code isn't like
a house key.  User code is more closely analogous to public roads in
that old story about how the width of a car was basically determined by
the Romans (because cars had to be compatible with existing roads): it
exists, it can't be changed, it must be compatible with; it's a design
constraint.

> > In particular, with the exception of specific bug regression tests,
> > all the tests using || matchers have been converted to xfails.
> > Shouldn't there still be some generic tests of the (functionally
> > correct subset of) the current behavior of || ?
> 
> There was exactly one non-regression test using :||= matchers,
> «Documentation example using "r:[^A-Z0-9]||[A-Z0-9]=** r:|=*"», and
> unfortunately, that one will no longer pass as written.

A patch that breaks a documentation example is the archetype of an
incompatible change.  Is there no alternative to that?  Can't you add
a new syntax?  It can be as simple as «-M 'v2: …'» (that's pretty common
in standards that retrofit themselves into DNS TXT records, for instance).

> However, I will see if I can find some cases for which the current
> implementation works correctly and add tests for them.

Thanks.

> > And, do you think any of the regression tests would begin to fail if
> > the xfail tests begin to succeed?
> 
> There are four regression tests that incorporate one or more :||=
> matchers. I investigated them and this is what I found:
> * Bug from workers 11081 is about the cursor jumping back and forth. I
> was able to remove all three :||= matchers from the test without
> breaking it.

The test isn't a unit test of :||=, where one would expect the output to
change when :||= is removed.  The test is a regression test that's
supposed to catch instances of a bug that could be reproduced only by
one person and only intermittently.  For such tests, changing their code
in any way might make them no longer test for the bug they claim to.

> * Bug from workers 11586 is about characters getting deleted while
> inserting the "unambiguous" substring. Here, I was not able to remove
> or replace the :||= matcher and still get the same output. This one
> might break,

Ack.

> but most of the output it expects is actually not relevant to the bug
> it is testing.

FWIW, the reply to 11586, 11634, mentions CamelCase briefly.

> * Test from workers 13320 is about cursor positioning. I was able to
> remove the :||= matcher from the test without breaking it.

So what?  The question isn't whether users who have used :||= could have
written their code without it.  The question is whether users who have
used :||= would see a behaviour change if :||='s semantics were changed
in the manner proposed.

> * Second test from workers 13345 is about a character getting deleted.
> I was able to replace the :||= matcher with a :|= matcher without
> breaking the test.
> 

Ditto.

> On Tue, Oct 12, 2021 at 6:25 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > Could you confirm that the text which the docs patch deletes or changes
> > was all confirmed correct (even if perhaps unclear)?  I'm concerned
> > about us possibly changing dense, accurate docs into clear, less-accurate
> > docs.
> 
> The original docs were vague and ambiguous on many points, and even
> self-contradictory on some.

Concrete examples, please?  users/27228 is a long thread, and the patch
itself is 500 80-character lines long.

> > Case in point: The incumbent docs say that the coanchor is matched only
> > against the trial completion, but the new docs say something else.  If
> > that's an intentional change, it needs to be called out explicitly in
> > the log message.
> 
> Yes, that's intentional. I'll add it to the commit message.

Thanks, but again, that was just a case in point.  You need to identify
all such cases, or better yet, split the patch into a series of small,
reviewable changes.  That's always a good idea, and more so for changes
that are _a priori_ controversial (in this case, due to being backwards
incompatible).

> > In the man page rendering on my system, itemiz()'s bullet is vertically
> > aligned with the parent item()'s text.
> 
> I can confirm that this indeed looks wrong. Do you know what I should
> use to get them indented properly?

No, sorry.  You might want to look in zman.yo and ztexi.yo to see if we
define or redefine startitem(), item(), startitemize(), or itemiz().  If
not, then it might be some issue that can be reproduced with yodl/nroff/man
alone, i.e., not an issue specific to zsh's yodl code.

> Or if that's not possible, I can add a ifzman check to format the text
> without bullets in the man page.

Or with ASCII bullets.

> However, I would at least like to keep them in the html page, because
> it helps make the text clearer.

Sure.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-13 19:37         ` Daniel Shahaf
@ 2021-10-13 20:02           ` Bart Schaefer
  2021-10-14 20:25           ` Oliver Kiddle
  1 sibling, 0 replies; 11+ messages in thread
From: Bart Schaefer @ 2021-10-13 20:02 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Marlon Richert, Zsh hackers list

On Wed, Oct 13, 2021 at 12:38 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> If you want to clarify the documentation of the feature as designed,

Just for the record, I believe that's what Marlon has done in the doc
patch.  I would not apply the xfail patch in its current state (where
it removes existing tests and replaces them with xfails).  Adding that
set of xfails as new tests and documenting that they represent
proposed behavior changes (without removing the existing tests) would
be preferred.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-13 19:37         ` Daniel Shahaf
  2021-10-13 20:02           ` Bart Schaefer
@ 2021-10-14 20:25           ` Oliver Kiddle
  1 sibling, 0 replies; 11+ messages in thread
From: Oliver Kiddle @ 2021-10-14 20:25 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Marlon Richert, Zsh hackers list

Daniel Shahaf wrote:
> That's not how it works.  Documented semantics are API promises that
> should be presumed to be used in the wild.  Any change that may break
> anybody's proverbial spacebar heating is an incompatible change and
> should be treated accordingly (avoided if possible, and failing that,
> clearly documented for upgraders, designed with a reasonable failure
> mode for old code on new zsh, etc.).

The existing documentation and implementation don't entirely match.
Following the history of the feature including original list posts, I
think it is fairly clear what the intended behaviour is supposed to be.
The behaviour has inconsistencies, quirks and bugs and even determining
what the behaviour is in a form that could be documented is not easy.
Many of your questions do have answers buried in the long thread that
started this.

I'm happy to see better documentation and test cases will make it easier
on anyone brave enough to attempt to fixup issues. I've not had a chance
to review the patch properly but will do.

Oliver


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-12 12:08 ` Marlon Richert
  2021-10-12 15:25   ` Daniel Shahaf
@ 2021-10-14 20:43   ` Oliver Kiddle
  2021-10-14 21:16     ` Bart Schaefer
  1 sibling, 1 reply; 11+ messages in thread
From: Oliver Kiddle @ 2021-10-14 20:43 UTC (permalink / raw)
  To: Marlon Richert; +Cc: Zsh hackers list

This is just some initial comments. I'll delve into this in more detail
at a later date.

On 12 Oct, Marlon Richert wrote:

> +By default, characters in the string to be completed (referred to here as the
> +command line) map only onto identical characters in the list of matches
> +produced by the completion code (referred to here as the trial completions) and
> +missing characters are inserted only at the cursor position, if the shell
> +option tt(COMPLETE_IN_WORD) is set, or at the end of the command line,
> +otherwise.  However, it is possible to modify this behavior by use of the

I'm fairly sure that if complete_in_word is unset, missing characters
are still allowed at the cursor position. It has the effect of treating
the rest of the word after the cursor as being a separate following
word. I think the code even adds a fake space in for some paths to
achieve this.

Is "trial completions" the best term we can come up with? Where it
occurs in singular form it isn't obvious that it doesn't refer to what
is on the command-line. I tend use "candidate matches". With the term
"matches" for those that remain following matching. Or does anyone have
other ideas? "completions" is already rather overloaded because it is
also used for completion definitions for commands.

Oliver


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Add xfail tests for || form of completion matchers
  2021-10-14 20:43   ` Oliver Kiddle
@ 2021-10-14 21:16     ` Bart Schaefer
  0 siblings, 0 replies; 11+ messages in thread
From: Bart Schaefer @ 2021-10-14 21:16 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: Marlon Richert, Zsh hackers list

On Thu, Oct 14, 2021 at 1:58 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> Is "trial completions" the best term we can come up with?

That's merely what Sven W. used.

> I tend use "candidate matches". With the term
> "matches" for those that remain following matching.

I'm fine with that.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-14 21:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11 14:34 [RFC] Add xfail tests for || form of completion matchers Marlon Richert
2021-10-12 12:08 ` Marlon Richert
2021-10-12 15:25   ` Daniel Shahaf
2021-10-13  4:57     ` Bart Schaefer
2021-10-13  5:08     ` Bart Schaefer
2021-10-13 14:20       ` Marlon Richert
2021-10-13 19:37         ` Daniel Shahaf
2021-10-13 20:02           ` Bart Schaefer
2021-10-14 20:25           ` Oliver Kiddle
2021-10-14 20:43   ` Oliver Kiddle
2021-10-14 21:16     ` Bart Schaefer

zsh-workers

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.vuxu.org/zsh-workers

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 zsh-workers zsh-workers/ https://inbox.vuxu.org/zsh-workers \
		zsh-workers@zsh.org
	public-inbox-index zsh-workers

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.zsh.workers


code repositories for the project(s) associated with this inbox:

	https://git.vuxu.org/mirror/zsh/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git