help / color / mirror / code / Atom feed
From: Mikael Magnusson <mikachu@gmail.com>
To: zsh-workers@zsh.org
Subject: Re: PATCH: Fix inverted condition for unique completions
Date: Thu, 17 Mar 2022 18:08:50 +0100	[thread overview]
Message-ID: <CAHYJk3R3hHxEOfJE5Ko+mZakCeV3nuL8ZHw3qytbAj-Vi_W5HA@mail.gmail.com> (raw)
In-Reply-To: <20220315115648.10521-1-mikachu@gmail.com>

On 3/15/22, Mikael Magnusson <mikachu@gmail.com> wrote:
> This change makes most-recent-file[1] completion take a few milliseconds
> instead of close to a minute in a directory of 30000 files. I'm not sure
> exactly what's going on with these flags, but surely if they are not set,
> then we should not deduplicate the set of matches. Most likely this was
> a thinko-copy-paste-o from the first NOSORT case which does have to be
> negated. I'm not sure why regular tab-press was always fast, but it is
> still fast after this change for me.
> I don't know when these flags are supposed to be set, so I haven't tested
> that case.
> [1]
> zstyle ':completion:(*-|)most-recent-file:*' match-original both
> zstyle ':completion:(*-|)most-recent-file:*' file-sort modification
> zstyle ':completion:(*-|)most-recent-file:*' file-patterns '*:all\ files'
> zstyle ':completion:most-recent-file:*' hidden all
> zstyle ':completion:(*-|)most-recent-file:*' completer _files

Okay, I set out to test the case when the flags are set, and waded
through some confusion but I think the conclusion is that the comments
in the header are wrong, and the names of the constants are
misleading, and _path_files should probably use -V -2 since files
can't be duplicated anyway. Is there a useful case when the same file
could be added twice? (eg not just running _files twice by mistake).

Here's the wading through confusion part:
compcore.c has code like this (without my patch)
	    if (!(flags & CGF_UNIQCON)) {
		int dup;

		/* And delete the ones that occur more than once. */
	    if (!(flags & CGF_UNIQALL) && !(flags & CGF_UNIQCON)) {
                int dup;
	    } else if (!(flags & CGF_UNIQCON)) {
		int dup;

Checking the comments for these flags:
#define CGF_UNIQALL  8		/* remove all duplicates */
#define CGF_UNIQCON 16		/* remove consecutive duplicates */

So far it clearly seems like the checks are inverted.

addmatches() does this:
        gflags = (((dat->aflags & CAF_NOSORT ) ? CGF_NOSORT  : 0) |
                  ((dat->aflags & CAF_UNIQALL) ? CGF_UNIQALL : 0) |
                  ((dat->aflags & CAF_UNIQCON) ? CGF_UNIQCON : 0));

Okay, so we've copied over some other flags (note the CAF vs CGF),
what do those say?
#define CAF_UNIQCON  8    /* compadd -2: don't deduplicate */
#define CAF_UNIQALL 16    /* compadd -1: deduplicate */
okay... that's somewhat contradictory. Let's check the manpage:
  -1   If  given  together  with the -V option, makes only consecutive
       duplicates in the group be removed. If  combined  with  the  -J
       option,  this  has no visible effect. Note that groups with and
       without this flag are in different name spaces.

  -2   If given together with the -J or -V option,  makes  all  dupli‐
       cates  be kept. Again, groups with and without this flag are in
       different name spaces.

Okay, so UNIQALL means to remove only consecutive duplicates, and
UNIQCON means keep all duplicates. Right. That doesn't make any sense,
but it does match what the original code does (I think).

With that in mind, I have the following patch instead, any objections
to this? (Again, it makes completion be instant instead of taking
minutes in large directories). If there are objections, we can
probably improve the deduplication algorithm to from n^2 to nlog n
(sort array of pointers and use that to manipulate the original array
keeping relative order of kept elements. In fact, the input data to
the function is a list of pointers which we copy to an array of
elements which is then manipulated).

PS the manpage says -V is required for -1/-2 but -J with -o nosort
works as well which is what happens below.

-- 8< --

Subject: PATCH: _path_files: keep duplicates when not sorting

The deduplication algorithm is too slow to run on very large sets of
matches, and files can't be duplicate anyway.
 Completion/Unix/Type/_path_files | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Completion/Unix/Type/_path_files b/Completion/Unix/Type/_path_files
index c09ca6fa9e..2ee3030785 100644
--- a/Completion/Unix/Type/_path_files
+++ b/Completion/Unix/Type/_path_files
@@ -168,7 +168,7 @@ if zstyle -s ":completion:${curcontext}:"
file-sort tmp1; then
   if [[ "$sort" = on ]]; then
-    mopts=( -o nosort "${mopts[@]}" )
+    mopts=( -o nosort -2 "${mopts[@]}" )

     for tmp1 in "$pats[@]"; do

Mikael Magnusson

  reply	other threads:[~2022-03-17 17:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 11:56 Mikael Magnusson
2022-03-17 17:08 ` Mikael Magnusson [this message]
2022-03-17 19:17   ` Bart Schaefer
2022-03-17 19:32     ` Mikael Magnusson
2022-03-26  1:40       ` PATCH 1/2: Fix comments for UNIQCON/ALL Mikael Magnusson
2022-03-26  1:43         ` PATCH 2/2: [WIP] Efficient dedup for unsorted completions Mikael Magnusson
2022-03-29 16:07           ` PATCHv2 2/2: " Mikael Magnusson
2022-03-30 16:41           ` PATCH 2/2: [WIP] " Bart Schaefer
2022-03-30 18:33             ` Mikael Magnusson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHYJk3R3hHxEOfJE5Ko+mZakCeV3nuL8ZHw3qytbAj-Vi_W5HA@mail.gmail.com \
    --to=mikachu@gmail.com \
    --cc=zsh-workers@zsh.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).