From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 4737 invoked from network); 17 Mar 2022 17:09:10 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 17 Mar 2022 17:09:10 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1647536950; b=Nk06OgBQJRivgZDjG46ZIkEQBxwEG1u2VR+7vZUDnYdqZlnOFDXmv9VoIft/eGaPOU0Q4wmDZ2 dYjtKmLj3gR4SFngZqSgoBD/T1cGTzaFhFRt4y1YAIz7SaiYKEFlEQHQBNo023H3lwwbH3n2/6 UVmyLaCoLpt2I2uT+fyyXkm3DjJ8tLw1osZtCywLuVC9AwnVHnWgsX5pxWo4/S3b2g2R4T5mim jSNuXMBD5AsA0+oMmV0EHEz7KrFJBrLXT5EFzg8w6IL/uv+BnDg94P8WezoTTjNZiYA6D7KM1M 2UG6fASy+skfC+GIElNmc5Ke67WzqWATd8ol2anFJWbd5A==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-pl1-f177.google.com) smtp.remote-ip=209.85.214.177; dkim=pass header.d=gmail.com header.s=20210112 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1647536950; bh=VUWkAvWPnUWsvfFRxl/ZKAqvvUOy13XYn0GPxGzetpY=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:To:Subject:Message-ID: Date:From:References:In-Reply-To:MIME-Version:DKIM-Signature:DKIM-Signature; b=Cs8Wuo+GxVfXGkZs8D45k/QFrTNRcV3pvcR8d3ZjIWVn1WUCbvI5gGrCpQ4kjTN23yGfgQki+L H3TqWSlW0YkvGzw5QE9YDalbesRjHPp1osiCvYZLlZkXToYcrXCX1E06zSNQ8VItAZgATVTlXy wfyJydNgWfS+cHTBYzQQKi7iEfJdEBLLl+o3Yg0TLw5otxMzIulOtPXhxlbv/0GvY6Svb+RLMR QRtm8Aov4SjHgXieiMXePDz8sGjlE/USTaK43fTYoqyQ0io4vJAjX3ynk/zz3b/qpLRZaBM/JJ qK+te5rLeZLvaN/C59w8Xgaq+nOMxUkrpw9RbYa4fMHIPQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:To:Subject:Message-ID:Date:From:References:In-Reply-To: MIME-Version:Reply-To:Cc:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=e7sc/zYF2J4ADJ6w7LDoKuN9eVtq9HKgULJ2STS5AKU=; b=M6bGAHQ7GbgeoTj98CQ/9koft6 RxmCk9VafG4PXu7Tk1XlTPr2DEgNdyXiVaDvtzzmszldn/Xx+RjBcDP/DPO0Fv8SH+Lcg3M4IsO6b 1RSjNsw8/HAtl8+WUihHxKERVN0R62ncVbWuS+4Md9FcnGi00JN2ESOvFmoqyGRGHhwbPgjXCOi28 WxfwCwgrYmOWAU7pBiIsWKaR4B2kV1Y38aDZfBIHHdxxkDTSRMIu7tNtLH2cbqWFzg0t60aZzGd9I kf16Nc/wp3TbpfL49KWvMOyRGeaF/pI8k+f5EoyyXJC7YdybpnnWvtVQ+ePFWhqT64s41FmdXLBz6 vqXUkj8Q==; Received: from authenticated user by zero.zsh.org with local id 1nUtcb-000PVX-IB; Thu, 17 Mar 2022 17:09:09 +0000 Authentication-Results: zsh.org; iprev=pass (mail-pl1-f177.google.com) smtp.remote-ip=209.85.214.177; dkim=pass header.d=gmail.com header.s=20210112 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none Received: from mail-pl1-f177.google.com ([209.85.214.177]:39804) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1nUtcK-000PDd-Fu; Thu, 17 Mar 2022 17:08:52 +0000 Received: by mail-pl1-f177.google.com with SMTP id d18so4977687plr.6 for ; Thu, 17 Mar 2022 10:08:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=e7sc/zYF2J4ADJ6w7LDoKuN9eVtq9HKgULJ2STS5AKU=; b=pFYGdlt3nkcw+0BV779Bb2BXCVN26WHv5q9oKwxpo4HivBTlWMK2v5CPDqGSusD9Au iDz9/pEZ/d6AYxqQd+uM8EELIBlpHNkWDCSc1/NOIIhR9CZR3sxEpKe+xNDT6SFbwy8w NGYSNVSqzbenpQ9ZNZea4rKCV1d8a2Tpc/KnkrcjUGTJy/TjIoOaAM1eftsQ02nymUe2 SsVb5D9zRciMzVhXefEA9PMkGzmbiFcNjdvujhaZSGOWn1b/2K9Jq22JOvE2HnTiJwh3 att/eYWU4ri65eEGViSv6tKswM/klgkIyZ2IfChC7wz4rIx1t3yW8dATxKSe2S3bNyfd M+Hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=e7sc/zYF2J4ADJ6w7LDoKuN9eVtq9HKgULJ2STS5AKU=; b=Wg1U4iq+vfjxTHPOuAkOl6TO/LFdwr/r9Ws4ipGlBjJZJygRXJLwlw1VmJB1ugC9ks ED3BBskTOjkeNE/+doW4iTgSfwSayuRf9TAkuJkUsWYpY4cVojlT7UnYpGpm5OOzqSTm X/vIvlxqK62ipb0hGLoY2WbjKNrnAB9LRoLXkXu73eWRU0SHoCfBSyuXQs6NzznQjcNV wvqbFMLJYh3yGAuZbQCVlne0ukf3IwfdlxEwuA5NGtuN090Y4aYc24GV097MQX6JfAP/ gglqc4HQCk0Dz5pw5ybJK9NaM0fpTjpihOlOZWc1A/WhGcPoka6q5g/l28dH/kV/LF88 DEiw== X-Gm-Message-State: AOAM533MDh8+rXlbgw60ezu6zRxncwHSWMSbSIud1gWscvPnwAbGHBD8 FQZFkSOHHViALyrYkdWPppdWtyX4SvCIIlpDzE4eUKZf X-Google-Smtp-Source: ABdhPJxQ+m1d4Z1GkpqjKf6QGlzbruutkFU3pT6gNgKm2Ih97eZLaScb8UoCJJhS00ZzD5wkFFY4sRGdZ8nXYlmPIVI= X-Received: by 2002:a17:902:700b:b0:148:ee33:70fe with SMTP id y11-20020a170902700b00b00148ee3370femr6337521plk.38.1647536930915; Thu, 17 Mar 2022 10:08:50 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a05:6a10:466:0:0:0:0 with HTTP; Thu, 17 Mar 2022 10:08:50 -0700 (PDT) In-Reply-To: <20220315115648.10521-1-mikachu@gmail.com> References: <20220315115648.10521-1-mikachu@gmail.com> From: Mikael Magnusson Date: Thu, 17 Mar 2022 18:08:50 +0100 Message-ID: Subject: Re: PATCH: Fix inverted condition for unique completions To: zsh-workers@zsh.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Seq: 49864 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: On 3/15/22, Mikael Magnusson wrote: > This change makes most-recent-file[1] completion take a few milliseconds > instead of close to a minute in a directory of 30000 files. I'm not sure > exactly what's going on with these flags, but surely if they are not set, > then we should not deduplicate the set of matches. Most likely this was > a thinko-copy-paste-o from the first NOSORT case which does have to be > negated. I'm not sure why regular tab-press was always fast, but it is > still fast after this change for me. > > I don't know when these flags are supposed to be set, so I haven't tested > that case. > > [1] > zstyle ':completion:(*-|)most-recent-file:*' match-original both > zstyle ':completion:(*-|)most-recent-file:*' file-sort modification > zstyle ':completion:(*-|)most-recent-file:*' file-patterns '*:all\ files' > zstyle ':completion:most-recent-file:*' hidden all > zstyle ':completion:(*-|)most-recent-file:*' completer _files Okay, I set out to test the case when the flags are set, and waded through some confusion but I think the conclusion is that the comments in the header are wrong, and the names of the constants are misleading, and _path_files should probably use -V -2 since files can't be duplicated anyway. Is there a useful case when the same file could be added twice? (eg not just running _files twice by mistake). Here's the wading through confusion part: compcore.c has code like this (without my patch) if (!(flags & CGF_UNIQCON)) { int dup; /* And delete the ones that occur more than once. */ ... if (!(flags & CGF_UNIQALL) && !(flags & CGF_UNIQCON)) { int dup; } else if (!(flags & CGF_UNIQCON)) { int dup; ... Checking the comments for these flags: #define CGF_UNIQALL 8 /* remove all duplicates */ #define CGF_UNIQCON 16 /* remove consecutive duplicates */ So far it clearly seems like the checks are inverted. addmatches() does this: gflags =3D (((dat->aflags & CAF_NOSORT ) ? CGF_NOSORT : 0) | ... ((dat->aflags & CAF_UNIQALL) ? CGF_UNIQALL : 0) | ((dat->aflags & CAF_UNIQCON) ? CGF_UNIQCON : 0)); Okay, so we've copied over some other flags (note the CAF vs CGF), what do those say? #define CAF_UNIQCON 8 /* compadd -2: don't deduplicate */ #define CAF_UNIQALL 16 /* compadd -1: deduplicate */ okay... that's somewhat contradictory. Let's check the manpage: -1 If given together with the -V option, makes only consecutive duplicates in the group be removed. If combined with the -J option, this has no visible effect. Note that groups with and without this flag are in different name spaces. -2 If given together with the -J or -V option, makes all dupli=E2=80= =90 cates be kept. Again, groups with and without this flag are in different name spaces. Okay, so UNIQALL means to remove only consecutive duplicates, and UNIQCON means keep all duplicates. Right. That doesn't make any sense, but it does match what the original code does (I think). With that in mind, I have the following patch instead, any objections to this? (Again, it makes completion be instant instead of taking minutes in large directories). If there are objections, we can probably improve the deduplication algorithm to from n^2 to nlog n (sort array of pointers and use that to manipulate the original array keeping relative order of kept elements. In fact, the input data to the function is a list of pointers which we copy to an array of elements which is then manipulated). PS the manpage says -V is required for -1/-2 but -J with -o nosort works as well which is what happens below. -- 8< -- Subject: PATCH: _path_files: keep duplicates when not sorting The deduplication algorithm is too slow to run on very large sets of matches, and files can't be duplicate anyway. --- Completion/Unix/Type/_path_files | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Completion/Unix/Type/_path_files b/Completion/Unix/Type/_path_= files index c09ca6fa9e..2ee3030785 100644 --- a/Completion/Unix/Type/_path_files +++ b/Completion/Unix/Type/_path_files @@ -168,7 +168,7 @@ if zstyle -s ":completion:${curcontext}:" file-sort tmp1; then if [[ "$sort" =3D on ]]; then sort=3D else - mopts=3D( -o nosort "${mopts[@]}" ) + mopts=3D( -o nosort -2 "${mopts[@]}" ) tmp2=3D() for tmp1 in "$pats[@]"; do --=20 Mikael Magnusson