zsh-workers
 help / color / mirror / code / Atom feed
* Bug report; git tab-completion on macOS within paths containing unicode characters
@ 2024-04-26  7:17 Yves Delley
  2024-04-27 19:14 ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Yves Delley @ 2024-04-26  7:17 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

Hi,

I may have found an issue with tab-completion for git on macOS (using
oh-my-zsh; but they indicate that git tab-completion is an upstream issue;
see https://github.com/ohmyzsh/ohmyzsh/issues/12380).
If I attempt to tab-complete on`git add` within a repo whose path contains
unicode characters in NFD (decomposed) form, no completions are presented.
With unicode characters in NFC form, everything works as expected. It
appears that the macOS Finder application writes folder names in NFD form,
whereas on Terminal.app, you have to jump through some hoops to get there.

Steps to reproduce:

On macOS 14.4, with OMZ installed through homebrew and with the `git`
plugin enabled, try the following:
 - Verify that command completion works normally:
    - `mkdir /tmp/test1 && cd /tmp/test1`
    - `git init`
    - `touch test.txt`
    - Type `git add t`, then press TAB and see `test.txt` being suggested
    - Optional: `rmdir -r /tmp/test1`
 - Repeat with a unicode path in NFD form
   - `cd /tmp`
   - `mkdir $(echo 'u\xcc\x88')`. I did not find a better way to create a
folder with a unicode character in NFD form using the Terminal. In
particular, typing the corresponding key `ü` on a `de_CH` keyboard creates
that character in NFC form.
   - `cd ü/` - here, it doesn't matter which form you use; after all, they
are still the same unicode character
   - `mkdir test2`
   -  `cd test2`
   - `git init`
   - `touch test.txt`
   - Type `git add t`, then press TAB; nothing is being suggested
   - Optional: `rmdir -r /tmp/ü/test2`

Software versions:
 - macOS 14.4.1
 - zsh 5.9
Locale:
  LC_ALL=de_CH.UTF-8
  LANG=en_GB.UTF-8
  LC_CTYPE=UTF-8

Thank you,
Yves

[-- Attachment #2: Type: text/html, Size: 1932 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-26  7:17 Bug report; git tab-completion on macOS within paths containing unicode characters Yves Delley
@ 2024-04-27 19:14 ` Bart Schaefer
  2024-04-27 19:31   ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-27 19:14 UTC (permalink / raw)
  To: pu.y; +Cc: zsh-workers

I'm uncertain that this is zsh's fault.  "git ls-files" is returning
that the directory is not a git repository, at least when I try to
reproduce.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-27 19:14 ` Bart Schaefer
@ 2024-04-27 19:31   ` Bart Schaefer
  2024-04-28  7:22     ` Yves Delley
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-27 19:31 UTC (permalink / raw)
  To: pu.y; +Cc: zsh-workers

On Sat, Apr 27, 2024 at 12:14 PM Bart Schaefer
<schaefer@brasslantern.com> wrote:
>
> I'm uncertain that this is zsh's fault.  "git ls-files" is returning
> that the directory is not a git repository, at least when I try to
> reproduce.

See if this helps -- I'm not sure this is the only (or even best)
place to apply a change, but I'm curious whether it improves your
results.

diff --git a/Completion/Unix/Command/_git b/Completion/Unix/Command/_git
index 7370aaead..47c386106 100644
--- a/Completion/Unix/Command/_git
+++ b/Completion/Unix/Command/_git
@@ -7526,7 +7526,7 @@ __git_files () {
   # TODO: --directory should probably be added to $opts when --others is given.

   local pref=${(Q)${~PREFIX}}
-  [[ $pref[1] == '/' ]] || pref=$gittoplevel$gitprefix$pref
+  [[ $pref[1] == '/' ]] || pref=${(qqqq)gittoplevel}$gitprefix$pref

   # First allow ls-files to pattern-match in case of remote repository. Use the
   # icase pathspec magic word to ensure that we support case-insensitive path


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-27 19:31   ` Bart Schaefer
@ 2024-04-28  7:22     ` Yves Delley
  2024-04-28 15:28       ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Yves Delley @ 2024-04-28  7:22 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

Hey, thanks for looking into this! I did a quick test, where I
manually applied the patch to
`/opt/homebrew/Cellar/zsh/5.9/share/zsh/functions/_git` of my
currently running install (the only file by the name of `_git` that I
could `find`). As far as I could tell, that had no effect (I did open
a new terminal, but I didn't do a fresh OS login). Also, for me, `git
ls-files` seems to work normally independent of having unicode
characters in the path or not. I did find a somewhat related issue on
the fish issue tracker:
https://github.com/fish-shell/fish-shell/issues/474, suggesting that
this is in fact specific to the filesystem being used. The thread
suggests that bash managed to fix this issue. The obvious solution is
that whatever function/program does the globbing should apply some
form of unicode normalisation to both sides before the match (is this
what `qqqq` does here?). However, apparently that's easier said than
done.


On Sat, 27 Apr 2024 at 21:31, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Sat, Apr 27, 2024 at 12:14 PM Bart Schaefer
> <schaefer@brasslantern.com> wrote:
> >
> > I'm uncertain that this is zsh's fault.  "git ls-files" is returning
> > that the directory is not a git repository, at least when I try to
> > reproduce.
>
> See if this helps -- I'm not sure this is the only (or even best)
> place to apply a change, but I'm curious whether it improves your
> results.
>
> diff --git a/Completion/Unix/Command/_git b/Completion/Unix/Command/_git
> index 7370aaead..47c386106 100644
> --- a/Completion/Unix/Command/_git
> +++ b/Completion/Unix/Command/_git
> @@ -7526,7 +7526,7 @@ __git_files () {
>    # TODO: --directory should probably be added to $opts when --others is given.
>
>    local pref=${(Q)${~PREFIX}}
> -  [[ $pref[1] == '/' ]] || pref=$gittoplevel$gitprefix$pref
> +  [[ $pref[1] == '/' ]] || pref=${(qqqq)gittoplevel}$gitprefix$pref
>
>    # First allow ls-files to pattern-match in case of remote repository. Use the
>    # icase pathspec magic word to ensure that we support case-insensitive path


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-28  7:22     ` Yves Delley
@ 2024-04-28 15:28       ` Bart Schaefer
  2024-04-28 18:59         ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-28 15:28 UTC (permalink / raw)
  To: pu.y; +Cc: zsh-workers

On Sun, Apr 28, 2024 at 12:23 AM Yves Delley <pu.y@delley.net> wrote:
>
> [...] for me, `git ls-files` seems to work normally

Try it with
  git ls-files $PWD
vs
  git ls-files ${(qqqq)PWD}

I think the problem is that NFD form ends up having the first byte of
the second nybble misinterpreted, so the string gets truncated at that
point.  That visibly happens in xterm (XQuartz) when I try to
copy-paste the pathname.  The (qqqq) flag just quotes the entire
string differently to push that through to ls-files, which then
properly composes it.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-28 15:28       ` Bart Schaefer
@ 2024-04-28 18:59         ` Bart Schaefer
  2024-04-28 19:01           ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-28 18:59 UTC (permalink / raw)
  To: pu.y; +Cc: zsh-workers

This used to work in 5.7.1 (default shell on Catalina).  Then this
change was made:

-  local pref=$gitcdup$gitprefix$PREFIX
+  local pref=${(Q)${~PREFIX}}
+  [[ $pref[1] == '/' ]] || pref=$gittoplevel$gitprefix$pref

Unquoting the prefix with (Q) seems to have broken it.  That came from
workers/45313 ("revised version of workers/41523").  Daniel?

A different fix to try:

diff --git a/Completion/Unix/Command/_git b/Completion/Unix/Command/_git
index 7370aaead..22b945e38 100644
--- a/Completion/Unix/Command/_git
+++ b/Completion/Unix/Command/_git
@@ -7531,7 +7531,7 @@ __git_files () {
   # First allow ls-files to pattern-match in case of remote repository. Use the
   # icase pathspec magic word to ensure that we support case-insensitive path
   # completion for users with the appropriate matcher configuration
-  files=(${(0)"$(_call_program files git ls-files -z
--exclude-standard ${(q)opts} -- ${(q)${pref:+:\(icase\)$pref\*}:-.}
2>/dev/null)"})
+  files=(${(0)"$(_call_program files git ls-files -z
--exclude-standard ${(q)opts} --
${(q)${pref:+:\(icase\)${(qq)pref}\*}:-.} 2>/dev/null)"})
   __git_command_successful $pipestatus || return

   # If ls-files succeeded but returned nothing, try again with no pattern. Note


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-28 18:59         ` Bart Schaefer
@ 2024-04-28 19:01           ` Bart Schaefer
  2024-04-29 17:37             ` Jun. T
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-28 19:01 UTC (permalink / raw)
  To: pu.y; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 20 bytes --]

Ugh, line wrapping.

[-- Attachment #2: __git_files.txt --]
[-- Type: text/plain, Size: 862 bytes --]

diff --git a/Completion/Unix/Command/_git b/Completion/Unix/Command/_git
index 7370aaead..22b945e38 100644
--- a/Completion/Unix/Command/_git
+++ b/Completion/Unix/Command/_git
@@ -7531,7 +7531,7 @@ __git_files () {
   # First allow ls-files to pattern-match in case of remote repository. Use the
   # icase pathspec magic word to ensure that we support case-insensitive path
   # completion for users with the appropriate matcher configuration
-  files=(${(0)"$(_call_program files git ls-files -z --exclude-standard ${(q)opts} -- ${(q)${pref:+:\(icase\)$pref\*}:-.} 2>/dev/null)"})
+  files=(${(0)"$(_call_program files git ls-files -z --exclude-standard ${(q)opts} -- ${(q)${pref:+:\(icase\)${(qq)pref}\*}:-.} 2>/dev/null)"})
   __git_command_successful $pipestatus || return
 
   # If ls-files succeeded but returned nothing, try again with no pattern. Note

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-28 19:01           ` Bart Schaefer
@ 2024-04-29 17:37             ` Jun. T
  2024-04-29 17:54               ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Jun. T @ 2024-04-29 17:37 UTC (permalink / raw)
  To: zsh-workers


> 2024/04/29 4:01、Bart Schaefer <schaefer@brasslantern.com>のメール:
> 
> <__git_files.txt>

With this patch, 'git add t<TAB>' apparently works, but
_complete_debug indicates that the problem still remains.

In __git_files (called from __git_other_files), 'git ls-files'
is called two times.
The first one is (omitting a few irrelevant options):
    git ls-files --others -- ':\(icase\)\''/path/to/ü/test2/t\''\*'
and this gives nothing (no files, no error messages).
Then __git_files tries the second call:
    git ls-files --others --
and this returns 'test.txt'. But this returns only the files in
the current directory and its subdirectories. So if we are in a
subdirectory then files in the top directory are not listed.

I tried
    git ls-files path
with path in both in NFC and NFD (by using $'\x..'), but both
didn't work.
I guess this is a problem of git itself.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-29 17:37             ` Jun. T
@ 2024-04-29 17:54               ` Bart Schaefer
  2024-04-30  0:41                 ` Jun. T
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-29 17:54 UTC (permalink / raw)
  To: Jun. T; +Cc: zsh-workers

On Mon, Apr 29, 2024 at 10:38 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
> In __git_files (called from __git_other_files), 'git ls-files'
> is called two times.

Comments in the function indicate this is intentional:
  # If ls-files succeeded but returned nothing, try again with no pattern. Note
  # that ls-files defaults to the CWD if not given a path, so if the file we
  # were trying to add is in an *adjacent* directory, this won't return anything
  # helpful either


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-29 17:54               ` Bart Schaefer
@ 2024-04-30  0:41                 ` Jun. T
  2024-04-30  1:57                   ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Jun. T @ 2024-04-30  0:41 UTC (permalink / raw)
  To: zsh-workers


> 2024/04/30 2:54、Bart Schaefer <schaefer@brasslantern.com>のメール:
> 
> On Mon, Apr 29, 2024 at 10:38 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>> 
>> In __git_files (called from __git_other_files), 'git ls-files'
>> is called two times.
> 
> Comments in the function indicate this is intentional:

Yes, but the second call is just for "better than nothing",
so the first call should better succeed, I think.

I tried
git ls-files $'/path/to/\xc3\xbc/test2'
and
git ls-files $'/path/to/u\xcc\x88/test2'
but both didn't work. Isn't this a bug of git?

# git does not consider \''/path/to/ü/test2/t\''\* as an
# absolute path.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-30  0:41                 ` Jun. T
@ 2024-04-30  1:57                   ` Bart Schaefer
  2024-04-30  2:31                     ` Jun. T
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2024-04-30  1:57 UTC (permalink / raw)
  To: Jun. T; +Cc: zsh-workers

On Mon, Apr 29, 2024 at 5:41 PM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
> > 2024/04/30 2:54、Bart Schaefer <schaefer@brasslantern.com>のメール:
> >
> > On Mon, Apr 29, 2024 at 10:38 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
> >>
> >> In __git_files (called from __git_other_files), 'git ls-files'
> >> is called two times.
> >
> > Comments in the function indicate this is intentional:
>
> Yes, but the second call is just for "better than nothing",
> so the first call should better succeed, I think.

I don't think the first call ever succeeds on files that are not
already known to git, which makes it odd to call it for "git add" in
the first place.  I'm not sure what the intention was here.  Hence my
attempt to call out Daniel.

> I tried
> git ls-files $'/path/to/\xc3\xbc/test2'
> and
> git ls-files $'/path/to/u\xcc\x88/test2'
> but both didn't work. Isn't this a bug of git?

Neither of those worked for me either, even without NFD characters in
the file name, until I was cd'd inside the repository.

It appears that "git ls-files" has to find a .git/ before it will do
any comparisons, even given an absolute path to a(nother) repository.
I don't know whether to class this as a bug in git.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report; git tab-completion on macOS within paths containing unicode characters
  2024-04-30  1:57                   ` Bart Schaefer
@ 2024-04-30  2:31                     ` Jun. T
  0 siblings, 0 replies; 12+ messages in thread
From: Jun. T @ 2024-04-30  2:31 UTC (permalink / raw)
  To: zsh-workers


> 2024/04/30 10:57、Bart Schaefer <schaefer@brasslantern.com>のメール:
> 
> On Mon, Apr 29, 2024 at 5:41 PM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>> 
>>> 2024/04/30 2:54、Bart Schaefer <schaefer@brasslantern.com>のメール:
>>> 
>>> On Mon, Apr 29, 2024 at 10:38 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>>>> 
>>>> In __git_files (called from __git_other_files), 'git ls-files'
>>>> is called two times.
>>> 
>>> Comments in the function indicate this is intentional:
>> 
>> Yes, but the second call is just for "better than nothing",
>> so the first call should better succeed, I think.
> 
> I don't think the first call ever succeeds on files that are not
> already known to git,

'git ls-fils --others' gives untracked files.

>> I tried
>> git ls-files $'/path/to/\xc3\xbc/test2'
>> and
>> git ls-files $'/path/to/u\xcc\x88/test2'
>> but both didn't work. Isn't this a bug of git?
> 
> Neither of those worked for me either, even without NFD characters in
> the file name, until I was cd'd inside the repository.

If there are NFD characters, they do not work even inside the repo.
But I think at least one of them should work.

% cd /path/to/ü
% git ls-files $'/path/to/u\xcc\x88'	# NFD
fatal: /path/to/ü: '/path/to/ü' is outside repository at '/path/to/ü'
(in the error message, the 1st and 2nd ü is NFC, the 3rd is NFD)

I run git within a debugger (lldb), and noticed that, on macOS,
git internally converted the pathspec on the command line (for example
$'/path/to/u\xcc\x88/test2') into NFC. But it treats the top level
working directory name in NFD, so the pathspec (NFC) is considered
as outside of the working tree (NFD).

I will send a bug report to git@vger.kernel.org.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-04-30  2:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-26  7:17 Bug report; git tab-completion on macOS within paths containing unicode characters Yves Delley
2024-04-27 19:14 ` Bart Schaefer
2024-04-27 19:31   ` Bart Schaefer
2024-04-28  7:22     ` Yves Delley
2024-04-28 15:28       ` Bart Schaefer
2024-04-28 18:59         ` Bart Schaefer
2024-04-28 19:01           ` Bart Schaefer
2024-04-29 17:37             ` Jun. T
2024-04-29 17:54               ` Bart Schaefer
2024-04-30  0:41                 ` Jun. T
2024-04-30  1:57                   ` Bart Schaefer
2024-04-30  2:31                     ` Jun. T

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).