zsh-users
 help / color / Atom feed
* Re: Help wanted with debugging a weird glob behavior
       [not found] <0628A0E5-63F0-481E-AEC2-962658134620__9154.55124793283$1566242642$gmane$org@icloud.com>
@ 2019-08-20  7:37 ` Stephane Chazelas
       [not found]   ` <227BE55C-4B7E-4CAD-B212-D48F663BC09D@icloud.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stephane Chazelas @ 2019-08-20  7:37 UTC (permalink / raw)
  To: Aryn Starr; +Cc: zsh-users

2019-08-19 23:52:23 +0430, Aryn Starr:
> I have an album of mp3 files in `/Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/`. I have put `/Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/20 - Cécile Corbel - Arrietty's Song (original Japanese version).flac` in the file `path` (to help with reproducibility of the bug)(file accessible at https://git.io/fjF98 <https://git.io/fjF98>).
> 
> You can see that the address saved in `path` is valid and points to a file:
[...]

Given that you seem to be on macOS, that has possibly something
to do with
https://unix.stackexchange.com/questions/399927/how-to-rename-filenames-with-accents-on-macos

And the fact that macOS stores characters in their decomposed
form which zsh tries to work around there by converting back to
precomposed.

-- 
Stephane

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Help wanted with debugging a weird glob behavior
       [not found]   ` <227BE55C-4B7E-4CAD-B212-D48F663BC09D@icloud.com>
@ 2019-08-20  8:21     ` Stephane Chazelas
  2019-08-20  8:47       ` Roman Perepelitsa
                         ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Stephane Chazelas @ 2019-08-20  8:21 UTC (permalink / raw)
  To: Aryn Starr; +Cc: Zsh Users List

2019-08-20 12:21:44 +0430, Aryn Starr:
> Indeed, using `echo "${$(cat path | iconv -f UTF-8-MAC -t UTF-8):h}”/*` works!
> Seeing that `zsh -f` works correctly without this shenanigan, is there an option that disables this? I have oh-my-zsh installed, which might have set an option to that effect ...
[...]

[back on-list]

You probably have the nocaseglob option on. That means that zsh
does read the content of directories to find matches and since
zreaddir returns é in its precomposed form (U+00E9), it matches
neither e<U+0301> (the decomposed form of é) nor E<U+0301> (the
decomposed form of É).

Not much you can do about it (except that iconv conversion or
install a proper OS ;-)).

If you disable that zsh work around (for which I think you
need to recompile zsh), then you'll probably get worse problems.

-- 
Stephane

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Help wanted with debugging a weird glob behavior
  2019-08-20  8:21     ` Stephane Chazelas
@ 2019-08-20  8:47       ` Roman Perepelitsa
  2019-08-20  9:02       ` Aryn Starr
       [not found]       ` <A227EBEE-60CC-460D-BBAD-D5E0A3386B4B__40377.5387721666$1566291837$gmane$org@icloud.com>
  2 siblings, 0 replies; 7+ messages in thread
From: Roman Perepelitsa @ 2019-08-20  8:47 UTC (permalink / raw)
  To: Aryn Starr, Zsh Users List

On Tue, Aug 20, 2019 at 10:22 AM Stephane Chazelas
<stephane.chazelas@gmail.com> wrote:
>
>
> Not much you can do about it (except that iconv conversion or
> install a proper OS ;-)).

HFS+ with its default settings is hands down the worst filesystem in
modern use. It has case-sensitive mode but unfortunately it's not
enabled by default and manually turning it on is not easy. Without the
case-insensitive unicode-corrupting madness you can get by with HFS+
even though it'll still be worse than any other modern filesystem.

Roman.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Help wanted with debugging a weird glob behavior
  2019-08-20  8:21     ` Stephane Chazelas
  2019-08-20  8:47       ` Roman Perepelitsa
@ 2019-08-20  9:02       ` Aryn Starr
       [not found]       ` <A227EBEE-60CC-460D-BBAD-D5E0A3386B4B__40377.5387721666$1566291837$gmane$org@icloud.com>
  2 siblings, 0 replies; 7+ messages in thread
From: Aryn Starr @ 2019-08-20  9:02 UTC (permalink / raw)
  To: Stephane Chazelas; +Cc: Zsh Users List

Indeed, I do have `nocaseglob` :) Can’t zsh be made to try to match the decomposed form, too? (Perhaps as a new option?) I don’t think putting `iconv`s everywhere is a sustainable practice …
Apart from that, I think it’s a good idea to create a helper script, `zsh-doctor`, that warns users of such possible edge cases in their config. It can check if they are on macOS and have nocaseglob enabled, and print an appropriate warning. In time, this script might save people a lot of trouble.
(github is probably a better home for such a diagnostic script, since it is more accessible and easier to do pull requests and stuff in …)

PS: I’m actually on APFS, not HFS. 

> On Aug 20, 2019, at 12:51 PM, Stephane Chazelas <stephane.chazelas@gmail.com> wrote:
> 
> 2019-08-20 12:21:44 +0430, Aryn Starr:
>> Indeed, using `echo "${$(cat path | iconv -f UTF-8-MAC -t UTF-8):h}”/*` works!
>> Seeing that `zsh -f` works correctly without this shenanigan, is there an option that disables this? I have oh-my-zsh installed, which might have set an option to that effect ...
> [...]
> 
> [back on-list]
> 
> You probably have the nocaseglob option on. That means that zsh
> does read the content of directories to find matches and since
> zreaddir returns é in its precomposed form (U+00E9), it matches
> neither e<U+0301> (the decomposed form of é) nor E<U+0301> (the
> decomposed form of É).
> 
> Not much you can do about it (except that iconv conversion or
> install a proper OS ;-)).
> 
> If you disable that zsh work around (for which I think you
> need to recompile zsh), then you'll probably get worse problems.
> 
> -- 
> Stephane


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Help wanted with debugging a weird glob behavior
       [not found]       ` <A227EBEE-60CC-460D-BBAD-D5E0A3386B4B__40377.5387721666$1566291837$gmane$org@icloud.com>
@ 2019-08-20 11:04         ` Stephane Chazelas
  0 siblings, 0 replies; 7+ messages in thread
From: Stephane Chazelas @ 2019-08-20 11:04 UTC (permalink / raw)
  To: Aryn Starr; +Cc: Zsh Users List

2019-08-20 13:32:31 +0430, Aryn Starr:
> Indeed, I do have `nocaseglob` :) Can’t zsh be made to try to
> match the decomposed form, too? (Perhaps as a new option?) I
> don’t think putting `iconv`s everywhere is a sustainable
> practice …
[...]

I suppose we could have an option similar to nocaseglob like
unicodeequivalenceglob where U+00E9 would match both U+00E9 and
U+0065U+0301 and vice-versa for instance (which you could
combine with nocaseglob and could probably be abused in a number
of ways and cause all sorts of security vulnerabilities like
that HFS+ design and case insenstive FS/nocaseglob already do).

I don't know if there's a standard C API for that. zsh may need
to pull an ICU library dependency to implement it. Also note
that normalisation changes with each version of Unicode (like
case insensitive comparison already changes with the locale and
version of the locale/system).

That sounds overkill just to work around the misdesigns of macOS.

-- 
Stephane

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Help wanted with debugging a weird glob behavior
  2019-08-19 19:22 Aryn Starr
@ 2019-09-17 15:30 ` Mikael Magnusson
  0 siblings, 0 replies; 7+ messages in thread
From: Mikael Magnusson @ 2019-09-17 15:30 UTC (permalink / raw)
  To: Aryn Starr; +Cc: Zsh Users

On Mon, Aug 19, 2019 at 9:24 PM Aryn Starr <whereislelouch@icloud.com> wrote:
(this message was in my spam folder, but I don't think anyone else
replied to it)
> But when I get the dirname of that path and do a glob on it, zsh doesn’t find any match:
>
> ```
> ~/TMP/zbug
> $ ec "${$(cat path):h}"/*(D)
> zsh: no matches found: /Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/*(D)
> ```
>
> I have tested this with `zsh -f`, and the bug is not present there. How can I find what in my config is causing this?

Most likely candidate is you did setopt nobareglobqual, in which case
the example is a bit needlessly complicated, does a simple
echo .(D)
work for you?

--
Mikael Magnusson

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Help wanted with debugging a weird glob behavior
@ 2019-08-19 19:22 Aryn Starr
  2019-09-17 15:30 ` Mikael Magnusson
  0 siblings, 1 reply; 7+ messages in thread
From: Aryn Starr @ 2019-08-19 19:22 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]

I have an album of mp3 files in `/Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/`. I have put `/Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/20 - Cécile Corbel - Arrietty's Song (original Japanese version).flac` in the file `path` (to help with reproducibility of the bug)(file accessible at https://git.io/fjF98 <https://git.io/fjF98>).

You can see that the address saved in `path` is valid and points to a file:

```
~/TMP/zbug
$ exa -a --oneline "$(cat path)"
/Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/20 - Cécile Corbel - Arrietty's Song (original Japanese version).flac
```

But when I get the dirname of that path and do a glob on it, zsh doesn’t find any match:

```
~/TMP/zbug
$ ec "${$(cat path):h}"/*(D)
zsh: no matches found: /Users/evar/my-music/Songs/Motion Picture's Soundtracks/More/Cécile Corbel - The Secret World Of Arrietty OST [FLAC]/*(D)
```

I have tested this with `zsh -f`, and the bug is not present there. How can I find what in my config is causing this?



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <0628A0E5-63F0-481E-AEC2-962658134620__9154.55124793283$1566242642$gmane$org@icloud.com>
2019-08-20  7:37 ` Help wanted with debugging a weird glob behavior Stephane Chazelas
     [not found]   ` <227BE55C-4B7E-4CAD-B212-D48F663BC09D@icloud.com>
2019-08-20  8:21     ` Stephane Chazelas
2019-08-20  8:47       ` Roman Perepelitsa
2019-08-20  9:02       ` Aryn Starr
     [not found]       ` <A227EBEE-60CC-460D-BBAD-D5E0A3386B4B__40377.5387721666$1566291837$gmane$org@icloud.com>
2019-08-20 11:04         ` Stephane Chazelas
2019-08-19 19:22 Aryn Starr
2019-09-17 15:30 ` Mikael Magnusson

zsh-users

Archives are clonable: git clone --mirror http://inbox.vuxu.org/zsh-users

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.zsh.users


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git