zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: "Zsh List Hackers'" <zsh-workers@zsh.org>
Subject: Re: Unicode, Korean, normalization form, Mac OS X and tab completion
Date: Sun, 01 Jun 2014 00:56:24 -0700	[thread overview]
Message-ID: <140601005624.ZM3283@torch.brasslantern.com> (raw)
In-Reply-To: <20140601022527.GD1820@tarsus.local2>

On Jun 1,  2:25am, Daniel Shahaf wrote:
}
} What about, say, people doing 'ls' and copy-pasting a filename from the
} output into a command line?  Wouldn't that result in NFD keyboard
} input?

Yes, but there's only so far that it makes sense to go with this.  For
example, [[ fooá = fooaÌ ]] arguably should not normalize, and script
file contents should not be normalized, etc.  I think messing with the
command input stream will create more problems than it solves.

What we *might* need is for patcompile() also to normalize (though that
potentially violates what I just said about [[ ... ]], depending on which
encoding is the pattern and which is the string to be matched).  Maybe
this needs to be part of the (#u) qualifier handling, or a related new
qualifier.

(Note there's little to no existing support for wide characters in e.g.
matcher-list range specifications, so no point in going there yet.)

} FWIW, while OS X always returns NFD filenames, one could also imagine an
} OS that is normalization-aware (forbids creating a file if its
} normalized name is the same as the normalized name of an existing file)
} but octet-sequence-preserving, and on such an OS both the readdir()
} output and the user input would need to be normalized.

This case is ultimately the same as your first example.  Either the two
forms of name should be treated the same, in which case normalizing the
results of readdir() is enough, or they should be treated as different
even though you aren't allowed to create both of them, in which case
they should not be normalized at all (and then there better be some way
outside the shell, e.g., at the TTY driver layer, to choose the input
encoding).

Maybe the completion system should use (#u) more often, or maybe there
needs to be a setopt to cause all patterns to act as if (#u) ...

If there's a tricky bit, it's knowing which encoding is the default for
input so you can normalize to that one.

} Also, other unixes allow you to have both the NFC-form and NFD-form in
} the same directory, e.g., 'touch fooa fooa' works just fine on linux
} ext4 (the first filename is composed, the second decomposed); in such
} cases normalization magic should not be done.

Hence my question about what compile-time tests we need for this, and
what if anything to do about Mac filesystems mounted on Linux.


  parent reply	other threads:[~2014-06-01  7:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-31  3:56 Kwon Yeolhyun
2014-05-31 15:21 ` Chet Ramey
2014-05-31 18:47   ` Bart Schaefer
2014-05-31 19:16 ` Peter Stephenson
2014-05-31 21:29   ` Bart Schaefer
2014-06-01  2:25     ` Daniel Shahaf
2014-06-01  5:30       ` Kwon Yeolhyun
2014-06-01 16:53         ` Daniel Shahaf
2014-06-01  7:56       ` Bart Schaefer [this message]
2014-06-01 16:46         ` Daniel Shahaf
2014-06-01 17:00         ` Jun T.
2014-06-01 19:13           ` Bart Schaefer
2014-06-02 17:01             ` Jun T.
2014-06-02 17:14               ` Bart Schaefer
2014-06-01 19:53           ` Bart Schaefer
2014-06-02 11:58             ` Kwon Yeolhyun
2014-06-02 14:23               ` Kwon Yeolhyun
2014-06-02 15:14                 ` Bart Schaefer
2014-06-02 15:27                   ` Peter Stephenson
2014-06-02 15:48                     ` Kwon Yeolhyun
2014-06-02 15:27                   ` Kwon Yeolhyun
2014-06-02 15:49                     ` Bart Schaefer
2014-06-02 15:58                       ` Kwon Yeolhyun
2014-06-02 14:31               ` Bart Schaefer
2014-06-02 17:15             ` Jun T.
2014-06-02 17:27               ` Bart Schaefer
2014-06-05 14:34                 ` Jun T.
2014-06-05 15:00                   ` Bart Schaefer
2014-06-02  5:17           ` Kwon Yeolhyun
2014-06-02  7:39             ` Jun T.
2014-06-02  8:42               ` Kwon Yeolhyun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=140601005624.ZM3283@torch.brasslantern.com \
    --to=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).