From: Chet Ramey <chet.ramey@case.edu>
To: Kwon Yeolhyun <yeolhyunkwon@me.com>
Cc: "Zsh List Hackers'" <zsh-workers@zsh.org>, chet.ramey@case.edu
Subject: Re: Unicode, Korean, normalization form, Mac OS X and tab completion
Date: Sat, 31 May 2014 11:21:56 -0400 [thread overview]
Message-ID: <5389F394.60205@case.edu> (raw)
In-Reply-To: <AB81F9FB-8D84-4656-9EFE-F2F98B196861@me.com>
On 5/30/14, 11:56 PM, Kwon Yeolhyun wrote:
> I have to work with lots of files of Korean names.
> But the problem is that zsh failed in tab completion with Korean files.
> So I’ve done research to figure out what’s going on and I found some keywords such as unicode, normalization form, Mac OS X, and decomposition.
> Also I searched mailing list and read some threads related to unicode or multibyte support.
> But I can’t find any solution.
>
> I’m not an expert about Unicode, zsh, Mac OS X. So I’m asking your help..
Your description and solution are right on the mark. Mac OS X stores and
returns filenames in decomposed Unicode (NFD), while Mac keyboards return
characters in precomposed Unicode (NFC). Decomposed Unicode is as you
describe: certain characters are `decomposed' into multiple codepoints.
(My use of NFD and NFC is not exact, but it's useful shorthand.)
What I did in bash was to convert between keyboard and file system
representations when performing filename comparisons for filename
completion. Zsh can do the same using iconv, which provides (on Mac
OS X) the UTF-8-MAC encoding to do the conversion.
One possible strategy is to convert each filename to NFC for comparison,
something like the following.
1. Keyboard input stays in NFC and is converted (dequoted, for example)
to a `raw' form for comparison.
2. Read directory, assume each name will be returned in NFD, convert
name to NFC.
3. Perform comparison using whatever strategy you'd like (e.g., taking
case into account, mapping equivalent characters, whatever)
4. If the comparison succeeds, add the matching filename (NFC) to the
list of completions.
5. If you have to add the filename to the command line (e.g., there is a
single match), you have already converted it to NFC and can insert it
directly.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/
next prev parent reply other threads:[~2014-05-31 15:28 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-31 3:56 Kwon Yeolhyun
2014-05-31 15:21 ` Chet Ramey [this message]
2014-05-31 18:47 ` Bart Schaefer
2014-05-31 19:16 ` Peter Stephenson
2014-05-31 21:29 ` Bart Schaefer
2014-06-01 2:25 ` Daniel Shahaf
2014-06-01 5:30 ` Kwon Yeolhyun
2014-06-01 16:53 ` Daniel Shahaf
2014-06-01 7:56 ` Bart Schaefer
2014-06-01 16:46 ` Daniel Shahaf
2014-06-01 17:00 ` Jun T.
2014-06-01 19:13 ` Bart Schaefer
2014-06-02 17:01 ` Jun T.
2014-06-02 17:14 ` Bart Schaefer
2014-06-01 19:53 ` Bart Schaefer
2014-06-02 11:58 ` Kwon Yeolhyun
2014-06-02 14:23 ` Kwon Yeolhyun
2014-06-02 15:14 ` Bart Schaefer
2014-06-02 15:27 ` Peter Stephenson
2014-06-02 15:48 ` Kwon Yeolhyun
2014-06-02 15:27 ` Kwon Yeolhyun
2014-06-02 15:49 ` Bart Schaefer
2014-06-02 15:58 ` Kwon Yeolhyun
2014-06-02 14:31 ` Bart Schaefer
2014-06-02 17:15 ` Jun T.
2014-06-02 17:27 ` Bart Schaefer
2014-06-05 14:34 ` Jun T.
2014-06-05 15:00 ` Bart Schaefer
2014-06-02 5:17 ` Kwon Yeolhyun
2014-06-02 7:39 ` Jun T.
2014-06-02 8:42 ` Kwon Yeolhyun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5389F394.60205@case.edu \
--to=chet.ramey@case.edu \
--cc=yeolhyunkwon@me.com \
--cc=zsh-workers@zsh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).