From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2505 invoked by alias); 1 Jun 2014 17:40:37 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 32652 Received: (qmail 25069 invoked from network); 1 Jun 2014 17:40:34 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 X-Biglobe-Sender: Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.2\)) Subject: Re: Unicode, Korean, normalization form, Mac OS X and tab completion From: "Jun T." In-Reply-To: <140601005624.ZM3283@torch.brasslantern.com> Date: Mon, 2 Jun 2014 02:00:55 +0900 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20140531201617.4ca60ab8@pws-pc.ntlworld.com> <140531142926.ZM556@torch.brasslantern.com> <20140601022527.GD1820@tarsus.local2> <140601005624.ZM3283@torch.brasslantern.com> To: zsh-workers@zsh.org X-Mailer: Apple Mail (2.1878.2) X-Biglobe-Spnum: 50522 There is a patch by a Japanese user which simply converts file names obtained by readder() into the composed form ("NFC"): https://gist.github.com/waltarix/1403346 The patch in this gist is against zsh-5.0.0 (I guess). I attached the same patch against the current git master below (I added defined(__APPLE__) to the #if condition). We may use this to see what kind of problem may appear by this simple approach. Kwon Yeolhyun, can you test this patch ?=20 In the current zsh (without this patch),=20 $ ls =EA=B0=80 doesn't work if =EA=B0=80 is input from keyboard (NFC), but works if it = is pasted from the ls output (NFD). With the patch, the opposite happens. =20 Of course this patch affect not only Korean but any languages which have decomposable character. For example, if you have a file named =C3=BCb= er=20 in the current directory, with the current zsh (without the patch): $ ls u # completes to =C3=BCber (useful for some user??) $ ls =C3=BC # fails to complete and u* matches with =C3=BCber while =C3=BC* doesn't. With the patch, the we get the opposite behavior. Jun diff --git a/Src/utils.c b/Src/utils.c index 9439227..86b61f1 100644 --- a/Src/utils.c +++ b/Src/utils.c @@ -4270,6 +4270,13 @@ mod_export char * zreaddir(DIR *dir, int ignoredots) { struct dirent *de; +#if defined(HAVE_ICONV) && defined(__APPLE__) + static iconv_t conv_ds =3D (iconv_t)NULL; + static char *conv_name =3D (char *)NULL; + char *temp_name; + char *temp_name_ptr, *orig_name_ptr; + size_t temp_name_len, orig_name_len; +#endif =20 do { de =3D readdir(dir); @@ -4278,6 +4285,23 @@ zreaddir(DIR *dir, int ignoredots) } while(ignoredots && de->d_name[0] =3D=3D '.' && (!de->d_name[1] || (de->d_name[1] =3D=3D '.' && = !de->d_name[2]))); =20 +#if defined(HAVE_ICONV) && defined(__APPLE__) + if (!conv_ds) + conv_ds =3D iconv_open("UTF-8", "UTF-8-MAC"); + if (conv_ds) { + orig_name_ptr =3D de->d_name; + orig_name_len =3D strlen(de->d_name); + conv_name =3D zrealloc(conv_name, orig_name_len+1); + temp_name_ptr =3D conv_name; + temp_name_len =3D orig_name_len; + if = (iconv(conv_ds,&orig_name_ptr,&orig_name_len,&temp_name_ptr,&temp_name_len= ) >=3D 0) { + *temp_name_ptr =3D '\0'; + temp_name =3D conv_name; + return metafy(temp_name, -1, META_STATIC); + } + } +#endif + return metafy(de->d_name, -1, META_STATIC); } =20