From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4633 invoked by alias); 1 Jun 2014 19:14:11 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 32655 Received: (qmail 20902 invoked from network); 1 Jun 2014 19:13:58 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=sX9mYCZKfeSFMezAFBsWxGEI5eWqfHgYMwUF7RzdOGo=; b=HchK02vT9KETayfo8jmWRhtgcNGetZbCDjHWaELPoXc2lX+9VdgP/y+ljwnureyvBU q5DcSJww+JDeojU9B50qD2HWG6mmSe86e540Er7Tx7xFkGYt/zbUjKjdrAIqqqKLrCm7 b2enLNXjW4/U+wcLCXwoiogA11O43Ry4uNI5Bvx4mQaiDhBsAK9Mb/770HS0uQ6MYeSq Ee/w7caE18cAVmCrwq8BjHAfCtxjWyFvOwdX58GkBnXNPm8SvTwmlZoXpz8u9s0V7eRm LnCB5bbUUCQyLYw6QmkRJKdVtRfuJWmz6qHoEecYcqNw/viq4OjhxbRF/Unu51tGzcVf 9Y5w== X-Gm-Message-State: ALoCoQl3XLLERS+U9O2RvVJ3nbuqBNaglJZa0pL7AFTKtb4XnyRN9djKzqNJ6emeKZ8qQysB1NLh MIME-Version: 1.0 X-Received: by 10.140.30.161 with SMTP id d30mr39672066qgd.62.1401650034617; Sun, 01 Jun 2014 12:13:54 -0700 (PDT) In-Reply-To: References: <20140531201617.4ca60ab8@pws-pc.ntlworld.com> <140531142926.ZM556@torch.brasslantern.com> <20140601022527.GD1820@tarsus.local2> <140601005624.ZM3283@torch.brasslantern.com> Date: Sun, 1 Jun 2014 12:13:54 -0700 Message-ID: Subject: Re: Unicode, Korean, normalization form, Mac OS X and tab completion From: Bart Schaefer To: Zsh hackers list Content-Type: multipart/alternative; boundary=001a113a5c1669c4db04facb145b --001a113a5c1669c4db04facb145b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sun, Jun 1, 2014 at 10:00 AM, Jun T. wrote: > There is a patch by a Japanese user which simply converts > file names obtained by readder() into the composed form ("NFC"): > https://gist.github.com/waltarix/1403346 > The patch in this gist is against zsh-5.0.0 (I guess). > I attached the same patch against the current git master below > (I added defined(__APPLE__) to the #if condition). > Arigatoo gozaimasu! (Watch me practice my limited and rusty Nihongo.) > In the current zsh (without this patch), > $ ls =EA=B0=80 > doesn't work if =EA=B0=80 is input from keyboard (NFC), but works if it i= s > pasted from the ls output (NFD). With the patch, the opposite happens. > This is as expected; both might work if patcompile() were also smart about it. For example, if you have a file named =C3=BCber > in the current directory, with the current zsh (without the patch): > > $ ls u # completes to =C3=BCber (useful for some user??) > $ ls =C3=BC # fails to complete > > and u* matches with =C3=BCber while =C3=BC* doesn't. > With the patch, the we get the opposite behavior. > The current behavior here is pretty much by accident, because the decomposed character for "=C3=BC" happens to be "u+umlaut" and (if I'm read= ing this correctly) at the lowest level the pattern match is applied octet-wise rather than character-wise, so "*" matches the umlaut and "u" is considered a prefix. Arguably the current behavior is wrong. --001a113a5c1669c4db04facb145b--