From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29195 invoked by alias); 1 Jun 2014 02:25:48 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 32643 Received: (qmail 23807 invoked from network); 1 Jun 2014 02:25:33 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= daniel.shahaf.name; h=date:from:to:cc:subject:message-id :references:mime-version:content-type:content-transfer-encoding :in-reply-to; s=mesmtp; bh=BsKQRu6Urlmxb1CbqO2iioffcvk=; b=02vH7 u28r6y9G0fpzys5SCNO4hDohNhgO/6uRIVwvMar2PZXWUukS9+qHqT3JVApin5Tc t+vL0FG1cHVahK3MQMRifYuXEkH6UYzoQrb+GPyCEJxHoNLlTglNrmHzLJQKgBbW pxwju0aFPivmrPox8UUl2ZziNltXSyHgx7jqdM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=date:from:to:cc:subject:message-id :references:mime-version:content-type:content-transfer-encoding :in-reply-to; s=smtpout; bh=BsKQRu6Urlmxb1CbqO2iioffcvk=; b=kwCo /i56WxxvOHIcIwyHa+NCCKbTAjxCiFhy4n0kWEuk2MIdUJc1+235MYfUUpOA1RHg ZilJWD2Byl1oOKoqVgjDd0RMYCUJJZV7ehmfK6osOP5gun87SX2MCtQ7fwiO2Sm0 nLCZTknfFQptoO+PdDjTBzSRn3+y+3TAASLV8nQ= X-Sasl-enc: q5Q0RzhDqSvDlT0m1K82D7d0yFG6r62ytz0EY2JDFZx/ 1401589531 Date: Sun, 1 Jun 2014 02:25:27 +0000 From: Daniel Shahaf To: Bart Schaefer Cc: Zsh List Hackers' Subject: Re: Unicode, Korean, normalization form, Mac OS X and tab completion Message-ID: <20140601022527.GD1820@tarsus.local2> References: <20140531201617.4ca60ab8@pws-pc.ntlworld.com> <140531142926.ZM556@torch.brasslantern.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <140531142926.ZM556@torch.brasslantern.com> User-Agent: Mutt/1.5.21 (2010-09-15) Bart Schaefer wrote on Sat, May 31, 2014 at 14:29:26 -0700: > On May 31, 8:16pm, Peter Stephenson wrote: > } > } I'm currently wondering if there is scope for normalising keyboard input > } really early --- before we feed it back to the shell --- and turning it > } back into the usual keyboard form right at the end > > Per thread with Chet, I think normalizing the filesystem is the easier > way to go. Keyboard input is already as close to normalized as it needs > to be, I think, and with only a couple of exceptions all the names we > get from the filesystem come through zreaddir(). What about, say, people doing 'ls' and copy-pasting a filename from the output into a command line? Wouldn't that result in NFD keyboard input? FWIW, while OS X always returns NFD filenames, one could also imagine an OS that is normalization-aware (forbids creating a file if its normalized name is the same as the normalized name of an existing file) but octet-sequence-preserving, and on such an OS both the readdir() output and the user input would need to be normalized. Also, other unixes allow you to have both the NFC-form and NFD-form in the same directory, e.g., 'touch fooá fooá' works just fine on linux ext4 (the first filename is composed, the second decomposed); in such cases normalization magic should not be done. Fun! :-) Daniel