From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28650 invoked by alias); 7 Sep 2012 22:58:49 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 17233 Received: (qmail 13833 invoked from network); 7 Sep 2012 22:58:37 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.2 Received-SPF: pass (ns1.primenet.com.au: SPF record at benizi.com designates 64.130.10.15 as permitted sender) Date: Fri, 7 Sep 2012 18:48:21 -0400 (EDT) From: "Benjamin R. Haskell" To: =?UTF-8?Q?=E2=98=88king?= cc: zsh-users@zsh.org Subject: Re: Encoding bug? In-Reply-To: <504A24BE.3090904@sharpsaw.org> Message-ID: References: <504A24BE.3090904@sharpsaw.org> User-Agent: Alpine 2.01 (LNX 1266 2009-07-14) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-1463810530-1093797140-1347058125=:24430" ---1463810530-1093797140-1347058125=:24430 Content-Type: TEXT/PLAIN; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8BIT On Fri, 7 Sep 2012, ☈king wrote: > In v5.0.0 (revision 361e171), if I do: > > % echo ♔ > % r > > I get Mojibake. > > If I instead do: > > % echo ♔ > % fc > > I also get a case of the 'baks. > > My locale settings are no different than from when I ran an older zsh: > LANG=en_US.utf8 > LC_CTYPE="en_US.utf8" > LC_NUMERIC="en_US.utf8" > LC_TIME="en_US.utf8" > LC_COLLATE="en_US.utf8" > LC_MONETARY="en_US.utf8" > LC_MESSAGES="en_US.utf8" > LC_PAPER="en_US.utf8" > LC_NAME="en_US.utf8" > LC_ADDRESS="en_US.utf8" > LC_TELEPHONE="en_US.utf8" > LC_MEASUREMENT="en_US.utf8" > LC_IDENTIFICATION="en_US.utf8" > LC_ALL=en_US.utf8 > > > I did talk to one other user on #zsh that is using 5.0.0 and didn't > experience the problem, and one who did. I have no clue what the > difference between the odd man out might be. Not sure where something's not being encoded properly, but I get the same results here under 4.3.12 patch 1.5346 w/ LANG/LC_* set to en_US.UTF-8. Also fails on git tag zsh-4.3.10. So, it's not a new issue, AFAICT. ♔ = \U2654 = UTF-8: 0xe2 0x99 0x94 In my HISTORYFILE it ends up being encoded as: 0xe2 0x83 0xb9 0x83 0xb4 If I run: $ echo ♔ $ r and then arrow-up, it ends up displaying as: echo <20f9><83> (which jives with the HISTORYFILE encoding, but seems notable inasmuch as zsh is 'aware' that the characters are different). In the process of git-bisecting, but I won't be able to finish til tomorrow sometime (probably). -- Best, Ben ---1463810530-1093797140-1347058125=:24430--