From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26267 invoked from network); 5 May 2008 14:06:36 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 5 May 2008 14:06:36 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 525 invoked from network); 5 May 2008 14:06:31 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 5 May 2008 14:06:31 -0000 Received: (qmail 25592 invoked by alias); 5 May 2008 14:06:28 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24925 Received: (qmail 25575 invoked from network); 5 May 2008 14:06:27 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 5 May 2008 14:06:27 -0000 Received: from mtaout01-winn.ispmail.ntl.com (mtaout01-winn.ispmail.ntl.com [81.103.221.47]) by bifrost.dotsrc.org (Postfix) with ESMTP id B02D380ED173 for ; Mon, 5 May 2008 16:06:19 +0200 (CEST) Received: from aamtaout03-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout01-winn.ispmail.ntl.com with ESMTP id <20080505140945.TPYT27050.mtaout01-winn.ispmail.ntl.com@aamtaout03-winn.ispmail.ntl.com>; Mon, 5 May 2008 15:09:45 +0100 Received: from pws-pc ([81.107.40.67]) by aamtaout03-winn.ispmail.ntl.com with ESMTP id <20080505141355.MDMO26699.aamtaout03-winn.ispmail.ntl.com@pws-pc>; Mon, 5 May 2008 15:13:55 +0100 Date: Mon, 5 May 2008 15:05:27 +0100 From: Peter Stephenson To: Phil Pennock Cc: "Zsh Hackers' List" Subject: Re: zsh 4.3.6 FreeBSD bug Message-ID: <20080505150527.28261c46@pws-pc> In-Reply-To: <20080505003858.GA90654@redoubt.spodhuis.org> References: <20080503073947.GA22661@redoubt.spodhuis.org> <20080504131913.5b0b0ca5@pws-pc> <20080504193746.1f663c4f@pws-pc> <20080505003858.GA90654@redoubt.spodhuis.org> X-Mailer: Claws Mail 3.3.1 (GTK+ 2.12.5; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Scanned: ClamAV 0.91.2/7031/Mon May 5 15:02:49 2008 on bifrost X-Virus-Status: Clean On Sun, 4 May 2008 17:38:58 -0700 Phil Pennock wrote: > Can anyone confirm if these problems, outlined below, are widely seen or > a platform issue? > > Note the differences between EURO SIGN first and POUND SIGN first, > perhaps something to do with character width: > =E2=82=AC: EURO SIGN [0x20ac] > =C2=A3: POUND SIGN [0xa3] I haven't seen any problems with either of these two characters on any of the Fedora versions I've been using. > Is this the problem which wcwidth() should fix? No, it looks like something different. wcwidth() tells the shell the printing width; it's nothing to do with the number of bytes that make up the character. The symptom of wcwidth() failing is that the character prints OK but the shell doesn't count the characters it's moving over or deleting properly. What does the following give for you (with pound sterling followed by Euro)? % echo =C2=A3=E2=82=AC | xxd 0000000: c2a3 e282 ac0a ...... 0xc2 0xa3 is the pound sign in UTF-8 and 0xe2 0x82 0xac is the Euro symbol. If it's the same as what I get, then the conversion to UTF-8 is correct. One possibility for what's going wrong is that 0xe2 0x82 0xac is being reported as an invalid character by mbtowc() and its relatives. The shell's line editor would pick this up and show the individual bytes; a dumb display would simply dump out the bytes and if the terminal is working they would show up OK. --=20 Peter Stephenson Web page now at http://homepage.ntlworld.com/p.w.stephenson/