From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24208 invoked by alias); 13 May 2015 18:29:57 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 20207 Received: (qmail 21906 invoked from network); 13 May 2015 18:29:55 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 Date: Wed, 13 May 2015 11:29:42 -0700 From: Danek Duvall To: Bart Schaefer Cc: zsh-users@zsh.org Subject: Re: zsh doesn't understand some multibyte characters Message-ID: <20150513182942.GB4834@lorien.comfychair.org> Mail-Followup-To: Danek Duvall , Bart Schaefer , zsh-users@zsh.org References: <20150513161411.GA4834@lorien.comfychair.org> <150513104350.ZM28203@torch.brasslantern.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <150513104350.ZM28203@torch.brasslantern.com> User-Agent: Mutt/1.5.20 (2010-04-22) On Wed, May 13, 2015 at 10:43:50AM -0700, Bart Schaefer wrote: > On May 13, 9:14am, Danek Duvall wrote: > } Subject: zsh doesn't understand some multibyte characters > } > } Perhaps this is just on Solaris, I dunno. But for some multibyte > } characters [...] if I move the cursor back over them or delete back > } over them, zsh gets confused and moves two positions instead of one > } > } I'll note that the same thing happens with all the other shells on > } Solaris [... ] Where else should I be looking for the problem? > > This sounds like the WCWIDTH() macro or function is returning the wrong > value for some characters. It does. > If you are compiling your own zsh, can you (a) check whether config.h > defines BROKEN_WCWIDTH, and (b) if it does not, try defining it and > recompile to see if that makes any difference? Not on its own; Solaris doesn't appear to define __STDC_ISO_10646__. But if I #define that to 1 (because nothing in zsh uses its value), then it does work. If I set comb_acute_mb[] = { (char)0xe2, (char)0x80, (char)0xa6 }; in the test, it thinks that character's wcwidth() is 2, not 1. Perhaps that should be a part of the test as well? I don't know why the zero-width combining character was chosen as the test. I'm less sure what to do about __STDC_ISO_10646__. I see that most of the places it's checked you're also checking for __APPLE__, but not all of them (and I'm not sure why that would be). I can talk to our globalization folks who might know why this isn't defined, or what it should be set to, or whatever, and file a bug if necessary. I guess until we figure that out, I can just have our zsh build define it on the commandline (assuming that you don't want to hold 5.0.8 for this, and I wouldn't want you to). Thanks, Danek