From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28026 invoked by alias); 14 May 2015 16:43:53 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 20213 Received: (qmail 23628 invoked from network); 14 May 2015 16:43:50 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 X-Biglobe-Sender: Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: zsh doesn't understand some multibyte characters From: "Jun T." In-Reply-To: <20150513182942.GB4834@lorien.comfychair.org> Date: Fri, 15 May 2015 01:43:45 +0900 Content-Transfer-Encoding: 7bit Message-Id: References: <20150513161411.GA4834@lorien.comfychair.org> <150513104350.ZM28203@torch.brasslantern.com> <20150513182942.GB4834@lorien.comfychair.org> To: zsh-users@zsh.org X-Mailer: Apple Mail (2.1878.6) X-Biglobe-Spnum: 49698 2015/05/14 03:29, Danek Duvall wrote > > If I set > > comb_acute_mb[] = { (char)0xe2, (char)0x80, (char)0xa6 }; > > in the test, it thinks that character's wcwidth() is 2, not 1. U+2026 is one of the characters whose "East Asian Width" property is set to "Ambiguous". Widths of these characters are *really* ambiguous; in western (monospaced) fonts they have a single width, while in (most of?) CJK fonts they have double width. Usually, wcwidth() returns 1 for these characters so they are not displayed correctly in CJK fonts, unless applications take spacial care of them. For example, xterm has an option -cjk to handle this problem. Your report indicates that Solaris is one of the rare systems in which wcwidth() returns 2 for U+2026. Are there any fonts in which U+2026 has double width on Solaris? > I don't know why the zero-width > combining character was chosen as the test. The test was first introduced to detect a broken wcwidth() on Mac OS X, where wcwidth() returns 1 for combining characters.