From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24003 invoked from network); 18 Apr 2008 16:05:25 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 18 Apr 2008 16:05:25 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 88418 invoked from network); 18 Apr 2008 16:05:21 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 18 Apr 2008 16:05:21 -0000 Received: (qmail 15539 invoked by alias); 18 Apr 2008 16:05:19 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24843 Received: (qmail 15523 invoked from network); 18 Apr 2008 16:05:18 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 18 Apr 2008 16:05:18 -0000 Received: from cluster-g.mailcontrol.com (cluster-g.mailcontrol.com [85.115.41.190]) by bifrost.dotsrc.org (Postfix) with ESMTP id E1AE88043AC7 for ; Fri, 18 Apr 2008 18:05:14 +0200 (CEST) Received: from cameurexb01.EUROPE.ROOT.PRI ([62.189.241.200]) by rly08g.srv.mailcontrol.com (MailControl) with ESMTP id m3IG575t031651 for ; Fri, 18 Apr 2008 17:05:08 +0100 Received: from news01.csr.com ([10.103.143.38]) by cameurexb01.EUROPE.ROOT.PRI with Microsoft SMTPSVC(6.0.3790.3959); Fri, 18 Apr 2008 17:05:06 +0100 Received: from news01.csr.com (localhost.localdomain [127.0.0.1]) by news01.csr.com (8.14.2/8.13.4) with ESMTP id m3IG56lp017782 for ; Fri, 18 Apr 2008 17:05:06 +0100 Received: from csr.com (pws@localhost) by news01.csr.com (8.14.2/8.14.2/Submit) with ESMTP id m3IG56PR017779 for ; Fri, 18 Apr 2008 17:05:06 +0100 Message-Id: <200804181605.m3IG56PR017779@news01.csr.com> X-Authentication-Warning: news01.csr.com: pws owned process doing -bs To: zsh-workers@sunsite.dk Subject: Re: PATCH: (large) initial support for combining characters in ZLE. In-reply-to: <4FB7F05D-1C15-4D51-9B45-28EEF6F10B02@kba.biglobe.ne.jp> References: <20080413175442.0e95a241@pws-pc> <9F0DCF1B-F5FB-4150-A4FF-C441DE615404@kba.biglobe.ne.jp> <20080418104016.3cf8d12b@news01> <4FB7F05D-1C15-4D51-9B45-28EEF6F10B02@kba.biglobe.ne.jp> Comments: In-reply-to "Jun T." message dated "Sat, 19 Apr 2008 00:48:50 +0900." Date: Fri, 18 Apr 2008 17:05:06 +0100 From: Peter Stephenson X-OriginalArrivalTime: 18 Apr 2008 16:05:06.0215 (UTC) FILETIME=[F7DEF370:01C8A16D] X-Scanned-By: MailControl A-08-00-04 (www.mailcontrol.com) on 10.71.0.118 X-Virus-Scanned: ClamAV 0.91.2/6827/Fri Apr 18 16:42:03 2008 on bifrost X-Virus-Status: Clean "Jun T." wrote: > On 2008/04/18, at 18:40, Peter Stephenson wrote: > > so iswgraph() might be the thing. > > There are about 15 characters for which "wcwidth() > 0 && ! iswgraph()" > is true, all of them are a kind of white space (no tab or such). > I personaly think "space + combining-char" is OK and > just "wcwidth()>0" is enough for defining the base character. > (wcwidth() is -1 for any control chars including tabs.) It may well be OK from a standards point of view, but it's more complicated in the shell, even if we restrict it to a space character (which would still require an explicit test). Space is a word delimiter and without major surgery in the main shell any attached combining character will be stripped off in lexical analysis; either that, or we would need to ignore any spaces with combining characters. This fundamentally breaks the model that the main shell deals with an ASCII byte stream with multibyte characters that aren't lexically significant. Hence I'd prefer to make this clear from the manner of display. A lexically non-significant (i.e. quoted) space in principle doesn't need this special handling, but that would mean the line editor needs lexical information, which I want to avoid: this is one of the things that makes the completion code so hard to handle. (There's no general problem with elements of $IFS, by the way: the initial command line is special and the delimiters have to be whitespace, presumably to avoid just this sort of problem.) -- Peter Stephenson Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070