From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10695 invoked from network); 15 Dec 2005 12:10:04 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00, FORGED_RCVD_HELO autolearn=ham version=3.1.0 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 15 Dec 2005 12:10:04 -0000 Received: (qmail 98307 invoked from network); 15 Dec 2005 12:09:58 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 15 Dec 2005 12:09:58 -0000 Received: (qmail 7598 invoked by alias); 15 Dec 2005 12:09:55 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 22084 Received: (qmail 7589 invoked from network); 15 Dec 2005 12:09:55 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 15 Dec 2005 12:09:55 -0000 Received: (qmail 98033 invoked from network); 15 Dec 2005 12:09:55 -0000 Received: from cluster-d.mailcontrol.com (HELO rly31d.srv.mailcontrol.com) (217.69.20.190) by a.mx.sunsite.dk with SMTP; 15 Dec 2005 12:09:53 -0000 Received: from exchange03.csr.com (uuk202166.uk.customer.alter.net [62.189.241.194] (may be forged)) by rly31d.srv.mailcontrol.com (MailControl) with ESMTP id jBFC90Cx015087 for ; Thu, 15 Dec 2005 12:09:39 GMT Received: from csr.com ([10.103.143.38]) by exchange03.csr.com with Microsoft SMTPSVC(5.0.2195.6713); Thu, 15 Dec 2005 12:09:08 +0000 To: zsh-workers@sunsite.dk (Zsh hackers list) Subject: Re: PATCH: assume "enhanced goodness" when --multibyte-enable In-reply-to: <20051215115203.38783.qmail@web25223.mail.ukl.yahoo.com> References: <20051215115203.38783.qmail@web25223.mail.ukl.yahoo.com> Date: Thu, 15 Dec 2005 12:09:07 +0000 From: Peter Stephenson Message-ID: X-OriginalArrivalTime: 15 Dec 2005 12:09:08.0287 (UTC) FILETIME=[59E654F0:01C60170] Content-Type: text/plain MIME-Version: 1.0 X-Scanned-By: MailControl A-05-40-01 (www.mailcontrol.com) on 10.68.0.141 Oliver Kiddle wrote: > Peter wrote: > > In utils.c we don't enable the full multibyte code for converting > > characters unless __STDC_ISO_10646__ is turned on. However, > everywhere > > in zle we simply trust that if --multibyte-enable is turned on > > everything just works. That includes wctomb(), which is all we need > > for character conversion. > > This doesn't make sense to me. With MULTIBYTE_SUPPORT enabled are you > just assuming that wchar_t is UCS-4 everywhere? ish... > I don't understand how that'll work if you have a system which has > perfectly good multibyte support but uses some other encoding for > wchar_t. It might well not, but up to now I've been assuming we need to know how to convert it. --enable-multibyte just says "go ahead and assume this works". Unless we can probe for what to do with a wchar_t I've been assuming we're kind of stuck. However, the assumptions we rely on are a bit different in the code for converting Unicode characters and in the reset of zle, so quite likely they shouldn't be tied... In converting \U/\u sequences, as you say, we really need fully paid up UCS-4. In the reset of zle, we need wchar_t to be an integer which overlaps with ASCII in positions 0 to 127, and we only need that in some places. (A lot of the time we can work on the pre-converted multibyte string, since that *must* have ASCII has a subset, and it's probably possible to do that everywhere by additional conversions.) I don't think it necessarily has to be exactly UCS-4 and most of the time it probably works if it isn't. So maybe the change is wrong. -- Peter Stephenson Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070 This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com