From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4805 invoked from network); 23 Feb 2005 14:58:01 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 23 Feb 2005 14:58:01 -0000 Received: (qmail 62317 invoked from network); 23 Feb 2005 14:57:56 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 23 Feb 2005 14:57:55 -0000 Received: (qmail 15388 invoked by alias); 23 Feb 2005 14:57:49 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 20858 Received: (qmail 15363 invoked from network); 23 Feb 2005 14:57:48 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 23 Feb 2005 14:57:48 -0000 Received: (qmail 61993 invoked from network); 23 Feb 2005 14:57:48 -0000 Received: from mailhost1.csr.com (HELO MAILSWEEPER01.csr.com) (81.105.217.43) by a.mx.sunsite.dk with SMTP; 23 Feb 2005 14:57:43 -0000 Received: from exchange03.csr.com (unverified [10.100.137.60]) by MAILSWEEPER01.csr.com (Content Technologies SMTPRS 4.3.12) with ESMTP id for ; Wed, 23 Feb 2005 14:56:13 +0000 Received: from news01.csr.com ([10.103.143.38]) by exchange03.csr.com with Microsoft SMTPSVC(5.0.2195.6713); Wed, 23 Feb 2005 14:58:38 +0000 Received: from news01.csr.com (localhost.localdomain [127.0.0.1]) by news01.csr.com (8.13.1/8.12.11) with ESMTP id j1NEvgCX032393 for ; Wed, 23 Feb 2005 14:57:42 GMT Received: from csr.com (pws@localhost) by news01.csr.com (8.13.1/8.13.1/Submit) with ESMTP id j1NEvfBI032390 for ; Wed, 23 Feb 2005 14:57:41 GMT Message-Id: <200502231457.j1NEvfBI032390@news01.csr.com> X-Authentication-Warning: news01.csr.com: pws owned process doing -bs To: zsh-workers@sunsite.dk Subject: Re: [PATCH] zle_refresh multibyte fix In-reply-to: <200502231727.58923.arvidjaar@newmail.ru> References: <200502231727.58923.arvidjaar@newmail.ru> Date: Wed, 23 Feb 2005 14:57:41 +0000 From: Peter Stephenson X-OriginalArrivalTime: 23 Feb 2005 14:58:38.0225 (UTC) FILETIME=[27CB7810:01C519B8] X-Spam-Checker-Version: SpamAssassin 3.0.2 on a.mx.sunsite.dk X-Spam-Level: X-Spam-Status: No, score=-2.5 required=6.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.2 X-Spam-Hits: -2.5 Andrey Borzenkov wrote: > The patch allows you to edit multibyte input (do not press TAB it will crash > zsh). Hmm, I think Clint had already tried to write it so that it used multibyte strings. But whatever works. > There are some bits missing, and most confusing is {lr,}prompt > treatment that is still mb and not wc. I think these could be converted when zleread() starts (and freed at the end if necessary). > Actually I find wc stuff very easy and suitable for using as internal > representation in zsh core. But this is separate topic. Apart from the inefficiency of extending every byte that comes into the shell into (typically) a four-byte integer, we can't rely on input and output bytes being a valid wide character in the current locale at all. I think the shell has to handle arbitrary strings of bytes without mutilating them. Consider, for example: # Pass secret byte to my utility my_utility $'\xff' (or any other string you like, the only point being that it isn't a valid multibyte character string). I don't see why we should arbitrarily decide that doesn't work because it doesn't convert to a wide character. It will simply break far too many things. However, in any case this isn't going to change soon. > This does not use VARARR as is, I can add it in committed patch if deemed > necessary. Where can I find more info about it? See system.h; it's fairly simple: type, name, size. > Please test it without ZLE_UNICODE_SUPPORT. There's a comma missing between fwrite arguments, and ZS_memset is incorrectly defined to wmemset in this case. Otherwise it seems OK after a quick test. > I may have got confused by ZLE_CHAR_T vs. ZLE_STRING_T; Peter please get a > look is usage is right. Basically, any existing int that holds a character should be ZLE_CHAR_T (though I'm coming to the view I should have made it wint_t, not wchar_t, and dropped ZLE_INT_T, since the whole point of using int instead of a character in the old code was to hold EOF --- or maybe ZLE_INT_T is the right one to keep). Any char * or unsigned char * that refers to an array which is now a wide character should be ZLE_STRING_T. -- Peter Stephenson Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070 ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. **********************************************************************