From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11771 invoked from network); 14 Jan 2005 15:54:38 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 14 Jan 2005 15:54:38 -0000 Received: (qmail 55101 invoked from network); 14 Jan 2005 15:54:32 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 14 Jan 2005 15:54:32 -0000 Received: (qmail 21823 invoked by alias); 14 Jan 2005 15:54:18 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 20711 Received: (qmail 21793 invoked from network); 14 Jan 2005 15:54:16 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 14 Jan 2005 15:54:16 -0000 Received: (qmail 54788 invoked from network); 14 Jan 2005 15:54:16 -0000 Received: from nef2.ens.fr (129.199.96.40) by a.mx.sunsite.dk with SMTP; 14 Jan 2005 15:54:12 -0000 Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef2.ens.fr (8.12.11/1.01.28121999) with ESMTP id j0EFsBsI084462 for ; Fri, 14 Jan 2005 16:54:11 +0100 (CET) Received: from (coudert@localhost) by clipper.ens.fr (8.13.1/jb-1.1) Date: Fri, 14 Jan 2005 16:54:11 +0100 From: =?iso-8859-1?Q?Fran=E7ois-Xavier?= Coudert To: zsh-workers@sunsite.dk Subject: Re: Some groundwork for Unicode in Zle Message-ID: <20050114155411.GA23624@clipper.ens.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2i X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.5.10 (nef2.ens.fr [129.199.96.32]); Fri, 14 Jan 2005 16:54:12 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.0.2 on a.mx.sunsite.dk X-Spam-Level: X-Spam-Status: No, score=-2.6 required=6.0 tests=BAYES_00 autolearn=ham version=3.0.2 X-Spam-Hits: -2.6 Hi all, I'm new to the list but I'm interested in UTF-8 inclusion into Zle. My question is the following: have you considered the possibility of keeping storing strings like the line edited in arrays of char (and not wide chars), while using a few functions to handle the fact that one Unicode character may be represented by a few chars (and one glyph by a few Unicode characters, but I'm not sure how this can be handled). Using a few of the functions glib exports for Unicode (but zsh could use home-made functions if need be), I hacked (and that's nothing close to pretty) some internal of Zle in the following way: diff -r zsh-4.2.3/Src/Zle/zle_misc.c zsh-fx/Src/Zle/zle_misc.c 29a30 > #include 97,98c98,99 < cs += zmult; < backdel(zmult); --- > cs = (char *) (g_utf8_next_char (line + cs)) - (char *)line; > backdel(((char *) line + cs) - (char *)g_utf8_prev_char (line + > cs)); 114a116,119 > if (zmult > cs) > backdel (cs); > else > backdel(((char *) line + cs) - (char *)g_utf8_prev_char (line + > cs) - 1); diff -r zsh-4.2.3/Src/Zle/zle_move.c zsh-fx/Src/Zle/zle_move.c 29c29 < --- > #include "glib.h" 162c162,167 < cs += zmult; --- > cs = (char *) (g_utf8_next_char (line + cs)) - (char *)line; 174c179 < cs -= zmult; --- > cs = (char *) (g_utf8_prev_char (line + cs)) - (char *)line; diff -r zsh-4.2.3/Src/Zle/zle_utils.c zsh-fx/Src/Zle/zle_utils.c 29a30 > #include 94a96,97 > int next, i; > 101,102c104,107 < line[to] = line[to + cnt]; < to++; --- > next = (char *) (g_utf8_next_char (line + cnt)) - (char *)line > - cnt; > for (i = to; i < to + next; i++) > line[i] = line[i + cnt]; > to += next; With this, one can correctly move around and delete (fore and back) unicode characters with ease. Such modifications seem easy to generalize. So the points I'd like to get your thoughts on are: 1. is such an approach useful? 2. what are the arguments against it? (it may need a wider rewrite of some builtins that other approaches) Thanks for your attention, and I hope I will be able to help getting zsh much more viable on UTF-8 systems! FX