From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <zsh-workers-return-20711-mason-zsh=primenet.com.au@sunsite.dk>
Received: (qmail 11771 invoked from network); 14 Jan 2005 15:54:38 -0000
Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88)
  by ns1.primenet.com.au with SMTP; 14 Jan 2005 15:54:38 -0000
Received: (qmail 55101 invoked from network); 14 Jan 2005 15:54:32 -0000
Received: from sunsite.dk (130.225.247.90)
  by a.mx.sunsite.dk with SMTP; 14 Jan 2005 15:54:32 -0000
Received: (qmail 21823 invoked by alias); 14 Jan 2005 15:54:18 -0000
Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm
Precedence: bulk
X-No-Archive: yes
X-Seq: 20711
Received: (qmail 21793 invoked from network); 14 Jan 2005 15:54:16 -0000
Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88)
  by sunsite.dk with SMTP; 14 Jan 2005 15:54:16 -0000
Received: (qmail 54788 invoked from network); 14 Jan 2005 15:54:16 -0000
Received: from nef2.ens.fr (129.199.96.40)
  by a.mx.sunsite.dk with SMTP; 14 Jan 2005 15:54:12 -0000
Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22])
          by nef2.ens.fr (8.12.11/1.01.28121999) with ESMTP id j0EFsBsI084462
          for <zsh-workers@sunsite.dk>; Fri, 14 Jan 2005 16:54:11 +0100 (CET)
Received: from (coudert@localhost) by clipper.ens.fr (8.13.1/jb-1.1)
Date: Fri, 14 Jan 2005 16:54:11 +0100
From: =?iso-8859-1?Q?Fran=E7ois-Xavier?= Coudert <Francois-Xavier.Coudert@ens.fr>
To: zsh-workers@sunsite.dk
Subject: Re: Some groundwork for Unicode in Zle
Message-ID: <20050114155411.GA23624@clipper.ens.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.2i
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.5.10 (nef2.ens.fr [129.199.96.32]); Fri, 14 Jan 2005 16:54:12 +0100 (CET)
X-Spam-Checker-Version: SpamAssassin 3.0.2 on a.mx.sunsite.dk
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=6.0 tests=BAYES_00 autolearn=ham 
	version=3.0.2
X-Spam-Hits: -2.6

Hi all,

I'm new to the list but I'm interested in UTF-8 inclusion into Zle. My
question is the following: have you considered the possibility of keeping
storing strings like the line edited in arrays of char (and not wide
chars), while using a few functions to handle the fact that one Unicode
character may be represented by a few chars (and one glyph by a few
Unicode characters, but I'm not sure how this can be handled).

Using a few of the functions glib exports for Unicode (but zsh could use
home-made functions if need be), I hacked (and that's nothing close to
pretty) some internal of Zle in the following way:

diff -r zsh-4.2.3/Src/Zle/zle_misc.c zsh-fx/Src/Zle/zle_misc.c
29a30
> #include <glib.h>
97,98c98,99
<       cs += zmult;
<       backdel(zmult);
---
>       cs = (char *) (g_utf8_next_char (line + cs)) - (char *)line;
>       backdel(((char *) line + cs) - (char *)g_utf8_prev_char (line +
>       cs));
114a116,119
>     if (zmult > cs)
>       backdel (cs);
>     else
>       backdel(((char *) line + cs) - (char *)g_utf8_prev_char (line +
>       cs) - 1);

diff -r zsh-4.2.3/Src/Zle/zle_move.c zsh-fx/Src/Zle/zle_move.c
29c29
< 
---
> #include "glib.h"
162c162,167
<     cs += zmult;
---
>     cs = (char *) (g_utf8_next_char (line + cs)) - (char *)line;
174c179
<     cs -= zmult;
---
>     cs = (char *) (g_utf8_prev_char (line + cs)) - (char *)line;

diff -r zsh-4.2.3/Src/Zle/zle_utils.c zsh-fx/Src/Zle/zle_utils.c
29a30
> #include <glib.h>
94a96,97
>     int next, i;
>     
101,102c104,107
<       line[to] = line[to + cnt];
<       to++;
---
>         next = (char *) (g_utf8_next_char (line + cnt)) - (char *)line
>         - cnt;
>       for (i = to; i < to + next; i++)
>         line[i] = line[i + cnt];
>       to += next;

With this, one can correctly move around and delete (fore and back)
unicode characters with ease. Such modifications seem easy to generalize.
So the points I'd like to get your thoughts on are:

  1. is such an approach useful?
  2. what are the arguments against it? (it may need a wider rewrite of
some builtins that other approaches)

Thanks for your attention, and I hope I will be able to help getting zsh
much more viable on UTF-8 systems!

FX