From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <zsh-workers-return-16108-mason-zsh=primenet.com.au@sunsite.dk>
Received: (qmail 13826 invoked from network); 22 Oct 2001 12:02:45 -0000
Received: from sunsite.dk (130.225.247.90)
  by ns1.primenet.com.au with SMTP; 22 Oct 2001 12:02:45 -0000
Received: (qmail 11810 invoked by alias); 22 Oct 2001 12:02:36 -0000
Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm
Precedence: bulk
X-No-Archive: yes
X-Seq: 16108
Received: (qmail 11790 invoked from network); 22 Oct 2001 12:02:35 -0000
From: Borsenkow Andrej <Andrej.Borsenkow@mow.siemens.ru>
To: "'Clint Adams'" <clint@zsh.org>
Cc: "'Geoff Wing'" <gcw@zsh.org>, "'Zsh Hackers'" <zsh-workers@sunsite.dk>
Subject: RE: multibyte backwarddeletechar
Date: Mon, 22 Oct 2001 16:02:22 +0400
Message-ID: <000f01c15af1$68294d80$21c9ca95@mow.siemens.ru>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook, Build 10.0.3311
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
In-Reply-To: <20011022073231.B31806@dman.com>
Importance: Normal

> 
> > implementation. Using wchar looks portable but the immediate problem
is
> > that conventional str* functions stop working. Using UTF-8 is
appealing
> 
> Since there are wide equivalents for most str* functions, that's not
> too severe a problem.
>

Mmm ... yes. We also need to deal with quoting; that may work just as it
works now with either char constants replaced by wchar constants (do not
know how portable it is) or by using btowc to convert them on the fly -
which assumes locale is upward compatible with ASCII (but we silently
assume it anyway).

> I did try once to replace shingetline with something that called
> a shingetwline (using wide equivalents) then ran it through wcstombs()
> to return the char * that was wanted.  It didn't function properly;
> probably something I don't understand about wide characters.

I am not sure I follow it. What you actually have to do is

- on input: either get plain characters and convert them using btowc
(that is O.K. as starting point) or read multibyte stream with mb*
functions and convert them with mbtowc (that is needed as final result
to be able to deal with UTF-8 encoding finally).

- on output: use either wctob or wctomb.

Looks like you did exactly opposite :-)


-andrej