zsh-users
 help / color / mirror / code / Atom feed
From: Oliver Kiddle <okiddle@yahoo.co.uk>
To: Olivier Verdier <Olivier.Verdier@mas.ecp.fr>
Cc: zsh-users@sunsite.dk
Subject: Re: UTF-8
Date: Tue, 18 Dec 2001 16:51:17 +0000	[thread overview]
Message-ID: <3C1F7405.A5957CC2@yahoo.co.uk> (raw)
In-Reply-To: <BC9BC140-F1A5-11D5-BA73-000393164560@mas.ecp.fr>

Olivier Verdier wrote:
> 
> I'm using Darwin and Mac OS X 10.1 together with zsh (zsh --version =
> zsh 4.0.4 (powerpc-apple-darwin1.4)), and I can't figure out how to make
> it work properly with UTF-8 encoding. All file names are indeed encoded
> in UTF-8 on macintosh hard-disk (HFS+ format). I use a terminal which is
> UTF-8 aware (apple Terminal.app). It works perfectly with
> UTF-8-configured 'less' and 'vim' commands.
> 
> Some examples of misbehaviors:
> 1) a 'ls' command for "Téléchargement" gives "Te??e??hargement"

The output of the ls command doesn't pass through zsh at all but goes
straight to the terminal so in this case, it is either ls or the
apple terminal which is failing to handle UTF-8.

>         *but* 'ls | less' gives "Téléchargement" if less is configured for
> UTF-8
>         so the output of 'ls' is correct, but is misinterpreted by the shell

That seems a little strange. I would suspect that the terminal is expecting
something like ISO-8859-1 and less is converting to that from UTF-8. Try
using a more weird character and see what happens then.

> 2) completion doesn't work; if 'Télé' is on the directory, Té[tab] gives
> nothing, but 'cd Télé' works...
>         *moreover* 'cd Té' writes 'cd T@' on screen, but 'cd Té[tab]' turns
> itself into 'cd Té'
> 
> 3) 'cd Télé' together with the option 'printeightbit' prints correctly
> the pwd; mkdir Télé works as expected.

I'm not quite sure why the completion there doesn't work. I don't have a
UTF-8 aware terminal to experiment with this which doesn't help.

Unfortunately, zsh was never built to handle UTF-8 correctly. For many
things it would be transparent because of the way UTF-8 is designed.
Commands like echo and cd I would expect to work. In some areas though,
it won't work. For example, if you assign a UTF-8 string to a variable
and use $#var to get its length, it will report the length wrongly
because it will count two for two-byte characters.

Fixing this would be quite a big job because it would affect virtually
all the code and need initial thought to work out where to use wide
characters, where to use UTF-8 and where to do conversions for input
and output.

For future reference, send any zsh questions to zsh-users@sunsite.dk or
zsh-workers@sunsite.dk. The address you used just goes to the people who
maintain the web pages.

Oliver Kiddle

_____________________________________________________________________
This message has been checked for all known viruses by the 
MessageLabs Virus Scanning Service. For further information visit
http://www.messagelabs.com/stats.asp


       reply	other threads:[~2001-12-18 16:52 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <BC9BC140-F1A5-11D5-BA73-000393164560@mas.ecp.fr>
2001-12-18 16:51 ` Oliver Kiddle [this message]
2014-12-17 18:05 utf-8 Ray Andrews
2014-12-17 20:31 ` utf-8 ZyX
2014-12-18  0:39   ` utf-8 Ray Andrews
2014-12-18  6:48     ` utf-8 Павлов Николай Александрович
2014-12-18  9:25       ` utf-8 Mikael Magnusson
2014-12-18  9:25 ` utf-8 Peter Stephenson
2014-12-18 17:36   ` utf-8 Ray Andrews
2014-12-18 17:48     ` utf-8 Peter Stephenson
2014-12-18 18:14       ` utf-8 Ray Andrews
2014-12-18 18:22         ` utf-8 ZyX
2014-12-18 18:05     ` utf-8 ZyX
2014-12-18 18:41       ` utf-8 Ray Andrews
2014-12-18 18:52         ` utf-8 ZyX
2014-12-18 20:04           ` utf-8 Ray Andrews
2014-12-18 20:12             ` utf-8 Peter Stephenson
2014-12-18 20:52             ` utf-8 ZyX
2014-12-18 21:15               ` utf-8 Ray Andrews
2014-12-18 21:38                 ` utf-8 ZyX
2014-12-18 23:55                   ` utf-8 Ray Andrews
2014-12-19  2:04                     ` utf-8 Bart Schaefer
2014-12-19  2:27                       ` utf-8 Ray Andrews
2014-12-19  2:32                         ` utf-8 Mikael Magnusson
2014-12-19  2:45                         ` utf-8 Bart Schaefer
2014-12-19  6:34                           ` utf-8 Ray Andrews
2014-12-19  7:02                             ` utf-8 Bart Schaefer
2014-12-19 17:04                               ` utf-8 Ray Andrews
2014-12-19 22:06                                 ` utf-8 ZyX
2014-12-19  7:29                             ` utf-8 Павлов Николай Александрович
2014-12-19  3:50                         ` utf-8 Lawrence Velázquez
2014-12-19  5:24                         ` utf-8 Павлов Николай Александрович
2014-12-19  5:18                     ` utf-8 Павлов Николай Александрович

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C1F7405.A5957CC2@yahoo.co.uk \
    --to=okiddle@yahoo.co.uk \
    --cc=Olivier.Verdier@mas.ecp.fr \
    --cc=zsh-users@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).