zsh-workers
 help / color / mirror / code / Atom feed
From: "David Gómez" <david@dervishd.net>
To: Oliver Kiddle <okiddle@yahoo.co.uk>
Cc: "David Gómez" <david@dervishd.net>, Zsh-workers <zsh-workers@sunsite.dk>
Subject: Re: UTF-8 support
Date: Mon, 4 Oct 2004 18:08:58 +0200	[thread overview]
Message-ID: <20041004160858.GA4510@fargo> (raw)
In-Reply-To: <23473.1096659965@trentino.logica.co.uk>

Hi Oliver ;),

> > So i conclude from your response that nobody is working on it ;).
> > I understand the time problem, everybody is short on time, including
> 
> Nothing has been done. A few people may have done some work that was
> never posted.

Is yet possible to find that work ;)?

> I got as far reading up, thinking about what the right
> approach would be and adding support for stuff like the following to
> print characters given their unicode code point:
>   echo '\u20ac'
> It seemed a good point to start because it'll be useful for testing.

Yes, it's useful for testing to be able use unicode points as input
to echo. I've used to do some testing myself ;)

> Most parts of the source will need work but it is possible to add
> support in individual areas. So don't start with completion, find
> something simple like the print builtin (in particular -c and -C
> options).

I see, splitting the parameters in columns needs the print builtin
have knowledge of the real width if you're using UTF-8 input.

> Builtins in general are simple because they are relatively
> self-contained. If you try to attack zle first, you'll just get fed up
> with it being too hard.

I think you're totally right. zle is to hard for a start, given i
have no experience in zsh source. I'll give a look to the print builtin
and will play a bit with zsh code to learn more.

> done, another idea for something simple would be to add a Test/U01 test
> and add code to make it search for a UTF-8 locale ($langinfo[CODESET] in
> the langinfo module will help) 

Good, i didn't know about that module ;)

> The source and comments are the only documentation I know of but you can
> always ask on the list.

Thanks, i'll do ;)

> Do you know much about unicode/UTF-8? For the
> minimum, read http://www.joelonsoftware.com/articles/Unicode.html
> and then read http://www.cl.cam.ac.uk/~mgk25/unicode.html

I knew a bit. But i've been reading your links these days and have
refreshed my rusted utf-8 concepts ;).

> In my opinion it would be sensible to support multibyte encodings in
> general and not just UTF-8.

I think the reason behind using UTF-8 is not having to use any other
encondings at all, so adding support for other multibytes encoding
wouldn't be needed in my opinion. But, on the other hand, using mbs*
from libc would made easy support any multibyte the current locale
has selected.

> stateful encodings. There are a few characters which are defined to
> display as double width even in proportional fonts so keep that in mind.

In what scripts happens these characters?

> You can detect whether UTF-8 is enabled with the C library's locale
> functions but we shouldn't need to: functions such as mbrlen do all the
> work for us.

Shouldn't mbrlen and company only be used when an UTF-8 locale is selected?

Thanks,

-- 
David Gómez

"The question of whether computers can think is just like the question of
whether submarines can swim." -- Edsger W. Dijkstra


  reply	other threads:[~2004-10-04 16:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-30  8:29 David Gómez
2004-09-30  9:24 ` Peter Stephenson
2004-10-01 18:41 ` David Gómez
2004-10-01 19:46   ` Oliver Kiddle
2004-10-04 16:08     ` David Gómez [this message]
2004-10-04 16:15       ` Clint Adams
2004-10-05 11:13       ` Oliver Kiddle
2004-10-04 16:20     ` Peter Stephenson
2004-10-05 11:01       ` Oliver Kiddle
2004-10-05 11:32         ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041004160858.GA4510@fargo \
    --to=david@dervishd.net \
    --cc=okiddle@yahoo.co.uk \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).