zsh-workers
 help / color / mirror / code / Atom feed
From: Oliver Kiddle <okiddle@yahoo.co.uk>
To: Peter Stephenson <pws@csr.com>
Cc: zsh-workers@sunsite.dk (Zsh hackers list)
Subject: Re: UTF-8 fonts
Date: Wed, 25 Sep 2002 18:29:35 +0100	[thread overview]
Message-ID: <E17uFyh-0007Po-00@bimbo.logica.co.uk> (raw)
In-Reply-To: <10303.1032953780@csr.com>

On 25 Sep, Peter Stephenson wrote:
> Borzenkov Andrey wrote:
> > Just to make it clear. Is the aim to use UTF-8 internally or to support
> > (arbitrary) multibyte encoding?
> 
> The first with as much of the second as we can get in without too much

So is your aim to use UTF-8 internally in all cases or only when it is
the selected character set? I would have thought it would be easier to
just use whatever LC_CTYPE (the locale's selected encoding) is
internally and use the mb* functions so things work regardless of
whether or not LC_CTYPE is a multi-byte character encoding. I don't
know much about other multi-byte character encodings that can be used
for the input/output locale but I had gathered they at least have the
level of compatibility with basic ASCII that allows you to use ASCII
characters in string literals. To convert everything to UTF-8
internally, you would have to either use iconv or do messy stuff: the
mb* functions deal with whatever LC_CTYPE is and not UTF-8 (unless
that's what LC_CTYPE happens to be of course).

> We are going to assume that bytes without the top-bit set are ASCII, and
> the remainder require mb* handling.

Isn't it easier to just do mb* handling on everything and not go around
checking the top bit. The mb*() functions should do that sort of stuff
for us. mbrtowc() can be used, discarding the returned wchar_t to, for
example consume one character of a string. So it worries about whatever
the top bit of the bytes are or whatever the underlying multi-byte
character encoding requires.

> > Impossible. Local names are just arbitrary chosen strings; there is no
> > "character set code" defined in any locale definition, at least on Unix.

as has been mentioned: nl_langinfo(CODESET)

> Read the document at the link I gave which suggests otherwise.  However,
> I now think we can in any case leave this to the mb* suite to decide.

Yes, I think we can.

I'm sure you can all use google, but other possibly useful links I had
in my bookmarks are these:

  IBM's patches to various GNU stuff:
    https://www-124.ibm.com/developer/opensource/linux/patches/i18n/
  IBM article that serves as a basic intro:
    http://www-106.ibm.com/developerworks/library/l-linuni.html
  howto
    http://www.tldp.org/HOWTO/Unicode-HOWTO-6.html

Oliver

This e-mail and any attachment is for authorised use by the intended recipient(s) only.  It may contain proprietary material, confidential information and/or be subject to legal privilege.  It should not be copied, disclosed to, retained or used by, any other party.  If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender.  Thank you.


  parent reply	other threads:[~2002-09-25 17:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-25 11:11 Borzenkov Andrey
2002-09-25 11:36 ` Peter Stephenson
2002-09-25 13:27   ` Nadav Har'El
2002-09-25 17:29   ` Oliver Kiddle [this message]
2002-09-25 17:50     ` Peter Stephenson
  -- strict thread matches above, loose matches on Subject: below --
2002-09-19 16:56 Peter Stephenson
2002-09-19 18:14 ` Clint Adams
2002-09-24 13:39 ` Oliver Kiddle
2002-09-24 16:03   ` Clint Adams
2002-09-24 17:41     ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E17uFyh-0007Po-00@bimbo.logica.co.uk \
    --to=okiddle@yahoo.co.uk \
    --cc=pws@csr.com \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).