zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <pws@csr.com>
To: zsh-workers@sunsite.dk
Subject: Re: Silent UTF-8 assumption?
Date: Thu, 10 May 2007 10:46:09 +0100	[thread overview]
Message-ID: <200705100946.l4A9k9FI001151@news01.csr.com> (raw)
In-Reply-To: <200705101156.19776.arvidjaar@newmail.ru>

Andrey Borzenkov wrote:
> --nextPart1795203.6vxPbZfGLe
> Content-Type: text/plain;
>   charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
> Content-Disposition: inline
> 
> This caught my attention:
> 
> static wchar_t
> charref(char *x, char *y)
> {
>     wchar_t wc;
>     size_t ret;
> 
>     if (!(patglobflags & GF_MULTIBYTE) || !(STOUC(*x) & 0x80))
>         return (wchar_t) STOUC(*x);
> 
> well, this is definitely not valid for arbitrary multibyte character
> set.

We're not using an arbitrary character set, we're using one that has the
portable character set (i.e. ASCII) as a 7-bit subset, including the
property of UTF-8 that any true multibyte stream has the eighth bit set
in all octets.  That's entirely for the practical reason that, if we
don't make that assumption, all hell will break use because we have to
make *every* part of the shell that ever tests a character, even an
ASCII character, multibyte aware.

There's a good chance the multibyte character set in question is UTF-8,
but it doesn't necessarily have to be.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php

To get further information regarding CSR, please visit our Investor Relations page at http://ir.csr.com/csr/about/overview


      reply	other threads:[~2007-05-10  9:48 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-10  7:56 Andrey Borzenkov
2007-05-10  9:46 ` Peter Stephenson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200705100946.l4A9k9FI001151@news01.csr.com \
    --to=pws@csr.com \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).