9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Rob Pike <robpike@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu>
Subject: Re: [9fans] plan9 and the Unicode Consortium definitions
Date: Fri, 19 Aug 2005 08:29:39 -0700	[thread overview]
Message-ID: <7359f04905081908294942438a@mail.gmail.com> (raw)
In-Reply-To: <bcba51a05081907514df66d76@mail.gmail.com>

The addition of surrogates was a serious error of judgement for Unicode,
in my opinion.  As I understand it, the original idea for Unicode was to
have a useful 16-bit subset of the 32-but ISO10646 standard, yet now
we see Unicode growing until it no longer fits in 16 bits, in order to
include some politically expedient characters.  This is unwise.  It's a
fact of life, but it's unwise.

As far as Plan 9 is concerned, it shouldn't be too hard to cope with
surrogates but the solution will be to bump Rune to unsigned int from
unsigned short.  It's not worth doing until everything else, for instance
Java, has made a similar jump.  I don't see a lot of pressure to pull
the systems in line with the standard.  Everyone is annoyed.

Plan 9's libraries provide a modest, convenient subset of the standard.
They could use an updating to (the non-surrogate part of) Unicode 3.0.
It's not hard; I've seen code to auto-generate the appropriate tables from
the Unicode data set.  I'll see about digging them up.  If you need the
full monstrosity,  though, you may need to accept something like ICU.
It's instructive to compare the magnitude of software growth that will
encompass.

All that aside, Plan 9 doesn't do some things it should.  For instance,
it should canonicalize all characters to separate out the diacritics and
merge them on display.  At the moment, it doesn't handle diacritics at
all.

-rob


  parent reply	other threads:[~2005-08-19 15:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-19 14:51 Dimitry Golubovsky
2005-08-19 15:00 ` Christoph Lohmann
2005-08-19 15:03 ` andrey mirtchovski
2005-08-19 15:29 ` Rob Pike [this message]
2005-08-19 15:23 Dimitry Golubovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7359f04905081908294942438a@mail.gmail.com \
    --to=robpike@gmail.com \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).