From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <1485.63.165.50.175.1090270909.squirrel@wish.cooper.edu> In-Reply-To: <7359f049040718120571c93b25@mail.gmail.com> References: <6e35c06204071810312daa31a9@mail.gmail.com><000701c46cf6$814c4370$92ec7d50@SOMA> <7359f049040718120571c93b25@mail.gmail.com> Date: Mon, 19 Jul 2004 17:01:49 -0400 Subject: Re: [9fans] UTF-8 criticism? From: "Joel Salomon" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu> User-Agent: SquirrelMail/1.4.2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Topicbox-Message-UUID: c320a8d4-eacd-11e9-9e20-41e7f4b1d025 >> that's not the real problem. it's implementing the collation >> sequences. the internal representation as 16 bit unsigneds >> is not a problem. > > actually it is, now that surrogates are well-established. > > -rob > Would moving to 32 bit signed (and only 0 -- 2^21 allowed, plus -1 for EOF) as in the more recent revisions of Unicode take care of the surrogates problem? --Joel p.s. >>> surrogates? >> >> Yeah, that is what happens when people settle on >> too small a code space. > > The Klingon Language Institute probably agrees. Far out man! > And I want native tengwar support... Go Geeks!!