From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <012a01c3477a$018ce920$b9844051@insultant.net> From: "boyd, rounin" To: <9fans@cse.psu.edu> References: <20030710160854.E7106@cackle.proxima.alt.za> <4b98d4a6bc053f2a6d06aed8997d50ff@plan9.bell-labs.com> <20030710162509.F7106@cackle.proxima.alt.za> <3F0D8198.6000009@nas.com> <20030711064145.G7106@cackle.proxima.alt.za> <00f601c34772$215ea340$b9844051@insultant.net> <20030711081111.H7106@cackle.proxima.alt.za> Subject: Re: [9fans] A simple question MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Date: Fri, 11 Jul 2003 08:59:43 +0200 Content-Transfer-Encoding: quoted-printable Topicbox-Message-UUID: f5942c0c-eacb-11e9-9e20-41e7f4b1d025 > Yes, that's where composition came into the picture. But that > needs to be clever, with character scaling to make room for accents > becoming more than a trivial nuisance. Technically, it doesn't > matter what a symbol stands for, as much as it needs to be _presented_ > in an unambiguous, clear fashion. yes that is not a problem. > Whether it's a pronunciation issue or a distinct character (is the > final letter in "papa" and "pap=E0" a pronunciation aid or a different > character in the sense of differentiating words with different > meanings?) is not important to its internal or external repreentation. but it's a big problem when you come to write sort(1). > But I do get your point that overlaps of alphabets for different > languages does add complexity. Maybe there is enough scope in > UTF-8 or Unicode to allow many-to one internal to external mappings. it really is a nasty problem. python [makes a crucifix sign] lets you ad= d codecs and all sorts of horrible crap. > The existence of a phonetic alphabet is a different issue, too vast > to address here (without composition capabilities, specially). phonetic alphabets are easy. well japanese would be perfect if they had stuck with the kana, but the kanji really messes things up. i think the wost character has 17 readings and that can ruin your whole day. just my rounin tattoo: http://www.insultant.net/images/rounin.jpg has three readings and a 4th in chinese. > Suffice to say even in Plan 9 there are fonts that do not have all > the useful glyphs in them, so whereas UTF-8 is a great abstraction > for internal purposes, there should be a more definite standard > about externalising it. well maybe. i'd be happier if i could chose my keyboard layout as part of the install. much as i hate lunix, the redhat install was pretty impressive, but that's the 'army or programmers' tar pit.