From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 20 Jun 2011 13:18:45 +0200 From: tlaronde@polynum.com To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Message-ID: <20110620111845.GA540@polynum.com> References: <20110616121700.GA9131@polynum.com> <9556bc097d90b774c37c16af5a7c20eb@brasstown.quanstro.net> <20110619163458.GA424@polynum.com> <3c7e401c771bdd0d9bd8950ceb60eb9e@ladd.quanstro.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3c7e401c771bdd0d9bd8950ceb60eb9e@ladd.quanstro.net> User-Agent: Mutt/1.4.2.3i Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX] Topicbox-Message-UUID: f2fc1544-ead6-11e9-9d60-3106f5b1d025 On Sun, Jun 19, 2011 at 06:38:59PM -0400, erik quanstrom wrote: > > nobody cares what font encoding tex uses internally. the > real issue is the input to tex. i sure would be very reluctant > to load anything on my system that will mangle utf-8, especially > for codepoints <256. that's the path to wchar_t. That TeX on Plan9 should accept utf-8 is not a question. But TeX has a present state, and kerTeX has a present state. For now, TeX only chews bytes (octets); there is apparently some acrobatics with a LaTeX macro set trying to accomodate with utf in input (according to Russ Cox if I understood correctly what he wrote). One can use TeX with utf as long as one uses only ASCII (by design/definition of utf). That is one can use TeX in interactive mode on Plan9 conforming to the TeXbook, since the TeXbook uses ASCII, even to create non ASCII glyphes (accented with escape sequences). TeX will do non desired things if it chews non ASCII encoded in utf (and this starts even with the Unicode-latin1 range). BUT, since the "codepoints" described in the latin1 subrange are present (except for /dcroat and /Dcroat) in the 229 glyphes PostScript Core fonts, and I can create fonts (tfm) for TeX covering "ASCII/latin1" characters, this allows people using this more wide (even if limited) range, to enter the text on Plan9; to use tcs(1) to convert this range to latin1 i.e. 8 bits encoding, and to feed (not interactive) this file to TeX. This adds, for now (and for others than Plan9 that still use chars == octets) some supplementary ability, without removing something. I have to make a choice. YES, "latin1" too is not less special than not ASCII in utf; but glyphes are there (in PS core fonts) ; it is in the same value than Unicode ; so it seems more natural to choose this than any other _for now_. Paris has not been built in one day. KerTeX neither. -- Thierry Laronde http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C