From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Thu, 30 Jun 2011 19:00:48 +0200
From: tlaronde@polynum.com
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Message-ID: <20110630170048.GA999@polynum.com>
References: <20110627114856.GA7099@polynum.com>
	<9308c52f360f6274e0730399741278ce@ladd.quanstro.net>
	<20110627172006.GA497@polynum.com> <4E08DDDE.94AB.00CC.0@wlu.ca>
	<20110628111915.GA498@polynum.com> <4E0B804C.94AB.00CC.0@wlu.ca>
	<20110630130254.GA7276@polynum.com> <4E0C5549.94AB.00CC.0@wlu.ca>
	<20110630162524.GA442@polynum.com>
	<59e4d419ba69189bc467a330651c7044@ladd.quanstro.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <59e4d419ba69189bc467a330651c7044@ladd.quanstro.net>
User-Agent: Mutt/1.4.2.3i
Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX]
Topicbox-Message-UUID: f8adaeda-ead6-11e9-9d60-3106f5b1d025

On Thu, Jun 30, 2011 at 12:31:17PM -0400, erik quanstrom wrote:
> > I don't despise XeTeX. Nor Unicode. And I will take Unicode as is. But I
> > will take TeX conventions as is too, since I'm working on TeX, and not
> > another formatting system; since these conventions are confined to the
> > ASCII subrange and only diverging from ASCII for the not glyph
> > positions. I still fail to see what's the big deal?
>
> you can't have it both ways.  you can't at the same time say tex is
> only defined for ascii, so utf-8 is a non sequitor, and at the same time
> put out a version of tex that takes latin1 input.

No, this is an error you and others are making.

There is a distinction between the encoding input (for the moment TeX
expect only 8 bits), and some conventions in the font organization.

The Computer Modern fonts provide ASCII "visible" characters (glyphes)
in the ASCII positions. But they are other positions in the 0-127 range
that are free. These positions are used "internally" by the plain TeX
conventions (TeX is the compiler/interpreter; tex(1) is the interpreter
having loaded a special set of conventions, the ones of plain TeX; one
can do almost totally without or totally differently). These free
(as far as a font is concerned) positions are filled with non ASCII
characters/glyphes. For example, in the text font layout, the 0x1a
position has the glyphe for the \ae. If a user, using plain TeX,
specifies \ae, the TFM constructed will give the correct metrics for the
glyph, and the dvi driver will put the correct glyph.

This does not preclude the user from directly entering the unicode
codepoint: in the TFM, if you want, the glyph information is duplicated,
in the conventional plain TeX position, and as a literal in the unicode
position.

In this case, the plain TeX convention is accessed whether by the \ae
char definition, the 0x1a code (ASCII control "sub"), or the 0x00e6
unicode.

This is not the input encoding; this is a font mapping.
--
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                      http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C