From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Mon, 27 Jun 2011 13:48:57 +0200
From: tlaronde@polynum.com
To: Michael Kerpan <mjkerpan@kerpan.com>
Message-ID: <20110627114856.GA7099@polynum.com>
References: <20110621105626.GA536@polynum.com> <iu357j$o3k$1@dough.gmane.org>
	<20110625065017.GA638@polynum.com>
	<522e1e2a38aa18c291305563d362abfe@ladd.quanstro.net>
	<20110625150327.GA425@polynum.com> <iu52m9$a54$1@dough.gmane.org>
	<20110625171134.GA3661@polynum.com>
	<BANLkTikoagmZ41qpH8Zqf5xw_btH1iP7Vg@mail.gmail.com>
	<20110626075745.GA395@polynum.com>
	<BANLkTi=WQCj2vL0j=G4FW08FDy_KrYpDMQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <BANLkTi=WQCj2vL0j=G4FW08FDy_KrYpDMQ@mail.gmail.com>
User-Agent: Mutt/1.4.2.3i
Cc: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX]
Topicbox-Message-UUID: f6d055a4-ead6-11e9-9d60-3106f5b1d025

On Sun, Jun 26, 2011 at 09:01:13PM -0400, Michael Kerpan wrote:
> On Sun, Jun 26, 2011 at 3:57 AM,  <tlaronde@polynum.com> wrote:
>
> > I don't know what "automagic" ligatures are; but ligatures are here in
> > the kerTeX fonts, user having nothing special to do to have them. Small
> > caps are here. Using the system fonts is here too, at least for T1
> > fonts: afm2tfm(1) makes them available. For other fonts format,
> > writing a whatever2tfm(1) will do the job.
>
> In general using a simple Type 1 font isn't going to get you things
> like true small caps, ligatures (beyond maybe the basic "fi" and "fl")
> or the ability to choose between old-style and lining figures.

These are not limitations of the software by itself but limitations due
to the obscurity of the whole process. Ligatures can be added via the
encoding passed to afm2tfm(1). As an example, if the next-to-come
publication, I add the standard TeX classical ones (``, '', fi, fl,
en-dash, em-dash, inverted ponctuation for spanish) plus << and >> for
french guillemets, ,, for basedoublequote.

Once you know how it is done (and since, if the corresponding glyphes do
not exist, this is discarded), it is just a matter of calling the
utility with the correct encoding. And once this is documented, no more
"wizzards" needed...

>The 256
> glyph limit means that you had to split things up into multiple fonts,
> This works well enough for simply creating a PostScript file that will
> be fed straight to a laser printer, but for creating searchable PDF
> files, it's far from ideal. In TeX, it also require a lot of manual
> work above and beyond what would be needed to get those features using
> Computer Modern. With OpenType support (and using OpenType fonts, of
> course), typographic features become as easy to use with third-party
> fonts as they are with Computer Modern.
>

Same answer. TeX does not need the design of the glyphes. It needs only
the metrics (Adobe has published the AFM for the core PostScript; the
definition of the fonts is not public, that's why the "urw" ones are
used.)

These are not a limitation of TeX by itself, but of the surrounding
environment and of the "freedom wizardry by obscurity". That's why too,
I want to preserve dvi, because one can write a dvi2whatever, while
putting directly pdf as the layout language is tying TeX to something
external support. The huge mess "TeX distributions" have become will
sooner of later kill TeX.

One of the major lack of kerTeX now is a dvi display renderer (for X and
Rio). So that the system is standalone and sheltered from external mood.

What Donald E. Knuth wanted is the ability to write his books without
depending on someone else anymore---"we can't print this way, since this
is deprecated, unavailable etc.". KerTeX will definitively miss the
goal if it depends on something else.

The other intellectual context (on my side) is also the following.

How did Michael Ventris find the clues to decipher linear B? The signs
were too numerous to be alphabetical, not enough to be ideographic. So
he guessed they were syllabic with some standalone ideographic ones.

I suspect that if some civilizations have not evolved rapidly, this is
due in part to the way the knowledge is transmitted. It is easy to learn
alphabetic and, furthermore, this disconnects the signs partly from the
sound and totally from the sight of the object (for real ones).
Alphabetic has rules. While ideographic requires erudition, and
since it seems unnatural to have an ideographic base (few signs that
combined can describe highler level notions), it renders new ideas more
difficult to express/transmit.

Unicode is a good idea to avoid "guessing" the language and to
plague code with the language knowledge. With this, utf encoding is the
best idea, keeping ASCII and keeping the "smallest addressable" i.e.
bytes.

But I don't want to have the obligation to "know" 65536 signs to
express what I want to express. I'm sorry, but I think that the
main majority (remember that for latin1/latin2 accented letters
are just variants so need less "user memory" than plain different
characters) can do with (less than) 256 signs blocks, and switch
fonts when "speaking" about special things (the switch can be
automatic by the way). As far as TeX is concerned, all the control
codepoints (positions) are useless in the fonts. There is still
availbale room even if for the latin1 encoded tfm built for (next)
kerTeX from PostScript core.

Does a whole Unicode "Times-Roman" font makes sense? Ideograms in
"Times-Roman"?

So Unicode is not a panacea. It is a mean, not an aim. ("Un moyen, pas
une fin.")
--
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                      http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C