From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 27 Jun 2011 13:48:57 +0200 From: tlaronde@polynum.com To: Michael Kerpan Message-ID: <20110627114856.GA7099@polynum.com> References: <20110621105626.GA536@polynum.com> <20110625065017.GA638@polynum.com> <522e1e2a38aa18c291305563d362abfe@ladd.quanstro.net> <20110625150327.GA425@polynum.com> <20110625171134.GA3661@polynum.com> <20110626075745.GA395@polynum.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX] Topicbox-Message-UUID: f6d055a4-ead6-11e9-9d60-3106f5b1d025 On Sun, Jun 26, 2011 at 09:01:13PM -0400, Michael Kerpan wrote: > On Sun, Jun 26, 2011 at 3:57 AM, wrote: > > > I don't know what "automagic" ligatures are; but ligatures are here in > > the kerTeX fonts, user having nothing special to do to have them. Small > > caps are here. Using the system fonts is here too, at least for T1 > > fonts: afm2tfm(1) makes them available. For other fonts format, > > writing a whatever2tfm(1) will do the job. > > In general using a simple Type 1 font isn't going to get you things > like true small caps, ligatures (beyond maybe the basic "fi" and "fl") > or the ability to choose between old-style and lining figures. These are not limitations of the software by itself but limitations due to the obscurity of the whole process. Ligatures can be added via the encoding passed to afm2tfm(1). As an example, if the next-to-come publication, I add the standard TeX classical ones (``, '', fi, fl, en-dash, em-dash, inverted ponctuation for spanish) plus << and >> for french guillemets, ,, for basedoublequote. Once you know how it is done (and since, if the corresponding glyphes do not exist, this is discarded), it is just a matter of calling the utility with the correct encoding. And once this is documented, no more "wizzards" needed... >The 256 > glyph limit means that you had to split things up into multiple fonts, > This works well enough for simply creating a PostScript file that will > be fed straight to a laser printer, but for creating searchable PDF > files, it's far from ideal. In TeX, it also require a lot of manual > work above and beyond what would be needed to get those features using > Computer Modern. With OpenType support (and using OpenType fonts, of > course), typographic features become as easy to use with third-party > fonts as they are with Computer Modern. > Same answer. TeX does not need the design of the glyphes. It needs only the metrics (Adobe has published the AFM for the core PostScript; the definition of the fonts is not public, that's why the "urw" ones are used.) These are not a limitation of TeX by itself, but of the surrounding environment and of the "freedom wizardry by obscurity". That's why too, I want to preserve dvi, because one can write a dvi2whatever, while putting directly pdf as the layout language is tying TeX to something external support. The huge mess "TeX distributions" have become will sooner of later kill TeX. One of the major lack of kerTeX now is a dvi display renderer (for X and Rio). So that the system is standalone and sheltered from external mood. What Donald E. Knuth wanted is the ability to write his books without depending on someone else anymore---"we can't print this way, since this is deprecated, unavailable etc.". KerTeX will definitively miss the goal if it depends on something else. The other intellectual context (on my side) is also the following. How did Michael Ventris find the clues to decipher linear B? The signs were too numerous to be alphabetical, not enough to be ideographic. So he guessed they were syllabic with some standalone ideographic ones. I suspect that if some civilizations have not evolved rapidly, this is due in part to the way the knowledge is transmitted. It is easy to learn alphabetic and, furthermore, this disconnects the signs partly from the sound and totally from the sight of the object (for real ones). Alphabetic has rules. While ideographic requires erudition, and since it seems unnatural to have an ideographic base (few signs that combined can describe highler level notions), it renders new ideas more difficult to express/transmit. Unicode is a good idea to avoid "guessing" the language and to plague code with the language knowledge. With this, utf encoding is the best idea, keeping ASCII and keeping the "smallest addressable" i.e. bytes. But I don't want to have the obligation to "know" 65536 signs to express what I want to express. I'm sorry, but I think that the main majority (remember that for latin1/latin2 accented letters are just variants so need less "user memory" than plain different characters) can do with (less than) 256 signs blocks, and switch fonts when "speaking" about special things (the switch can be automatic by the way). As far as TeX is concerned, all the control codepoints (positions) are useless in the fonts. There is still availbale room even if for the latin1 encoded tfm built for (next) kerTeX from PostScript core. Does a whole Unicode "Times-Roman" font makes sense? Ideograms in "Times-Roman"? So Unicode is not a panacea. It is a mean, not an aim. ("Un moyen, pas une fin.") -- Thierry Laronde http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C