9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: tlaronde@polynum.com
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX]
Date: Sat, 25 Jun 2011 08:50:17 +0200	[thread overview]
Message-ID: <20110625065017.GA638@polynum.com> (raw)
In-Reply-To: <iu357j$o3k$1@dough.gmane.org>

On Fri, Jun 24, 2011 at 11:05:23PM +0000, Mauricio CA wrote:
>
> I found this text in TeX by Topic[1] that seems to support Quanstrom's
> idea. It describes how TeX reads input, and says it's done one line at
> a time (where it follows what the system defines as lines) and then for
> each line it first removes trailing spaces; then (possibly) ads a return
> to the end of the line; and then, since "computers may also differ in
> the character encoding (the most common schemes are ASCII and EBCDIC),
> so TeX converts the characters that are read from the file to its own
> character codes. These codes are then used exclusively [...]"

This is simply and extract of what is explained, partly in the
TeXbook, and in TeX: the program, 2 volumes of the 5 D.E. Knuth'
series on computer typesetting.

The initial exchange between characters is, shall we say, on the
"system" level. But it is, in the code, limited to the ASCII (7 bits)
range (and even if virtex(1) is almost the bare metal, it can be only
bootstrapped by ASCII macro commands); and furthermore, TeX is "8
bits clean", that is only using, for "text", 8 bits for input...
and as CID for fonts.

The exchange is defined at compilation time, but can also be remapped
via macro-commands.

So casting utf in 8 bits is:
	- useless for ASCII (by definition);
	- will work only for latin1 input.

Extending TeX to wydes (runes) will be relatively easy superficially for
input and output (because D.E.K. has organized the code so that these
parts can be easily changed), but will not work with TeX fonts: all the
fonts machinery has to be changed.

Furthermore, this will not work, as is, with all the Unicode
range, since TeX is "left-to-right" (but what is fundamental is that,
all in all, with the exception perhaps of Frege's ideography, all
languages seem to be linear; so a switch in TeX for width and height of
the boxes computed, and hints for dvi drivers to flip/mirror can achieve
the task). So this also is to be adapted (hence the suggestion for
XeTeX).

So for now, TeX is kept 8 bits. I make no assumption for the encoding
(and user has to feed "8 bits encoding" to TeX; ASCII users have nothing
to change; others, if they want to use directly another 8 bits encoding
(ex.: directly accented letters latin1 code) have to tcs(1) the file
first.

What I will change is only on the fonts available.

For historical reasons, the fonts derived from the PostScript standard
ones were in "EC" encoding, aka Cork, mapping mainly latin1 characters
in the 128-255 in not the latin1 encoding (because it was defined in
1990).

A macro set shall install its own expected fonts.

KerTeX shall be usable to full (relatively to its present state) extent
with the KerTeX provided data, here fonts. And to avoid providing non
D.E.K.'s fonts with the same (cryptic) names as the ones commonly found
in other TeX distributions, the kerTeX ones will use a Unix feature:
directory hierarchy, to explain the dependencies: not an initial letter
for the font forgery, but a subdirectory: adobe/ etc.

This does not prevent anyone from generating other flavours, especially
because by looking to the dir layout and to the conf/KERTEX.post-install
Bourne shell script, everything is shown and explained.
--
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                      http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C




  reply	other threads:[~2011-06-25  6:50 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-16 12:17 tlaronde
2011-06-16 16:49 ` Russ Cox
2011-06-16 17:37   ` tlaronde
2011-06-16 18:43     ` Bakul Shah
2011-06-16 19:20       ` tlaronde
2011-06-16 17:43 ` tlaronde
2011-06-17 14:18 ` Joel C. Salomon
2011-06-17 15:37   ` tlaronde
2011-06-17 18:07     ` Joel C. Salomon
2011-06-17 18:37       ` tlaronde
2011-06-19 14:21     ` erik quanstrom
2011-06-19 14:07 ` erik quanstrom
2011-06-19 16:34   ` tlaronde
2011-06-19 18:01     ` tlaronde
2011-06-19 22:38     ` erik quanstrom
2011-06-20 11:18       ` tlaronde
2011-06-20 21:53         ` erik quanstrom
2011-06-21 10:56           ` tlaronde
2011-06-24 23:05             ` Mauricio CA
2011-06-25  6:50               ` tlaronde [this message]
2011-06-25 12:19                 ` erik quanstrom
2011-06-25 15:03                   ` tlaronde
2011-06-25 15:11                     ` erik quanstrom
2011-06-25 16:33                       ` tlaronde
2011-06-25 16:34                     ` Mauricio CA
2011-06-25 17:11                       ` tlaronde
2011-06-25 18:43                         ` Michael Kerpan
2011-06-26  7:57                           ` tlaronde
2011-06-27  1:01                             ` Michael Kerpan
2011-06-27 11:48                               ` tlaronde
2011-06-27 12:36                                 ` erik quanstrom
2011-06-27 14:38                                   ` Karljurgen Feuerherm
2011-06-27 17:20                                   ` tlaronde
2011-06-27 17:34                                     ` erik quanstrom
2011-06-27 18:01                                       ` tlaronde
2011-06-27 21:17                                         ` Michael Kerpan
2011-06-28 11:25                                           ` tlaronde
2011-06-27 23:45                                     ` Karljurgen Feuerherm
2011-06-27 23:48                                       ` erik quanstrom
2011-06-28 11:19                                       ` tlaronde
2011-06-28 11:32                                         ` tlaronde
2011-06-28 12:16                                         ` erik quanstrom
2011-06-29 23:43                                         ` Karljurgen Feuerherm
2011-06-30 13:02                                           ` tlaronde
2011-06-30 13:14                                             ` erik quanstrom
2011-06-30 13:47                                               ` tlaronde
2011-06-30 14:51                                             ` Karljurgen Feuerherm
2011-06-30 15:22                                               ` Michael Kerpan
2011-06-30 16:25                                               ` tlaronde
2011-06-30 16:31                                                 ` erik quanstrom
2011-06-30 17:00                                                   ` tlaronde
2011-06-30 17:12                                                     ` tlaronde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110625065017.GA638@polynum.com \
    --to=tlaronde@polynum.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).