From mboxrd@z Thu Jan  1 00:00:00 1970
Message-Id: <4E0B804C.94AB.00CC.0@wlu.ca>
Date: Wed, 29 Jun 2011 19:43:08 -0400
From: "Karljurgen Feuerherm" <kfeuerherm@wlu.ca>
To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net>
References: <20110625150327.GA425@polynum.com>
	<iu52m9$a54$1@dough.gmane.org> <20110625171134.GA3661@polynum.com>
	<BANLkTikoagmZ41qpH8Zqf5xw_btH1iP7Vg@mail.gmail.com>
	<20110626075745.GA395@polynum.com>
	<BANLkTi=WQCj2vL0j=G4FW08FDy_KrYpDMQ@mail.gmail.com>
	<20110627114856.GA7099@polynum.com>
	<9308c52f360f6274e0730399741278ce@ladd.quanstro.net>
	<20110627172006.GA497@polynum.com> <4E08DDDE.94AB.00CC.0@wlu.ca>
	<20110628111915.GA498@polynum.com>
In-Reply-To: <20110628111915.GA498@polynum.com>
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="=__Part0926419C.0__="
Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX]
Topicbox-Message-UUID: f7cf48f2-ead6-11e9-9d60-3106f5b1d025

This is a MIME message. If you are reading this text, you may want to
consider changing to a mail reader or gateway that understands how to
properly handle MIME multipart messages.

--=__Part0926419C.0__=
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

I=27d like to make a few comments concerning what you say below.=20

1. I=27ve been involved with Unicode, both UTC and as a representative to
WG2, and I can confidently affirm that there is no Unicode God. No one
has ever said There is no Code but Unicode, and UTC/WG2 is its prophet,
or anything like that. If you have a reference to the Unicode Standard
where I can read in black and white what you are referring to, I will
happily look at it. (This is not intended as a smart remark. I=27m quite
seriously interested in understanding the facts of this issue.)=20

2. Anyone involved in Unicode, including inner core members of UTC etc,
recognize that it=27s far from perfect. There is acknowledgement that a
number of things could have been handled differently, but weren=27t.
Stability Policy may seem like a problematic restriction to some in
cases like this, but it guarantees backward compatibility, so has wisdom
to it.=20

3. Whatever views one may have on Unicode, for better or worse, it is
what it is. As you said yourself, c=27est un moyen et non pas une fin....
One is free to use it, or not, and or devise alternatives. (But more on
alternatives below.)=20

4. You suggested in an earlier email that you=27d like to think the whole
thing through carefully in advance, rather than implement things in
stages, as others do, who then never get to the advanced stages. To me
this begs the question of whether such is always universally the case.
In particular, if anyone or any group tried/had tried to implement all
of what Unicode proposes to be/become (UCS--Universal Character Set),
the sheer magnitude of the task (which of course grows over time since
scripts either in themselves or as a set are not static), he/she/they
would never get the thing off the ground. This is in part why there are
(arguably) flaws in Unicode. In any case, I seriously doubt that even if
one attempted to =22redo=22 it =22the right way this time=22 one would =
manage.
This is just not within the grasp of human endeavour. The mistakes would
simply be different or in different areas. Likewise, there are plenty of
things one could bring against the process of Unicode endorsing
proposals, i.e. the inherent politics of interested groups, but that
again is always a reality.=20

5. All that being said--Plan 9, as far as I can see, intentionally
supports Unicode (see http://plan9.bell-labs.com/plan9/about.html). (
http://plan9.bell-labs.com/plan9/about.html). ) So to me, it=27s a
non-starter to want to port *TeX to Plan 9 but rail against Unicode,
whether justifiably or through misunderstanding.=20

6. Unicode isn=27t Eternal, any more than any other encoding standard.
(I=27m sure there were--and perhaps still are--those who think that BCD,
no wait=21 EBCD, no wait=21 ASCII, no wait...=21--were/are the be all and =
end
all). In time, something else will develop in response to developing
needs.=20

7. But at present, the recognized standard out there that for most
practical intents and purposes (in particular, to service the needs of
something other than just North American anglophone techie society) is
Unicode, with whatever blemishes it may have.=20

So it seems to me that in keeping with your principle alluded to above,
and given that were talking about a Plan 9 environment here, you ought
to be talking UTF-8 right off the bad.=20

As I said--=22seems to me=22. Could be I=27m seriously misunderstanding =
the
discussion... but then again, the diminishing dialogue in terms of
number of participants suggests to me that there may be at least *some*
truth in what I=27m thinking....=20

Please don=27t think this is intended as a rant, either due to the way
I=27ve formatted this or on account of the content. I=27m interested in
following what you=27re doing; I=27m just a bit puzzled, and I sincerely
wish you the best in your efforts with this project.=20

K

>>> <tlaronde=40polynum.com> 06/28/11 7:19 AM >>>
On Mon, Jun 27, 2011 at 07:45:34PM -0400, Karljurgen Feuerherm wrote:
> Thierry,
>
> > I only say that:
>
> > 1) Forcing, as this was
written in the XeTeX FAQ, user to> special codepoint for the fi ligature =
since, white eyes, scornful wave
> of the hand: =22this is the way this is done with Unicode=22 is sheer
> stupidity.
>
> I don=27t know who told you that...  just because there is a codepoint
for something does not mean that one has to access that codepoint
directly in all cases. Software at various levels can render a ligature
on the basis of various actual character sequences (e.g. f + i, or f, i
when ligatures are forced, etc.
>
> It=27s simply a level of what support one wishes to offer....

This is exactly what I=27m trying to say. If one enters =5C=27e, =5C=27 is =
just
the =22charname=22 or macro command to access the acute accent in the =
font.
One can enter directly the code for the acute accent. Or one can enter
directly the =C3=A9 (if the CID entered is classified as =22other=22 =
=5Bliteral=5D,
and the fonts have something at the corresponding index).

BUT the documentation found told that with =22modern=22 fonts, one has the
absolute obligation threatened by Thy Unicode GOD to enter the codepoint
and that ligatures were deprecated.

TeX is absolutely agnostic. It is an engine, a compiler/interpreter.
Even tex(1) is just the name of an instance of TeX with a special
convention: D.E. Knuth=27s plain TeX.
some =5C=27e let
CID
>
> KF

--
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                      http://www.kergis.com/
Key fingerprint =3D 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C




--=__Part0926419C.0__=
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Content-Description: HTML

<html>
  <head>
    <style type=3D"text/css">
      <!--
        body { line-height: normal; font-variant: normal; margin-bottom: =
1px; margin-left: 4px; margin-right: 4px; margin-top: 4px }
        p { margin-bottom: 0; margin-top: 0 }
      -->
    </style>
   =20
  </head>
  <body style=3D"margin-bottom: 1px; margin-left: 4px; margin-right: 4px; =
margin-top: 4px">
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">I&#39;d like to make a few =
comments concerning what you say below.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">1. I&#39;ve been involved =
with Unicode&#44; both UTC and as a representative to WG2&#44; and I can =
confidently affirm that there is no Unicode God. No one has ever said =
There is no Code but Unicode&#44; and UTC/WG2 is its prophet&#44; or =
anything like that. If you have a reference to the Unicode Standard where =
I can read in black and white what you are referring to&#44; I will =
happily look at it. &#40;This is not intended as a smart remark. I&#39;m =
quite seriously interested in understanding the facts of this issue.&#41;</=
font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">2. Anyone involved in =
Unicode&#44; including inner core members of UTC etc&#44; recognize that =
it&#39;s far from perfect. There is acknowledgement that a number of =
things could have been handled differently&#44; but weren&#39;t. Stability =
Policy may seem like a problematic restriction to some in cases like =
this&#44; but it guarantees backward compatibility&#44; so has wisdom to =
it.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">3. Whatever views one may =
have on Unicode&#44; for better or worse&#44; it is what it is. As you =
said yourself&#44; c&#39;est un moyen et non pas une fin.... One is free =
to use it&#44; or not&#44; and or devise alternatives. &#40;But more on =
alternatives below.&#41;</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">4. You suggested in an =
earlier email that you&#39;d like to think the whole thing through =
carefully in advance&#44; rather than implement things in stages&#44; as =
others do&#44; who then never get to the advanced stages. To me this begs =
the question of whether such is always universally the case. In particular&=
#44; if anyone or any group tried/had tried to implement all of what =
Unicode proposes to be/become &#40;UCS--Universal Character Set&#41;&#44; =
the sheer magnitude of the task &#40;which of course grows over time since =
scripts either in themselves or as a set are not static&#41;&#44; =
he/she/they would never get the thing off the ground. This is in part why =
there are &#40;arguably&#41; flaws in Unicode. In any case&#44; I =
seriously doubt that even if one attempted to &quot;redo&quot; it =
&quot;the right way this time&quot; one would manage. This is just not =
within the grasp of human endeavour. The mistakes would simply be =
different or in different areas. Likewise&#44; there are plenty of things =
one could bring against the process of Unicode endorsing proposals&#44; =
i.e. the inherent politics of interested groups&#44; but that again is =
always a reality.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">5. All that being said--Plan =
9&#44; as far as I can see&#44; intentionally supports Unicode &#40;see =
</font><i><u><a href=3D"http://plan9.bell-labs.com/plan9/about.html)."><fon=
t face=3D"Lucida Grande" size=3D"3" color=3D"#0000ff">http://plan9.bell-lab=
s.com/plan9/about.html&#41;.</font></a></u></i><font size=3D"3" face=3D"Luc=
ida Grande">&nbsp;So to me&#44; it&#39;s a non-starter to want to port =
&#42;TeX to Plan 9 but rail against Unicode&#44; whether justifiably or =
through misunderstanding.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">6. Unicode isn&#39;t =
Eternal&#44; any more than any other encoding standard. &#40;I&#39;m sure =
there were--and perhaps still are--those who think that BCD&#44; no =
wait&#33; EBCD&#44; no wait&#33; ASCII&#44; no wait...&#33;--were/are the =
be all and end all&#41;. In time&#44; something else will develop in =
response to developing needs.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">7. But at present&#44; the =
recognized standard out there that for most practical intents and purposes =
&#40;in particular&#44; to service the needs of something other than just =
North American anglophone techie society&#41; is Unicode&#44; with =
whatever blemishes it may have.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">So it seems to me that in =
keeping with your principle alluded to above&#44; and given that were =
talking about a Plan 9 environment here&#44; you ought to be talking UTF-8 =
right off the bad.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">As I said--&quot;seems to =
me&quot;. Could be I&#39;m seriously misunderstanding the discussion... =
but then again&#44; the diminishing dialogue in terms of number of =
participants suggests to me that there may be at least &#42;some&#42; =
truth in what I&#39;m thinking....</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">Please don&#39;t think this =
is intended as a rant&#44; either due to the way I&#39;ve formatted this =
or on account of the content. I&#39;m interested in following what =
you&#39;re doing&#59; I&#39;m just a bit puzzled&#44; and I sincerely wish =
you the best in your efforts with this project.</font>    </p>
<br>     =20
    <p style=3D"margin-bottom: 0; margin-top: 0">
      <font size=3D"3" face=3D"Lucida Grande">K</font><br><br>&gt;&gt;&gt; =
&lt;tlaronde@polynum.com&gt; 06/28/11 7:19 AM &gt;&gt;&gt;<br>On Mon&#44; =
Jun 27&#44; 2011 at 07:45:34PM -0400&#44; Karljurgen Feuerherm wrote:<br>&g=
t; Thierry&#44;<br>&gt;<br>&gt; &gt; I only say that:<br>&gt;<br>&gt; &gt; =
1&#41; Forcing&#44; as this was written in the XeTeX FAQ&#44; user to =
enter the<br>&gt; special codepoint for the fi ligature since&#44; white =
eyes&#44; scornful wave<br>&gt; of the hand: &quot;this is the way this is =
done with Unicode&quot; is sheer<br>&gt; stupidity.<br>&gt;<br>&gt; I =
don&#39;t know who told you that...&#160;&nbsp;just because there is a =
codepoint for something does not mean that one has to access that =
codepoint directly in all cases. Software at various levels can render a =
ligature on the basis of various actual character sequences &#40;e.g. f =
&#43; i&#44; or f&#44; i when ligatures are forced&#44; etc.<br>&gt;<br>&gt=
; It&#39;s simply a level of what support one wishes to offer....<br><br>Th=
is is exactly what I&#39;m trying to say. If one enters &#92;&#39;e&#44; =
&#92;&#39; is just<br>the &quot;charname&quot; or macro command to access =
the acute accent in the font.<br>One can enter directly the code for the =
acute accent. Or one can enter<br>directly the&#160;&#233;&#160;&#40;if =
the CID entered is classified as &quot;other&quot; &#91;literal&#93;&#44;<b=
r>and the fonts have something at the corresponding index&#41;.<br><br>BUT =
the documentation found told that with &quot;modern&quot; fonts&#44; one =
has the<br>absolute obligation threatened by Thy Unicode GOD to enter the =
codepoint<br>and that ligatures were deprecated.<br><br>TeX is absolutely =
agnostic. It is an engine&#44; a compiler/interpreter.<br>Even tex&#40;1=
1; is just the name of an instance of TeX with a special<br>convention: =
D.E. Knuth&#39;s plain TeX.<br>some &#92;&#39;e let<br>CID<br>&gt;<br>&gt; =
KF<br><br>--<br>&#160;&#160;&#160;&#160;&#160;&#160;&#160;&nbsp;Thierry =
Laronde &lt;tlaronde &#43;AT&#43; polynum &#43;dot&#43; com&gt;<br>&#160;&#=
160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160=
;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&nbsp;<a href=3D"http://www.kerg=
is.com/">http://www.kergis.com/</a><br>Key fingerprint &#61; 0FF7 E906 =
FBAF FE95 FD89&#160;&nbsp;250D 52B1 AE95 6006 F40C<br><br><br>
    </p>
  </body>
</html>

--=__Part0926419C.0__=--