From mboxrd@z Thu Jan 1 00:00:00 1970 Message-Id: <4E085DBF.94AB.00CC.0@wlu.ca> Date: Mon, 27 Jun 2011 10:38:55 -0400 From: "Karljurgen Feuerherm" To: <9fans@9fans.net> References: <20110621105626.GA536@polynum.com> <20110625065017.GA638@polynum.com> <522e1e2a38aa18c291305563d362abfe@ladd.quanstro.net> <20110625150327.GA425@polynum.com> <20110625171134.GA3661@polynum.com> <20110626075745.GA395@polynum.com> <20110627114856.GA7099@polynum.com> <9308c52f360f6274e0730399741278ce@ladd.quanstro.net> In-Reply-To: <9308c52f360f6274e0730399741278ce@ladd.quanstro.net> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="=__Part547B1FEF.0__=" Subject: Re: [9fans] [RFC] fonts and unicode/utf [TeX] Topicbox-Message-UUID: f6e2b24e-ead6-11e9-9d60-3106f5b1d025 This is a MIME message. If you are reading this text, you may want to consider changing to a mail reader or gateway that understands how to properly handle MIME multipart messages. --=__Part547B1FEF.0__= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks for bringing up Sumerian (better: Sumero-Akkadian Cuneiform). I was thinking along exactly those lines. For me at least, solutions that satisfy =27the majority=27 are no solutions at all. And obviously, I=27m = not alone.=20 (Though it could well be that I missed the intent of Thierry=27s comment and am barking up the wrong tree.)=20 K >>> erik quanstrom 06/27/11 8:36 AM >>> > But I don=27t want to have the obligation to =22know=22 65536 signs to > express what I want to express. I=27m sorry, but I think that the > main majority (remember that for latin1/latin2 accented letters > are just variants so need less =22user memory=22 than plain different > characters) can do with (less than) 256 signs blocks, and switch > fonts when =22speaking=22 about special things (the switch can be > automatic by the way). As far as TeX is concerned, all the control > codepoints (positions) are useless in the fonts. There is still > availbale room even if for the latin1 encoded tfm built for (next) > kerTeX from PostScript core. there are currently 0x10ffff+1 codepoints (1114112), not 65536, but only 23669 + the large chinese blocks are currently defined. but anyway, i think you are missing the point. every one of those codepoints is used, or was used in human written communication. the fact that you or i probablly don=27t know them all is beside the point entirely. there are 600000 words in the oxford english dictionary. i don=27t know them all. let=27s suppose i had the power to eliminate all the ones that i don=27t know. wouldn=27t that be a horrible idea? then i would not be able to learn any new words. odious. so with unicode. if you strip out all the languages you don=27t know by restricting yourself to the latin1 codepoints =5B0, 256), then you can=27t easily add, say, greek or sumerian codepoints should you or anyone else need them. since, as you can see, there is a 1:1 identity mapping between latin1 and unicode codepoints =5B0, 256), i don=27t see why one wouldn=27t give oneself the option to increase this subset to cover more ground. i use alphas, arrows, math symbols, etc. quite often in code. and even more often when i used to use tex. it=27s really quite a drag to read =5Calpha instead of =E2=80=9C=CE=B1.=E2=80=9D > Does a whole Unicode =22Times-Roman=22 font makes sense? Ideograms in > =22Times-Roman=22? i get confused on terms. i think the right term is typeface. extended fonts collections of a given typeface covering very wide sections of unicode do exist and are sold by the major font vendors. i don=27t think that it=27s too hard to imagine that one can make most symbols look compatable enough. in fact, i=27m using a font with =7E32000 glyphs on my plan 9 terminal right now. and there=27s no penalty for having that many glyphs. it just means that my font file as a couple hundred subfonts. these are only open if needed. typically only 3 subfonts are open at any one time. - erik --=__Part547B1FEF.0__= Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Description: HTML =20

Thanks for bringing up = Sumerian (better: Sumero-Akkadian Cuneiform). I was thinking along = exactly those lines. For me at least, solutions that satisfy 'the = majority' are no solutions at all. And obviously, I'm not = alone.


=20

(Though it could well be = that I missed the intent of Thierry's comment and am barking up the = wrong tree.)


=20

K

>>> = erik quanstrom <quanstro@quanstro.net> 06/27/11 8:36 AM >>><= br>> But I don't want to have the obligation to "know" = 65536 signs to
> express what I want to express. I'm sorry, = but I think that the
> main majority (remember that for = latin1/latin2 accented letters
> are just variants so need less = "user memory" than plain different
> characters) can = do with (less than) 256 signs blocks, and switch
> fonts = when "speaking" about special things (the switch can = be
> automatic by the way). As far as TeX is concerned, all = the control
> codepoints (positions) are useless in the = fonts. There is still
> availbale room even if for the latin1 = encoded tfm built for (next)
> kerTeX from PostScript = core.

there are currently 0x10ffff+1 codepoints (1114112)= ;, not 65536,
but only 23669 + the large chinese blocks are = currently defined.

but anyway, i think you are missing the = point.  every one of those
codepoints is used, or was = used in human written communication.
the fact that you or i probablly = don't know them all is beside the
point entirely.

there are = 600000 words in the oxford english dictionary.  i don't
kn= ow them all.  let's suppose i had the power to eliminate = all
the ones that i don't know.  wouldn't that be a = horrible idea?
then i would not be able to learn any new words. = ; odious.

so with unicode.  if you strip out all the = languages you don't know
by restricting yourself to the latin1 = codepoints [0, 256), then you
can't easily add, = say, greek or sumerian codepoints should you or
anyone else need = them.

since, as you can see, there is a 1:1 identity = mapping between latin1
and unicode codepoints [0, 256), = i don't see why one wouldn't
give oneself the option to = increase this subset to cover more ground.
i use alphas, arrows,= math symbols, etc. quite often in code.  and
even more = often when i used to use tex.  it's really quite a drag = to
read \alpha instead of “α.”

> = Does a whole Unicode "Times-Roman" font makes sense? = Ideograms in
> "Times-Roman"?

i get confused on = terms.  i think the right term is typeface.
extended fonts = collections of a given typeface covering very
wide sections of unicode = do exist and are sold by the major
font vendors.

i don't = think that it's too hard to imagine that one can make
most symbols = look compatable enough.  in fact, i'm using a font
wit= h ~32000 glyphs on my plan 9 terminal right now.

and = there's no penalty for having that many glyphs.  it = just
means that my font file as a couple hundred subfonts.  th= ese
are only open if needed.  typically only 3 subfonts are = open
at any one time.

- erik

--=__Part547B1FEF.0__=--