* Re: Chinese
[not found] <20051212173143.94765127FA@ronja.ntg.nl>
@ 2005-12-13 8:07 ` Duncan Hothersall
2005-12-13 9:52 ` Chinese Hans Hagen
0 siblings, 1 reply; 11+ messages in thread
From: Duncan Hothersall @ 2005-12-13 8:07 UTC (permalink / raw)
Hans wrote:
> chinese is not yet defined in utf so if you want that, we need to do it
...
> assuming this, how about making a set of tfm,enc,map files that match
> the unicode positions (volunteers ...)
I'm very willing to help, especially if there is some drudge work
involved in constructing the files. I don't know enough (yet) about the
logic of it all to help with setting up the system, but if someone can
supply skeleton files and/or a method for constructing the necessary
files, I'm happy to do any leg-work.
Duncan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 8:07 ` Chinese Duncan Hothersall
@ 2005-12-13 9:52 ` Hans Hagen
2005-12-13 10:03 ` sjoerd siebinga
2005-12-13 12:33 ` Adam Lindsay
0 siblings, 2 replies; 11+ messages in thread
From: Hans Hagen @ 2005-12-13 9:52 UTC (permalink / raw)
Duncan Hothersall wrote:
>Hans wrote:
>
>
>
>>chinese is not yet defined in utf so if you want that, we need to do it
>>
>>
>...
>
>
>>assuming this, how about making a set of tfm,enc,map files that match
>>the unicode positions (volunteers ...)
>>
>>
>
>I'm very willing to help, especially if there is some drudge work
>involved in constructing the files. I don't know enough (yet) about the
>logic of it all to help with setting up the system, but if someone can
>supply skeleton files and/or a method for constructing the necessary
>files, I'm happy to do any leg-work.
>
>
what we need is a set of encoding files like
/UniEncoding52 [
....
/uni52DF
/uni52E0
/uni52E1
/uni52E2
/uni52E3
/uni52E4
...
/.notdef
....
] def
that represent the ranges and can be used to construct tfm files.
(or whatever index entry is needed in order to filter the metrics from
the ttf file)
maybe patricks font code already can do that:
- read in a ttf file (or a glyph list produced by ttf2tfm or ttf2afm)
- make a range of enc and tfm files
actually, this is rather generic, since pdftex can handle symbolic names
like /index... and /uni..., so if we have such a set, we can stick to
one bunch of enc files
the utf handler can then simply access char E1 from htsong-52.tfm
testing is rather simple:
\pdfmapline{htsong-52 <uni-52.enc <htsong.ttf}
\font\test=htsong-52 \char"e1
Hans
Hans
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 9:52 ` Chinese Hans Hagen
@ 2005-12-13 10:03 ` sjoerd siebinga
2005-12-13 10:34 ` Hans Hagen
2005-12-13 10:46 ` Re: Chinese Tobias Burnus
2005-12-13 12:33 ` Adam Lindsay
1 sibling, 2 replies; 11+ messages in thread
From: sjoerd siebinga @ 2005-12-13 10:03 UTC (permalink / raw)
On 13 Dec 2005, at 10:52, Hans Hagen wrote:
> Duncan Hothersall wrote:
>
>> Hans wrote:
>>
>>
>>> chinese is not yet defined in utf so if you want that, we need to
>>> do it
>>>
>> ...
>>
>>> assuming this, how about making a set of tfm,enc,map files that
>>> match the unicode positions (volunteers ...)
>>>
>>
>> I'm very willing to help, especially if there is some drudge work
>> involved in constructing the files. I don't know enough (yet)
>> about the
>> logic of it all to help with setting up the system, but if someone
>> can
>> supply skeleton files and/or a method for constructing the necessary
>> files, I'm happy to do any leg-work.
>>
> what we need is a set of encoding files like
>
> /UniEncoding52 [
> ....
> /uni52DF
> /uni52E0
> /uni52E1
> /uni52E2
> /uni52E3
> /uni52E4
> ...
> /.notdef
> ....
> ] def
I have made a Ruby-script (for personal use loosely based on Adam's
xsl-files) which generates all the encoding- and symbolfiles from a
given cmapfile. If someone could send me the ttf-font, I can generate
all the necessary encodingfiles for you.
Sjoerd
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 10:03 ` sjoerd siebinga
@ 2005-12-13 10:34 ` Hans Hagen
2005-12-13 11:26 ` sjoerd siebinga
2005-12-13 10:46 ` Re: Chinese Tobias Burnus
1 sibling, 1 reply; 11+ messages in thread
From: Hans Hagen @ 2005-12-13 10:34 UTC (permalink / raw)
sjoerd siebinga wrote:
> I have made a Ruby-script (for personal use loosely based on Adam's
> xsl-files) which generates all the encoding- and symbolfiles from a
> given cmapfile. If someone could send me the ttf-font, I can generate
> all the necessary encodingfiles for you.
the chinese fonts mentioned in the context garden qualify for such a
treatment (htsong cum suis)
Hans
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 10:03 ` sjoerd siebinga
2005-12-13 10:34 ` Hans Hagen
@ 2005-12-13 10:46 ` Tobias Burnus
2005-12-13 10:56 ` Hans Hagen
1 sibling, 1 reply; 11+ messages in thread
From: Tobias Burnus @ 2005-12-13 10:46 UTC (permalink / raw)
Hi,
sjoerd siebinga wrote:
> I have made a Ruby-script (for personal use loosely based on Adam's
> xsl-files) which generates all the encoding- and symbolfiles from a
> given cmapfile. If someone could send me the ttf-font, I can generate
> all the necessary encodingfiles for you.
Nice! The recommended (by Xiao Jianfeng) TrueType fonts are given at
http://wiki.contextgarden.net/Chinese
They are
ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/htfs.ttf
ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/hthei.ttf
ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/htkai.ttf
ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/htsong.ttf
Richard Gabriel wrote:
> But yet another question: What about Japanese? I've made only small
> research so far, but unlike Chinese, there's almost no information
> about Japanese in TeX. How much of work would be to adjust the current
> "chinese" ConTeXt module for Japanese? What would you need for it?
> [Of course, meanwhile I'll investigate some other ways of typesetting
> Japanese...]
(I don't know much about Japanese.)
In Japanese contrary to Chinese they mix different character sets:
- The Chinese characters ("Kanji"), which seem to make up most of the
(scientific) text (I'v seen);
in addition some pronouncation based characters are used:
- ("Kana":) Hiragana and Katagana; the former are rather round
characters in Japanese texts, most prominent should be "の" [means
something like "of" in English]. They are mostly used for
suffixes/prefixes where no Chinese equivalent exists. Whereas Katagana
is used to write words which have been taken from (mostly) European
languages.
For Kanji there should be no problem with the Chinese module, for Kana
you need additional support for these characters. Since they are
pronouncation based, they only consisted of < 50 Characters each.
Tobias
(Hmm, I never though I would end up such deep in linguistics duing my
PhD theses in physics. But having three Chinese in the group and doing
regularily some measurements at a research centre in Taiwan - I couldn't
help picking up something.)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 10:46 ` Re: Chinese Tobias Burnus
@ 2005-12-13 10:56 ` Hans Hagen
0 siblings, 0 replies; 11+ messages in thread
From: Hans Hagen @ 2005-12-13 10:56 UTC (permalink / raw)
Tobias Burnus wrote:
> (Hmm, I never though I would end up such deep in linguistics duing my
> PhD theses in physics. But having three Chinese in the group and doing
> regularily some measurements at a research centre in Taiwan - I
> couldn't help picking up something.)
well, there is a certain charm in those characters, even if you cannot
read them (during a 2*10 hour trip in a chinese bus during the last tug
conference one quickly learns to recognize the symbols for gas stations
and such -)
browsing a chinese-english dictionary is also fun (i have a small one on
my desk; some day i should start collecting dictionaries of all
languages that context supports -); with a bit of puzzling one can find
out the system behind the way words are made up
Hans
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 10:34 ` Hans Hagen
@ 2005-12-13 11:26 ` sjoerd siebinga
2005-12-13 13:02 ` your mails at go Hans Hagen
0 siblings, 1 reply; 11+ messages in thread
From: sjoerd siebinga @ 2005-12-13 11:26 UTC (permalink / raw)
On 13 Dec 2005, at 11:34, Hans Hagen wrote:
> sjoerd siebinga wrote:
>
>> I have made a Ruby-script (for personal use loosely based on
>> Adam's xsl-files) which generates all the encoding- and
>> symbolfiles from a given cmapfile. If someone could send me the
>> ttf-font, I can generate all the necessary encodingfiles for you.
>
> the chinese fonts mentioned in the context garden qualify for such
> a treatment (htsong cum suis)
>
Ok. Where can I send the chinese encodingfiles?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 9:52 ` Chinese Hans Hagen
2005-12-13 10:03 ` sjoerd siebinga
@ 2005-12-13 12:33 ` Adam Lindsay
2005-12-13 15:12 ` Hans Hagen
1 sibling, 1 reply; 11+ messages in thread
From: Adam Lindsay @ 2005-12-13 12:33 UTC (permalink / raw)
Hans Hagen wrote:
> what we need is a set of encoding files like
>
> /UniEncoding52 [
> ....
> /uni52DF
> /uni52E0
I hate to be negative, but I have doubts about how generic this approach
may be. In some tentative experiments, I discovered that many (most?)
CJK fonts don't use traditional postscript names, but rather map from
unicode to an indexed glyph number.
Fortunately, ttf2tfm's -w enco@Unicode@ notation seems to address this
in most of the old test cases I tried.
adam
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk
Lancaster University, InfoLab21 +44(0)1524/510.514
Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 11+ messages in thread
* your mails at go
2005-12-13 11:26 ` sjoerd siebinga
@ 2005-12-13 13:02 ` Hans Hagen
0 siblings, 0 replies; 11+ messages in thread
From: Hans Hagen @ 2005-12-13 13:02 UTC (permalink / raw)
sjoerd siebinga wrote:
>
> On 13 Dec 2005, at 11:34, Hans Hagen wrote:
>
>> sjoerd siebinga wrote:
>>
>>> I have made a Ruby-script (for personal use loosely based on
>>> Adam's xsl-files) which generates all the encoding- and
>>> symbolfiles from a given cmapfile. If someone could send me the
>>> ttf-font, I can generate all the necessary encodingfiles for you.
>>
>>
>> the chinese fonts mentioned in the context garden qualify for such a
>> treatment (htsong cum suis)
>>
>
> Ok. Where can I send the chinese encodingfiles?
you can send me a zip
maybe we should start thinking on how to set up a repository at
https://foundry.supelec.fr/
taco and patrick have more experience in this area than i have so maybe
they have some ideas on how to organize this
Hans
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 12:33 ` Adam Lindsay
@ 2005-12-13 15:12 ` Hans Hagen
2005-12-13 15:29 ` Adam Lindsay
0 siblings, 1 reply; 11+ messages in thread
From: Hans Hagen @ 2005-12-13 15:12 UTC (permalink / raw)
Adam Lindsay wrote:
> Hans Hagen wrote:
>
>> what we need is a set of encoding files like
>>
>> /UniEncoding52 [
>> ....
>> /uni52DF
>> /uni52E0
>
>
> I hate to be negative, but I have doubts about how generic this
> approach may be. In some tentative experiments, I discovered that many
> (most?) CJK fonts don't use traditional postscript names, but rather
> map from unicode to an indexed glyph number.
>
> Fortunately, ttf2tfm's -w enco@Unicode@ notation seems to address this
> in most of the old test cases I tried.
afaik pdftex can handle the indexXXXX and unicXXXX entries as
alternatives for glyphnames
Hans
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Re: Chinese
2005-12-13 15:12 ` Hans Hagen
@ 2005-12-13 15:29 ` Adam Lindsay
0 siblings, 0 replies; 11+ messages in thread
From: Adam Lindsay @ 2005-12-13 15:29 UTC (permalink / raw)
Hans Hagen wrote:
> Adam Lindsay wrote:
>> Fortunately, ttf2tfm's -w enco@Unicode@ notation seems to address this
>> in most of the old test cases I tried.
>
>
> afaik pdftex can handle the indexXXXX and unicXXXX entries as
> alternatives for glyphnames
Yes. Sorry I wasn't clear on that.
It's just that ttf2tfm is the tool that does a good job at extracting
those entries when other tools fail.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk
Lancaster University, InfoLab21 +44(0)1524/510.514
Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2005-12-13 15:29 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20051212173143.94765127FA@ronja.ntg.nl>
2005-12-13 8:07 ` Chinese Duncan Hothersall
2005-12-13 9:52 ` Chinese Hans Hagen
2005-12-13 10:03 ` sjoerd siebinga
2005-12-13 10:34 ` Hans Hagen
2005-12-13 11:26 ` sjoerd siebinga
2005-12-13 13:02 ` your mails at go Hans Hagen
2005-12-13 10:46 ` Re: Chinese Tobias Burnus
2005-12-13 10:56 ` Hans Hagen
2005-12-13 12:33 ` Adam Lindsay
2005-12-13 15:12 ` Hans Hagen
2005-12-13 15:29 ` Adam Lindsay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).