ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* MKIV Chinese typesetting
@ 2008-01-27  9:51 Yue Wang
  2008-01-28  2:17 ` Arthur Reutenauer
  0 siblings, 1 reply; 13+ messages in thread
From: Yue Wang @ 2008-01-27  9:51 UTC (permalink / raw)
  To: ntg-context

Dear Hans and other friends:

I play a little bit on MKIV's Chinese support and find it is
incomplete. It has some serious problems, and lack important features
as well.
Well, I know it from mk.pdf that Chinese support is under
construction, so in this email I just want to give some Chinese
horizontal typesetting suggestions for development (vertical
typesetting are more frequently used in Hongkong, Taiwan and Japan, so
maybe people from there are eager to help):

first, a features many Chinese people like:
a lot of Chinese users write essays using full-width punct (that is, a
punct mark is as the same size as the Chinese characters) and I found
that the puncts are compressed well in context with good penalty
settings(although there are minor problems in the pdf).
however, sometimes it is up to the publisher which kind of punct marks
are to be used, and usually many scientific books and papers use
half-width puncts.
so I think a new feature should be added to map all the Chinese puncts
into english while at the same time, a space should be added after the
English punct marks. that is, changing
中文,中文
into
中文, 中文
notice the space in the second example
there is an opentype feature called hwid, but it should not be used because:
- it changes the English alphabets as well
- most of the time the function in most Chinese fonts is missing or buggy
- it does not automatically add the space
These are the minor problems in the mk.pdf:
- pp118, penultimate example, box 2, line1, the ' punct mark should
not appear at the end of the line
- pp118, ultimate example, box 2, line2, in fact, if you want do
perfect Chinese typesetting, all the puncts which begin a line or end
a line should be closed to the margin line which makes the type block
look good (especially when watch the right or left edge of the type
block closely) while this ' has many space before it. In English, it
is quite easy and the good old tex does a good job because all the
letters and puncts are designed carefully, but in Chinese things are
much difficult because all the glyphs including the puncts in all the
Chinese fonts have the same width, and which makes the problem worse
is that different Chinese fonts have different positions for these
puncts. So a dynamic way which calculates the most left edge and most
right edge of a punct glyph should be used, and it is not available in
luatex:( will it be added later?

second, and the most important, on the bilingual typesetting:
nowadays English words are frequently used in Chinese, so great care
must be taken to handle it. many Chinese users write their manuscript
in one of the two ways:
中文 English 中文
or
中文English中文
The first way looks good when using a editor with mono space font and
the second way is convenient for one to type. So both users will happy
if both ways get the same result in context.
in MKII, users are forced to use such kind of expression which drives
them crazy and should be avoided:
中文English\ 中文
A small skip should be left between Chinese and English which makes
the result much better. usually the space is a quarter of a chinese
character width. A TeX expression should like:
\hspace{0.25em plus 0.125em minus 0.08em}
but I think it is better to introduce a \setupxxxwidth for the user.
The last important thing for English and Chinese bi-lingual
typesetting is that: do not use English glyphs in Chinese fonts
because:
- recently English glyphs in most of the Chinese fonts are ugly
- users have no right to modify the Chinese fonts
- many other reasons
so when set up a Chinese typeface, it is better to leave a feature for
the user to choose the accompany english typeface.
while typesetting, Chinese typography rules do not interfere with the
english one, that is, chinese part and english part use their own line
break model.
here are two bugs in recent context:
- 中文 English English English 中文will produce 中文EnglishEnglishEnglish中文,
the spaces between english words are all missing.
- the following script produce an error: Invalid field id penalty for
node type glyph (1). Here is a sample:
\definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft]
\definefontsynonym[songti][name:AdobeSongStd-Light][features=Chinese]
\definefontsynonym[Serif][songti]
definetypeface[song][rm][serif][songti]
\setupbodyfont[song,12pt,rm]
\starttext 中文 English: 中文 \stoptext

third, indenting:
Chinese paragraphs are indented with two Chinese characters, and we
also indent the first paragraph. context is so powerful that it is
easy to do these kind of things, thank you:)
and i think it is better to write it into the Chinese typesetting module?

Urgency: bi-lingual typesetting > fix the wrong panelty > map the
english punct maps as an alternative > indenting > puncts should be
close to margin lines.

I am eager to see full support for chinese and other Asian languages
(and I am eager to help too), and i hope my suggestions here are
helpful to the developers.
I am grateful to Hans, Taco and and many other guys (eg. Zhichu Chen)
that did a very good job in recent MKIV Chinese support. Thank you for
your time and effort for continuous contributions!

Yue Wang
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-27  9:51 MKIV Chinese typesetting Yue Wang
@ 2008-01-28  2:17 ` Arthur Reutenauer
  2008-01-28 16:15   ` Hans Hagen
                     ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Arthur Reutenauer @ 2008-01-28  2:17 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

	Hello,

  Thanks for this comprehensive review.  If I'm not mistaken, there is
no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf
seem to use the generic font loading mechanism.

  I would like to answer more completely, but don't have much time for
the moment.  About some of your remarks:

> so I think a new feature should be added to map all the Chinese puncts
> into english while at the same time, a space should be added after the
> English punct marks.

  Would it not be better to automatically add shrinkable glue after
Chinese punctuation, rather than replacing the character by force?  This
would be very much in line with the general TeX philosophy of setting
text (and would probably suppress the need for half-width forms in the
font altogether).

> - pp118, penultimate example, box 2, line1, the ' punct mark should
> not appear at the end of the line

  This should be taken care of by adding an appropriate penalty before
the character.

> - pp118, ultimate example, box 2, line2, in fact, if you want do
> perfect Chinese typesetting, all the puncts which begin a line or end
> a line should be closed to the margin line

  Do you mean simply closer to the margin, or in the margin itself
(protruding)? Protruding is already possible in pdfTeX; I believe it is
available in LuaTeX as well, although it might be broken for the moment
(Taco?).  Setting the character closer to the margin should be possible
as well, as a modified form of protruding, I trust.

> A small skip should be left between Chinese and English which makes
> the result much better. usually the space is a quarter of a chinese
> character width. A TeX expression should like:
> \hspace{0.25em plus 0.125em minus 0.08em}

  Again, this can be taken care of by automatically adding this glue
between pairs of character of the appropriate category.

> The last important thing for English and Chinese bi-lingual
> typesetting is that: do not use English glyphs in Chinese fonts

  Sure, there should be a possibility of specifying a Western font to be
used inside Chinese text.

> - the following script produce an error: Invalid field id penalty for
> node type glyph (1).

  I don't have that error here.  This is very big font; are you sure it
has been read entirely and correctly written to the cache?  Lua crashed
on my machine when I first compiled your example, and only a partial
font hash was written to the cache (ConTeXt didn't crash, so the first
compilation apparently ended well, but the cache was already filled with
a partial font).  I can imagine that problems will arise in the presence
of a partially hashed font in the cache.

  Anyway, the code looks quite weird to me:

> \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft]

  This means that you activate two different scripts at the same time
(hang == Hangul and hani == Han ideographs), and also two languages at
the same time (zht == Chinese Traditional and dlft is probably a typo
for dflt == default).  I can't imagine what that is supposed to mean,
and activating Traditional Chinese is probably wrong with Adobe Song Std
which is a Simplified Chinese font.  A saner definition of that feature
would be in my opinion:

	\definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs]

  I know this code comes from mk.pdf, but I think it is a mistake.

  Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx
author) and Haruhiko Okumura about CJKV typesetting with Omega a couple
of years ago.  They have implemented all of the rules you mention above
and a bit more; and although they used OTPs at the time, it should be
quite straighforward to transpose it in Lua code (actually, I've done it
a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it
should probably done using node processors or something).

	http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf

		Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28  2:17 ` Arthur Reutenauer
@ 2008-01-28 16:15   ` Hans Hagen
  2008-01-28 17:07     ` Arthur Reutenauer
  2008-01-28 16:26   ` Wolfgang Schuster
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Hans Hagen @ 2008-01-28 16:15 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Arthur Reutenauer wrote:

>   Thanks for this comprehensive review.  If I'm not mistaken, there is
> no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf
> seem to use the generic font loading mechanism.
> 
>   I would like to answer more completely, but don't have much time for
> the moment.  About some of your remarks:

actually, there is code in there but you need to specify chinese as feature

\definefontfeature
   [chinese-traditional]
   [mode=node,script=hang,lang=zht]
\definefontfeature
   [chinese-simple]
   [mode=node,script=hang,lang=zhs]


>> so I think a new feature should be added to map all the Chinese puncts
>> into english while at the same time, a space should be added after the
>> English punct marks.

>   Would it not be better to automatically add shrinkable glue after
> Chinese punctuation, rather than replacing the character by force?  This
> would be very much in line with the general TeX philosophy of setting
> text (and would probably suppress the need for half-width forms in the
> font altogether).

there are penalties and glus nodes injected (based on specs given by 
some users)

>> - pp118, penultimate example, box 2, line1, the ' punct mark should
>> not appear at the end of the line

probably an old mk.pdf (i'm awating some feedback before i post a new one)

>   This should be taken care of by adding an appropriate penalty before
> the character.

adding penalties is done based on a couple of tables

>> - pp118, ultimate example, box 2, line2, in fact, if you want do
>> perfect Chinese typesetting, all the puncts which begin a line or end
>> a line should be closed to the margin line
> 
>   Do you mean simply closer to the margin, or in the margin itself
> (protruding)? Protruding is already possible in pdfTeX; I believe it is
> available in LuaTeX as well, although it might be broken for the moment
> (Taco?).  Setting the character closer to the margin should be possible
> as well, as a modified form of protruding, I trust.

thisis always a bit of a trade off; i use samples with small width so at 
some point you run into tex optimizing situations; i'll make things 
configurable

>> A small skip should be left between Chinese and English which makes
>> the result much better. usually the space is a quarter of a chinese
>> character width. A TeX expression should like:
>> \hspace{0.25em plus 0.125em minus 0.08em}
> 
>   Again, this can be taken care of by automatically adding this glue
> between pairs of character of the appropriate category.
> 
>> The last important thing for English and Chinese bi-lingual
>> typesetting is that: do not use English glyphs in Chinese fonts
> 
>   Sure, there should be a possibility of specifying a Western font to be
> used inside Chinese text.

font swichting; i still have to look into mixed fonts

>> - the following script produce an error: Invalid field id penalty for
>> node type glyph (1).
> 
>   I don't have that error here.  This is very big font; are you sure it
> has been read entirely and correctly written to the cache?  Lua crashed
> on my machine when I first compiled your example, and only a partial
> font hash was written to the cache (ConTeXt didn't crash, so the first
> compilation apparently ended well, but the cache was already filled with
> a partial font).  I can imagine that problems will arise in the presence
> of a partially hashed font in the cache.
> 
>   Anyway, the code looks quite weird to me:
> 
>> \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft]
> 
>   This means that you activate two different scripts at the same time
> (hang == Hangul and hani == Han ideographs), and also two languages at
> the same time (zht == Chinese Traditional and dlft is probably a typo
> for dflt == default).  I can't imagine what that is supposed to mean,
> and activating Traditional Chinese is probably wrong with Adobe Song Std
> which is a Simplified Chinese font.  A saner definition of that feature
> would be in my opinion:

indeed this disables chinese ...

> 	\definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs]
> 
>   I know this code comes from mk.pdf, but I think it is a mistake.
> 
>   Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx
> author) and Haruhiko Okumura about CJKV typesetting with Omega a couple
> of years ago.  They have implemented all of the rules you mention above
> and a bit more; and although they used OTPs at the time, it should be
> quite straighforward to transpose it in Lua code (actually, I've done it
> a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it
> should probably done using node processors or something).

indeed

> 	http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf

i'll have a look

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28  2:17 ` Arthur Reutenauer
  2008-01-28 16:15   ` Hans Hagen
@ 2008-01-28 16:26   ` Wolfgang Schuster
  2008-01-28 23:19     ` Arthur Reutenauer
  2008-01-28 16:43   ` Taco Hoekwater
  2008-01-28 17:05   ` Yue Wang
  3 siblings, 1 reply; 13+ messages in thread
From: Wolfgang Schuster @ 2008-01-28 16:26 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

On Jan 28, 2008 3:17 AM, Arthur Reutenauer
<arthur.reutenauer@normalesup.org> wrote:
>         Hello,
>
>   Thanks for this comprehensive review.  If I'm not mistaken, there is
> no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf
> seem to use the generic font loading mechanism.

This is wrong, fon-otf contains a few lua macros about linebreaking
and char-def has information about the character width (full width,
half width ...)
and other information like opening punctuation, parenthesis but none
of them is finished.

>   I would like to answer more completely, but don't have much time for
> the moment.  About some of your remarks:
>
> > so I think a new feature should be added to map all the Chinese puncts
> > into english while at the same time, a space should be added after the
> > English punct marks.
>
>   Would it not be better to automatically add shrinkable glue after
> Chinese punctuation, rather than replacing the character by force?  This
> would be very much in line with the general TeX philosophy of setting
> text (and would probably suppress the need for half-width forms in the
> font altogether).
>
> > - pp118, penultimate example, box 2, line1, the ' punct mark should
> > not appear at the end of the line
>
>   This should be taken care of by adding an appropriate penalty before
> the character.
>
> > - pp118, ultimate example, box 2, line2, in fact, if you want do
> > perfect Chinese typesetting, all the puncts which begin a line or end
> > a line should be closed to the margin line
>
>   Do you mean simply closer to the margin, or in the margin itself
> (protruding)? Protruding is already possible in pdfTeX; I believe it is
> available in LuaTeX as well, although it might be broken for the moment
> (Taco?).  Setting the character closer to the margin should be possible
> as well, as a modified form of protruding, I trust.
>
> > A small skip should be left between Chinese and English which makes
> > the result much better. usually the space is a quarter of a chinese
> > character width. A TeX expression should like:
> > \hspace{0.25em plus 0.125em minus 0.08em}
>
>   Again, this can be taken care of by automatically adding this glue
> between pairs of character of the appropriate category.
>
> > The last important thing for English and Chinese bi-lingual
> > typesetting is that: do not use English glyphs in Chinese fonts
>
>   Sure, there should be a possibility of specifying a Western font to be
> used inside Chinese text.

Could be done with cirtual fonts but we need a interface.

> > - the following script produce an error: Invalid field id penalty for
> > node type glyph (1).
>
>   I don't have that error here.  This is very big font; are you sure it
> has been read entirely and correctly written to the cache?  Lua crashed
> on my machine when I first compiled your example, and only a partial
> font hash was written to the cache (ConTeXt didn't crash, so the first
> compilation apparently ended well, but the cache was already filled with
> a partial font).  I can imagine that problems will arise in the presence
> of a partially hashed font in the cache.
>
>   Anyway, the code looks quite weird to me:
>
> > \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft]
>
>   This means that you activate two different scripts at the same time
> (hang == Hangul and hani == Han ideographs), and also two languages at
> the same time (zht == Chinese Traditional and dlft is probably a typo
> for dflt == default).  I can't imagine what that is supposed to mean,
> and activating Traditional Chinese is probably wrong with Adobe Song Std
> which is a Simplified Chinese font.  A saner definition of that feature
> would be in my opinion:
>
>         \definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs]

You need the hang script, it takes care about the linebreak.

>   I know this code comes from mk.pdf, but I think it is a mistake.
>
>   Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx
> author) and Haruhiko Okumura about CJKV typesetting with Omega a couple
> of years ago.  They have implemented all of the rules you mention above
> and a bit more; and although they used OTPs at the time, it should be
> quite straighforward to transpose it in Lua code (actually, I've done it
> a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it
> should probably done using node processors or something).
>
>         http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf

This this currently done in font-otf.lua.

Greetings,

Wolfgang
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28  2:17 ` Arthur Reutenauer
  2008-01-28 16:15   ` Hans Hagen
  2008-01-28 16:26   ` Wolfgang Schuster
@ 2008-01-28 16:43   ` Taco Hoekwater
  2008-01-28 20:17     ` Hans Hagen
  2008-01-28 17:05   ` Yue Wang
  3 siblings, 1 reply; 13+ messages in thread
From: Taco Hoekwater @ 2008-01-28 16:43 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Arthur Reutenauer wrote:
> 
>   Do you mean simply closer to the margin, or in the margin itself
> (protruding)? Protruding is already possible in pdfTeX; I believe it is
> available in LuaTeX as well, although it might be broken for the moment
> (Taco?).  

Protrusion should be available in luatex as well, but it may be
incompatible with the mkiv code.


Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28  2:17 ` Arthur Reutenauer
                     ` (2 preceding siblings ...)
  2008-01-28 16:43   ` Taco Hoekwater
@ 2008-01-28 17:05   ` Yue Wang
  3 siblings, 0 replies; 13+ messages in thread
From: Yue Wang @ 2008-01-28 17:05 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Thank you very much for your mail!

On Mon, Jan 28, 2008 at 10:17 AM, Arthur Reutenauer
<arthur.reutenauer@normalesup.org> wrote:
>         Hello,
>
>   Thanks for this comprehensive review.  If I'm not mistaken, there is
>  no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf
>  seem to use the generic font loading mechanism.
>

yes, there are. see the last part of font-otf.lua

>   I would like to answer more completely, but don't have much time for
>  the moment.  About some of your remarks:
>
>

Thank you for your time and effort:)

>  > so I think a new feature should be added to map all the Chinese puncts
>  > into english while at the same time, a space should be added after the
>  > English punct marks.
>
>   Would it not be better to automatically add shrinkable glue after
>  Chinese punctuation, rather than replacing the character by force?  This
>  would be very much in line with the general TeX philosophy of setting
>  text (and would probably suppress the need for half-width forms in the
>  font altogether).
>

Sorry I am making a mistake here, forgive me.
According to the rules made by Chinese official, Chinese puncts should
not map to English one, sorry about that.
but there are two kinds of full stop in Chinese, one is a circle,
another is a dot, usually we should map the circle full stop to dot
stop in Chinese scientific typesetting.


>
>  > - pp118, penultimate example, box 2, line1, the ' punct mark should
>  > not appear at the end of the line
>
>   This should be taken care of by adding an appropriate penalty before
>  the character.

You are right:) There must be some problems in the penalty settings in
font-otf.lua but I need some time to trace where. I think we should do
something after the three elseif: line4563 , 4579 and 4588.

>
>
>  > - pp118, ultimate example, box 2, line2, in fact, if you want do
>  > perfect Chinese typesetting, all the puncts which begin a line or end
>  > a line should be closed to the margin line
>
>   Do you mean simply closer to the margin, or in the margin itself
>  (protruding)? Protruding is already possible in pdfTeX; I believe it is
>  available in LuaTeX as well, although it might be broken for the moment
>  (Taco?).  Setting the character closer to the margin should be possible
>  as well, as a modified form of protruding, I trust.

closer to the margin, not in the margin.
It is possible, but we don't know how much width we should adjust
because the puncts in different font have different position. Of
course, we can adjust the space according to most of the fonts.

>
>
>  > A small skip should be left between Chinese and English which makes
>  > the result much better. usually the space is a quarter of a chinese
>  > character width. A TeX expression should like:
>  > \hspace{0.25em plus 0.125em minus 0.08em}
>
>   Again, this can be taken care of by automatically adding this glue
>  between pairs of character of the appropriate category.
>

Yes,  and I think they should be added into font-otf.lua as well.

>
>  > The last important thing for English and Chinese bi-lingual
>  > typesetting is that: do not use English glyphs in Chinese fonts
>
>   Sure, there should be a possibility of specifying a Western font to be
>  used inside Chinese text.

Yes, and I think there should be an option left for the user when they
setup their accompany fonts.

>
>
>  > - the following script produce an error: Invalid field id penalty for
>  > node type glyph (1).
>
>   I don't have that error here.  This is very big font; are you sure it
>  has been read entirely and correctly written to the cache?  Lua crashed
>  on my machine when I first compiled your example, and only a partial
>  font hash was written to the cache (ConTeXt didn't crash, so the first
>  compilation apparently ended well, but the cache was already filled with
>  a partial font).  I can imagine that problems will arise in the presence
>  of a partially hashed font in the cache.
>

I am sure lua parse it correctly (I get the tma and tmc file in the cache).
I am using the 01.16 beta.

>   Anyway, the code looks quite weird to me:
>
>
>  > \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft]
>
>   This means that you activate two different scripts at the same time
>  (hang == Hangul and hani == Han ideographs), and also two languages at
>  the same time (zht == Chinese Traditional and dlft is probably a typo
>  for dflt == default).  I can't imagine what that is supposed to mean,
>  and activating Traditional Chinese is probably wrong with Adobe Song Std
>  which is a Simplified Chinese font.  A saner definition of that feature
>  would be in my opinion:
>
>         \definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs]
>
>   I know this code comes from mk.pdf, but I think it is a mistake.
>

umm... it is a mess.....
what does the the hang mean?
maybe fonts.analyzers.methods.hang and fonts.analyzers. method.hani in
font-otf.lua
line 4505 and 4583 which is used to adjust the penalty between
different CJK categories?

>   Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx
>  author) and Haruhiko Okumura about CJKV typesetting with Omega a couple
>  of years ago.  They have implemented all of the rules you mention above
>  and a bit more; and although they used OTPs at the time, it should be
>  quite straighforward to transpose it in Lua code (actually, I've done it
>  a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it
>  should probably done using node processors or something).

Thank you for the link. In fact, many rules appear in the last part of
font-otf.lua but it is incomplete.
Chinese typesetting is easier than English typesetting because in
Chinese we can break the line at any characters and no hyphenating
algorithms is needed.
The only thing is about the spaces between puncts and the penalty
before and after the puncts.
When English words are introduced, we should also take font switching
and glue between chinese and english words into account.

>
>         http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf
>
>                 Arthur
>  ___________________________________________________________________________________
>  If your question is of interest to others as well, please add an entry to the Wiki!
>
>  maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
>  webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
>  archive  : https://foundry.supelec.fr/projects/contextrev/
>  wiki     : http://contextgarden.net
>  ___________________________________________________________________________________
>
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 16:15   ` Hans Hagen
@ 2008-01-28 17:07     ` Arthur Reutenauer
  0 siblings, 0 replies; 13+ messages in thread
From: Arthur Reutenauer @ 2008-01-28 17:07 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> actually, there is code in there but you need to specify chinese as feature
> 
> \definefontfeature
>    [chinese-traditional]
>    [mode=node,script=hang,lang=zht]
> \definefontfeature
>    [chinese-simple]
>    [mode=node,script=hang,lang=zhs]

  OK, but "hang" should still be replaced by "hani" if you want to use
OpenType features for Chinese (Traditional or Simplified).

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 16:43   ` Taco Hoekwater
@ 2008-01-28 20:17     ` Hans Hagen
  2008-01-28 20:24       ` Arthur Reutenauer
  0 siblings, 1 reply; 13+ messages in thread
From: Hans Hagen @ 2008-01-28 20:17 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Taco Hoekwater wrote:
> Arthur Reutenauer wrote:
>>   Do you mean simply closer to the margin, or in the margin itself
>> (protruding)? Protruding is already possible in pdfTeX; I believe it is
>> available in LuaTeX as well, although it might be broken for the moment
>> (Taco?).  
> 
> Protrusion should be available in luatex as well, but it may be
> incompatible with the mkiv code.

i'm not going to waste time on protruding in mkiv, later this year we 
will have proper font related protruding and hz tables and then i will 
pick up that thread

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 20:17     ` Hans Hagen
@ 2008-01-28 20:24       ` Arthur Reutenauer
  0 siblings, 0 replies; 13+ messages in thread
From: Arthur Reutenauer @ 2008-01-28 20:24 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> i'm not going to waste time on protruding in mkiv, later this year we 
> will have proper font related protruding and hz tables and then i will 
> pick up that thread

  Anyway, if you read Yue's reply, he says the glyphs should not
protrude in Chinese anyway ;-)  But I'm pretty sure they can in Japanese
(for some typographers at least).

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 16:26   ` Wolfgang Schuster
@ 2008-01-28 23:19     ` Arthur Reutenauer
  2008-01-28 23:22       ` Hans Hagen
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Arthur Reutenauer @ 2008-01-28 23:19 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> This is wrong, fon-otf contains a few lua macros about linebreaking
> and char-def has information about the character width (full width,
> half width ...)
> and other information like opening punctuation, parenthesis but none
> of them is finished.

  OK, I thought line breaking would be managed in node-*, so I didn't
look in font-otf for it.

>>   Sure, there should be a possibility of specifying a Western font to be
>> used inside Chinese text.
> 
> Could be done with cirtual fonts but we need a interface.

  Sure, no need to rush things.

> You need the hang script, it takes care about the linebreak.

  What do you mean?  How does it take care about the linebreak?  And how
can it be relevant for Chinese characters?  Default Chinese fonts from
Adobe like AdobeSongStd don't have a "hang" script at all anyway.  Do
you know fonts that have?

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 23:19     ` Arthur Reutenauer
@ 2008-01-28 23:22       ` Hans Hagen
  2008-01-28 23:25       ` Hans Hagen
  2008-01-29 11:25       ` Wolfgang Schuster
  2 siblings, 0 replies; 13+ messages in thread
From: Hans Hagen @ 2008-01-28 23:22 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Arthur Reutenauer wrote:

>> You need the hang script, it takes care about the linebreak.
> 
>   What do you mean?  How does it take care about the linebreak?  And how
> can it be relevant for Chinese characters?  Default Chinese fonts from
> Adobe like AdobeSongStd don't have a "hang" script at all anyway.  Do
> you know fonts that have?

hey, i just gambled ... it's you who have to tell me what script/lang 
combinations to use; i just needed a value to kickstart the analyser and 
nobody bothered to correct me

(same for arab, i just picked some)

you don't seriousy think that i can read chinese eh?

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 23:19     ` Arthur Reutenauer
  2008-01-28 23:22       ` Hans Hagen
@ 2008-01-28 23:25       ` Hans Hagen
  2008-01-29 11:25       ` Wolfgang Schuster
  2 siblings, 0 replies; 13+ messages in thread
From: Hans Hagen @ 2008-01-28 23:25 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Arthur Reutenauer wrote:

> Adobe like AdobeSongStd don't have a "hang" script at all anyway.  Do
> you know fonts that have?

btw, the same is true for japanese and korean ... i like these glyphs 
and playing with them but i need input from users on how to organize 
things, i.e. script/lang combinations and rules for treating them so 
that i can write the analyzers

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: MKIV Chinese typesetting
  2008-01-28 23:19     ` Arthur Reutenauer
  2008-01-28 23:22       ` Hans Hagen
  2008-01-28 23:25       ` Hans Hagen
@ 2008-01-29 11:25       ` Wolfgang Schuster
  2 siblings, 0 replies; 13+ messages in thread
From: Wolfgang Schuster @ 2008-01-29 11:25 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

On Jan 29, 2008 12:19 AM, Arthur Reutenauer
<arthur.reutenauer@normalesup.org> wrote:
> > This is wrong, fon-otf contains a few lua macros about linebreaking
> > and char-def has information about the character width (full width,
> > half width ...)
> > and other information like opening punctuation, parenthesis but none
> > of them is finished.
>
>   OK, I thought line breaking would be managed in node-*, so I didn't
> look in font-otf for it.
>
> >>   Sure, there should be a possibility of specifying a Western font to be
> >> used inside Chinese text.
> >
> > Could be done with cirtual fonts but we need a interface.
>
>   Sure, no need to rush things.
>
> > You need the hang script, it takes care about the linebreak.
>
>   What do you mean?  How does it take care about the linebreak?  And how
> can it be relevant for Chinese characters?  Default Chinese fonts from
> Adobe like AdobeSongStd don't have a "hang" script at all anyway.  Do
> you know fonts that have?

You need the hang script in \definefontfeature to enable ConTeXt linebreak
for CJK, don't ask me why I you have to use it as value for script.

Wolfgang
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-01-29 11:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-27  9:51 MKIV Chinese typesetting Yue Wang
2008-01-28  2:17 ` Arthur Reutenauer
2008-01-28 16:15   ` Hans Hagen
2008-01-28 17:07     ` Arthur Reutenauer
2008-01-28 16:26   ` Wolfgang Schuster
2008-01-28 23:19     ` Arthur Reutenauer
2008-01-28 23:22       ` Hans Hagen
2008-01-28 23:25       ` Hans Hagen
2008-01-29 11:25       ` Wolfgang Schuster
2008-01-28 16:43   ` Taco Hoekwater
2008-01-28 20:17     ` Hans Hagen
2008-01-28 20:24       ` Arthur Reutenauer
2008-01-28 17:05   ` Yue Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).