From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/38828 Path: news.gmane.org!not-for-mail From: "Yue Wang" Newsgroups: gmane.comp.tex.context Subject: Re: MKIV Chinese typesetting Date: Tue, 29 Jan 2008 01:05:52 +0800 Message-ID: <68bfdc900801280905w7c961075t33adbcc30e013c26@mail.gmail.com> References: <68bfdc900801270151u7e763fa6y7d8935fc35637b3f@mail.gmail.com> <20080128021711.GA13410@phare.normalesup.org> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1201541775 20776 80.91.229.12 (28 Jan 2008 17:36:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 28 Jan 2008 17:36:15 +0000 (UTC) To: "Mailing list for ConTeXt users" Original-X-From: ntg-context-bounces@ntg.nl Mon Jan 28 18:36:34 2008 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1JJXu5-0005O1-Gm for gctc-ntg-context-518@m.gmane.org; Mon, 28 Jan 2008 18:36:17 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 766F11FB01; Mon, 28 Jan 2008 18:35:29 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 07660-01; Mon, 28 Jan 2008 18:35:26 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 0A8111FB8E; Mon, 28 Jan 2008 18:12:57 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 869A11FB49 for ; Mon, 28 Jan 2008 18:12:49 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 25242-01-59 for ; Mon, 28 Jan 2008 18:12:08 +0100 (CET) Original-Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.152]) by ronja.ntg.nl (Postfix) with ESMTP id BFE2F1FD7E for ; Mon, 28 Jan 2008 18:05:53 +0100 (CET) Original-Received: by fg-out-1718.google.com with SMTP id e12so1813960fga.8 for ; Mon, 28 Jan 2008 09:05:53 -0800 (PST) Original-Received: by 10.86.76.16 with SMTP id y16mr5406415fga.36.1201539952280; Mon, 28 Jan 2008 09:05:52 -0800 (PST) Original-Received: by 10.86.26.1 with HTTP; Mon, 28 Jan 2008 09:05:52 -0800 (PST) In-Reply-To: <20080128021711.GA13410@phare.normalesup.org> Content-Disposition: inline X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.9 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:38828 Archived-At: Thank you very much for your mail! On Mon, Jan 28, 2008 at 10:17 AM, Arthur Reutenauer wrote: > Hello, > > Thanks for this comprehensive review. If I'm not mistaken, there is > no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf > seem to use the generic font loading mechanism. > yes, there are. see the last part of font-otf.lua > I would like to answer more completely, but don't have much time for > the moment. About some of your remarks: > > Thank you for your time and effort:) > > so I think a new feature should be added to map all the Chinese puncts > > into english while at the same time, a space should be added after the > > English punct marks. > > Would it not be better to automatically add shrinkable glue after > Chinese punctuation, rather than replacing the character by force? This > would be very much in line with the general TeX philosophy of setting > text (and would probably suppress the need for half-width forms in the > font altogether). > Sorry I am making a mistake here, forgive me. According to the rules made by Chinese official, Chinese puncts should not map to English one, sorry about that. but there are two kinds of full stop in Chinese, one is a circle, another is a dot, usually we should map the circle full stop to dot stop in Chinese scientific typesetting. > > > - pp118, penultimate example, box 2, line1, the ' punct mark should > > not appear at the end of the line > > This should be taken care of by adding an appropriate penalty before > the character. You are right:) There must be some problems in the penalty settings in font-otf.lua but I need some time to trace where. I think we should do something after the three elseif: line4563 , 4579 and 4588. > > > > - pp118, ultimate example, box 2, line2, in fact, if you want do > > perfect Chinese typesetting, all the puncts which begin a line or end > > a line should be closed to the margin line > > Do you mean simply closer to the margin, or in the margin itself > (protruding)? Protruding is already possible in pdfTeX; I believe it is > available in LuaTeX as well, although it might be broken for the moment > (Taco?). Setting the character closer to the margin should be possible > as well, as a modified form of protruding, I trust. closer to the margin, not in the margin. It is possible, but we don't know how much width we should adjust because the puncts in different font have different position. Of course, we can adjust the space according to most of the fonts. > > > > A small skip should be left between Chinese and English which makes > > the result much better. usually the space is a quarter of a chinese > > character width. A TeX expression should like: > > \hspace{0.25em plus 0.125em minus 0.08em} > > Again, this can be taken care of by automatically adding this glue > between pairs of character of the appropriate category. > Yes, and I think they should be added into font-otf.lua as well. > > > The last important thing for English and Chinese bi-lingual > > typesetting is that: do not use English glyphs in Chinese fonts > > Sure, there should be a possibility of specifying a Western font to be > used inside Chinese text. Yes, and I think there should be an option left for the user when they setup their accompany fonts. > > > > - the following script produce an error: Invalid field id penalty for > > node type glyph (1). > > I don't have that error here. This is very big font; are you sure it > has been read entirely and correctly written to the cache? Lua crashed > on my machine when I first compiled your example, and only a partial > font hash was written to the cache (ConTeXt didn't crash, so the first > compilation apparently ended well, but the cache was already filled with > a partial font). I can imagine that problems will arise in the presence > of a partially hashed font in the cache. > I am sure lua parse it correctly (I get the tma and tmc file in the cache). I am using the 01.16 beta. > Anyway, the code looks quite weird to me: > > > > \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft] > > This means that you activate two different scripts at the same time > (hang == Hangul and hani == Han ideographs), and also two languages at > the same time (zht == Chinese Traditional and dlft is probably a typo > for dflt == default). I can't imagine what that is supposed to mean, > and activating Traditional Chinese is probably wrong with Adobe Song Std > which is a Simplified Chinese font. A saner definition of that feature > would be in my opinion: > > \definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs] > > I know this code comes from mk.pdf, but I think it is a mistake. > umm... it is a mess..... what does the the hang mean? maybe fonts.analyzers.methods.hang and fonts.analyzers. method.hani in font-otf.lua line 4505 and 4583 which is used to adjust the penalty between different CJK categories? > Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx > author) and Haruhiko Okumura about CJKV typesetting with Omega a couple > of years ago. They have implemented all of the rules you mention above > and a bit more; and although they used OTPs at the time, it should be > quite straighforward to transpose it in Lua code (actually, I've done it > a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it > should probably done using node processors or something). Thank you for the link. In fact, many rules appear in the last part of font-otf.lua but it is incomplete. Chinese typesetting is easier than English typesetting because in Chinese we can break the line at any characters and no hyphenating algorithms is needed. The only thing is about the spaces between puncts and the penalty before and after the puncts. When English words are introduced, we should also take font switching and glue between chinese and english words into account. > > http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf > > Arthur > ___________________________________________________________________________________ > If your question is of interest to others as well, please add an entry to the Wiki! > > maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context > webpage : http://www.pragma-ade.nl / http://tex.aanhet.net > archive : https://foundry.supelec.fr/projects/contextrev/ > wiki : http://contextgarden.net > ___________________________________________________________________________________ > ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________