From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/38818 Path: news.gmane.org!not-for-mail From: Arthur Reutenauer Newsgroups: gmane.comp.tex.context Subject: Re: MKIV Chinese typesetting Date: Mon, 28 Jan 2008 03:17:11 +0100 Message-ID: <20080128021711.GA13410@phare.normalesup.org> References: <68bfdc900801270151u7e763fa6y7d8935fc35637b3f@mail.gmail.com> Reply-To: Mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1201536675 588 80.91.229.12 (28 Jan 2008 16:11:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 28 Jan 2008 16:11:15 +0000 (UTC) To: Mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Mon Jan 28 17:11:34 2008 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1JJWZj-00067G-Up for gctc-ntg-context-518@m.gmane.org; Mon, 28 Jan 2008 17:11:12 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 1A1741FB8C; Mon, 28 Jan 2008 17:10:45 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 18718-02-4; Mon, 28 Jan 2008 17:10:42 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 48F361FC2F; Mon, 28 Jan 2008 16:50:49 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 07D621FB64 for ; Mon, 28 Jan 2008 16:50:08 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 10564-01-14 for ; Mon, 28 Jan 2008 16:49:43 +0100 (CET) Original-Received: from nef2.ens.fr (nef2.ens.fr [129.199.96.40]) by ronja.ntg.nl (Postfix) with ESMTP id 926521FCD7 for ; Mon, 28 Jan 2008 03:17:12 +0100 (CET) Original-Received: from phare.normalesup.org (phare.normalesup.org [129.199.129.80]) by nef2.ens.fr (8.13.6/1.01.28121999) with ESMTP id m0S2HBMp041148 for ; Mon, 28 Jan 2008 03:17:11 +0100 (CET) X-Envelope-To: Original-Received: by phare.normalesup.org (Postfix, from userid 1008) id 9941F38093; Mon, 28 Jan 2008 03:17:11 +0100 (CET) Content-Disposition: inline In-Reply-To: <68bfdc900801270151u7e763fa6y7d8935fc35637b3f@mail.gmail.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.1.4 (nef2.ens.fr [129.199.96.32]); Mon, 28 Jan 2008 03:17:12 +0100 (CET) X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.9 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:38818 Archived-At: Hello, Thanks for this comprehensive review. If I'm not mistaken, there is no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf seem to use the generic font loading mechanism. I would like to answer more completely, but don't have much time for the moment. About some of your remarks: > so I think a new feature should be added to map all the Chinese puncts > into english while at the same time, a space should be added after the > English punct marks. Would it not be better to automatically add shrinkable glue after Chinese punctuation, rather than replacing the character by force? This would be very much in line with the general TeX philosophy of setting text (and would probably suppress the need for half-width forms in the font altogether). > - pp118, penultimate example, box 2, line1, the ' punct mark should > not appear at the end of the line This should be taken care of by adding an appropriate penalty before the character. > - pp118, ultimate example, box 2, line2, in fact, if you want do > perfect Chinese typesetting, all the puncts which begin a line or end > a line should be closed to the margin line Do you mean simply closer to the margin, or in the margin itself (protruding)? Protruding is already possible in pdfTeX; I believe it is available in LuaTeX as well, although it might be broken for the moment (Taco?). Setting the character closer to the margin should be possible as well, as a modified form of protruding, I trust. > A small skip should be left between Chinese and English which makes > the result much better. usually the space is a quarter of a chinese > character width. A TeX expression should like: > \hspace{0.25em plus 0.125em minus 0.08em} Again, this can be taken care of by automatically adding this glue between pairs of character of the appropriate category. > The last important thing for English and Chinese bi-lingual > typesetting is that: do not use English glyphs in Chinese fonts Sure, there should be a possibility of specifying a Western font to be used inside Chinese text. > - the following script produce an error: Invalid field id penalty for > node type glyph (1). I don't have that error here. This is very big font; are you sure it has been read entirely and correctly written to the cache? Lua crashed on my machine when I first compiled your example, and only a partial font hash was written to the cache (ConTeXt didn't crash, so the first compilation apparently ended well, but the cache was already filled with a partial font). I can imagine that problems will arise in the presence of a partially hashed font in the cache. Anyway, the code looks quite weird to me: > \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft] This means that you activate two different scripts at the same time (hang == Hangul and hani == Han ideographs), and also two languages at the same time (zht == Chinese Traditional and dlft is probably a typo for dflt == default). I can't imagine what that is supposed to mean, and activating Traditional Chinese is probably wrong with Adobe Song Std which is a Simplified Chinese font. A saner definition of that feature would be in my opinion: \definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs] I know this code comes from mk.pdf, but I think it is a mistake. Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx author) and Haruhiko Okumura about CJKV typesetting with Omega a couple of years ago. They have implemented all of the rules you mention above and a bit more; and although they used OTPs at the time, it should be quite straighforward to transpose it in Lua code (actually, I've done it a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it should probably done using node processors or something). http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf Arthur ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________