From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/38822 Path: news.gmane.org!not-for-mail From: Hans Hagen Newsgroups: gmane.comp.tex.context Subject: Re: MKIV Chinese typesetting Date: Mon, 28 Jan 2008 17:15:41 +0100 Message-ID: <479DFFAD.9020101@wxs.nl> References: <68bfdc900801270151u7e763fa6y7d8935fc35637b3f@mail.gmail.com> <20080128021711.GA13410@phare.normalesup.org> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1201539775 12985 80.91.229.12 (28 Jan 2008 17:02:55 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 28 Jan 2008 17:02:55 +0000 (UTC) To: Mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Mon Jan 28 18:03:13 2008 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1JJXO3-0000uc-IN for gctc-ntg-context-518@m.gmane.org; Mon, 28 Jan 2008 18:03:11 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 0B7961FC2F; Mon, 28 Jan 2008 18:02:33 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 25239-01-37; Mon, 28 Jan 2008 18:02:16 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 3AD441FCE8; Mon, 28 Jan 2008 17:37:46 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 2F86C1FD8A for ; Mon, 28 Jan 2008 17:37:32 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 25242-01-3 for ; Mon, 28 Jan 2008 17:36:46 +0100 (CET) Original-Received: from mail.pragma-ade.net (dsl-083-247-100-017.solcon.nl [83.247.100.17]) by ronja.ntg.nl (Postfix) with ESMTP id E96541FC98 for ; Mon, 28 Jan 2008 17:15:44 +0100 (CET) Original-Received: from [10.100.1.100] (unverified [10.100.1.100]) by controller-1 (SurgeMail 3.7b8) with ESMTP id 7195 for ; Mon, 28 Jan 2008 17:15:44 +0100 User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) In-Reply-To: <20080128021711.GA13410@phare.normalesup.org> X-Server: High Performance Mail Server - http://surgemail.com r=-274017400 X-Authenticated-User: hagen@controller-1 X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.9 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:38822 Archived-At: Arthur Reutenauer wrote: > Thanks for this comprehensive review. If I'm not mistaken, there is > no specific code for CJKV typesetting in Mark IV; the examples in mk.pdf > seem to use the generic font loading mechanism. > > I would like to answer more completely, but don't have much time for > the moment. About some of your remarks: actually, there is code in there but you need to specify chinese as feature \definefontfeature [chinese-traditional] [mode=node,script=hang,lang=zht] \definefontfeature [chinese-simple] [mode=node,script=hang,lang=zhs] >> so I think a new feature should be added to map all the Chinese puncts >> into english while at the same time, a space should be added after the >> English punct marks. > Would it not be better to automatically add shrinkable glue after > Chinese punctuation, rather than replacing the character by force? This > would be very much in line with the general TeX philosophy of setting > text (and would probably suppress the need for half-width forms in the > font altogether). there are penalties and glus nodes injected (based on specs given by some users) >> - pp118, penultimate example, box 2, line1, the ' punct mark should >> not appear at the end of the line probably an old mk.pdf (i'm awating some feedback before i post a new one) > This should be taken care of by adding an appropriate penalty before > the character. adding penalties is done based on a couple of tables >> - pp118, ultimate example, box 2, line2, in fact, if you want do >> perfect Chinese typesetting, all the puncts which begin a line or end >> a line should be closed to the margin line > > Do you mean simply closer to the margin, or in the margin itself > (protruding)? Protruding is already possible in pdfTeX; I believe it is > available in LuaTeX as well, although it might be broken for the moment > (Taco?). Setting the character closer to the margin should be possible > as well, as a modified form of protruding, I trust. thisis always a bit of a trade off; i use samples with small width so at some point you run into tex optimizing situations; i'll make things configurable >> A small skip should be left between Chinese and English which makes >> the result much better. usually the space is a quarter of a chinese >> character width. A TeX expression should like: >> \hspace{0.25em plus 0.125em minus 0.08em} > > Again, this can be taken care of by automatically adding this glue > between pairs of character of the appropriate category. > >> The last important thing for English and Chinese bi-lingual >> typesetting is that: do not use English glyphs in Chinese fonts > > Sure, there should be a possibility of specifying a Western font to be > used inside Chinese text. font swichting; i still have to look into mixed fonts >> - the following script produce an error: Invalid field id penalty for >> node type glyph (1). > > I don't have that error here. This is very big font; are you sure it > has been read entirely and correctly written to the cache? Lua crashed > on my machine when I first compiled your example, and only a partial > font hash was written to the cache (ConTeXt didn't crash, so the first > compilation apparently ended well, but the cache was already filled with > a partial font). I can imagine that problems will arise in the presence > of a partially hashed font in the cache. > > Anyway, the code looks quite weird to me: > >> \definefontfeature[chinese][mode=node,script=hang,lang=zht,script=hani,lang=dlft] > > This means that you activate two different scripts at the same time > (hang == Hangul and hani == Han ideographs), and also two languages at > the same time (zht == Chinese Traditional and dlft is probably a typo > for dflt == default). I can't imagine what that is supposed to mean, > and activating Traditional Chinese is probably wrong with Adobe Song Std > which is a Simplified Chinese font. A saner definition of that feature > would be in my opinion: indeed this disables chinese ... > \definefontfeature[chinese-traditional][mode=node,script=hani,lang=zhs] > > I know this code comes from mk.pdf, but I think it is a mistake. > > Finally, there is an interesting article by Jin-Hwan Cho (the dvipdfmx > author) and Haruhiko Okumura about CJKV typesetting with Omega a couple > of years ago. They have implemented all of the rules you mention above > and a bit more; and although they used OTPs at the time, it should be > quite straighforward to transpose it in Lua code (actually, I've done it > a couple of months ago, but I have used plain LuaTeX, and in ConTeXt it > should probably done using node processors or something). indeed > http://project.ktug.or.kr/omega-cjk/tug2004-preprint.pdf i'll have a look ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl ----------------------------------------------------------------- ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________