From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/12433 Path: main.gmane.org!not-for-mail From: "Tim 't Hart" Newsgroups: gmane.comp.tex.context Subject: RE: Writing Japanese using ConTeXt Date: Mon, 9 Jun 2003 18:33:49 +0200 Sender: ntg-context-admin@ntg.nl Message-ID: <000501c32ea4$eb16a570$0a01a8c0@TIMBO> References: <3EE496BB.1010605@zam.att.ne.jp> Reply-To: ntg-context@ntg.nl NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1055176505 18850 80.91.224.249 (9 Jun 2003 16:35:05 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 9 Jun 2003 16:35:05 +0000 (UTC) Original-X-From: ntg-context-admin@ntg.nl Mon Jun 09 18:35:02 2003 Return-path: Original-Received: from ref.vet.uu.nl ([131.211.172.13] helo=ref.ntg.nl) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19PPbq-0004tp-00 for ; Mon, 09 Jun 2003 18:35:02 +0200 Original-Received: from ref.ntg.nl (localhost.localdomain [127.0.0.1]) by ref.ntg.nl (Postfix) with ESMTP id 5D11310AF9; Mon, 9 Jun 2003 18:37:30 +0200 (MEST) Original-Received: from post-21.mail.nl.demon.net (post-21.mail.nl.demon.net [194.159.73.20]) by ref.ntg.nl (Postfix) with ESMTP id 93A4610ACE for ; Mon, 9 Jun 2003 18:33:52 +0200 (MEST) Original-Received: from [212.238.244.167] (helo=TIMBO) by post-21.mail.nl.demon.net with esmtp (Exim 3.36 #1) id 19PPai-000IsB-00 for ntg-context@ntg.nl; Mon, 09 Jun 2003 16:33:52 +0000 Original-To: X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 In-Reply-To: <3EE496BB.1010605@zam.att.ne.jp> Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Errors-To: ntg-context-admin@ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.0.13 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.comp.tex.context:12433 X-Report-Spam: http://spam.gmane.org/gmane.comp.tex.context:12433 Matthew Huggett wrote: > I asked about Japanese a while back. Hans requested more information = on > encodings, fonts, etc. I don't know enough about these things or > ConTeXt to know what is needed exactly. =20 > From what I've read, unicode is not that popular in Japan itself. = ... Unicode wasn't that popular because Unix-like operating systems used EUC = as encoding, and Microsoft used their own invented Shift-JIS encoding. So = there is still a lot of digital text out there written in these encodings, and = a lot of tools still use it. But I think that if you want to write new = texts, using Unicode shouldn't be a problem for most users. I guess that most editors supporting Asian encodings also make it possible to save in = UTF-8. I think nowadays it's easier to find a Unicode enabled editor than it is = to find a Shift-JIS/EUC editor! (Well, on Windows anyway...). Since ConTeXt already supports UTF-8, I don't see a reason to make thinks more = difficult than they already are by writing text in other encodings. When I look at the source of the Chinese module, the most difficult part = for me to understand is the part about font encoding, the enco-chi.tex file, = and the use of \defineuclass in that file. I guess it has to do something = with mapping the written text to the font. If I understand correctly, the = Chinese module doesn't use Unicode fonts, but GBK or Big5 encoded fonts. =20 I guess that if you want to make a proper Japanese module, you'll need = to support JIS or Shift-JIS encoded fonts. But on the other hand, maybe we don't need to support that since there are a lot of Japanese Unicode = fonts available. I use WinXP, and there we have msmincho.ttc and msgothic.ttc, which are both Unicode fonts. I also use kochi-mincho.ttf and kochi-gothic.ttf, which are both freely available Japanese Unicode = fonts. And Cyberbit is a Unicoded font as well. Commercially available fonts by Dynalab (Dynafont Japanese TrueType collection is quite cheap and very = good) are also Unicode fonts. Again, I don't think we should make it difficult = for ourselves by trying to support non-Unicode fonts while unicoded Japanese fonts are easy to use and widely available. > Typesetting Japanese could be more complicated than Chinese because of > the concurrent use of four writing systems=20 The fact that Japanese uses four writing systems is not really a = problem. Hiragana and Katakana (Kana) are just part of other Unicode ranges than Kanji/Chinese. Things might get difficult if you want to use different = fonts for Kana than you are using for Kanji. Then you need to assign a = different font to a different Unicode range. But I have no idea why somebody wants = to do such a thing! Just using Unicode and a Japanese Unicode font will = take care of things. If you type Romaji/Latin characters in the example I posted yesterday, = they get printed in CMR. I did some tests and I could change the font in any other font I wanted to, just by using the normal ConTeXt font = mechanisms. So I guess it is easy to mix Japanese fonts with normal Latin fonts. > I guess I need to track down a few sample documents. I tried to turn = up=20 > some info on Japanese typesetting rules but had no luck. The only info I got is from Ken Lunde's CJKV book, where he mentions = some rules about CJK line breaking. Also, some characters are allowed to = protrude in the right margin. I have some OTP's for Omega which handles all of = this. They can be seen here: http://www.math.jussieu.fr/~zoonek/LaTeX/Omega-Japanese/doc.html At first I wanted to use Omega with ConTeXt so that I could use these = OTP's, but Omega isn't really stable. With the ConTeXt example that I posted yesterday, I am already able to = write Japanese in UTF-8, use a Unicoded Japanese font in ConTeXt, and get = Japanese output. I hope the hard part is already behind me! :-) The only thing = that still puzzles me is how I can add interglyph space so that TeX can break = the lines. If someone can help, I would really appreciate it! My best, Tim