From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/24283 Path: news.gmane.org!not-for-mail From: Hans Hagen Newsgroups: gmane.comp.tex.context Subject: Re: Chinese Date: Mon, 12 Dec 2005 16:53:51 +0100 Message-ID: <439D9D0F.6080406@wxs.nl> References: <20051209152442.e84454c3@mx1.kerio.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1134403007 12648 80.91.229.2 (12 Dec 2005 15:56:47 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 12 Dec 2005 15:56:47 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Mon Dec 12 16:56:40 2005 Return-path: Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by ciao.gmane.org with esmtp (Exim 4.43) id 1Elq06-0001rJ-4e for gctc-ntg-context-518@m.gmane.org; Mon, 12 Dec 2005 16:54:08 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id CEFD5127BE; Mon, 12 Dec 2005 16:54:04 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 19715-07; Mon, 12 Dec 2005 16:53:58 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 55A85127C4; Mon, 12 Dec 2005 16:53:58 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 7C9BC127C4 for ; Mon, 12 Dec 2005 16:53:57 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 19884-06 for ; Mon, 12 Dec 2005 16:53:55 +0100 (CET) Original-Received: from controller-1 (dsl-212-84-128-085.solcon.nl [212.84.128.85]) by ronja.ntg.nl (Postfix) with ESMTP id B817F127BE for ; Mon, 12 Dec 2005 16:53:53 +0100 (CET) Original-Received: from [10.100.1.102] (unverified [10.100.1.102]) by controller-1 (SurgeMail 3.5b3) with ESMTP id 7368 for ; Mon, 12 Dec 2005 17:00:17 +0300 User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en Original-To: mailing list for ConTeXt users In-Reply-To: <20051209152442.e84454c3@mx1.kerio.com> X-Server: High Performance Mail Server - http://surgemail.com r=-274017400 X-Authenticated-User: hagen@controller-1 X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.5 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:24283 Archived-At: Richard Gabriel wrote: > Hi guys, > > I can confirm that the UTF-8 input doesn't work for me too. > If I convert the file info GBK (CP936), it works fine [I suggest to > use the 'iconv' utility for the conversion :-)]. > > I tested the UTF-8 output the followin ways: > > 1) > \enableregime[utf] > \usemodule[chinese] > chinese is not yet defined in utf so if you want that, we need to do it now, since the chinese remapping stuff is rather complex, the best method is to consider a dedicated mechanism question: do the unicode tables cover gbk and big 5 well? assuming this, how about making a set of tfm,enc,map files that match the unicode positions (volunteers ...) we can extend the utf handler with a kind of plugin mechanism: \unprotect \def\utfunihashglyph#1% {\@EA\doutfunihashglyph\@EA{\number\utfdiv{#1}}{#1}} % only div once \def\doutfunihashglyph#1#2% div raw {\csname \ifnum#2<\utf@i \strippedcsname\unicodeasciicharacter \else\ifcsname\@@unicommand#1\endcsname \@@unicommand#1% \else\ifcsname\@@univector#1\endcsname \@@univector#1% \else \strippedcsname\unicodeunknowncharacter \fi\fi\fi \@EA\endcsname\@EA{\number\utfmod{#2}}} % only mod once \def\unicodeunknowncharacter#1% {\unknownchar} \let\utfunihash\utfunihashglyph \def\@@unicommand{@@unicommand} \def\defineutfcommand #1 #2% {\setvalue{\@@unicommand#1}##1{#2{#1}{##1}}} so we can define pluig in handlers for e.g. chinese \defineutfcommand 81 {\uchar} (bombs due to missing fonts, so for testing) \def\NotYet#1#2{[#1 #2]} \defineutfcommand 81 {\NotYet} (next comes adapting the chinese files; i can imagine that we redo the big5 and gbk definitions so that they remap to ut8 as common encoding) so .. the question is ... who is going to make the tfm/enc/map files Hans