From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/16559
Path: main.gmane.org!not-for-mail
From: "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp>
Newsgroups: gmane.emacs.gnus.general
Subject: Re: MULE primer
Date: Tue, 1 Sep 1998 17:46:32 +0900 (JST)
Sender: owner-ding@hpc.uh.edu
Message-ID: <13803.46184.111160.960378@tanko.sk.tsukuba.ac.jp>
References: <m3btp0euuf.fsf@sparky.gnus.org>
	<m2af4klppy.fsf@altair.xemacs.org>
NNTP-Posting-Host: coloc-standby.netfonds.no
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-2022-jp
Content-Transfer-Encoding: 7bit
X-Trace: main.gmane.org 1035155412 28569 80.91.224.250 (20 Oct 2002 23:10:12 GMT)
X-Complaints-To: usenet@main.gmane.org
NNTP-Posting-Date: Sun, 20 Oct 2002 23:10:12 +0000 (UTC)
Return-Path: <owner-ding@hpc.uh.edu>
Original-Received: from gizmo.hpc.uh.edu (gizmo.hpc.uh.edu [129.7.102.31])
	by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id EAA18906
	for <jason@mailhost.sclp.com>; Tue, 1 Sep 1998 04:48:45 -0400 (EDT)
Original-Received: from sina.hpc.uh.edu (sina.hpc.uh.edu [129.7.3.5]) by gizmo.hpc.uh.edu (8.7.6/8.7.3) with ESMTP id DAF27694; Tue, 1 Sep 1998 03:18:25 -0500
Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Tue, 01 Sep 1998 03:47:08 -0500 (CDT)
Original-Received: from sclp3.sclp.com (root@sclp3.sclp.com [209.195.19.139]) by sina.hpc.uh.edu (8.7.3/8.7.3) with ESMTP id DAA01012 for <ding@hpc.uh.edu>; Tue, 1 Sep 1998 03:46:47 -0500 (CDT)
Original-Received: from tanko.sk.tsukuba.ac.jp (root@tanko.sk.tsukuba.ac.jp [130.158.99.155])
	by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id EAA18890
	for <ding@gnus.org>; Tue, 1 Sep 1998 04:46:41 -0400 (EDT)
Original-Received: by tanko.sk.tsukuba.ac.jp
	id m0zDm5B-00013HC
	(Debian Smail-3.2.0.101 1997-Dec-17 #2); Tue, 1 Sep 1998 17:46:33 +0900 (JST)
Original-To: ding@gnus.org, xemacs-mule@xemacs.org
In-Reply-To: <m2af4klppy.fsf@altair.xemacs.org>
X-Mailer: VM 6.53 under 21.0 "Uzbek Black" XEmacs Lucid
Precedence: list
X-Majordomo: 1.94.jlt7
Xref: main.gmane.org gmane.emacs.gnus.general:16559
X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:16559

>>>>> "sb" == SL Baur <steve@xemacs.org> writes:

    sb> [cc'ed to xemacs-mule]

Which is where I picked it up; I imagine it will continue on both
lists, I don't (and don't have time to :( ) subscribe to ding.

    sb> Lars Magne Ingebrigtsen <larsi@gnus.org> writes in
    sb> ding@gnus.org:

    >> What is *the* way to learn about all this MULE stuff?

    sb> For detailed technical documentation about charsets, JIS,
    sb> etc. there's the O'Reilly Blowfish book -- Understanding
    sb> Japanese Information Processing.  It makes very dry reading
    sb> except perhaps for Chapter 5 -- "Japanese Input" which does
    sb> give some clues how to make things work.

I agree with the reviewer's opinion about "dry," and it's primarily
relevant (to people who aren't implementing core Mule functionality)
as background.  BTW, "Son-of-Blowfish" is supposed to go to press RSN.
It has the advantage of being much more internationally-oriented,
supporting Korean, Chinese, and Vietnamese as well as Japanese.  A
partial and old version is available on-line at

      file://ftp.uu.net/vendor/oreilly/nutshell/ujip/doc/cjk.inf

(about 165k, and now over two years old).

    sb> Unfortunately, unless you can read Japanese there doesn't
    sb> appear to be any documentation.

I must not understand what the question was, because in XEmacs Info
there are useful discussions of Mule under both Internals and Lispref.
The extensive comments in the C code implementing XEmacs's Mule (there
was nothing like them in Mule 2.3, I don't know if the commenting has
increased in recent ETL/FSF Mules) are also useful.  I think those are
mostly in mule-coding.c and mule-charset.c.  All the functions I have
ever needed to use are documented there or via `describe-function'.

The latter is only useful if you know what function you want, of
course.  The resources listed are not a programmer's guide,
unfortunately.  A very coarse approximation to a programmer's guide
would be O'Reilly's "Xlib Programmer's Manual" (vol 1 of the series,
the name may be inexact, you need the R5 (separate volume) or R6
(revised edition, much better)), in the chapters on internationalizing
applications.  (I did write "_very_ coarse", OK?)  At least it gives
you some idea of how to use an internationalized programming system to
create applications that can use many languages.

Mule is somewhat more complicated, because it is multilingual, and the
X internationalization model doesn't really address that (viz the
clumsy contortions you need to go through to use multiple X Input
Methods).

For internal changes to XEmacs to support new functionality, Hrvoje
Niksic has written a Mule-izing guide for the C code (sorry, dunno the
URL offhand), but I don't think that's what you have in mind.

The ISO 2022 and Unicode v2 standards documents have some rationale
about language handling, but that should be considered deep background 
(sort of an appendix to "Blowfish").  Don't bother with ISO 10646, it
contains no rationale.

_Before_ you read the FSF's preprint elisp manual (the section
entitled, informatively enough, "non-ASCII characters") read the
corresponding XEmacs documentation.  The XEmacs documentation gets the 
abstractions and semantics right, the FSF's docs are unclear, if not
misleading, in several places.  The preprint elisp manual is also not
a programmer's guide (zero examples and little sense of the context in
which functions might be useful). :-(

    sb> A cheap pocket tourist's dictionary is good enough to get some
    sb> examples to type in.

There is also Jim Breen's edict package

	 file://ftp.monash.edu.au/pub/nihongo/edict.{gz,doc}

(a flat file, about 1MB; to read it directly in an XEmacs buffer you
may need to set the coding system to euc-japanese manually for reasons
I don't understand), which can be accessed via the XEmacs package
edict (which does not contain the dictionary itself at the
maintainer's request).  The edict package is also at least somewhat
compatible with Emacs 20.2, but I found working in that environment
really painful (the differences from Emacs 19.34/Mule 2.3 were, ah,
poorly documented) so I can't vouch for more than "it will load and
look up a couple of words".

    >> I have a distinct feeling I must be misunderstanding something
    >> basic here.  I mean, here's (日本語) some Japanese text, so I
    >> must be doing something right, but I still don't understand it.

Most things (such as handling buffers containing characters from
various codesets) are done fairly automatically by Mule, and aren't
your problem.  Some other things (such as inputting the characters)
are handled by external subsystems that interface directly to Mule,
and aren't your problem.

It's only when you want to do things like set the MIME Content-Type
header correctly, encode/decode non-ASCII message headers, encode
multilingual buffers to Postscript, or handle right to left languages
and bidirectional text (don't bother; nobody knows how to do this
right yet) that "knowing how to program Mule" becomes an issue.

    sb> I'm not sure where to begin.

Move to Hong Kong? :-)  I would suggest with the Xlib documentation on
writing internationalizable applications.  That gives a feeling for
the problems and the abstractions involved in internationalizating an
app.  There's also some discussion in the Motif Programming Guide
(same series, vol 6A I think).  If "Son-of-Blowfish" should happen to
appear on a bookshelf near you, I would recommend that.  I haven't
seen a draft, but an acquaintance at O'Reilly says that this book is
going to be more oriented to programming than its predecessor,
although still heavily theoretical.

One thing about the differences between Mule/XEmacs and Mule/FSFの
Emacs: they're not going to get smaller any time soon, and the FSF has
been making many changes to the interface, consistently over the 20.x
releases.  If you want the Emacs with the latest and greatest in
multilingual features, you'll probably want to track the changes the
ETL/FSF people are making.  Lots of pain for not much gain, IMHO.

XEmacs's Mule interface will probably be rather stable, at least for
as long as it takes you to learn how to do it.  I can predict with
fair confidence that Emacs's Mule will be UNstable for at least that
long, and in any case you will probably have to support 19.x, 20.0,
20.1, 20.2, and 20.3 Emacs separately in multilingual code.

Caveat: this last dismal assessment only applies if you mean to (a)
implement MIME standards for language handling directly and fully
internally to Gnus, or (b) get handling of buffers with multilingual
content absolutely pedantically precisely correct.  As mentioned
above, most basic text manipulations are handled automatically by all
the Mules.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences        Tel/fax: +1 (298) 53-5091