Re: MULE primer - Stephen J. Turnbull

Gnus development mailing list
 help / color / mirror / Atom feed

From: "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp>
Subject: Re: MULE primer
Date: Tue, 1 Sep 1998 17:46:32 +0900 (JST)	[thread overview]
Message-ID: <13803.46184.111160.960378@tanko.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <m2af4klppy.fsf@altair.xemacs.org>

>>>>> "sb" == SL Baur <steve@xemacs.org> writes:

    sb> [cc'ed to xemacs-mule]

Which is where I picked it up; I imagine it will continue on both
lists, I don't (and don't have time to :( ) subscribe to ding.

    sb> Lars Magne Ingebrigtsen <larsi@gnus.org> writes in
    sb> ding@gnus.org:

    >> What is *the* way to learn about all this MULE stuff?

    sb> For detailed technical documentation about charsets, JIS,
    sb> etc. there's the O'Reilly Blowfish book -- Understanding
    sb> Japanese Information Processing.  It makes very dry reading
    sb> except perhaps for Chapter 5 -- "Japanese Input" which does
    sb> give some clues how to make things work.

I agree with the reviewer's opinion about "dry," and it's primarily
relevant (to people who aren't implementing core Mule functionality)
as background.  BTW, "Son-of-Blowfish" is supposed to go to press RSN.
It has the advantage of being much more internationally-oriented,
supporting Korean, Chinese, and Vietnamese as well as Japanese.  A
partial and old version is available on-line at

      file://ftp.uu.net/vendor/oreilly/nutshell/ujip/doc/cjk.inf

(about 165k, and now over two years old).

    sb> Unfortunately, unless you can read Japanese there doesn't
    sb> appear to be any documentation.

I must not understand what the question was, because in XEmacs Info
there are useful discussions of Mule under both Internals and Lispref.
The extensive comments in the C code implementing XEmacs's Mule (there
was nothing like them in Mule 2.3, I don't know if the commenting has
increased in recent ETL/FSF Mules) are also useful.  I think those are
mostly in mule-coding.c and mule-charset.c.  All the functions I have
ever needed to use are documented there or via `describe-function'.

The latter is only useful if you know what function you want, of
course.  The resources listed are not a programmer's guide,
unfortunately.  A very coarse approximation to a programmer's guide
would be O'Reilly's "Xlib Programmer's Manual" (vol 1 of the series,
the name may be inexact, you need the R5 (separate volume) or R6
(revised edition, much better)), in the chapters on internationalizing
applications.  (I did write "_very_ coarse", OK?)  At least it gives
you some idea of how to use an internationalized programming system to
create applications that can use many languages.

Mule is somewhat more complicated, because it is multilingual, and the
X internationalization model doesn't really address that (viz the
clumsy contortions you need to go through to use multiple X Input
Methods).

For internal changes to XEmacs to support new functionality, Hrvoje
Niksic has written a Mule-izing guide for the C code (sorry, dunno the
URL offhand), but I don't think that's what you have in mind.

The ISO 2022 and Unicode v2 standards documents have some rationale
about language handling, but that should be considered deep background 
(sort of an appendix to "Blowfish").  Don't bother with ISO 10646, it
contains no rationale.

_Before_ you read the FSF's preprint elisp manual (the section
entitled, informatively enough, "non-ASCII characters") read the
corresponding XEmacs documentation.  The XEmacs documentation gets the 
abstractions and semantics right, the FSF's docs are unclear, if not
misleading, in several places.  The preprint elisp manual is also not
a programmer's guide (zero examples and little sense of the context in
which functions might be useful). :-(

    sb> A cheap pocket tourist's dictionary is good enough to get some
    sb> examples to type in.

There is also Jim Breen's edict package

	 file://ftp.monash.edu.au/pub/nihongo/edict.{gz,doc}

(a flat file, about 1MB; to read it directly in an XEmacs buffer you
may need to set the coding system to euc-japanese manually for reasons
I don't understand), which can be accessed via the XEmacs package
edict (which does not contain the dictionary itself at the
maintainer's request).  The edict package is also at least somewhat
compatible with Emacs 20.2, but I found working in that environment
really painful (the differences from Emacs 19.34/Mule 2.3 were, ah,
poorly documented) so I can't vouch for more than "it will load and
look up a couple of words".

    >> I have a distinct feeling I must be misunderstanding something
    >> basic here.  I mean, here's (日本語) some Japanese text, so I
    >> must be doing something right, but I still don't understand it.

Most things (such as handling buffers containing characters from
various codesets) are done fairly automatically by Mule, and aren't
your problem.  Some other things (such as inputting the characters)
are handled by external subsystems that interface directly to Mule,
and aren't your problem.

It's only when you want to do things like set the MIME Content-Type
header correctly, encode/decode non-ASCII message headers, encode
multilingual buffers to Postscript, or handle right to left languages
and bidirectional text (don't bother; nobody knows how to do this
right yet) that "knowing how to program Mule" becomes an issue.

    sb> I'm not sure where to begin.

Move to Hong Kong? :-)  I would suggest with the Xlib documentation on
writing internationalizable applications.  That gives a feeling for
the problems and the abstractions involved in internationalizating an
app.  There's also some discussion in the Motif Programming Guide
(same series, vol 6A I think).  If "Son-of-Blowfish" should happen to
appear on a bookshelf near you, I would recommend that.  I haven't
seen a draft, but an acquaintance at O'Reilly says that this book is
going to be more oriented to programming than its predecessor,
although still heavily theoretical.

One thing about the differences between Mule/XEmacs and Mule/FSFの
Emacs: they're not going to get smaller any time soon, and the FSF has
been making many changes to the interface, consistently over the 20.x
releases.  If you want the Emacs with the latest and greatest in
multilingual features, you'll probably want to track the changes the
ETL/FSF people are making.  Lots of pain for not much gain, IMHO.

XEmacs's Mule interface will probably be rather stable, at least for
as long as it takes you to learn how to do it.  I can predict with
fair confidence that Emacs's Mule will be UNstable for at least that
long, and in any case you will probably have to support 19.x, 20.0,
20.1, 20.2, and 20.3 Emacs separately in multilingual code.

Caveat: this last dismal assessment only applies if you mean to (a)
implement MIME standards for language handling directly and fully
internally to Gnus, or (b) get handling of buffers with multilingual
content absolutely pedantically precisely correct.  As mentioned
above, most basic text manipulations are handled automatically by all
the Mules.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences        Tel/fax: +1 (298) 53-5091

next prev parent reply	other threads:[~1998-09-01  8:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1998-08-31 20:53 Lars Magne Ingebrigtsen
1998-08-31 21:14 ` Alan Shutko
1998-08-31 21:38 ` Hrvoje Niksic
1998-08-31 22:20 ` Michael Welsh Duggan
1998-09-01  1:10   ` Alan Shutko
1998-09-01  8:22     ` Lars Magne Ingebrigtsen
1998-09-01 16:13       ` Alan Shutko
1998-09-01  8:20   ` Lars Magne Ingebrigtsen
1998-09-01 13:13     ` Michael Welsh Duggan
1998-09-01 14:33       ` William M. Perry
1998-09-01 15:57       ` Lars Magne Ingebrigtsen
1998-09-01 17:43         ` Hallvard B Furuseth
1998-08-31 23:02 ` SL Baur
1998-09-01  8:46   ` Stephen J. Turnbull [this message]
1998-09-01 15:52 ` font suggestions for GNU Emacs? (was Re: MULE primer) John H Palmieri
1998-09-01 19:29   ` John H Palmieri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13803.46184.111160.960378@tanko.sk.tsukuba.ac.jp \
    --to=turnbull@sk.tsukuba.ac.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).