Gnus development mailing list
 help / color / mirror / Atom feed
From: "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp>
Subject: Re: More charset things
Date: Fri, 5 Feb 1999 09:47:18 +0900 (JST)	[thread overview]
Message-ID: <14010.16278.215333.623477@tanko.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <m34sp283fp.fsf@quimbies.gnus.org>

>>>>> "Lars" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
    Lars> Hrvoje Niksic <hniksic@srce.hr> writes:

    >> MULE is little else than a Japanese version of Emacs, and it
    >> appears that the Japanese are not interested in Unicode.  So it

The MULE development group is nearly entirely Japanese; including the
people implementing Devanagari (for sure) and Arabic and Ethiopic
(IIRC).  Not surprisingly, the tuning (and tuning is absolutely
necessary; the linguists don't know enough about language for charset
guessing and the like to be more than heuristic) is best for Japanese,
and bugs for non-Japanese languages don't get found and fixed quickly.

But MULE is the only truly multilingual platform there is at the
moment, to the best of my knowledge; Unicode doesn't satisfy the needs 
of lots of people, and is not easily extensible without changing the
standard.  MULE is.  MULE is more than a Japanese version of Emacs.

The Japanese are divided on Unicode; some are vehemently opposed,
others are interested.  There don't seem to be any strong advocates,
though.

    >> wasn't implemented.  I'm not sure about FSF, but for XEmacs, I
    >> know of no plans to implement it in the near future.

    Lars> A partial implementation of utf-mumble was posted recently
    Lars> somewhere by someone.  (Could I possible get any more
    Lars> vague?)  So I'm Cc'ing this to the xemacs-mule list.

Morioka-san ported (IIRC) a Lisp-level implementation of UTF-8.  The
attachments were broken on the ML (so Steve never was able to look at
it), I'll restore from archive the working (I hope) copy I got from
Morioka.  Martin Buchholz believes that since the tables are in Lisp,
the performance impact will be huge.

    Lars> I asked before for a likely book that would introduce me to
    Lars> the basic concepts, and someone (Stephen Turnbull?) told me,
    Lars> but then I forgot.

Prices are vague recollections, in decreasing order of importance for
basic understanding:

Ken Lunde.  Chinese, Japanese, Korean and Vietnamese Information
    Processing.  O'Reilly Associates.  Probably the most useful single 
    volume, although it doesn't cover single-octet encodings.
ISO.  ISO-2022:  Extension Techniques for Coded Character Sets.  US$75.
Unicode Consortium.  The Unicode Standard, v2.x.  About US$70 from Amazon.
ISO.  ISO-10646:  Universal Multi-octet Character Set Encoding
    Standard.  About US$125.  Don't bother unless you've got extra
    money, Unicode Standard is much more complete and readable.  All
    ISO-10646 has extra is 4-octet encoding, which is presently
    useless, and it is very likely that any UTF-8 .

I don't know of any textbooks on character set stuff, there must be
some somewhere.  Lunde's book will have a very extensive bibliography.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences       Tel/fax: +81 (298) 53-5091
__________________________________________________________________________
__________________________________________________________________________
What are those two straight lines for?  "Free software rules."


  reply	other threads:[~1999-02-05  0:47 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1999-02-03 18:09 Lars Magne Ingebrigtsen
1999-02-04 14:56 ` Hrvoje Niksic
1999-02-04 17:08   ` Lars Magne Ingebrigtsen
1999-02-04 17:21     ` Hrvoje Niksic
1999-02-04 17:49       ` Lars Magne Ingebrigtsen
1999-02-05  0:47         ` Stephen J. Turnbull [this message]
1999-02-05  2:43           ` Hrvoje Niksic
     [not found]           ` <m3hft163aa.fsf@peorth.gweep.net>
1999-02-05 19:06             ` Vladimir Volovich
     [not found]               ` <m3sockqqjx.fsf@peorth.gweep.net>
1999-02-06 15:55                 ` Lars Magne Ingebrigtsen
     [not found]                   ` <m3lnia5922.fsf@peorth.gweep.net>
1999-02-07 21:02                     ` Hrvoje Niksic
1999-02-09 15:56                       ` Lars Magne Ingebrigtsen
1999-02-09 17:21                         ` Hrvoje Niksic
1999-02-09 17:31                           ` Alan Shutko
1999-02-09 17:37                           ` Lars Magne Ingebrigtsen
1999-02-09 18:06                             ` Hrvoje Niksic
1999-02-08 16:04                   ` Bill White
1999-02-09 16:04                     ` Lars Magne Ingebrigtsen
1999-02-06  8:17           ` Lars Magne Ingebrigtsen
1999-02-09 10:27           ` Displayed [ 0: Stephen J. Turnbull ] but it had lots of lines Alf-Ivar Holm
1999-02-09 16:14             ` Lars Magne Ingebrigtsen
1999-02-09 22:07           ` More charset things Jan Vroonhof
     [not found]           ` <m3hft163aa.fsf@p <byu2wv6xkb.fsf@bolzano.math.ethz.ch>
1999-02-09 22:13             ` Hrvoje Niksic
1999-02-07 20:43         ` François Pinard
1999-02-08  2:09           ` Martin Buchholz
1999-02-22 15:52             ` François Pinard
1999-02-08 14:49           ` Robert Bihlmeyer
     [not found]           ` <m37lttydo2.fsf@peorth.gweep.net>
1999-02-08  9:55             ` Kai.Grossjohann
1999-02-08 15:52             ` François Pinard
     [not found]               ` <m3n22ou09w.fsf@peorth.gweep.net>
1999-02-08 23:19                 ` François Pinard
1999-02-09  8:05               ` Steinar Bang
1999-02-14 18:10                 ` UTF-8 (Was: More charset things) Steinar Bang
1999-02-09 16:03               ` More charset things Lars Magne Ingebrigtsen
1999-02-08 17:29             ` Karl Eichwalder
1999-02-08 22:03             ` James H. Cloos Jr.
1999-02-09  5:29               ` Russ Allbery
1999-02-09  7:33                 ` James H. Cloos Jr.
1999-02-10  2:13                   ` Stephen Zander
1999-02-11 10:09           ` Jan Vroonhof
1999-02-07 19:37       ` François Pinard
1999-02-08  0:06         ` Kenichi Handa
1999-02-07 19:35     ` François Pinard
1999-02-08 13:37       ` Simon Josefsson
1999-02-08 23:43         ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14010.16278.215333.623477@tanko.sk.tsukuba.ac.jp \
    --to=turnbull@sk.tsukuba.ac.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).