9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Eris Discordia <eris.discordia@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] Simplified Chinese plan 9
Date: Fri, 11 Sep 2009 19:36:33 +0100	[thread overview]
Message-ID: <3DA76AC1A01346A9B22FA838@[192.168.1.2]> (raw)
In-Reply-To: <509071940909110954i7f3e6a31ic1a93cb9b741f60@mail.gmail.com>

> anyway, the general idea is that it can compose kanji from strings of
> hiragana. it's also been used for other languages (although my memory of
> that says it was mostly for the transliteration function, rather than the
> compositing function). is it possible to do something similar for the
> hanzi, composing them up from roots/stems? i've seen reference to the
> idea in chinese dictionaries, but have no idea if it's use is widespread.

Kana to kanji conversion is peculiar to Japanese and that's basically how
all Japanese IMEs work. You input a series of kana (in Roman/Latin letters
converted on-the-fly), then either assert them as they are or accept a
corresponding kanji the IME offers. It's called inline conversion.
Conversion may also be explicitly requested from the software when for some
reason inline conversion results are unsatisfactory. It takes really good
UI design to make the process practical.

For Chinese, input from a standardized romanization is required, Pinyin
being the most widely used (cellphones, computers, people who learn Chinese
as a second language and would have an immensely hard time if they were to
write in ideographs, even many Chinese people). Kana to kanji conversion is
not viable there simply because kana is not the syllabary system used to
express Chinese. Chinese syllables do no correspond to kana, plus Chinese
is tonal while Japanese is not. Phonetically, and therefore input-wise
since practical CJK input is based on sounds rather than meanings, the two
languages are universes apart even though they share Han characters in the
semantic sphere. Actually, any practical input system should rely on sound
representation rather than meaning--there only so many sounds while there
are infinitely many meanings.

Roots/stems you refer to are elements in the ideographs used to classify
Han characters. They are more properly called radicals and are ordered by
stroke count, i.e. the number times you put down the pen to compose one
from the basic strokes. Most IMEs, _besides_ automatic conversion, offer
the option to choose a kanji/hanzi/hanja by any one of various lookup
methods. Radical lookup is one such method. There are other classifications
of Han characters such as Hadamitzky-Spahn (applicable to kanji) which
aren't present in many IMEs.

This is a great example of a full-blown Japanese word processor (it's
Windows freeware):

<http://www.physics.ucla.edu/~grosenth/jwpce.html>

Features nearly everything expected from a CJK input system and works
independent of MS IME although can also be used in conjunction.

At present, Windows and MS Office do an unrivalled job of enabling
multi-lingual input and display. I can't help but feel this is sort of a
lock-in situation for people who need/fancy that sort of capability. This
isn't really something I would revel in but it's at least reassuring that
there is _some_ convenient, stable, uniform way to get these things done.



--On Friday, September 11, 2009 12:54 -0400 Anthony Sorace
<anothy@gmail.com> wrote:

> i know very little about existing chinese input methods, so this is more a
> question for my own understanding than a suggestion, but:
>
> there is ktrans for Plan 9; the latest version i'm aware of is described
> here: 	http://basalt.cias.osakafu-u.ac.jp/plan9/s39.html
> although that page is a bit hard to read since line breaks are not
> preserved. the contents are just the README from the tar file; maybe
> easier to just download that and read there.
>
> anyway, the general idea is that it can compose kanji from strings of
> hiragana. it's also been used for other languages (although my memory of
> that says it was mostly for the transliteration function, rather than the
> compositing function). is it possible to do something similar for the
> hanzi, composing them up from roots/stems? i've seen reference to the
> idea in chinese dictionaries, but have no idea if it's use is widespread.
>
> i've had ktrans working on 4th edition in the past, although i just tried
> again (after a long gap), and it blows an assert, which i've not looked
> into yet.
>



      reply	other threads:[~2009-09-11 18:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-11  8:40 xiangyu
2009-09-11 10:23 ` erik quanstrom
2009-09-11 11:29   ` Alexander Sychev
2009-09-11 16:13     ` Eris Discordia
2009-09-11 17:49       ` erik quanstrom
2009-09-11 19:14         ` Eris Discordia
     [not found]         ` <68F5914168759B188DF09A60@192.168.1.2>
2009-09-11 19:53           ` Anthony Sorace
2009-09-11 21:28             ` Eris Discordia
2009-09-11 22:16               ` erik quanstrom
2009-09-12  1:19                 ` Eris Discordia
2009-09-12  1:46                   ` erik quanstrom
2009-09-12  7:05                     ` Eris Discordia
2009-09-12  8:39                       ` Daniel Lyons
2009-09-12 14:22                         ` Eris Discordia
2009-09-12 14:27                           ` erik quanstrom
2009-09-12 14:39                             ` Eris Discordia
     [not found]                             ` <160F5E4B5D4057F12BB54C75@192.168.1.2>
2009-09-12 20:22                               ` Nick LaForge
     [not found]             ` <C890B1F2A8C2EC12D5383D7C@192.168.1.2>
2009-09-11 21:59               ` Anthony Sorace
2009-09-14  9:33         ` Paul Donnelly
2009-09-14 12:47           ` Eris Discordia
2009-09-11 16:54     ` Anthony Sorace
2009-09-11 18:36       ` Eris Discordia [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='3DA76AC1A01346A9B22FA838@[192.168.1.2]' \
    --to=eris.discordia@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).