* [9fans] updated version of ktrans {JIS coded]
@ 1998-08-16 9:07 Kenji
0 siblings, 0 replies; 2+ messages in thread
From: Kenji @ 1998-08-16 9:07 UTC (permalink / raw)
The problem I asked last night has been solved by myself. This is
because I listened great violine music by MIDORI last night, and
she refreshed my brain effectively. ^_^
By the way, I think I have done it, and will put this as boddle (please
tell me how to make boddle file safely) on our Web page for Plan 9,
http://basalt.cias.osakafu-u.ac.jp/plan9_doc/plan9_index.html, which is
our secret Web page written in Japanese since 1995. :-) This is euc
encoded Japanese text, and I will make small English part to settle
the boddle file (if I could make it safely). Therefore, don't be afraid
of many encrypted lines. :-)
The final version (I hope) of README.kenji file is attached below.
-------- from here -----
This is a modified version of ktrans of the original in Plan 9 distribution
by Kenji Okamoto, Aug. 16, 1998. I tried (actually learned) to keep the art
level of the original codes. However, it's too smart to me, and seems beyond
my skill.
I'm afraid I made it seriously brain damaged. If so, I applogize, because
I knwo who wrote the original. My only one excuse for this version is "it's
works".
Newly added features are as follows:
1) have a "local" dictionary file for translation from kana to kanji which can
easily editted by user. The default file name is $home/lib/ktrans-jisho.
If you want to use other dictionary file, set the KTJISHO environment
variable to point to that file.
2) capital romaji input for a word such as verbs or adjectives with okurigana,
which follows the idea of SKK system by Masahiko Sato of Kyoto Univ.
(masahiko@kuis.kyoto-u.ac.jp). If you want to get to kanji string (runes)
"^[$BF0$+$9^[(B", which is a verb, you may input "ugoKasu" from the keyboard.
Note here the Kasu's k is a capital (important). You will see a hiragana
runes "^[$B$&$4$+$9^[(B", and then, the kanji runes of "^[$BF0$+$9^[(B", when you hit "ctl-t".
If you are satisfied by that translation, continue to input next word.
If you are not pleasure with that candidate, hit 'ctl-t' once more to see
more candiate for that hiragana inputs. When no more other candidates are
registered in your dictionary, you will see the initial hiragana input.
3) for Japanese "joshi", a postpositioned short word after noun, you can use another
method which I developed for this work. If you want kanji string (runes)
"^[$B;d$O^[(B", then, try to hit "watashiHA" from the keyboard. Note that the sound
of "wa(ha)" is express as capitalized "HA". You will see a hiragana string
of "^[$B$o$?$7$O^[(B", and then, "^[$B;d$O^[(B" after `ctl-t'.
4) a control sequence of 'ctl-l' is introduced to leave input hiragana runes unchanged.
This is occasionally neccessary.
5) simple leaning mechanism has been implemented on the on-memory hashing dictinary
where most recently used kanji runes (candidate) moves to the top of the
list of candidates. This is valid only during the session you called kktrans.
This is done so intentionally, because present learning method is ..well...
naive. ^_^ I know this, however, I believe you can solve it by making a good
dictionary best fitted to your purpose by yourself.
6) 'ctl-q' ends the session when you want to edit your kana-kanji translation
dictionary by sam. I chose this only by the reason for simplicity.
The dictionary is read only once at the beginning of the ktrans session.
7) change mode to kana-input is triggered by 'ctl-n' but not 'ctl-g' (original).
This is by the simple reason why I feel it better for Japanese (nihongo)
trigger key. I re-arranged 'ctl-g' to to greek mode. If I'm doing something
wrong, please tell me. I know the feeling from Japanese, but not from
greek.
8) as the starting $home/lib/ktrans-jisho, you may re-format the SKK-JISYO.S
(66.9KB) of SKK system, which can be reached from ftp.kuis.kyoto-u.ac.jp.
The next three lines shows the short sed filter to transform from SKK type
dictionary to Plan 9. Before this, you should change kanji code from ujis
(euc) to UTF-8 by tcs utility, of course.
s/\// /g
s/ / /g
s/ $//g
The header items are sorted in a strange order in the original SKK
dictionary. Present implementation does not care the order, therefore,
you can change it by yourself.
9) SKK jisho, such as SKK-JISYO.S, is composed of two parts, okuri-ari and
okuri-nashi entries. This is greatly depend on the Japanese grammer,
and okuri-ari may represent verb/adjective etc., i.e., not noun.
These two parts work differently in the original SKK system, however,
I did not employed that method, rather, a simple approarch as described
in (2) and (3). Here, we have no difference between these two parts,
and the reason why I leaved the two part structure remained is just
to make easier to read for editting. Of course, you can change it
without any side-effects.
8) This implementation of Japanese input method is to convert every one word
by one key triggering essentially. This may cause some cumbersome feeling
to Nihongo users who are accustomed to, say, Windows. I know this.
However, I intended to keep the codes compact as possible as a first step
to develope Nihongo input system on Plan 9. Furthermore, I never seen
the latters worked perfectly. I think the conversion failed essentially
when we see more than, say, five/six candidates for one input hiragana
runes.
9) a usage example: if you want to make the Japanese text as below:
^[$B;d$OKhF|^[(B35^[$BJ,0J>e$bJb$$$F^[(B, ^[$B99$K^[(B10^[$BJ,EE<V$K>h$C$F3X9;$KDL$$$^$9^[(B.
^[$B7r9/$N0];}$K$bLr$@$C$F$$$^$9$,^[(B, ^[$B$J$+$J$+$?$N$7$$$b$N$G$9^[(B.
your keyboard hitting stream should be:
watashiHA[^t]mainichi[^t]35[^l]fun[^t]ijouMO[^t]aruIte, [^t]saraNI[^t]
10[^l]fun[^t]denshaNI[^t]noTte[^t]gakkouNI[^t]kayoImasu.[^t]
kenkouNO[^t]ijiNImo[^t]yakuDAtteimasuga, [^t]nakanaka[^l]tanoshiI[^t] <-- NO! :-)
monodesu.[^l]
whre [^t], [^l] indicates 'ctl-t' and 'ctl-l' respectively.
10) yes, tons of TODO lists of course. ^_^
Kenji August 15, 1998
----------
^ permalink raw reply [flat|nested] 2+ messages in thread
* [9fans] updated version of ktrans {JIS coded]
@ 1998-08-15 9:13 Kenji
0 siblings, 0 replies; 2+ messages in thread
From: Kenji @ 1998-08-15 9:13 UTC (permalink / raw)
I still has one more problem regarding reading /dev/kbd.
In main() function of translate.c of original ktrans, I added
two lines just before the line to get mp:
mp = match(bp, &nchar, table);.
The added lines are very simple tolower() function like this:
if (table == kana && (*bp <='Z' && *bp>= 'A'))
*bp = tolower(*bp);
Here, I expected the behaviour such that we see "^[$B$+^[(B" when I input
from keyboard like "KA". However, I got k^[$B$"^[(B.
I;ve read siource codes many times, but I couldn't got success to
understand the reason why it behaves like that. I know my brains
are reaching to my max though...
Kenji
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~1998-08-16 9:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-08-16 9:07 [9fans] updated version of ktrans {JIS coded] Kenji
-- strict thread matches above, loose matches on Subject: below --
1998-08-15 9:13 Kenji
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).