From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 13 Aug 1998 11:52:45 +0900 From: Kenji Okamoto okamoto@earth.cias.osakafu-u.ac.jp Subject: [9fans] updated version of ktrans Topicbox-Message-UUID: 7c433364-eac8-11e9-9e20-41e7f4b1d025 Message-ID: <19980813025245.1JCaDfgde1bJY7FAdMc0Z4x7K-ux26A_k3vlHU4Cmuc@z> Kenji again. First of all, please forgive me to write this to this public mailing-list= =2E I tried twice to reply to Rob yesterday, but all failed to reach you with= the error messag of "451 read error from plan9.bell-labs.com" and "Connec= tion reset by plan9.bell-labs.com". I have no other means to reach you, and t= hen, = please forgive me to include a personal letter to you attached bellow. After I wrote this e-mail to Rob, I had some improvement on this project,= and now I believe I'm in a phase to minor (I hope) bug fixing, such as memory leaks etc. I tried three sizes of kana-kanji conversion SKK dictionary (S =3D 61 KB,= M =3D 171KB, L =3D 3.7 MB) , and got a feeling that the smallest size dic= may be enough to write a short text such as e-mail. The medium sized dictionary made me much more comfortable, however, it's process size grows to 1.5MB (640KB for the smallest). Therefore, I think this implementation would be in a good level for many of Japanese text editing as everyday use. Here, I don't image naive users such as personal computers are targetting. This may fit the consept of Plan 9, I believe. :-) The new version of my README file is as follows (I wrote this in JIS code, because I'm now writing this mail on a unix machine, and it does something wrong on UTF runes(learn from the previous fail)). = If someone had problems to read this, please use tcs on Plan 9 system. --------README------ kktrans is modified version of ktrans by Kenji Okamoto, Aug. 11, 1998 on = the = original source of Plan 9 distribution. = Newly added features are as follows: 1) have a "local" dictionary file for translation from kana to kanji which can easily editted by user. The default file name is $home/lib/kktrans-jisho 2) capital roma-ji input for a word with okurigana, which follows the ide= a of SKK system by Masahiko Sato (masahiko@kuis.kyoto-u.ac.jp). If y= ou = want to translate to kanji string "=1B$BF0$+$9=1B(B", you may input= = "ugoKasu". Note here the Kasu's k is a capital (important). This = is because the sound "Ka" can be "Ke" or "Ki" etc in Japanese for various usag= es. = Therefore, we need some skill to make compact the size of dictionar= y. SKK's method is very interesting, I believe. You will, then, see the kan= ji rune = of "=1B$BF0=1B(B", when you hit "ctl-t". = If you are satisfied by that translation, hit next word ( or "ctl= -l" = may which is newly imprimented to leave hiragana runes intentionall= y = unchanged). You will see the rest of the okuri-gana like "=1B$BF0$= +$9=1B(B". = If you are not pleasure with that candidate, hit "ctl-t" once more = to = see more candiate for that hiragana inputs. 3) simple learning mechanism has been implemented on the on-memory dictin= ary hash table where most recently used kanji runes (candidate) moves to the top o= f the list of candidates. 4) as the starting $home/lib/kktrans-jisho, you may re-format the SKK-JIS= YO.S = (66.9KB) of SKK system, which can be reached from ftp.kuis.kyoto-u.= ac.jp. = The next three lines shows the short sed filter to transform from S= KK type = dictionary to Plan 9. (you may proceed to make tcs -f ujis :-) s/\// /g s/ / /g s/ $//g 5) SKK jisho, such as SKK-JISYO.S, is composed of two parts, okuri-ari an= d = okuri-nashi entries. This is greatly depend on the Japanese gramme= r, and okuri-ari may represent verb/adjective etc., ie., not noun. These two parts work differently in the original SKK system, howe= ver, I did not employed that way, rather, a simple approarch as describe= d in (2). Here, we have no difference between these two parts, and j= ust = to make easier to read for editting, the two parts structure was le= aved as are. = 6) This implementation of Japanese input method is to convert one word in= a time essentially. This may cause some cumbersome feeling to Nihongo use= rs. I know this. However, I intended to keep the codes compact as poss= ible for a first step to develope Nihongo input system on Plan 9. To im= prove this, we have introduce some automatic mechanism like SKK... I've n= ot decided what would be best for us... 7) tons of TODO lists of course. ^_^ Kenji August 12, 1998 = -----personal mail----- Hi Rob-- Thanks for your quick advice on the philosophical view of Plan 9. I will update my codes to follow your advice, i.e., $home/lib/kktrans-jis= ho = for personal on-line quick dictionary. After I posted last mail to the mailing list, I got two points to inprove my user interface. One concernes on your advice of "default" dic= =2E version. I'm now considering to implement this as a slow large dictionar= y which will be read from HDD on demand. Then, this large dictionary shoul= d have only hashed table of some grouped offset address on memory. If it= = could be finished, I will put it on /sys/lib. For this dictionary, I thi= nk I can apply Ritchie-san's hlook.c algorism with very small change. Another important point to be improved concernes with user interface itse= lf. Last night, when I was in bed, I got an new idea to incorporate SKK's method to ktrans's basic scheme. I'll try it from now, because we are = now in Obon vacation, and I have time, then, I'll do this. By the way, new kktrans can be used to write short Japanese text like e-mail, which was the single reason why I did it this time, although it should be improved much to wrtite longer Japanese text such as papers. Japanese input method is essential for us Japanese who want to use Plan 9= for daily works. This has prevented me to go into Plan 9 for a long time. However, this new update may improve it a little, at least, I hope so. Kenji PS. This mail had been sent in the morning, and by some troubles, = returned to me. I've done to incorporation of SKK now. -------cut here-----