9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] updated version of ktrans {JIS coded]
@ 1998-08-16  9:07 Kenji
  0 siblings, 0 replies; 2+ messages in thread
From: Kenji @ 1998-08-16  9:07 UTC (permalink / raw)


The problem I asked last night has been solved by myself.  This is
because I listened great violine music by MIDORI last night, and 
she refreshed my brain effectively.  ^_^


By the way, I think I have done it, and will put this as boddle (please
tell me how to make boddle file safely) on our Web page for Plan 9, 
http://basalt.cias.osakafu-u.ac.jp/plan9_doc/plan9_index.html, which is
our secret Web page written in Japanese since 1995.  :-)  This is euc
encoded Japanese text, and I will make small English part to settle
the boddle file (if I could make it safely).  Therefore, don't be afraid
of many encrypted lines. :-)


The final version (I hope) of README.kenji file is attached below.

-------- from here -----
  This is a modified version of ktrans of the original in Plan 9 distribution 
by Kenji Okamoto, Aug. 16, 1998.   I tried (actually learned) to keep the art 
level of the original codes.  However, it's too smart to me, and seems beyond 
my skill.

  I'm afraid I made it seriously brain damaged.  If so, I applogize, because 
I knwo who wrote the original.  My only one excuse for this version is "it's 
works".


Newly added features are as follows:

1) have a "local" dictionary file for translation from kana to kanji which can 
      easily editted by user.  The default file name is $home/lib/ktrans-jisho.
      If you want to use other dictionary file, set the KTJISHO environment 
      variable to point to that file.

2) capital romaji input for a word such as verbs or adjectives with okurigana, 
      which follows the idea of SKK system by Masahiko Sato of Kyoto Univ.
      (masahiko@kuis.kyoto-u.ac.jp).  If you want to get to kanji string (runes)
      "^[$BF0$+$9^[(B", which is a verb, you may input "ugoKasu" from the keyboard.  
      Note here the Kasu's k is a capital (important).  You will see a hiragana
      runes "^[$B$&$4$+$9^[(B", and then, the kanji runes of "^[$BF0$+$9^[(B", when you hit "ctl-t". 
        If you are satisfied by that translation, continue to input next word.   
      If you are not pleasure with that candidate, hit 'ctl-t' once more to see 
      more candiate for that hiragana inputs.  When no more other candidates are 
      registered in your dictionary, you will see the initial hiragana input.

3) for Japanese "joshi", a postpositioned short word after noun, you can use another
      method which I developed for this work.  If you want kanji string (runes)
      "^[$B;d$O^[(B", then, try to hit "watashiHA" from the keyboard.  Note that the sound
      of "wa(ha)" is express as capitalized "HA".  You will see a hiragana string
      of "^[$B$o$?$7$O^[(B", and then, "^[$B;d$O^[(B" after `ctl-t'.

4) a control sequence of 'ctl-l' is introduced to leave input hiragana runes unchanged.
      This is occasionally neccessary.

5) simple leaning mechanism has been implemented on the on-memory hashing dictinary
      where most recently used kanji runes (candidate) moves to the top of the
      list of candidates.  This is valid only during the session you called kktrans.
      This is done so intentionally, because present learning method is ..well... 
      naive. ^_^  I know this, however, I believe you can solve it by making a good 
      dictionary best fitted to your purpose by yourself.

6) 'ctl-q' ends the session when you want to edit your kana-kanji translation 
      dictionary by sam.  I chose this only by the reason for simplicity.
      The dictionary is read only once at the beginning of the ktrans session.

7) change mode to kana-input is triggered by 'ctl-n' but not 'ctl-g' (original).
      This is by the simple reason why I feel it better for Japanese (nihongo)
      trigger key.  I re-arranged 'ctl-g' to to greek mode.  If I'm doing something
      wrong, please tell me.  I know the feeling from Japanese, but not from
      greek.

8) as the starting $home/lib/ktrans-jisho, you may re-format the SKK-JISYO.S 
      (66.9KB) of SKK system, which can be reached from ftp.kuis.kyoto-u.ac.jp.  
      The next three lines shows the short sed filter to transform from SKK type 
      dictionary to Plan 9.  Before this, you should change kanji code from ujis
      (euc) to UTF-8 by tcs utility, of course.
            s/\// /g
            s/  /	/g
            s/ $//g
      The header items are sorted in a strange order in the original SKK 
      dictionary.  Present implementation does not care the order, therefore, 
      you can change it by yourself.

9) SKK jisho, such as SKK-JISYO.S, is composed of two parts, okuri-ari and 
      okuri-nashi entries.  This is greatly depend on the Japanese grammer,
      and okuri-ari may represent verb/adjective etc., i.e., not noun.
        These two parts work differently in the original SKK system, however,
      I did not employed that method, rather, a simple approarch as described
      in (2) and (3).  Here, we have no difference between these two parts, 
      and the reason why I leaved the two part structure remained is just 
      to make easier to read for editting.  Of course, you can change it
      without any side-effects.
    
8) This implementation of Japanese input method is to convert every one word 
      by one key triggering essentially.  This may cause some cumbersome feeling 
      to Nihongo users who are accustomed to, say, Windows. I know this.  
      However, I intended to keep the codes compact as possible as a first step 
      to develope Nihongo input system on Plan 9.  Furthermore, I never seen 
      the latters worked perfectly.  I think the conversion failed essentially 
      when we see more than, say, five/six candidates for one input hiragana
      runes.

9) a usage example: if you want to make the Japanese text as below:

           ^[$B;d$OKhF|^[(B35^[$BJ,0J>e$bJb$$$F^[(B, ^[$B99$K^[(B10^[$BJ,EE<V$K>h$C$F3X9;$KDL$$$^$9^[(B.
           ^[$B7r9/$N0];}$K$bLr$@$C$F$$$^$9$,^[(B, ^[$B$J$+$J$+$?$N$7$$$b$N$G$9^[(B.

      your keyboard hitting stream should be:

      watashiHA[^t]mainichi[^t]35[^l]fun[^t]ijouMO[^t]aruIte, [^t]saraNI[^t]
      10[^l]fun[^t]denshaNI[^t]noTte[^t]gakkouNI[^t]kayoImasu.[^t]
      kenkouNO[^t]ijiNImo[^t]yakuDAtteimasuga, [^t]nakanaka[^l]tanoshiI[^t] <-- NO! :-)
      monodesu.[^l]

      whre [^t], [^l] indicates 'ctl-t' and 'ctl-l' respectively.

10) yes, tons of TODO lists of course. ^_^


		       				Kenji    August 15, 1998 
----------






^ permalink raw reply	[flat|nested] 2+ messages in thread

* [9fans] updated version of ktrans {JIS coded]
@ 1998-08-15  9:13 Kenji
  0 siblings, 0 replies; 2+ messages in thread
From: Kenji @ 1998-08-15  9:13 UTC (permalink / raw)


I still has one more problem regarding reading /dev/kbd.

In main() function of translate.c of original ktrans, I added 
two lines just before the line to get mp:
   mp = match(bp, &nchar, table);.

The added lines are very simple tolower() function like this:

if (table == kana && (*bp <='Z' && *bp>= 'A'))
	*bp = tolower(*bp);

Here, I expected the behaviour such that we see "^[$B$+^[(B" when I input
from keyboard like "KA".  However, I got k^[$B$"^[(B.
I;ve read siource codes many times, but I couldn't got success to
understand the reason why it behaves like that.  I know my brains
are reaching to my max though...

Kenji






^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1998-08-16  9:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-08-16  9:07 [9fans] updated version of ktrans {JIS coded] Kenji
  -- strict thread matches above, loose matches on Subject: below --
1998-08-15  9:13 Kenji

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).