9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] input methods for non-ascii languages
@ 2003-07-29 12:06 sasa
  0 siblings, 0 replies; 20+ messages in thread
From: sasa @ 2003-07-29 12:06 UTC (permalink / raw)
  To: 9fans


i'm interested in the path from keyboard scan code to rune and to UTF-8.
kenji helped me with his sources.

sasa.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  8:26         ` Anthony Mandic
@ 2003-07-30 11:35           ` boyd, rounin
  0 siblings, 0 replies; 20+ messages in thread
From: boyd, rounin @ 2003-07-30 11:35 UTC (permalink / raw)
  To: 9fans

> But going back to your original comment where you said that
> Japanese has 4 character sets, in terms of computer character
> sets (i.e. s-jis etc.) its only one isn't it?

err, iirc, no.  there are multiple jis versions and it's pretty ugly.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  2:33               ` okamoto
  2003-07-30  2:40                 ` boyd, rounin
  2003-07-30  4:45                 ` Skip Tavakkolian
@ 2003-07-30  8:26                 ` Anthony Mandic
  2 siblings, 0 replies; 20+ messages in thread
From: Anthony Mandic @ 2003-07-30  8:26 UTC (permalink / raw)
  To: 9fans

okamoto@granite.cias.osakafu-u.ac.jp wrote:
> 
> > yes, calligraphy is hard.  i can't even write ascii based stuff now after
> 
> I was raised in such Kanji calture where it's very natural we can't read
> ancient Kanji writings, because it's too much artificial.
...
> I felt that every nation has a similar tendency
> to write their words in art like if possible.   However, in case of
> Kanji, it's hard to read if it's written, and importantly we feel it
> beautifuller, in art like...

	Even in the West, illuminated texts and such fonts as Gothic,
	even when printed, are hard to read. Any flowery or showy
	type of font is difficult because you have to work out what
	each letter is, which slows down the reading pace. It gets
	worse when characters look similar.

	Handwriting varies from person to person and I find some hard to
	read. Reading copperplate is difficult if the writer stylises it
	too much. So it can be much the same as written Kanji. Artists
	stylise their calligraphy even further, which doesn't help, but
	its done on purpose. I can't even make out much of what written
	in graffiti tags. But then I suspect I'm not meant to.

-am	© 2003


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 20:47       ` boyd, rounin
@ 2003-07-30  8:26         ` Anthony Mandic
  2003-07-30 11:35           ` boyd, rounin
  0 siblings, 1 reply; 20+ messages in thread
From: Anthony Mandic @ 2003-07-30  8:26 UTC (permalink / raw)
  To: 9fans

"boyd, rounin" wrote:
> 
> > These two could be considered to be different cases for the
> > same syllabic sound. So they'd be something akin to upper and
> > lower case in European character sets ...
> 
> not at all.  you DO NOT write 'pure' japanese with katakana -- ever.

	Well, you could if you wanted to, but I do understand what you
	mean. Its not normal and would look as odd to the Japanese,
	possibly, as all uppercase does to us.

	But going back to your original comment where you said that
	Japanese has 4 character sets, in terms of computer character
	sets (i.e. s-jis etc.) its only one isn't it?

-am	© 2003


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  2:33               ` okamoto
  2003-07-30  2:40                 ` boyd, rounin
@ 2003-07-30  4:45                 ` Skip Tavakkolian
  2003-07-30  8:26                 ` Anthony Mandic
  2 siblings, 0 replies; 20+ messages in thread
From: Skip Tavakkolian @ 2003-07-30  4:45 UTC (permalink / raw)
  To: 9fans

> I felt that every nation has a similar tendency
> to write their words in art like if possible.

In Persian writings, especially poetry, calligraphy is used to
visually match the beauty of what is being said, and to give a hint of
its tone.  Making the words readable is not the primary consideration.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  2:33               ` okamoto
@ 2003-07-30  2:40                 ` boyd, rounin
  2003-07-30  4:45                 ` Skip Tavakkolian
  2003-07-30  8:26                 ` Anthony Mandic
  2 siblings, 0 replies; 20+ messages in thread
From: boyd, rounin @ 2003-07-30  2:40 UTC (permalink / raw)
  To: 9fans

> > yes, calligraphy is hard.  i can't even write ascii based stuff now after
>
> I was raised in such Kanji calture where it's very natural we can't read
> ancient Kanji writings, because it's too much artificial.

yes, it's too (what is the word?), ideomatic (although that usually applies
to speech).  it is an art.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  2:17             ` boyd, rounin
@ 2003-07-30  2:33               ` okamoto
  2003-07-30  2:40                 ` boyd, rounin
                                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: okamoto @ 2003-07-30  2:33 UTC (permalink / raw)
  To: 9fans

> yes, calligraphy is hard.  i can't even write ascii based stuff now after

I was raised in such Kanji calture where it's very natural we can't read
ancient Kanji writings, because it's too much artificial.

When I was at Old Bible's museum in Jerusalem, I was much surprised
that a young mother is reading that Bible, written 2000 years ago!, to
her daughter.   And I couldn't stop to ask her why you can read such
an old document, and her answer was, well it's not easy to read it,
because a bit art like.   I felt that every nation has a similar tendency
to write their words in art like if possible.   However, in case of
Kanji, it's hard to read if it's written, and importantly we feel it
beautifuller, in art like...

Kenji



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  2:06           ` okamoto
@ 2003-07-30  2:17             ` boyd, rounin
  2003-07-30  2:33               ` okamoto
  0 siblings, 1 reply; 20+ messages in thread
From: boyd, rounin @ 2003-07-30  2:17 UTC (permalink / raw)
  To: 9fans

> Is that your tattoo, which I said you should not do it?

it most certainly is [~5x3cm]:

    http://japan.chez.tiscali.fr/TokyoWeb/E-Ronin.htm

gaijin dakara, wakaranai ...

suits me fine, 'cos i'm:

    a) masterless
    b) unemployed
    c) have a CS major, but no BSc

yes, calligraphy is hard.  i can't even write ascii based stuff now after
all this this keyboard work and general language confusion.

now, if i could just get some 'asari' ...



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  1:19         ` boyd, rounin
@ 2003-07-30  2:06           ` okamoto
  2003-07-30  2:17             ` boyd, rounin
  0 siblings, 1 reply; 20+ messages in thread
From: okamoto @ 2003-07-30  2:06 UTC (permalink / raw)
  To: 9fans

do you have (see attached)?

I did it now.

Is that your tattoo, which I said you should not do it?
I'm hesitating to say something here.
Writing Kanji is a kind of art, which I have no good skill for it,
although I had lessons for it in my youth.

Kenji



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  0:51       ` okamoto
  2003-07-30  1:14         ` okamoto
@ 2003-07-30  1:19         ` boyd, rounin
  2003-07-30  2:06           ` okamoto
  1 sibling, 1 reply; 20+ messages in thread
From: boyd, rounin @ 2003-07-30  1:19 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 132 bytes --]

> In the ktrans, I know, we have no proper nouns, such as names of
> humans or places etc. though.

do you have (see attached)?

[-- Attachment #2: rounin.jpg --]
[-- Type: image/jpeg, Size: 16624 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-30  0:51       ` okamoto
@ 2003-07-30  1:14         ` okamoto
  2003-07-30  1:19         ` boyd, rounin
  1 sibling, 0 replies; 20+ messages in thread
From: okamoto @ 2003-07-30  1:14 UTC (permalink / raw)
  To: 9fans

One addition!

If you believe it's most important to write as fast as possible,
ktrans is not for you.   However, I don't read such ducuments.
Writing is not so easy to achieve it in a short time, and our speed
of writing is not so high, I believe.

Kenji



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 12:20     ` David Presotto
@ 2003-07-30  0:51       ` okamoto
  2003-07-30  1:14         ` okamoto
  2003-07-30  1:19         ` boyd, rounin
  0 siblings, 2 replies; 20+ messages in thread
From: okamoto @ 2003-07-30  0:51 UTC (permalink / raw)
  To: 9fans


> Some years back I saw a grammar based editor at Sony.
> It allowed you to type a sentence in romaji and it would display
> a romaji/kanji/kana representation of the sentence at the bottom
> of the screen.  Since it 'understood' the sentence, by the time
> you got to the end of the sentence, it had a pretty high probability
> of having it right.  You could then tab over to any word and cycle
> through hiragana, katakana, romaji and the possible kanji equivs.
> Of course, as you fixed each word, it could be changing all of the
> unfixed part to match.  I'ld hate to think how much code was
> behind it.

I don't believe it got any acceptance from well skilled Japanese writers.
The strategy for translating Kana/Romaji input to Kanji may have
two ways, the first for Japanese beginners, where I don't mean them
as foreigners, and the other for well educated Japanese writers.
For the first writer, automatic translation would be helpful, however,
for the latter, it would be a deeply annoying thing.   I'm using ktrans for
everyday mail writings and fairy longer documents without any problem.
I added many words though.

In the ktrans, I know, we have no proper nouns, such as names of
humans or places etc. though.

Kenji



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 15:07     ` Anthony Mandic
@ 2003-07-29 20:47       ` boyd, rounin
  2003-07-30  8:26         ` Anthony Mandic
  0 siblings, 1 reply; 20+ messages in thread
From: boyd, rounin @ 2003-07-29 20:47 UTC (permalink / raw)
  To: 9fans

> These two could be considered to be different cases for the
> same syllabic sound. So they'd be something akin to upper and
> lower case in European character sets ...

not at all.  you DO NOT write 'pure' japanese with katakana -- ever.

if you want do emphasise something in japanese you don't use
katakana.  if i wanted to write:

    baka desu

and i wanted to emphasise it, it'd write:

    baka yaro

[iirc] and it would _not_ be written in katakana.

like it says in Johnny English, by Pascal Sauvage [a french guy]:

    as we say in france -- the top, the best of

    http://us.imdb.com/Title?0274166



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 11:29   ` boyd, rounin
  2003-07-29 12:20     ` David Presotto
@ 2003-07-29 15:07     ` Anthony Mandic
  2003-07-29 20:47       ` boyd, rounin
  1 sibling, 1 reply; 20+ messages in thread
From: Anthony Mandic @ 2003-07-29 15:07 UTC (permalink / raw)
  To: 9fans

"boyd, rounin" wrote:

> japanese is a special case 'cos it has 4 character sets:
> 
>     - hiragana [phonetic set for japanese words]
>     - katakana [phonetic set for foreign words]

	These two could be considered to be different cases for the
	same syllabic sound. So they'd be something akin to upper and
	lower case in European character sets (although I don't know
	how they are actually generated on Japanese keyboards, I had
	thought it might be by using the shift key). Hence one character
	set.

>     - kanji [the ideographs]
>     - romaji [romanised representation]
> 
> i've seen numerous systems and keyboards for doing this
> and other things (the various japanese on 9fans know better
> than i, obviously) and it's pretty nasty.
> 
> some keyboards have the kana imposed on a qwerty keyboard
> and you use a 'shift' key to get at them.
>
> for typing the kanji, well the system i like is that you type the
> stem of the pronounciation and you then cycle through a
> set of ideographs until the one you want turns up.  i'm
> not sure, but there should be no reason why such a
> system couldn't sort them by frequency, on a personalised
> basis.

	I recall seeing a science show on the (Australian) ABC a
	few years back where a team from an Australian university
	came up with using the numeric keypad and working off
	stroke order. Since Chinese characters have specific strokes
	and a stroke order, they claimed it was easy to assign the
	strokes to the numeric keys and let the computer determine
	the character from the stroke order. I don't know what became
	of this method - perhaps it just never took off.

> iirc the basic set of kanji is around 800, then there's a jump
> to 2000 and most newspapers use around 6000.
> 
> reading them is hard enough, but in writing them you have
> to remember the 'stroke order', not some random set of
> strokes that will get you the character (this goes for the
> kana as well, but they are simple).

	Stroke order isn't too hard and easy to learn once you get
	the hang of it. Its fairly natural actually. What I found
	to be hard was getting the correct pronunciation of the kanji.
	Since you had On and Kun etc. it wasn't easy.

-am	© 2003


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 10:44 ` Skip Tavakkolian
  2003-07-29 11:29   ` boyd, rounin
@ 2003-07-29 12:23   ` David Presotto
  1 sibling, 0 replies; 20+ messages in thread
From: David Presotto @ 2003-07-29 12:23 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 346 bytes --]

Dmr had done a kbd daemon in the past that lets you type
romaji and cycle each caracter word through different
representations.  It wasn't as nice as the Sony system
but it was better than typing lots of alts.  You might
ask him for the code and make it work in the current world.
You'ld have to mount on top of the keyboard to make it work.

[-- Attachment #2: Type: message/rfc822, Size: 2460 bytes --]

From: "Skip Tavakkolian" <fst@centurytel.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] input methods for non-ascii languages
Date: Tue, 29 Jul 2003 03:44:41 -0700
Message-ID: <0488901b9c79a39ff7a6284c92c17653@centurytel.net>

> i'd like to see some sources of programs that make possible
> to input unicode characters in different way instead of the
> default input method (ALT+some keys) in plan9.
> can you show me some links to papers, sources?!

I don't know if this would also be relevant, but you can check
'nemo' directory on sources for devkbmap stuff. If you are setup
to get to sources, you'll find it here:

/n/sources/nemo/sys/src/9/pc

Otherwise look for 'Sources Extras' under 'Additional Software' at
the Labs.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 11:29   ` boyd, rounin
@ 2003-07-29 12:20     ` David Presotto
  2003-07-30  0:51       ` okamoto
  2003-07-29 15:07     ` Anthony Mandic
  1 sibling, 1 reply; 20+ messages in thread
From: David Presotto @ 2003-07-29 12:20 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 597 bytes --]

Some years back I saw a grammar based editor at Sony.
It allowed you to type a sentence in romaji and it would display
a romaji/kanji/kana representation of the sentence at the bottom
of the screen.  Since it 'understood' the sentence, by the time
you got to the end of the sentence, it had a pretty high probability
of having it right.  You could then tab over to any word and cycle
through hiragana, katakana, romaji and the possible kanji equivs.
Of course, as you fixed each word, it could be changing all of the
unfixed part to match.  I'ld hate to think how much code was
behind it.

[-- Attachment #2: Type: message/rfc822, Size: 3913 bytes --]

From: "boyd, rounin" <boyd@insultant.net>
To: <9fans@cse.psu.edu>
Subject: Re: [9fans] input methods for non-ascii languages
Date: Tue, 29 Jul 2003 13:29:32 +0200
Message-ID: <010c01c355c4$af2cb6c0$b9844051@insultant.net>

> I don't know if this would also be relevant, but you can check
> 'nemo' directory on sources for devkbmap stuff. If you are setup
> to get to sources, you'll find it here:

that stuff is based on the fact that you have a keyboard that
allows you to type the characters directly.  well, i should make
myself clear:

    all pc keyboards generate the same scan codes for the
    the same key (modulo weirdness) but the keytops have
    different symbols on them.  eg: where a us keyboard
    would have qwerty i have azerty.  when typing either
    sequence the same set of scan codes is generated.

to make it more difficult not all of the characters can be typed
directly; to get ê [&ecirc;] i have to type ^ then e.

japanese is a special case 'cos it has 4 character sets:

    - hiragana [phonetic set for japanese words]
    - katakana [phonetic set for foreign words]
    - kanji [the ideographs]
    - romaji [romanised representation]

i've seen numerous systems and keyboards for doing this
and other things (the various japanese on 9fans know better
than i, obviously) and it's pretty nasty.

some keyboards have the kana imposed on a qwerty keyboard
and you use a 'shift' key to get at them.

for typing the kanji, well the system i like is that you type the
stem of the pronounciation and you then cycle through a
set of ideographs until the one you want turns up.  i'm
not sure, but there should be no reason why such a
system couldn't sort them by frequency, on a personalised
basis.

iirc the basic set of kanji is around 800, then there's a jump
to 2000 and most newspapers use around 6000.

reading them is hard enough, but in writing them you have
to remember the 'stroke order', not some random set of
strokes that will get you the character (this goes for the
kana as well, but they are simple).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29 10:44 ` Skip Tavakkolian
@ 2003-07-29 11:29   ` boyd, rounin
  2003-07-29 12:20     ` David Presotto
  2003-07-29 15:07     ` Anthony Mandic
  2003-07-29 12:23   ` David Presotto
  1 sibling, 2 replies; 20+ messages in thread
From: boyd, rounin @ 2003-07-29 11:29 UTC (permalink / raw)
  To: 9fans

> I don't know if this would also be relevant, but you can check
> 'nemo' directory on sources for devkbmap stuff. If you are setup
> to get to sources, you'll find it here:

that stuff is based on the fact that you have a keyboard that
allows you to type the characters directly.  well, i should make
myself clear:

    all pc keyboards generate the same scan codes for the
    the same key (modulo weirdness) but the keytops have
    different symbols on them.  eg: where a us keyboard
    would have qwerty i have azerty.  when typing either
    sequence the same set of scan codes is generated.

to make it more difficult not all of the characters can be typed
directly; to get ê [&ecirc;] i have to type ^ then e.

japanese is a special case 'cos it has 4 character sets:

    - hiragana [phonetic set for japanese words]
    - katakana [phonetic set for foreign words]
    - kanji [the ideographs]
    - romaji [romanised representation]

i've seen numerous systems and keyboards for doing this
and other things (the various japanese on 9fans know better
than i, obviously) and it's pretty nasty.

some keyboards have the kana imposed on a qwerty keyboard
and you use a 'shift' key to get at them.

for typing the kanji, well the system i like is that you type the
stem of the pronounciation and you then cycle through a
set of ideographs until the one you want turns up.  i'm
not sure, but there should be no reason why such a
system couldn't sort them by frequency, on a personalised
basis.

iirc the basic set of kanji is around 800, then there's a jump
to 2000 and most newspapers use around 6000.

reading them is hard enough, but in writing them you have
to remember the 'stroke order', not some random set of
strokes that will get you the character (this goes for the
kana as well, but they are simple).



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29  7:42 sasa
  2003-07-29  7:56 ` okamoto
@ 2003-07-29 10:44 ` Skip Tavakkolian
  2003-07-29 11:29   ` boyd, rounin
  2003-07-29 12:23   ` David Presotto
  1 sibling, 2 replies; 20+ messages in thread
From: Skip Tavakkolian @ 2003-07-29 10:44 UTC (permalink / raw)
  To: 9fans

> i'd like to see some sources of programs that make possible
> to input unicode characters in different way instead of the
> default input method (ALT+some keys) in plan9.
> can you show me some links to papers, sources?!

I don't know if this would also be relevant, but you can check
'nemo' directory on sources for devkbmap stuff. If you are setup
to get to sources, you'll find it here:

/n/sources/nemo/sys/src/9/pc

Otherwise look for 'Sources Extras' under 'Additional Software' at
the Labs.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [9fans] input methods for non-ascii languages
  2003-07-29  7:42 sasa
@ 2003-07-29  7:56 ` okamoto
  2003-07-29 10:44 ` Skip Tavakkolian
  1 sibling, 0 replies; 20+ messages in thread
From: okamoto @ 2003-07-29  7:56 UTC (permalink / raw)
  To: 9fans

> this is especially question to japanese, russian, greek etc. users.
> i'd like to see some sources of programs that make possible
> to input unicode characters in different way

You can down load ktrans from
http://basalt.cias.osakafu-u.ac.jp/plan9/s39.html.

Kenji



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [9fans] input methods for non-ascii languages
@ 2003-07-29  7:42 sasa
  2003-07-29  7:56 ` okamoto
  2003-07-29 10:44 ` Skip Tavakkolian
  0 siblings, 2 replies; 20+ messages in thread
From: sasa @ 2003-07-29  7:42 UTC (permalink / raw)
  To: 9fans


hi planiners!

this is especially question to japanese, russian, greek etc. users.
i'd like to see some sources of programs that make possible
to input unicode characters in different way instead of the
default input method (ALT+some keys) in plan9.
can you show me some links to papers, sources?!

thanx9.

sasa babic (babic@icpf.cas.cz)





^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2003-07-30 11:35 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-29 12:06 [9fans] input methods for non-ascii languages sasa
  -- strict thread matches above, loose matches on Subject: below --
2003-07-29  7:42 sasa
2003-07-29  7:56 ` okamoto
2003-07-29 10:44 ` Skip Tavakkolian
2003-07-29 11:29   ` boyd, rounin
2003-07-29 12:20     ` David Presotto
2003-07-30  0:51       ` okamoto
2003-07-30  1:14         ` okamoto
2003-07-30  1:19         ` boyd, rounin
2003-07-30  2:06           ` okamoto
2003-07-30  2:17             ` boyd, rounin
2003-07-30  2:33               ` okamoto
2003-07-30  2:40                 ` boyd, rounin
2003-07-30  4:45                 ` Skip Tavakkolian
2003-07-30  8:26                 ` Anthony Mandic
2003-07-29 15:07     ` Anthony Mandic
2003-07-29 20:47       ` boyd, rounin
2003-07-30  8:26         ` Anthony Mandic
2003-07-30 11:35           ` boyd, rounin
2003-07-29 12:23   ` David Presotto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).