9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] ctrans - Chinese language input for Plan9
@ 2022-07-20  3:20 smj
  2022-07-20  3:54 ` Lucio De Re
  0 siblings, 1 reply; 19+ messages in thread
From: smj @ 2022-07-20  3:20 UTC (permalink / raw)
  To: 9fans

With the recent commit of 'ktrans' to 9front, SDF boot camper 'ldb' as taken the
idea and created 'ctrans' https://9p.sdf.org/who/ldb

As Kenji Okamoto has pointed out, 'ktrans' would be difficult to extend to 
Chinese due to the massive number of characters necessary.  While Japanese can
get away with ~2500 daily use characters, Chinese requires quite a bit more. 
The advantage in Japanese is that there are two other writing alphabets which are
purely phonetic and useful for importing foreign words.

ldb's 'ctrans' had to take the 'ktrans' idea and optimize it a bit more to support
20,000 characters.  The result is a mechanism that more or less behaves like ktrans
but is quick (even over drawterm) to cycle through character lists.

moody has seen this work and it has been an inspiration to adapt to 'ktrans' for
even faster Kanji look up which could allow for more esoteric Kanji to be added.

In addition a new font 'HanaMinA' has been adapted which beautifully supports both
Japanese and Chinese characters and it is what we recommend folks use on 9p.sdf.org.

Thank you ldb for your great work!

ldb, お疲れ様です!

smj

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M2b9bec354e89720b17643a6a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] ctrans - Chinese language input for Plan9
  2022-07-20  3:20 [9fans] ctrans - Chinese language input for Plan9 smj
@ 2022-07-20  3:54 ` Lucio De Re
  2022-07-20  6:50   ` sirjofri
  0 siblings, 1 reply; 19+ messages in thread
From: Lucio De Re @ 2022-07-20  3:54 UTC (permalink / raw)
  To: 9fans

I have only one word for all the above: amazing!

As a dumb occidental, I have no idea where one starts with ideograms,
but I realise how different the concept is and how its complexity can
stimulate technical creativity.

Well done, all!

Lucio.

On 7/20/22, smj@9p.sdf.org <smj@9p.sdf.org> wrote:
> With the recent commit of 'ktrans' to 9front, SDF boot camper 'ldb' as taken
> the
> idea and created 'ctrans' https://9p.sdf.org/who/ldb
> 
> As Kenji Okamoto has pointed out, 'ktrans' would be difficult to extend to
> Chinese due to the massive number of characters necessary.  While Japanese
> can
> get away with ~2500 daily use characters, Chinese requires quite a bit more.
> 
> The advantage in Japanese is that there are two other writing alphabets
> which are
> purely phonetic and useful for importing foreign words.
> 
> ldb's 'ctrans' had to take the 'ktrans' idea and optimize it a bit more to
> support
> 20,000 characters.  The result is a mechanism that more or less behaves like
> ktrans
> but is quick (even over drawterm) to cycle through character lists.
> 
> moody has seen this work and it has been an inspiration to adapt to 'ktrans'
> for
> even faster Kanji look up which could allow for more esoteric Kanji to be
> added.
> 
> In addition a new font 'HanaMinA' has been adapted which beautifully
> supports both
> Japanese and Chinese characters and it is what we recommend folks use on
> 9p.sdf.org.
> 
> Thank you ldb for your great work!
> 
> ldb, お疲れ様です!
> 
> smj

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M69038fd1b148474da50c0796
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] ctrans - Chinese language input for Plan9
  2022-07-20  3:54 ` Lucio De Re
@ 2022-07-20  6:50   ` sirjofri
  2022-07-21  2:44     ` [9fans] " cigar562hfsp952fans
  0 siblings, 1 reply; 19+ messages in thread
From: sirjofri @ 2022-07-20  6:50 UTC (permalink / raw)
  To: 9fans


20.07.2022 05:54:34 Lucio De Re <lucio.dere@gmail.com>:

> I have only one word for all the above: amazing!

Yes, truly amazing.

> As a dumb occidental, I have no idea where one starts with ideograms,
> but I realise how different the concept is and how its complexity can
> stimulate technical creativity.

Or just make us realize how dumb "american" computers are.

I'm pretty sure that pure Chinese computers would look different.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M64d55dd90735730acadd808b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-20  6:50   ` sirjofri
@ 2022-07-21  2:44     ` cigar562hfsp952fans
  2022-07-21  6:57       ` sirjofri
                         ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: cigar562hfsp952fans @ 2022-07-21  2:44 UTC (permalink / raw)
  To: 9fans

sirjofri <sirjofri+ml-9fans@sirjofri.de> writes:

> I'm pretty sure that pure Chinese computers would look different.

I've often wondered that.  What input methods do Chinese speakers use?
What do Chinese keyboards look like?  How do they find/select the
character they want?  Are different sets of characters available on
different computers, or are input methods standardized?  I wonder.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mfd7cc77a83bcefbc998c371e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-21  2:44     ` [9fans] " cigar562hfsp952fans
@ 2022-07-21  6:57       ` sirjofri
  2022-07-21  9:46         ` adr
  2022-07-21  9:45       ` Lucio De Re
  2022-07-22 19:14       ` andpuke
  2 siblings, 1 reply; 19+ messages in thread
From: sirjofri @ 2022-07-21  6:57 UTC (permalink / raw)
  To: 9fans


21.07.2022 04:44:53 cigar562hfsp952fans@icebubble.org:

> sirjofri <sirjofri+ml-9fans@sirjofri.de> writes:
>
>> I'm pretty sure that pure Chinese computers would look different.
>
> I've often wondered that.  What input methods do Chinese speakers use?
> What do Chinese keyboards look like?  How do they find/select the
> character they want?  Are different sets of characters available on
> different computers, or are input methods standardized?  I wonder.

I was more referring to computers built without any american influence at 
all, so no ansi, no ascii, no LTR, probably different keycodes...


I can't give you an answer as I'm not from an asian culture (although I 
studied it a little) and it's hard to answer anyway since I'm heavily 
influenced by american computers. I'd really need a few years studying 
those cultures heavily to be able to describe a possible tendency.

I can imagine though to look at early russian (and maybe even chinese, if 
there is) space technology. I know that the russian tech was very 
isolated compared to modern technology.

sirjofri

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M817c5719a75708c69b3cfd05
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-21  2:44     ` [9fans] " cigar562hfsp952fans
  2022-07-21  6:57       ` sirjofri
@ 2022-07-21  9:45       ` Lucio De Re
  2022-07-21 10:20         ` adr
  2022-07-22 19:14       ` andpuke
  2 siblings, 1 reply; 19+ messages in thread
From: Lucio De Re @ 2022-07-21  9:45 UTC (permalink / raw)
  To: 9fans

On 7/21/22, cigar562hfsp952fans@icebubble.org
<cigar562hfsp952fans@icebubble.org> wrote:
> sirjofri <sirjofri+ml-9fans@sirjofri.de> writes:
>
>> I'm pretty sure that pure Chinese computers would look different.
>
> I've often wondered that.  What input methods do Chinese speakers use?
> What do Chinese keyboards look like?  How do they find/select the
> character they want?  Are different sets of characters available on
> different computers, or are input methods standardized?  I wonder.
>
I stumbled onto an instructive video on youtube not that long ago. I'm
sure there are a few you'll be able to search for. If I understand
correctly, it's a combination of entering the phoneme by the nearest
Latin letter, then select from a diminishing range of suitable options
on the screen.

The video was more focused specifically on how this need - which
Chinese, Japanese and Koreans somewhat reacted differently to - caused
the Chinese to make great strides in computing.

Lucio.
> ------------------------------------------
> 9fans: 9fans
> Permalink:
> https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mfd7cc77a83bcefbc998c371e
> Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
>


-- 
Lucio De Re
2 Piet Retief St
Kestell (Eastern Free State)
9860 South Africa

Ph.: +27 58 653 1433
Cell: +27 83 251 5824

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M0df0a84a156b182c700ca96c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-21  6:57       ` sirjofri
@ 2022-07-21  9:46         ` adr
  0 siblings, 0 replies; 19+ messages in thread
From: adr @ 2022-07-21  9:46 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1321 bytes --]

> I know that the russian tech was very
> isolated compared to modern technology.

The most interesting for me are the Setun ternary computers designed by Nikolay Brusentsov in the late '50s running a Forth like system. They did a lot of research and came to the conclusion that Forth was _the_ language. They saw Forth as a discovery by Chuck Moore, not an invention (to give him more credit, no less). The binary computers that become popular (m-3, ural, etc) were slowly replaced by clones of western computers PDP-11, Intel, Vax, etc). The operating systems were mostly clones too. The computers of the '80s and '90s in schools and homes were clones of PC, Apple, Z80. The Spectrum clones were very popular. Asian computer technology was imported from the Western or Soviet worlds, so they had to add devices or methods to enter their own characters (look for some crazy keyboard built in Taiwan). The early input methods (form the '70s?) were pretty much like the ones we use today. As far as I know, there wasn't any Asian computer created without Western or Soviet influence.

adr
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mfe79a57631b9a0b4b7b839e8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1941 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-21  9:45       ` Lucio De Re
@ 2022-07-21 10:20         ` adr
  2022-07-22 12:30           ` Silvan Jegen
  0 siblings, 1 reply; 19+ messages in thread
From: adr @ 2022-07-21 10:20 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1092 bytes --]

> I stumbled onto an instructive video on youtube not that long ago. I'm
> sure there are a few you'll be able to search for. If I understand
> correctly, it's a combination of entering the phoneme by the nearest
> Latin letter, then select from a diminishing range of suitable options
> on the screen.

There are other input methods based on the shape of the characters. Some are better with traditional Chinese characters, other with simplified characters, it's complicated... Let see if some Chinese comrade share with us his daily life experience. The Japanese is input writing kana directly with a Japanese keyboard or by romaji with roman characters on western keyboards (ka -> か, &c) and then transformed to kanji when necessary. There are different IMEs, but the principle is the same. I suppose that ktrans is similar, I haven't tried jet.

adr
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M428fc6fd31a9ffdb29d773bc
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1757 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-21 10:20         ` adr
@ 2022-07-22 12:30           ` Silvan Jegen
  2022-07-22 14:43             ` adr
  2022-07-22 18:06             ` Sebastian Higgins
  0 siblings, 2 replies; 19+ messages in thread
From: Silvan Jegen @ 2022-07-22 12:30 UTC (permalink / raw)
  To: 9fans

adr@sdf.org wrote:
> > I stumbled onto an instructive video on youtube not that long ago. I'm
> > sure there are a few you'll be able to search for. If I understand
> > correctly, it's a combination of entering the phoneme by the nearest
> > Latin letter, then select from a diminishing range of suitable options
> > on the screen.
> 
> There are other input methods based on the shape of the
> characters. Some are better with traditional Chinese characters,
> other with simplified characters, it's complicated... Let see if some
> Chinese comrade share with us his daily life experience. The Japanese
> is input writing kana directly with a Japanese keyboard or by romaji
> with roman characters on western keyboards (ka -> か, &c) and then
> transformed to kanji when necessary. There are different IMEs, but the
> principle is the same. I suppose that ktrans is similar, I haven't
> tried jet.

ktrans seems to be quite different actually. According to the
documentation it uses the Cangjie input method [0] which is based on the
so called "radicals". These are some more basic elements that the Chinese
characters are made of (note that the "radicals" chosen for Cangjie are
not identical to the 214 radicals that are commonly used to classify
Chinese characters. For the latter see [1]).

Every one of these 24 Cangjie radicals gets mapped to an ASCII character
and their combinations then uniquely identify a Chinese character (the
wikipage at [0] illustrates the approach very well).

This input method seems to be old and I have never seen a Chinese person
use it. From what I understand, most Chinese people nowadays just write
text in Pinyin (a latin transliteration of the Chinese pronounciation)
and then the IME helps you choose the correct combination of Chinese
characters (potentially taking the context of the text already written
into account).


Cheers,

Silvan

[0] https://en.wikipedia.org/wiki/Cangjie_input_method
[1] https://en.wikipedia.org/wiki/Kangxi_radical

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M9589c3997fe9cf5b52b599d5
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 12:30           ` Silvan Jegen
@ 2022-07-22 14:43             ` adr
  2022-07-22 18:06             ` Sebastian Higgins
  1 sibling, 0 replies; 19+ messages in thread
From: adr @ 2022-07-22 14:43 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 964 bytes --]

Yep, Cangjie is one of those input methods based on shape I was talking about, more appropriate for traditional Chinese characters used in Taiwan, Hong-Kong, etc. South Korea still use kanji similar to traditional Chinese, but I don't know what input method they use. Note that in mainland China people use Pinyin because they imposed the use of simplified Chinese characters.  It surprises me to hear that ktrans uses Cangjie, Japanese keyboards let you input kana directly, and the use of kana to write without kanji is common, specially in books for kids, so it seams more natural to me to make a kana->kanji conversion (or romaji->kana->kanji in Western keyboards). But I'm not Japanese, maybe Cangjie is faster, I've never tryed.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M9f10d9140a5f0838d615958f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1487 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 12:30           ` Silvan Jegen
  2022-07-22 14:43             ` adr
@ 2022-07-22 18:06             ` Sebastian Higgins
  2022-07-22 19:09               ` Silvan Jegen
  2022-07-22 19:13               ` Jacob Moody
  1 sibling, 2 replies; 19+ messages in thread
From: Sebastian Higgins @ 2022-07-22 18:06 UTC (permalink / raw)
  To: 9fans

A few things:

1.  Cangjie is still widely used in places that uses traditional Chinese characters. You would still be required to be good at it if you apply for text-heavy office jobs in these places.
2.  Radical-based/shape-based methods were extremely popular when the prediction technology wasn't as good (which means Pinyin was significantly slower). It wasn't until late 2000s to early 2010s before this situation has changed.
3.  Pinyin without prediction is slow because of what we called the 重码 (lit. "overlap of encoding") problem. For Pinyin the encoding overlaps because many characters may have the same Pinyin; the purpose of all shape-based method is to reduce the overlap problem and thus increase the input speed.
4.  ctrans uses cangjie because (1) implementing shape-based methods was much, much more simpler than phonetic-based methods because most (if not all) of the job is table lookup; (2) if we were to use the same UI (or lack thereof) as ktrans the overlap-of-encoding problem of Pinyin would very probably drive you nuts when using it; (3) it is the input method the author uses, however I do admit using Cangjie for simplified Chinese input is kinda peculiar.

Source: me who is a native Chinese speaker and have learned Wubi (a shape-based method for simplified Chinese) in primary school.

________________________________________
From: Silvan Jegen <me@sillymon.ch>
Sent: Friday, July 22, 2022 12:30
To: 9fans
Subject: Re: [9fans] Re: ctrans - Chinese language input for Plan9

adr@sdf.org wrote:
> > I stumbled onto an instructive video on youtube not that long ago. I'm
> > sure there are a few you'll be able to search for. If I understand
> > correctly, it's a combination of entering the phoneme by the nearest
> > Latin letter, then select from a diminishing range of suitable options
> > on the screen.
>
> There are other input methods based on the shape of the
> characters. Some are better with traditional Chinese characters,
> other with simplified characters, it's complicated... Let see if some
> Chinese comrade share with us his daily life experience. The Japanese
> is input writing kana directly with a Japanese keyboard or by romaji
> with roman characters on western keyboards (ka -> か, &c) and then
> transformed to kanji when necessary. There are different IMEs, but the
> principle is the same. I suppose that ktrans is similar, I haven't
> tried jet.

ktrans seems to be quite different actually. According to the
documentation it uses the Cangjie input method [0] which is based on the
so called "radicals". These are some more basic elements that the Chinese
characters are made of (note that the "radicals" chosen for Cangjie are
not identical to the 214 radicals that are commonly used to classify
Chinese characters. For the latter see [1]).

Every one of these 24 Cangjie radicals gets mapped to an ASCII character
and their combinations then uniquely identify a Chinese character (the
wikipage at [0] illustrates the approach very well).

This input method seems to be old and I have never seen a Chinese person
use it. From what I understand, most Chinese people nowadays just write
text in Pinyin (a latin transliteration of the Chinese pronounciation)
and then the IME helps you choose the correct combination of Chinese
characters (potentially taking the context of the text already written
into account).


Cheers,

Silvan

[0] https://en.wikipedia.org/wiki/Cangjie_input_method
[1] https://en.wikipedia.org/wiki/Kangxi_radical

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mf1934dc65975e0ca3989d488
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 18:06             ` Sebastian Higgins
@ 2022-07-22 19:09               ` Silvan Jegen
  2022-07-22 19:13               ` Jacob Moody
  1 sibling, 0 replies; 19+ messages in thread
From: Silvan Jegen @ 2022-07-22 19:09 UTC (permalink / raw)
  To: 9fans

Heyhey!

Sebastian Higgins <bctnry@outlook.com> wrote:
> A few things:
> 
> 1.  Cangjie is still widely used in places that uses traditional
> Chinese characters. You would still be required to be good at it if
> you apply for text-heavy office jobs in these places.

Ah, I didn't know that! I also don't know anyone who does office work
in a place where traditional Chinese characters are used though ...


> 2.  Radical-based/shape-based methods were extremely popular when
> the prediction technology wasn't as good (which means Pinyin was
> significantly slower). It wasn't until late 2000s to early 2010s
> before this situation has changed.

At least in Japan I have never met anyone using a
radical-based/shape-based input method. I have not even met anyone using
direct Kana input, only through romaji. That said, may be an earlier
generation used it more commonly ...


> 3.  Pinyin without prediction is slow because of what we called the
> 重码 (lit. "overlap of encoding") problem. For Pinyin the encoding
> overlaps because many characters may have the same Pinyin; the purpose
> of all shape-based method is to reduce the overlap problem and thus
> increase the input speed.

Yeah, it's due to the high homophones count. Only the tones differ and
these are not supported in pinyin input methods (as far as I know ...)


> 4. ctrans uses cangjie because (1) implementing shape-based methods
> was much, much more simpler than phonetic-based methods because most
> (if not all) of the job is table lookup; (2) if we were to use the
> same UI (or lack thereof) as ktrans the overlap-of-encoding problem
> of Pinyin would very probably drive you nuts when using it; (3) it is
> the input method the author uses, however I do admit using Cangjie for
> simplified Chinese input is kinda peculiar.
> 
> Source: me who is a native Chinese speaker and have learned Wubi
> (a shape-based method for simplified Chinese) in primary school.

Thanks for the insights. I appreciate it!


Cheers,
Silvan

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M977f609261cd764b55ad5dbf
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 18:06             ` Sebastian Higgins
  2022-07-22 19:09               ` Silvan Jegen
@ 2022-07-22 19:13               ` Jacob Moody
  1 sibling, 0 replies; 19+ messages in thread
From: Jacob Moody @ 2022-07-22 19:13 UTC (permalink / raw)
  To: 9fans

On 7/22/22 12:06, Sebastian Higgins wrote:
> A few things:
> 
> 1.  Cangjie is still widely used in places that uses traditional Chinese characters. You would still be required to be good at it if you apply for text-heavy office jobs in these places.
> 2.  Radical-based/shape-based methods were extremely popular when the prediction technology wasn't as good (which means Pinyin was significantly slower). It wasn't until late 2000s to early 2010s before this situation has changed.
> 3.  Pinyin without prediction is slow because of what we called the 重码 (lit. "overlap of encoding") problem. For Pinyin the encoding overlaps because many characters may have the same Pinyin; the purpose of all shape-based method is to reduce the overlap problem and thus increase the input speed.
> 4.  ctrans uses cangjie because (1) implementing shape-based methods was much, much more simpler than phonetic-based methods because most (if not all) of the job is table lookup; (2) if we were to use the same UI (or lack thereof) as ktrans the overlap-of-encoding problem of Pinyin would very probably drive you nuts when using it; (3) it is the input method the author uses, however I do admit using Cangjie for simplified Chinese input is kinda peculiar.
> 
> Source: me who is a native Chinese speaker and have learned Wubi (a shape-based method for simplified Chinese) in primary school.

I had taken a naive attempt at trying getting ktrans to support a form
of Chinese input. Admitably, my interest was mostly in stress testing
my rewrite of the hashmap used in ktrans, throwing a ~100k character
dictionary at it seemed like a fun way to test it. The dictionary I
imported was one that used Wubi based mapping for charters, posted
by jxy to the 9front mailing list a week or so ago.

If anyone is curious the dictionary itself can be found here:
https://raw.githubusercontent.com/fcitx/fcitx-table-data/master/wbx.txt

This has been super interesting to me from a learning perspective.

Thanks for the insight!
Jacob Moody

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mcf3888dbfc4013192d8c471e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-21  2:44     ` [9fans] " cigar562hfsp952fans
  2022-07-21  6:57       ` sirjofri
  2022-07-21  9:45       ` Lucio De Re
@ 2022-07-22 19:14       ` andpuke
  2022-07-22 19:24         ` andpuke
  2 siblings, 1 reply; 19+ messages in thread
From: andpuke @ 2022-07-22 19:14 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1921 bytes --]

On Wednesday, 20 July 2022, at 11:15 PM, cigar562hfsp952fans wrote:
> I've often wondered that.  What input methods do Chinese speakers use?
What do Chinese keyboards look like?  How do they find/select the
character they want?  Are different sets of characters available on
different computers, or are input methods standardized?  I wonder.
Most Chinese speakers just use standard "British and American keyboards". There are keycaps engraved with Wubi or Cangjie or Bopomofo (or Zhuyin), but they are all compatible with QWERTY.

On Thursday, 21 July 2022, at 1:58 AM, sirjofri wrote:
> I was more referring to computers built without any american influence at 
all, so no ansi, no ascii, no LTR, probably different keycodes...
Cangjie was the first solution to Chinese processing with *personal computers* (at the time of Apple ][ it was sold as  extension boards.)
There used to be other encoding methods such as using only numpad (Four-Corner Method), or special keyboards (Ming Kwai typewriter), even an input method for Chinese had been invented in US https://patents.google.com/patent/US2412777A, but they were almost disappeared.

There are a few other considerations regards to adopting Cangjie besides https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mf1934dc65975e0ca3989d488/ctrans-chinese-language-input-for-plan9:

1. Cangjie is copyright free and related IMEs are distributed as free software, while (at least newer version of) Wubi is patented.
2. Personally, I realized the order of strokes has been changed during the last 10 years or so and similarly, the pronunciation of certain characters has also altered over the time.

Best wishes
---
ldb
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M63f7777cb9504cafbca55334
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2900 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 19:14       ` andpuke
@ 2022-07-22 19:24         ` andpuke
  2022-07-22 20:37           ` Silvan Jegen
  0 siblings, 1 reply; 19+ messages in thread
From: andpuke @ 2022-07-22 19:24 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

On Friday, 22 July 2022, at 2:09 PM, Silvan Jegen wrote:
> Ah, I didn't know that! I also don't know anyone who does office work
in a place where traditional Chinese characters are used though ...
They would use RIME, https://rime.im a free software widely recognized among Chinese users who are not satisfied with default Pinyin. But unfortunately that thing is written in C++ so making a port is unliky.

ldb
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Mc5ba1baecec99ea1967578b2
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1263 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 19:24         ` andpuke
@ 2022-07-22 20:37           ` Silvan Jegen
  2022-07-22 22:29             ` LdBeth
  0 siblings, 1 reply; 19+ messages in thread
From: Silvan Jegen @ 2022-07-22 20:37 UTC (permalink / raw)
  To: 9fans

andpuke@foxmail.com wrote:
> On Friday, 22 July 2022, at 2:09 PM, Silvan Jegen wrote:
> > Ah, I didn't know that! I also don't know anyone who does office work
> > in a place where traditional Chinese characters are used though ...
>
> They would use RIME, https://rime.im a free software widely
> recognized among Chinese users who are not satisfied with default
> Pinyin. But unfortunately that thing is written in C++ so making a
> port is unliky.

Funnily enough I use Rime on my Linux machine to input Simplified
Chinese. I honestly just switched a Rime input setting to something that
looks like pinyin but the suggestions seem better to me than the old
IME that I used ... I should probably invest some time in understanding
how the thing actually is supposed to be used (documentation in English
seems sparse and my Chinese sucks).

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M9e59e41273b1269646ab8584
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 20:37           ` Silvan Jegen
@ 2022-07-22 22:29             ` LdBeth
  2022-07-26 12:29               ` adr
  0 siblings, 1 reply; 19+ messages in thread
From: LdBeth @ 2022-07-22 22:29 UTC (permalink / raw)
  To: 9fans

>>>>> In <288YQ7Y33V3RF.38NPGPX4H2CHU@homearch.localdomain> 
>>>>>   "Silvan Jegen" <me@sillymon.ch> wrote:
SJ> andpuke@foxmail.com wrote:
>> On Friday, 22 July 2022, at 2:09 PM, Silvan Jegen wrote:
>> > Ah, I didn't know that! I also don't know anyone who does office work
>> > in a place where traditional Chinese characters are used though ...
>>
>> They would use RIME, https://rime.im a free software widely
>> recognized among Chinese users who are not satisfied with default
>> Pinyin. But unfortunately that thing is written in C++ so making a
>> port is unliky.
SJ> Funnily enough I use Rime on my Linux machine to input Simplified
SJ> Chinese. I honestly just switched a Rime input setting to something that
SJ> looks like pinyin but the suggestions seem better to me than the old
SJ> IME that I used ... I should probably invest some time in understanding
SJ> how the thing actually is supposed to be used (documentation in English
SJ> seems sparse and my Chinese sucks).

RIME was popularized because most other Pinyin based IMEs on the
market suck for traditional Chinese input, for these IMEs' suggestion
dictionaries were usually directly substituted from simplified Chinese
versions, but mapping simplified Chinese to transitional Chinese is
very context sensitive. The byproduct of RIME is the OpenCC
https://github.com/BYVoid/OpenCC library that can handles all the
trivia of these kinds of translation.

The SC support for RIME was contributed by community, I think, and the
author of RIME uses Cangjie. Cangjie was not officially designed for
simplified Chinese but was extended to be able to handle that. I heard
rumors that the author refused to add a switch to prioritize
simplified Chinese characters for Cangjie in RIME, so an external
dictionary is used if users want to have that behavior.

---
LDB

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M7654c6f7091bf0a32c7e3bca
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-22 22:29             ` LdBeth
@ 2022-07-26 12:29               ` adr
  2022-07-29  8:04                 ` Silvan Jegen
  0 siblings, 1 reply; 19+ messages in thread
From: adr @ 2022-07-26 12:29 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 543 bytes --]

> Silvan Jegen wrote:
> ktrans seems to be quite different actually. According to the
> documentation it uses the Cangjie input method
I was really surprised when I read this and of course, this is not true. I suppose you meant ctrans.

https://git.sansfontieres.com/~romi/ktrans/tree/front/item/README.kenji

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-M0049de1a1058af72e04fe22c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1244 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [9fans] Re: ctrans - Chinese language input for Plan9
  2022-07-26 12:29               ` adr
@ 2022-07-29  8:04                 ` Silvan Jegen
  0 siblings, 0 replies; 19+ messages in thread
From: Silvan Jegen @ 2022-07-29  8:04 UTC (permalink / raw)
  To: 9fans, adr



On July 26, 2022 3:29:15 PM GMT+03:00, adr@sdf.org wrote:
>> Silvan Jegen wrote:
>> ktrans seems to be quite different actually. According to the
>> documentation it uses the Cangjie input method
>I was really surprised when I read this and of course, this is not true. I suppose you meant ctrans.

Ah, my bad. I must have confused the two.


Cheers,
Silvan

> https://git.sansfontieres.com/~romi/ktrans/tree/front/item/README.kenji

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tba6835d445e07919-Ma5d0eb2bc4af1d14fe1e7e30
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-07-29  8:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20  3:20 [9fans] ctrans - Chinese language input for Plan9 smj
2022-07-20  3:54 ` Lucio De Re
2022-07-20  6:50   ` sirjofri
2022-07-21  2:44     ` [9fans] " cigar562hfsp952fans
2022-07-21  6:57       ` sirjofri
2022-07-21  9:46         ` adr
2022-07-21  9:45       ` Lucio De Re
2022-07-21 10:20         ` adr
2022-07-22 12:30           ` Silvan Jegen
2022-07-22 14:43             ` adr
2022-07-22 18:06             ` Sebastian Higgins
2022-07-22 19:09               ` Silvan Jegen
2022-07-22 19:13               ` Jacob Moody
2022-07-22 19:14       ` andpuke
2022-07-22 19:24         ` andpuke
2022-07-22 20:37           ` Silvan Jegen
2022-07-22 22:29             ` LdBeth
2022-07-26 12:29               ` adr
2022-07-29  8:04                 ` Silvan Jegen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).