caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Tao Stein <taostein@gmail.com>
To: Allan Wegan <allanwegan@allanwegan.de>
Cc: OCaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] error messages in multiple languages ?
Date: Wed, 12 Apr 2017 08:12:11 +0800	[thread overview]
Message-ID: <CABs4TjLcYR=8HMP2q5gWoWPU3MB34rswutO6Rrm103=qP+P+8Q@mail.gmail.com> (raw)
In-Reply-To: <041c1cd1-b697-9889-55e6-2db7f611dc6b@allanwegan.de>

[-- Attachment #1: Type: text/plain, Size: 4713 bytes --]

German and French are closer to English than Arabic or Chinese, especially
in the script.

As an experiment in empathy, I encourage folks to examine this working
OCaml code where I've replaced the Latin tokens and identifiers with
Chinese ones: https://github.com/taostein/hanma/blob/master/example.hm .
Chinese lacks capital letters [1], so I use the prefix "卜" instead. The
mapping of tokens is here (in the parsing/lexer.mll diff):
https://github.com/taostein/hanma/blob/master/lexer.mll.diff

Reading code is hard when the script model isn't functioning in the fast
processing part of your brain. Granted, Chinese has more characters than
Latin, but training a brain to do fast processing of script takes years,
even if it's Latin. Sometimes we forget it took us years to learn to read,
for most of us that was a long time ago.

I've taught Chinese students OCaml programming using Latin tokens and I've
taught the same replacing those Latin tokens with Chinese ones. I tried
this as an experiment and I was surprised at the outcome. Previously, I
thought as most of you probably do -- come on, it's just a few tokens plus
logic -- not hard. How many tokens are there in C, like 30? I could
memorize those in a day! I WAS WRONG. The students were markedly more
motivated and enthusiastic when coding in their own script. And these are
smart people, among China's brightest. Motivated learners learn better and
are also more fun to teach. This teaching experience is what inspired me to
undertake this translation project.

My observations are qualitative, because I've been focused on the teaching
part, as opposed to the research about teaching part, but I hope to gather
more data in future semesters and write a report about these findings. The
qualitative results were strong -- script matters. I believe it's about
script, not language. Parsing a foreign script quickly is really hard on
the brain. We need the brain for the hard parts of programming.

There are obviously many pieces of OCaml that need translation; manuals,
errors and warnings, libraries, the core code, comments. I think error
messages are a good place to start. We can work on different pieces in
parallel. And hopefully we can build something useful for scripts other
than Chinese, like Arabic and Russian. If you are interested in helping
with this project, please get in touch with me directly.

Yes, we want to build a global tech community. We must start from empathy.
Maybe the Arabs and Chinese (and Russians and Koreans and Japanese)
"should" or "shouldn't" learn English (or German or French or Latin or some
other Western European language), under some definition of "should" (refer
to various moral theories). But "should" is academic -- they're NOT going
to learn English. If anything, the trend is moving in the other direction.
China, for example, is lowering its university-level english requirements.
So the question is: how global and how big do we want this so-called
"global" tech community to be? Empathy and good translation tools can help
us make it a real global (no scare quotes) community.

Tao Stein / 石涛 / تاو شتاين

Yes, by Arabic numbers I meant the numeric script used by Arabs, not what
the Oxford English Dictionary calls arabic (lower-case) numbers.

[1] Chinese also lacks a plural form, which does somewhat ease error
messaging.

On 12 April 2017 at 07:04, Allan Wegan <allanwegan@allanwegan.de> wrote:

> > careful here, the “(hindu‐)arabic digits” used in European languages
> > (0123456789) are similar, but not identical to, the symbols that actual
> > arabic languages use nowadays (“eastern arabic digits”,
> > ٠‎١‎٢‎٣‎٤‎٥‎٦‎٧‎٨‎٩). there even are false friends (e·g· the eastern 4
> > looks like a reversed western 3, the eastern 5 looks like a western 0,
> > the eastern 6 looks like a western 7).
> >
> > yeah. confusing.
>
> Ideed. Must have been wishfull thinking on my side.
>
> Not translating the thing at all may be the wiser option. It might serve
> the greater goal of finally establishing one universal world script and
> language, everyone has to learn to be able to participate in the global
> tech community (and written English is at least somewhat easy to learn)...
>
>
>
> Greetings from Germany
> --
> Allan Wegan
> <http://www.allanwegan.de/>
> Jabber: allanwegan@ffnord.net
>  OTR-Fingerprint: E4DCAA40 4859428E B3912896 F2498604 8CAA126F
> Jabber: allanwegan@jabber.ccc.de
>  OTR-Fingerprint: A1AAA1B9 C067F988 4A424D33 98343469 29164587
> ICQ: 209459114
>  OTR-Fingerprint: 71DE5B5E 67D6D758 A93BF1CE 7DA06625 205AC6EC
>
>

[-- Attachment #2: Type: text/html, Size: 5889 bytes --]

  reply	other threads:[~2017-04-12  0:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-08 14:22 Tao Stein
2017-04-08 14:43 ` Gabriel Scherer
2017-04-08 15:03   ` Sébastien Hinderer
2017-04-08 16:38 ` Xavier Leroy
2017-04-08 16:51   ` Sébastien Hinderer
2017-04-08 16:56     ` Xavier Leroy
2017-04-09 19:50       ` Adrien Nader
2017-04-10  6:14         ` Ian Zimmerman
2017-04-10 13:20           ` Tao Stein
2017-04-10 13:45             ` Evgeny Roubinchtein
2017-04-10 14:04               ` Tao Stein
2017-04-10 18:07                 ` Adrien Nader
2017-04-10 19:45                   ` Hendrik Boom
2017-04-10 19:49                     ` Dušan Kolář
2017-04-11  0:38                       ` Tao Stein
2017-04-11 14:05 ` Richard W.M. Jones
2017-04-11 14:18   ` Gabriel Scherer
2017-04-11 14:59     ` Tao Stein
2017-04-11 17:17       ` Allan Wegan
2017-04-11 19:07         ` Glen Mével
2017-04-11 23:04           ` Allan Wegan
2017-04-12  0:12             ` Tao Stein [this message]
2017-04-16 22:37               ` Evgeny Roubinchtein
2017-04-09 17:15 Андрей Бергман

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABs4TjLcYR=8HMP2q5gWoWPU3MB34rswutO6Rrm103=qP+P+8Q@mail.gmail.com' \
    --to=taostein@gmail.com \
    --cc=allanwegan@allanwegan.de \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).