From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 55B6F7FA83 for ; Wed, 12 Apr 2017 02:12:15 +0200 (CEST) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=taostein@gmail.com; spf=Pass smtp.mailfrom=taostein@gmail.com; spf=None smtp.helo=postmaster@mail-qk0-f179.google.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of taostein@gmail.com) identity=pra; client-ip=209.85.220.179; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="taostein@gmail.com"; x-sender="taostein@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of taostein@gmail.com designates 209.85.220.179 as permitted sender) identity=mailfrom; client-ip=209.85.220.179; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="taostein@gmail.com"; x-sender="taostein@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-qk0-f179.google.com) identity=helo; client-ip=209.85.220.179; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="taostein@gmail.com"; x-sender="postmaster@mail-qk0-f179.google.com"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3A7+TG6BAq90l6WvtO2umgUyQJP3N1i/DPJgcQr6Af?= =?us-ascii?q?oPdwSPT7r8bcNUDSrc9gkEXOFd2CrakV16yO6+jJYi8p2d65qncMcZhBBVcuqP?= =?us-ascii?q?49uEgeOvODElDxN/XwbiY3T4xoXV5h+GynYwAOQJ6tL1LdrWev4jEMBx7xKRR6?= =?us-ascii?q?JvjvGo7Vks+7y/2+94fdbghMizexe69+IAmrpgjNq8cahpdvJLwswRXTuHtIfO?= =?us-ascii?q?pWxWJsJV2Nmhv3+9m98p1+/SlOovwt78FPX7n0cKQ+VrxYES8pM3sp683xtBnM?= =?us-ascii?q?VhWA630BWWgLiBVIAgzF7BbnXpfttybxq+Rw1DWGMcDwULs5Qiqp4bt1RxD0iS?= =?us-ascii?q?cHLz85/3/Risxsl6JQvRatqwViz4LIfI2ZMfxzdb7fc9wHX2pMRsReVyJBDI2y?= =?us-ascii?q?bIUBEvQPMvpDoobnu1cDtwGzCRWwCO7tzDJDm3/43bc90+QkCQzIwhYvH9UTu3?= =?us-ascii?q?rJsNX6KqYSUeaox6TP0TXMdfRW2Szh6IfWcxAhp+qBXb11ccXLyEkvExnJgUmX?= =?us-ascii?q?qYzgJj6Y0PkGvWac7+plT+2vimgnphlrrTirwscjkI/Jh4wLxVDL7yp5xpw5Ks?= =?us-ascii?q?CmR0Jjbt6kEYdQtyGHN4RtWM8tX2ZouCM8x7YbupC7ZDAHxIo7yxPbcfCKcIiF?= =?us-ascii?q?7gj9WOqPPTt0nm9pdbC7ihu07EOu0PfzVtOu31ZPtidFksfDtnQK1xHL78iIUP?= =?us-ascii?q?p9/kO41TaW1ADf9vhIIU4pmafZL5Mt2LEwlp0UsUTMGi/5hl/6g7ORdkUh4uSo?= =?us-ascii?q?6uLnbav6ppKEKYN4lgXzPr4tl8G/G+g0LBYCU3SB9eih1rDu+VX1QLBQgf03lq?= =?us-ascii?q?nZvoraJcMepqOhHw9ayIEj6w2jDzi40dQYm2IKLF1AeB2djojpP0vCL+z/Dfe6?= =?us-ascii?q?m1isiitkx+jaPr39BZXANmTMn63kfbZ58kJczAszzctD559PEbEAIPfzWlfru9?= =?us-ascii?q?DCDx85NRa0w+f9B9ln2IMeQzHHPqjMGafWuFnA2e8gKu/EMIYRvD/7NPUq7vjG?= =?us-ascii?q?hHs9kFkCcKag290bZSbrMO5hJhClaH2kucsMEGFC6hY3S/f2hVyEeTFWbne2Ga?= =?us-ascii?q?k742doW8qdEY7fS9X10/S61yChE8gTPzgeBw=3D=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0CFAQB6cO1YhrPcVdFbHAEBBAEBCgEBF?= =?us-ascii?q?wEBBAEBCgEBg0o/gQsHgxlGm2OJCI5fLIV4AoNdB0MUAQEBAQEBAQEBAQESAQE?= =?us-ascii?q?BCAsLCCgvgjMiAYJAAQICAQwXHQEbHQEDAQsGAwILNwICIgERAQUBHAYTFAiJW?= =?us-ascii?q?wEDDQgOiymRGj+MBIIEBQEcgwkFg1YKGScNVYJoAQEBAQYBAQEBAQEBARgCBhK?= =?us-ascii?q?GPoRAMIF+gwWCWYJfBZBtjBKHAItegX+FLoNdhjqSORQfgRU2gSY7IBVBGIQ0H?= =?us-ascii?q?4F8MzUBiSwBAQE?= X-IPAS-Result: =?us-ascii?q?A0CFAQB6cO1YhrPcVdFbHAEBBAEBCgEBFwEBBAEBCgEBg0o?= =?us-ascii?q?/gQsHgxlGm2OJCI5fLIV4AoNdB0MUAQEBAQEBAQEBAQESAQEBCAsLCCgvgjMiA?= =?us-ascii?q?YJAAQICAQwXHQEbHQEDAQsGAwILNwICIgERAQUBHAYTFAiJWwEDDQgOiymRGj+?= =?us-ascii?q?MBIIEBQEcgwkFg1YKGScNVYJoAQEBAQYBAQEBAQEBARgCBhKGPoRAMIF+gwWCW?= =?us-ascii?q?YJfBZBtjBKHAItegX+FLoNdhjqSORQfgRU2gSY7IBVBGIQ0H4F8MzUBiSwBAQE?= X-IronPort-AV: E=Sophos;i="5.37,187,1488841200"; d="scan'208,217";a="220138978" Received: from mail-qk0-f179.google.com ([209.85.220.179]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 12 Apr 2017 02:12:13 +0200 Received: by mail-qk0-f179.google.com with SMTP id f133so11241576qke.2 for ; Tue, 11 Apr 2017 17:12:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=uS8JQmKC3EqpgAwoKg5tWtET1oy+D5gM1HdEu67mRNM=; b=je/Vd3FI9JNz4fGfigRmIoBjOxIGsncBP8ynCoVZX8uuiMFNvIxmamC54pAXEkHZpp jfVpvy33cDICa2p+651RgjdUmH0HUPaEwlzBhfE5Tnsoue5hcjW8nvlaMsHjVb1bc4tS yvLBblUJv3ewgbotypRyqTdc9o46Foxu7tiErLm5WXN+d7gsl3gWW2Ijc8wxK/V5TOv4 Gi0AtJiSywYOxaWwnP1MQeutBBD2buyaRKedN+07fJBzx2MZ3GlLMP89ogkJlUXDTDZo a6/igaX+iOFw+JK6zcDWG1flhjBIaAcFYjUzzxUxq1MHeYL1D+KMXgQJ3nbJmW5i+lyD 42mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=uS8JQmKC3EqpgAwoKg5tWtET1oy+D5gM1HdEu67mRNM=; b=Ntfo/YJu+RpcB379ez5RifAKNcV+mCd8hvkrZOaAsW69ntegut4YLBfcSv607x8Dfg GnXR/HwHF66cCGkPVfoZGlFUVAuqq4/JfOrooBQedsLcQSmQHyLy1tw1tpxV0w94o6r2 396hLeWPyopV2i2d5ygPxmYIlO5iyIAPomzzl06VDnQEr+VzfzH3ggbQHEAYRxCXYIHs zSxpmyxe+S++Q+0MtqUED8Jcq9skIhUDPdd/ezK4/CtybuAm54EkT6/vUpJ8HaIp/9Wm Tj4G/N0DaLqVQCtP5URof8I10rASaoO+4tJrS4YflzV06KUdo4aQN5bUE8DQ3aB4m9Et RZ6w== X-Gm-Message-State: AN3rC/7Yw4lyexNqi1uHg/YkvCz5bOmQSQ2TW0RXCWxNHOKl+npx7+HL9M35G+w5bbkCK5StCUv3mbbYUTh5CA== X-Received: by 10.55.212.79 with SMTP id l76mr37326160qki.203.1491955932562; Tue, 11 Apr 2017 17:12:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.148.100 with HTTP; Tue, 11 Apr 2017 17:12:11 -0700 (PDT) In-Reply-To: <041c1cd1-b697-9889-55e6-2db7f611dc6b@allanwegan.de> References: <20170411140512.GK28111@annexia.org> <0a2f848f-c697-b267-e371-d53ae281aec2@crans.org> <041c1cd1-b697-9889-55e6-2db7f611dc6b@allanwegan.de> From: Tao Stein Date: Wed, 12 Apr 2017 08:12:11 +0800 Message-ID: To: Allan Wegan Cc: OCaml Mailing List Content-Type: multipart/alternative; boundary=001a11479a146188ba054ced0fe6 Subject: Re: [Caml-list] error messages in multiple languages ? --001a11479a146188ba054ced0fe6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable German and French are closer to English than Arabic or Chinese, especially in the script. As an experiment in empathy, I encourage folks to examine this working OCaml code where I've replaced the Latin tokens and identifiers with Chinese ones: https://github.com/taostein/hanma/blob/master/example.hm . Chinese lacks capital letters [1], so I use the prefix "=E5=8D=9C" instead.= The mapping of tokens is here (in the parsing/lexer.mll diff): https://github.com/taostein/hanma/blob/master/lexer.mll.diff Reading code is hard when the script model isn't functioning in the fast processing part of your brain. Granted, Chinese has more characters than Latin, but training a brain to do fast processing of script takes years, even if it's Latin. Sometimes we forget it took us years to learn to read, for most of us that was a long time ago. I've taught Chinese students OCaml programming using Latin tokens and I've taught the same replacing those Latin tokens with Chinese ones. I tried this as an experiment and I was surprised at the outcome. Previously, I thought as most of you probably do -- come on, it's just a few tokens plus logic -- not hard. How many tokens are there in C, like 30? I could memorize those in a day! I WAS WRONG. The students were markedly more motivated and enthusiastic when coding in their own script. And these are smart people, among China's brightest. Motivated learners learn better and are also more fun to teach. This teaching experience is what inspired me to undertake this translation project. My observations are qualitative, because I've been focused on the teaching part, as opposed to the research about teaching part, but I hope to gather more data in future semesters and write a report about these findings. The qualitative results were strong -- script matters. I believe it's about script, not language. Parsing a foreign script quickly is really hard on the brain. We need the brain for the hard parts of programming. There are obviously many pieces of OCaml that need translation; manuals, errors and warnings, libraries, the core code, comments. I think error messages are a good place to start. We can work on different pieces in parallel. And hopefully we can build something useful for scripts other than Chinese, like Arabic and Russian. If you are interested in helping with this project, please get in touch with me directly. Yes, we want to build a global tech community. We must start from empathy. Maybe the Arabs and Chinese (and Russians and Koreans and Japanese) "should" or "shouldn't" learn English (or German or French or Latin or some other Western European language), under some definition of "should" (refer to various moral theories). But "should" is academic -- they're NOT going to learn English. If anything, the trend is moving in the other direction. China, for example, is lowering its university-level english requirements. So the question is: how global and how big do we want this so-called "global" tech community to be? Empathy and good translation tools can help us make it a real global (no scare quotes) community. Tao Stein / =E7=9F=B3=E6=B6=9B / =D8=AA=D8=A7=D9=88 =D8=B4=D8=AA=D8=A7=D9= =8A=D9=86 Yes, by Arabic numbers I meant the numeric script used by Arabs, not what the Oxford English Dictionary calls arabic (lower-case) numbers. [1] Chinese also lacks a plural form, which does somewhat ease error messaging. On 12 April 2017 at 07:04, Allan Wegan wrote: > > careful here, the =E2=80=9C(hindu=E2=80=90)arabic digits=E2=80=9D used = in European languages > > (0123456789) are similar, but not identical to, the symbols that actual > > arabic languages use nowadays (=E2=80=9Ceastern arabic digits=E2=80=9D, > > =D9=A0=E2=80=8E=D9=A1=E2=80=8E=D9=A2=E2=80=8E=D9=A3=E2=80=8E=D9=A4=E2= =80=8E=D9=A5=E2=80=8E=D9=A6=E2=80=8E=D9=A7=E2=80=8E=D9=A8=E2=80=8E=D9=A9). = there even are false friends (e=C2=B7g=C2=B7 the eastern 4 > > looks like a reversed western 3, the eastern 5 looks like a western 0, > > the eastern 6 looks like a western 7). > > > > yeah. confusing. > > Ideed. Must have been wishfull thinking on my side. > > Not translating the thing at all may be the wiser option. It might serve > the greater goal of finally establishing one universal world script and > language, everyone has to learn to be able to participate in the global > tech community (and written English is at least somewhat easy to learn)... > > > > Greetings from Germany > -- > Allan Wegan > > Jabber: allanwegan@ffnord.net > OTR-Fingerprint: E4DCAA40 4859428E B3912896 F2498604 8CAA126F > Jabber: allanwegan@jabber.ccc.de > OTR-Fingerprint: A1AAA1B9 C067F988 4A424D33 98343469 29164587 > ICQ: 209459114 > OTR-Fingerprint: 71DE5B5E 67D6D758 A93BF1CE 7DA06625 205AC6EC > > --001a11479a146188ba054ced0fe6 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

German and French are closer to Englis= h than Arabic or Chinese, especially in the script.

As an experiment in empathy, I encourage folks to examine this working OC= aml code where I've replaced the Latin tokens and identifiers with Chin= ese ones:=C2=A0https://github.com/taostein/hanma/blob/master/example.hm=C2=A0= . Chinese lacks capital letters [1], so I use the prefix "=E5=8D=9C&qu= ot; instead. The mapping of tokens is here (in the parsing/lexer.mll diff):= =C2=A0https://github.com/taostein/hanma/blob/master/lexer.mll.diff
<= div>
Reading code is hard when the script model isn't fun= ctioning in the fast processing part of your brain. Granted, Chinese has mo= re characters than Latin, but training a brain to do fast processing of scr= ipt takes years, even if it's Latin. Sometimes we forget it took us yea= rs to learn to read, for most of us that was a long time ago.
I've taught Chinese students OCaml programming using Latin = tokens and I've taught the same replacing those Latin tokens with Chine= se ones. I tried this as an experiment and I was surprised at the outcome. = Previously, I thought as most of you probably do -- come on, it's just = a few tokens plus logic -- not hard. How many tokens are there in C, like 3= 0? I could memorize those in a day! I WAS WRONG. The students were markedly= more motivated and enthusiastic when coding in their own script. And these= are smart people, among China's brightest. Motivated learners learn be= tter and are also more fun to teach. This teaching experience is what inspi= red me to undertake this translation project.

My o= bservations are qualitative, because I've been focused on the teaching = part, as opposed to the research about teaching part, but I hope to gather = more data in future semesters and write a report about these findings. The = qualitative results were strong -- script matters. I believe it's about= script, not language. Parsing a foreign script quickly is really hard on t= he brain. We need the brain for the hard parts of programming.

There are obviously many pieces of OCaml that need transla= tion; manuals, errors and warnings, libraries, the core code, comments. I t= hink error messages are a good place to start. We can work on different pie= ces in parallel. And hopefully we can build something useful for scripts ot= her than Chinese, like Arabic and Russian. If you are interested in helping= with this project, please get in touch with me directly.

Yes, we want to build a global tech community. We must start from e= mpathy. Maybe the Arabs and Chinese (and Russians and Koreans and Japanese)= "should" or "shouldn't" learn English (or German o= r French or Latin or some other Western European language), under some defi= nition of "should" (refer to various moral theories). But "s= hould" is academic -- they're NOT going to learn English. If anyth= ing, the trend is moving in the other direction. China, for example, is low= ering its university-level english requirements. So the question is: how gl= obal and how big do we want this so-called "global" tech communit= y to be? Empathy and good translation tools can help us make it a real glob= al (no scare quotes)=C2=A0community.

Tao Stein / = =E7=9F=B3=E6=B6=9B /=C2=A0=D8=AA=D8=A7=D9=88 =D8=B4=D8=AA=D8=A7=D9=8A=D9=86=

Yes, by Arabic numbers I meant the numeric sc= ript used by Arabs, not what the Oxford English Dictionary calls arabic (lo= wer-case) numbers.

[1] Chinese also lacks a plural form, which does somewhat ease= error messaging.

On 12 April 2017 at 07:04, Allan Weg= an <allanwegan@allanwegan.de> wrote:
> careful he= re, the =E2=80=9C(hindu=E2=80=90)arabic digits=E2=80=9D used in European la= nguages
> (0123456789) are similar, but not identical to, the symbols that actua= l
> arabic languages use nowadays (=E2=80=9Ceastern arabic digits=E2=80=9D= ,
> =D9=A0=E2=80=8E=D9=A1=E2=80=8E=D9=A2=E2=80=8E=D9=A3=E2=80=8E=D9=A4=E2= =80=8E=D9=A5=E2=80=8E=D9=A6=E2=80=8E=D9=A7=E2=80=8E=D9=A8=E2=80=8E=D9=A9). = there even are false friends (e=C2=B7g=C2=B7 the eastern 4
> looks like a reversed western 3, the eastern 5 looks like a western 0,=
> the eastern 6 looks like a western 7).
>
> yeah. confusing.

Ideed. Must have been wishfull thinking on my side.

Not translating the thing at all may be the wiser option. It might serve
the greater goal of finally establishing one universal world script and
language, everyone has to learn to be able to participate in the global
tech community (and written English is at least somewhat easy to learn)...<= br>


Greetings from Germany
--
Allan Wegan
<http://www.allanwegan.de/>
Jabber: allanwegan@ffnord.net<= br> =C2=A0OTR-Fingerprint: E4DCAA40 4859428E B3912896 F2498604 8CAA126F
Jabber: allanwegan@jabber.ccc.d= e
=C2=A0OTR-Fingerprint: A1AAA1B9 C067F988 4A424D33 98343469 29164587
ICQ: 209459114
=C2=A0OTR-Fingerprint: 71DE5B5E 67D6D758 A93BF1CE 7DA06625 205AC6EC


--001a11479a146188ba054ced0fe6--