From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: weis
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA17641 for caml-redistribution; Fri, 15 Oct 1999 12:06:15 +0200 (MET DST)
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA28908 for <caml-list@pauillac.inria.fr>; Fri, 15 Oct 1999 10:26:23 +0200 (MET DST)
Received: from givry.inria.fr (givry.inria.fr [193.51.193.144])
	by concorde.inria.fr (8.8.7/8.8.7) with ESMTP id KAA16825;
	Fri, 15 Oct 1999 10:26:14 +0200 (MET DST)
Received: from givry.inria.fr (givry.inria.fr [193.51.193.144]) by givry.inria.fr (8.7.6/8.7.3) with ESMTP id KAA12903; Fri, 15 Oct 1999 10:26:13 +0200 (MET DST)
Message-Id: <199910150826.KAA12903@givry.inria.fr>
From: Francis Dupont <Francis.Dupont@inria.fr>
To: skaller <skaller@maxtal.com.au>
cc: STARYNKEVITCH Basile <Basile.Starynkevitch@cea.fr>, caml-list@inria.fr
Subject: Re: localization, internationalization and Caml 
In-reply-to: Your message of Fri, 15 Oct 1999 08:20:20 +1000.
             <38065724.1CEC055F@maxtal.com.au> 
Date: Fri, 15 Oct 1999 10:26:12 +0200
Sender: weis

 In your previous mail you wrote:

   	The current 'support' for 8 bit characters in ocaml should be
   deprecated immediately. It is an extremely bad thing to have, since
   Latin-1 et al are archaic 8 bit standards incompatible with the
   international standard for ISO10646 communication, namely 
   the UTF-8 encoding.

=> there is a rather strong opposition against UTF-8 in France
because it is not a natural encoding (ie. if ASCII maps to ASCII
it is not the case for ISO 8859-* characters, imagine a new UTF-X
encoding maps ASCII to strange things and you'd be able to understand
our concern).

   Yes, I know Latin-1 is useful now for French.

=> it is more than useful, Latin-1 (soon ISO IS 8859-15) is necessary
if you need really readable texts in French.

   The way forward may well be to provide an input filter to convert
   Latin-1 (or any other encoding) to UTF8, and have ocaml process that.

=> my problem is the output of the filter will be no more readable when
I've put too much French in the program (in comments for instance).

   This requires almost no changes to the compiler: the design should
   open the set of characters acceptable in identifiers, probably
   to some subset of the set recommended in one of the ISO10646 related
   documents; the other change required is to accept \uXXXX and \UXXXXXXXX
   escapes in strings. String processing functions should generally
   continue to be 8 bit [per octet]: full internationalisation of client
   string handling functions is a very complex, non-trivial, task]
   
=> I believe internationalization should not be done by countries
where English is the only used language: this is at least awkward...

Regards

Francis.Dupont@inria.fr