From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA17087 for caml-red; Fri, 12 Jan 2001 10:14:54 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id IAA14747 for ; Fri, 12 Jan 2001 08:56:58 +0100 (MET) Received: from localhost.localdomain (kenny77.zip.com.au [61.8.18.205]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f0C7ut522253 for ; Fri, 12 Jan 2001 08:56:56 +0100 (MET) Received: from ozemail.com.au (IDENT:root@localhost [127.0.0.1]) by localhost.localdomain (8.9.3/8.8.7) with ESMTP id SAA07100; Fri, 12 Jan 2001 18:55:25 +1100 Message-ID: <3A5EB86C.86B0DF9D@ozemail.com.au> Date: Fri, 12 Jan 2001 18:55:24 +1100 From: John Max Skaller X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.12-20 i686) X-Accept-Language: en MIME-Version: 1.0 To: Alain Frisch CC: OCAML Subject: Re: JIT-compilation for OCaml? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: weis@pauillac.inria.fr Alain Frisch wrote: > What are you calling 'Unicode' ? For me it is the 'Unicode standard' > from the 'Unicode consortium' (http://www.unicode.org/), and it doesn't > seem like it was abandoned. Actually, it is in sync with ISO/IEC 10646 > as for the character set. Perhaps I should have been clearer: the Unicode consortium is indeed active, and is one of the main players in the ISO-10646 and related standards forum. It is also responsible for making data available freely, unlike ISO and National Standards Bodies, for I am very thankful. What I meant was that, as a separate Standard, Unicode is considered archaic: it is subsumed by ISO-10646. No one should provide (merely) Unicode support in a modern language, (IMHO) they should provide ISO-10646 support. ISO usually provides the basis for National Standards when these exist, for example the US National Standard for C++ is a rubber stamp of the ISO one (although the US was the main, but not only, contributor to the formation of the document). Unicode was useful in the days when industry couldn't wait for international consensus, but that consensus now exists, and the standard to use for international character sets is ISO-10646, not Unicode. This does require more work, since while tables of size 2^16 words are viable, 2^31 generally isn't. However the 2^31 code points leave plenty of room for expansion, while Unicode is already too small to encode all the world's languages (although the coverage is quite good), in particular, there's little room for extensive coverage of mathematical symbols. Note that some government type projects require conformance to ISO standards, and merely providing Unicode may threaten the commerical viability of a product. For example, countries like Australia and New Zealand have a need for working with text in many languages, and they're likely to require ISO-10646, since it has scope to include Aboriginal and Maori, which Unicode does not. I note that Unicode is still important, because many operating systems support Unicode filenames, and some programming languages like Java and Python provide Unicode support. One still has to support 'legacy' systems. [BTW: Java's lack of ISO-10646 support would likely have been an obstacle in the process of ISO Standardisation, had that been followed through] -- John (Max) Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850 checkout Vyper http://Vyper.sourceforge.net download Interscript http://Interscript.sourceforge.net