From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA16832 for caml-red; Fri, 12 Jan 2001 10:11:45 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id BAA09045 for ; Fri, 12 Jan 2001 01:21:48 +0100 (MET) Received: from mailserver.cli.di.unipi.it (crudelia.cli.di.unipi.it [131.114.11.37]) by concorde.inria.fr (8.11.1/8.10.0) with ESMTP id f0C0Llj04970 for ; Fri, 12 Jan 2001 01:21:47 +0100 (MET) Organization: Centro di Calcolo - Dipartimento di Informatica di Pisa - Italy Received: from fire.cli.di.unipi.it (fire-ext.cli.di.unipi.it [131.114.11.52]) by mailserver.cli.di.unipi.it (8.9.3/8.9.3) with SMTP id BAA26227 for ; Fri, 12 Jan 2001 01:20:28 +0100 Received: (qmail 21010 invoked by uid 7794); 12 Jan 2001 00:21:43 -0000 Received: from carlotta.cli.di.unipi.it(131.114.11.15) via SMTP by crudelia.cli.di.unipi.it, id smtpda20023; Fri Jan 12 01:19:02 2001 Received: from localhost (bernardp@localhost) by carlotta.cli.di.unipi.it (8.8.5/8.6.12) with SMTP id BAA00912 for ; Fri, 12 Jan 2001 01:19:15 +0100 (MET) X-Authentication-Warning: carlotta.cli.di.unipi.it: bernardp owned process doing -bs Date: Fri, 12 Jan 2001 01:19:15 +0100 (MET) From: Pierpaolo BERNARDI To: OCAML Subject: Re: Unicode (was RE: JIT-compilation for OCaml?) In-Reply-To: <3145774E67D8D111BE6E00C0DF418B663AD720@nt.kal.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: weis@pauillac.inria.fr On Thu, 11 Jan 2001, Dave Berry wrote: > I thought Unicode was a recognised subset of ISO-10646, corresponding to the > range 0-2^16. No. ISO-10646 and Unicode contains exactly the same code points. Unicode has room for about 2^20 code points. The ISO committee has agreed to limit ISO-10646 to the same range. The current version of Unicode 3.0.1 (and consequently of ISO-10646) has less than 2^16 code points assigned. The next version, due out in a couple of months will contain about 100.000 characters > My knowledge of C/C++ is probably out of date, but I thought they just used > the wide character type, without requiring a particular internal > representation. This is correct. > In what way do ISO C/C++ support ISO-10646? I have not been following this discussion, so I missed the message you are replying to. One can say that C supports ISO-10646 in the sense that a C environment *can* use ISO-10646 as its internal representation for wide chars, if the C implementor so chooses. Many compilers do just this, in fact. P.