From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 5D712BB83 for ; Sat, 19 Aug 2006 14:42:02 +0200 (CEST) Received: from blaster.systems.pipex.net (blaster.systems.pipex.net [62.241.163.7]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k7JCg2Jh008145 for ; Sat, 19 Aug 2006 14:42:02 +0200 Received: from [192.168.1.5] (81-86-133-45.dsl.pipex.com [81.86.133.45]) by blaster.systems.pipex.net (Postfix) with ESMTP id 4105CE000476; Sat, 19 Aug 2006 13:41:46 +0100 (BST) Message-ID: <44E7070A.40704@jollys.org> Date: Sat, 19 Aug 2006 13:41:46 +0100 From: Peter Jolly User-Agent: Thunderbird 1.5.0.5 (Windows/20060719) MIME-Version: 1.0 To: Jonathan Roewen Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Supporting unicode in ocaml... References: <20060819090500.GB2213@furbychan.cocan.org> In-Reply-To: Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; ocaml:01 preprocess:01 afaik:01 ocaml:01 bom:98 constants:01 wrote:01 minor:01 caml-list:01 encoding:02 namely:02 string:02 string:02 module:03 handles:03 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 Jonathan Roewen wrote: > Yes I did (as if you read closer, I did mention). But being able to > use actual utf8 rather than manually encoding characters in the string > is what would be nice. Maybe I just need a script to preprocess > sources in utf8... AFAIK, all you need to do is get rid of the BOM, which is redundant in UTF-8 anyway. I have no problems at all using UTF-8 encoded string constants in OCaml, and Camomile's UTF8 module then handles them seamlessly. YMMV. (There is one minor issue, which is that there are legal OCaml identifiers which become illegal when the file is encoded in UTF-8 -- namely those that use accented letters.)