From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id C9A91BBAF for ; Wed, 29 Sep 2010 10:15:09 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvcBAOyRokyGnQCBkWdsb2JhbACiGxUBAQEBCQsKBxEDH8hIhUQE X-IronPort-AV: E=Sophos;i="4.57,252,1283724000"; d="scan'208";a="71689037" Received: from shiva.jussieu.fr ([134.157.0.129]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Sep 2010 10:15:07 +0200 Received: from hydrogene.pps.jussieu.fr (hydrogene.pps.jussieu.fr [134.157.168.1]) by shiva.jussieu.fr (8.14.4/jtpda-5.4) with ESMTP id o8T8EufB048541 ; Wed, 29 Sep 2010 10:14:56 +0200 (CEST) X-Ids:166 Received: from strontium.pps.jussieu.fr (Debian-exim@strontium.pps.jussieu.fr [134.157.168.38]) by hydrogene.pps.jussieu.fr (8.13.4/jtpda-5.4) with ESMTP id o8T8EmdC029433 ; Wed, 29 Sep 2010 10:14:48 +0200 Received: from vouillon by strontium.pps.jussieu.fr with local (Exim 4.72) (envelope-from ) id 1P0roN-0002pX-Rw; Wed, 29 Sep 2010 10:14:48 +0200 Date: Wed, 29 Sep 2010 10:14:47 +0200 From: Jerome Vouillon To: Paul Steckler Cc: David Allsopp , "caml-list@yquem.inria.fr" Subject: Re: [Caml-list] Windows filenames and Unicode Message-ID: <20100929081447.GA10841@pps.jussieu.fr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-Miltered: at jchkmail.jussieu.fr with ID 4CA2F580.002 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 4CA2F580.002/134.157.168.1/hydrogene.pps.jussieu.fr/hydrogene.pps.jussieu.fr/ X-Spam: no; 0.00; vouillon:01 vouillon:01 steckler:01 ocaml:01 runtime:01 translated:01 translated:01 ocaml:01 bcpierce:01 mingw:01 mingw:01 wrote:01 unix:01 upenn:01 jerome:01 On Wed, Sep 29, 2010 at 05:26:52PM +1000, Paul Steckler wrote: [...] > For Linux, I was planning on enforcing the invariant that all > strings inside my program are UTF-8. For Windows, I could use the > same invariant, and modify the OCaml runtime so that all calls to > Windows file primitives have those strings translated to UTF-16 (and > return values translated back to UTF-8). That is, I'd have to build > a custom version of OCaml and wrap CreateFile, etc. with such > Unicode translation functions. The Unison file synchronizer (http://www.cis.upenn.edu/~bcpierce/unison/) has binding for the UTF-16 Windows API. You should have a look at it. At the moment, this is fairly tied to Unison (in particular because we want to still be able to access the 8bit API for compatibility with previous versions of Unison), but it would be great to turn the code into a standalone library. One possibility would be to write a Unicode version of the Unix library. > All this is made slightly more complicated by the fact that I'm using > the MinGW version of OCaml. There is no difficulty accessing the UTF-16 Windows API with MinGW version of OCaml. Actually, I'm even crosscompiling from Linux without any difficulty. -- Jerome