From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 9ECADBBAF for ; Wed, 29 Sep 2010 08:24:29 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvcBAF94okxRZ90wkWdsb2JhbACTaI40FQEBAQEJCwoHEQMfykSFRASKOg X-IronPort-AV: E=Sophos;i="4.57,251,1283724000"; d="scan'208";a="73954373" Received: from mtaout02-winn.ispmail.ntl.com ([81.103.221.48]) by mail1-smtp-roc.national.inria.fr with ESMTP; 29 Sep 2010 08:24:29 +0200 Received: from aamtaout04-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout02-winn.ispmail.ntl.com (InterMail vM.7.08.04.00 201-2186-134-20080326) with ESMTP id <20100929062428.QFAU3192.mtaout02-winn.ispmail.ntl.com@aamtaout04-winn.ispmail.ntl.com>; Wed, 29 Sep 2010 07:24:28 +0100 Received: from romulus.metastack.com ([81.102.132.77]) by aamtaout04-winn.ispmail.ntl.com (InterMail vG.2.02.00.01 201-2161-120-102-20060912) with ESMTP id <20100929062428.TUUH22376.aamtaout04-winn.ispmail.ntl.com@romulus.metastack.com>; Wed, 29 Sep 2010 07:24:28 +0100 Received: from remus.metastack.local ([172.16.0.1]) by romulus.metastack.com (8.14.2/8.14.2) with ESMTP id o8T6OPNg011593 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 29 Sep 2010 07:24:25 +0100 Received: from Remus.metastack.local ([fe80::547c:3c42:e1da:eda2]) by Remus.metastack.local ([fe80::547c:3c42:e1da:eda2%11]) with mapi; Wed, 29 Sep 2010 07:23:05 +0100 From: David Allsopp To: Paul Steckler , "caml-list@yquem.inria.fr" Subject: RE: [Caml-list] Windows filenames and Unicode Thread-Topic: [Caml-list] Windows filenames and Unicode Thread-Index: AQHLX5PPOwwrCzpT5EGTUsu4XyPtE5MoeleA Date: Wed, 29 Sep 2010 06:23:04 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Organization: MetaStack Solutions Ltd. X-Scanned-By: MIMEDefang 2.65 on 81.102.132.77 X-Cloudmark-Analysis: v=1.1 cv=4QByPj+6Iq2k/6L54d+eVKTdgQxdscpRskJJReCfdXo= c=1 sm=0 a=6XaqXQm5RQQA:10 a=kj9zAlcOel0A:10 a=xqWC_Br6kY4A:10 a=yMhMjlubAAAA:8 a=ZOzjf2MOAAAA:8 a=n06BfVqjAAAA:20 a=L15I8gShSDUedCAoydgA:9 a=P__iD6wh0TKy_w72I2gA:7 a=uRQgt8GOTl1UQumvLv_t-W-vLroA:4 a=CjuIK1q_8ugA:10 a=HpAAvcLHHh0Zw7uRqdWCyQ==:117 X-Spam: no; 0.00; steckler:01 ocaml:01 readdir:01 ocaml:01 camomile:01 ocaml's:01 argv:01 rewriting:01 pathname:01 beginner's:01 bug:01 foolproof:98 beginners:01 wrote:01 caml-list:01 Paul Steckler wrote: > In Windows, NTFS filenames are specified in Unicode (UTF-16). Am I right > in thinking that OCaml file primitives, like open_in, readdir, etc. canno= t > handle NTFS filenames containing characters with codepoints greater than > 255? Given that the WinAPI "wide" functions use UTF-16, you can of course fake U= TF-16 on top of normal OCaml strings but I think that you'll hit a brick wa= ll because the I/O primitives are based on the underlying C library functio= ns which at the end of the day will be using the ANSI versions of the Windo= ws API functions, not the Unicode ones. > I'm aware of the Camomile library, which gives the ability to manipulate > UTF-16 strings inside of OCaml. But it looks like crucial points of > OCaml's I/O, like Sys.argv and file primitives are strictly limited to 8- > bit characters. >=20 > Is there a way around this limitation, other than rewriting the file I/O > primitives? A way (but not foolproof on Windows 7 and Windows 2008 R2 because you can d= isable it) would be to wrap the GetShortPathName Windows API function[1] wh= ich will convert the pathname to its DOS 8.3 format which will not contain = Unicode characters. Another way might be to wrap the Unicode version of Cre= ateFileEx and convert the result into a handle compatible with the standard= library functions but I reckon that could be tricky! David [1] http://msdn.microsoft.com/en-us/library/aa364989(v=3DVS.85).aspx >=20 > -- Paul >=20 > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs