From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 89348BBAF for ; Wed, 29 Sep 2010 07:05:40 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgkCAJ9lokxKfVK0kGdsb2JhbACUGY17CBUBAQEBCQkMBxEDH7Aem2aFRASKOg X-IronPort-AV: E=Sophos;i="4.57,251,1283724000"; d="scan'208";a="59425498" Received: from mail-wy0-f180.google.com ([74.125.82.180]) by mail3-smtp-sop.national.inria.fr with ESMTP; 29 Sep 2010 07:05:40 +0200 Received: by wyj26 with SMTP id 26so528081wyj.39 for ; Tue, 28 Sep 2010 22:05:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.235.106 with SMTP id t84mr903148weq.46.1285736739602; Tue, 28 Sep 2010 22:05:39 -0700 (PDT) Received: by 10.216.137.84 with HTTP; Tue, 28 Sep 2010 22:05:39 -0700 (PDT) X-Originating-IP: [203.143.161.115] Date: Wed, 29 Sep 2010 15:05:39 +1000 Message-ID: Subject: Windows filenames and Unicode From: Paul Steckler To: caml-list@yquem.inria.fr Content-Type: text/plain; charset=ISO-8859-1 X-Spam: no; 0.00; steckler:01 steck:01 ocaml:01 readdir:01 camomile:01 ocaml:01 ocaml's:01 argv:01 rewriting:01 strings:01 primitives:01 primitives:01 filenames:02 filenames:02 sys:03 In Windows, NTFS filenames are specified in Unicode (UTF-16). Am I right in thinking that OCaml file primitives, like open_in, readdir, etc. cannot handle NTFS filenames containing characters with codepoints greater than 255? I'm aware of the Camomile library, which gives the ability to manipulate UTF-16 strings inside of OCaml. But it looks like crucial points of OCaml's I/O, like Sys.argv and file primitives are strictly limited to 8-bit characters. Is there a way around this limitation, other than rewriting the file I/O primitives? -- Paul