caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: Unicode, update
@ 2010-10-19  1:59 Paul Steckler
  2010-10-19 14:33 ` [Caml-list] " Ashish Agarwal
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Steckler @ 2010-10-19  1:59 UTC (permalink / raw)
  To: caml-list

Sylvain Le Gall <sylvain@le-gall.net> wrote:
> Would it be possible to publish them as an external library?

What I did isn't really complete enough to constitute a library, I'd say.

Here's what I did:

>From Pervasives:

  open_out_win32
  open_out_bin_win32
  open_out_gen_win32
  open_in_win32
  open_in_bin_win32
  open_in_gen_win32

>From Sys:

  file_exists_win32
  getcwd_win32
  chdir_win32
  missing: is_directory_win32, readdir_win32

I did not code up Win32/UTF8 equivalents of anything in the Unix
module.  A complete
library of Win32/UTF8 file functions would include a number of items
from that module.

If anyone's interested in finishing off what I've done, I could send along
what I have.

-- Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Re: Unicode, update
  2010-10-19  1:59 Unicode, update Paul Steckler
@ 2010-10-19 14:33 ` Ashish Agarwal
  2010-10-19 15:19   ` Michael Ekstrand
  0 siblings, 1 reply; 5+ messages in thread
From: Ashish Agarwal @ 2010-10-19 14:33 UTC (permalink / raw)
  To: Paul Steckler; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1394 bytes --]

Have you considered adding these to the Batteries project? It would be good
to get general purpose functionality added directly there. Also that way you
don't have to feel you've done a lot of things; a single useful function
could be added.


On Mon, Oct 18, 2010 at 9:59 PM, Paul Steckler <steck@stecksoft.com> wrote:

> Sylvain Le Gall <sylvain@le-gall.net> wrote:
> > Would it be possible to publish them as an external library?
>
> What I did isn't really complete enough to constitute a library, I'd say.
>
> Here's what I did:
>
> >From Pervasives:
>
>  open_out_win32
>  open_out_bin_win32
>  open_out_gen_win32
>  open_in_win32
>  open_in_bin_win32
>  open_in_gen_win32
>
> >From Sys:
>
>  file_exists_win32
>  getcwd_win32
>  chdir_win32
>  missing: is_directory_win32, readdir_win32
>
> I did not code up Win32/UTF8 equivalents of anything in the Unix
> module.  A complete
> library of Win32/UTF8 file functions would include a number of items
> from that module.
>
> If anyone's interested in finishing off what I've done, I could send along
> what I have.
>
> -- Paul
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 2240 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Re: Unicode, update
  2010-10-19 14:33 ` [Caml-list] " Ashish Agarwal
@ 2010-10-19 15:19   ` Michael Ekstrand
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Ekstrand @ 2010-10-19 15:19 UTC (permalink / raw)
  To: caml-list, Batteries Development

On 10/19/2010 09:33 AM, Ashish Agarwal wrote:
> Have you considered adding these to the Batteries project? It would be
> good to get general purpose functionality added directly there. Also
> that way you don't have to feel you've done a lot of things; a single
> useful function could be added.

I think there is promise in adding such functionality to Batteries'
BatFile module, particularly if we adopt a design similar to that of
Glib's filename support:

* An abstract (or private) type 'filename': on Win32, this is a UTF-8 or
UCS-2 Unicode string; on Unix, it is a string in the locale encoding
(which is the best you can do, unfortunately).
* Functions to convert to UTF-8 and the current locale for printing and
display.
* Idempotent conversion to/from UTF-8 for portable storage.  Glib
implements this with file:/// URLs, so that the non-ASCII characters can
be URL-encoded and thus preserved.
* Conversion from UTF-8 to handle user input, with the caveat that
converting to UTF-8 and back will not necessarily result in the same
filename.

There might be types and definitions in lablgtk2 that we could try to be
compatible with (particularly so you can use the results of a lablgtk2
file chooser as a filename to open).

OTOH, Qt seems to get along using Unicode strings for filenames, so the
encoding issues might not be that big a deal in practice.  The easy path
of just adding UTF-8 file open functions to BatFile (taking BatUTF8.t)
might be OK.

- Michael

> On Mon, Oct 18, 2010 at 9:59 PM, Paul Steckler <steck@stecksoft.com
> <mailto:steck@stecksoft.com>> wrote:
> 
>     Sylvain Le Gall <sylvain@le-gall.net <mailto:sylvain@le-gall.net>>
>     wrote:
>     > Would it be possible to publish them as an external library?
> 
>     What I did isn't really complete enough to constitute a library, I'd
>     say.
> 
>     Here's what I did:
> 
>     >From Pervasives:
> 
>      open_out_win32
>      open_out_bin_win32
>      open_out_gen_win32
>      open_in_win32
>      open_in_bin_win32
>      open_in_gen_win32
> 
>     >From Sys:
> 
>      file_exists_win32
>      getcwd_win32
>      chdir_win32
>      missing: is_directory_win32, readdir_win32
> 
>     I did not code up Win32/UTF8 equivalents of anything in the Unix
>     module.  A complete
>     library of Win32/UTF8 file functions would include a number of items
>     from that module.
> 
>     If anyone's interested in finishing off what I've done, I could send
>     along
>     what I have.
> 
>     -- Paul
> 
>     _______________________________________________
>     Caml-list mailing list. Subscription management:
>     http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>     Archives: http://caml.inria.fr
>     Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>     Bug reports: http://caml.inria.fr/bin/caml-bugs
> 
> 
> 
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unicode, update
  2010-10-14 10:18 Paul Steckler
@ 2010-10-14 10:32 ` Sylvain Le Gall
  0 siblings, 0 replies; 5+ messages in thread
From: Sylvain Le Gall @ 2010-10-14 10:32 UTC (permalink / raw)
  To: caml-list

Hello,

On 14-10-2010, Paul Steckler <steck@stecksoft.com> wrote:
> A couple of weeks ago or so, I asked about using OCaml file primitives
> with the Camomile library for Unicode
> on Windows.  I thought I'd update people on the list about my
> resolution of these issues.
>
> I decided to make the application UTF-8 throughout, so that the string
> type always means UTF-8 -- OK, there
> are a few exceptions to that rule.  The SQLite3 library already deals
> with UTF-8 in a graceful way,   The same is
> true for the C/C++ parsing library I'm using.  That leaves the OCaml
> library procedures, like open_in and open_out,
> which definitely don't handle Unicode filenames on Windows.
>
> I took the OCaml sources and made modified versions of functions, like
> file_exists, open_in, and so on,
> that convert filenames from UTF-8 to UTF-16 and then used "wide"
> versions of the underlying Win32
> primitives.  In some cases, I had to convert UTF-16 back to UTF-8.
> The Win32 functions MultiByteToWideChar
> and WideCharToMultiByte handle those conversions nicely.  I link in
> these new functions, named
> file_exists_win32, open_in_win32, etc., and everything works a treat.
>

Would it be possible to publish them as an external library? 

Thanks for the update
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Unicode, update
@ 2010-10-14 10:18 Paul Steckler
  2010-10-14 10:32 ` Sylvain Le Gall
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Steckler @ 2010-10-14 10:18 UTC (permalink / raw)
  To: caml-list

A couple of weeks ago or so, I asked about using OCaml file primitives
with the Camomile library for Unicode
on Windows.  I thought I'd update people on the list about my
resolution of these issues.

I decided to make the application UTF-8 throughout, so that the string
type always means UTF-8 -- OK, there
are a few exceptions to that rule.  The SQLite3 library already deals
with UTF-8 in a graceful way,   The same is
true for the C/C++ parsing library I'm using.  That leaves the OCaml
library procedures, like open_in and open_out,
which definitely don't handle Unicode filenames on Windows.

I took the OCaml sources and made modified versions of functions, like
file_exists, open_in, and so on,
that convert filenames from UTF-8 to UTF-16 and then used "wide"
versions of the underlying Win32
primitives.  In some cases, I had to convert UTF-16 back to UTF-8.
The Win32 functions MultiByteToWideChar
and WideCharToMultiByte handle those conversions nicely.  I link in
these new functions, named
file_exists_win32, open_in_win32, etc., and everything works a treat.

-- Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-10-19 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-19  1:59 Unicode, update Paul Steckler
2010-10-19 14:33 ` [Caml-list] " Ashish Agarwal
2010-10-19 15:19   ` Michael Ekstrand
  -- strict thread matches above, loose matches on Subject: below --
2010-10-14 10:18 Paul Steckler
2010-10-14 10:32 ` Sylvain Le Gall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).