caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
To: David Allsopp <dra-news@metastack.com>
Cc: Christophe TROESTLER <Christophe.Troestler@umons.ac.be>,
	OCaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] GSoC: better UTF-8 support
Date: Mon, 28 Feb 2011 12:21:32 +0100	[thread overview]
Message-ID: <AANLkTi=dHMofPWCoMwMVtiB+Q-9A32JwP2OuuqxGLYoh@mail.gmail.com> (raw)
In-Reply-To: <E51C5B015DBD1348A1D85763337FB6D949100C29@Remus.metastack.local>

> If it's to go into the standard library then yes, it should exactly replicate the interface of the Char and String modules, that way within the standard library UTF8 can be used as a drop-in replacement
[...]

I'm not sure many programs would actually benefit from that. At a
certain point if you really want to process unicode at the character
level you'll need a proper library. Using these ascii/latin1 oriented
interfaces to process unicode at the character level would be
debilitating and frustrating for your final users (e.g. no treatement
of normal forms, you do realize that in unicode there's more than one
way of representing the character 'é').

The current status quo already allows you to treat UTF-8 encoded
string if you don't try to look into them at the character level which
is fine for many programs.

> Out of interest, what are your complaints against the String and Char modules - missing functions or something deeper?

Every time I have to explode a string at a given separator and want to
use only the standard library I complain.

> It is mildly amusing that you criticise the String and Char modules, yet have interest in this module given
[...]

The thing is that this support could be included without changing the
interfaces at all. Only the regexp language needs to be extended (and
I guess the underlying implementation wouldn't have to be changed).

Best,

Daniel


  reply	other threads:[~2011-02-28 11:21 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-28  8:35 Christophe TROESTLER
2011-02-28  8:58 ` Daniel Bünzli
2011-02-28 10:07   ` David Allsopp
2011-02-28 11:21     ` Daniel Bünzli [this message]
2011-02-28 11:46       ` David Allsopp
2011-02-28 12:32         ` Daniel Bünzli
2011-02-28 12:59           ` [Caml-list] " Sylvain Le Gall
2011-02-28 10:59   ` Sylvain Le Gall
2011-02-28 14:39   ` [Caml-list] " David Rajchenbach-Teller
2011-02-28 10:07 ` David Allsopp
     [not found]   ` <20110228.143157.1265982603697554449.Christophe.Troestler+ocaml@umons.ac.be>
2011-02-28 14:11     ` Daniel Bünzli
2011-02-28 14:57       ` Dario Teixeira
2011-02-28 14:13 ` Gerd Stolpmann
2011-02-28 14:31   ` [Caml-list] " Sylvain Le Gall
2011-02-28 15:09   ` [Caml-list] " Dario Teixeira
2011-02-28 15:50   ` David Allsopp
2011-03-01  5:49     ` [Caml-list] " Yoriyuki Yamagata
2011-02-28 14:21 ` [Caml-list] " Michael Ekstrand
2011-03-03 15:37 ` Damien Doligez
2011-03-03 16:42   ` Dario Teixeira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=dHMofPWCoMwMVtiB+Q-9A32JwP2OuuqxGLYoh@mail.gmail.com' \
    --to=daniel.buenzli@erratique.ch \
    --cc=Christophe.Troestler@umons.ac.be \
    --cc=caml-list@inria.fr \
    --cc=dra-news@metastack.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).