caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Mattias Waldau" <mattias.waldau@abc.se>
To: <caml-list@inria.fr>
Cc: "Xavier Leroy" <xavier.leroy@inria.fr>
Subject: RE: [Caml-list] Non-mutable strings
Date: Wed, 16 Jan 2002 20:22:36 +0100	[thread overview]
Message-ID: <AAEBJHFJOIPMMIILCEPBKEPBDGAA.mattias.waldau@abc.se> (raw)
In-Reply-To: <20020110185619.A20606@pauillac.inria.fr>

Nice to see that there is a general interest of non-mutable strings.
However, as Xavier says, maybe it is a bit late.

We have another string problem, namely handling non-ascii. As I understand
it, one of the points of of nML (http://ropas.kaist.ac.kr/n/), with is a new
ML language currently built using Ocaml, is that it handles asian
characters. Also, their was an entry recently into this group about asian
characters codings.

I don't think any language can continue to be pure-ascii for ever. One of
the reason of Ruby's success is that it handles non-ascii (I think it is
made by an japanese). However, even we Swedes have problems, only 2 of our 3
special characters are in the lower 7 bits and sorting is always wrong.

A unicode char is between 1 and 4 bytes, that means that str[i] doesn't work
(unless you do as NT or Java, store it as wide chars internally, which of
course Ocaml could do too). You always have to start at the beginning of the
string to find the i:th char.

Thus, introducing Unicode strings (or something similar, I heard that Asians
don't like Unicode at all) and introducing non-mutable strings should
preferrable be done simultaneously.

In order to have 8-bit chars strings and unicode strings simultaneously we
need something like 'u"', and maybe the possibility to say that all strings
are unicode. Can this be done using a module just like 'open Float'
redefines '+' to '+.'?

Or should Ocaml v 4 go the whole way and let all strings (also identifiers)
be Unicode?

/mattias

P.s. Microsoft NT, 2000, XP handles double byte chars everywhere, it is
called BSTR and in order to make string comparasion etc library-routines are
called all the time. However, since Unicode can be 4 byte, I don't know how
that is encoded into 2 bytes.

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


  parent reply	other threads:[~2002-01-16 19:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-01-04  2:55 [Caml-list] Stop at exception Magesh Kannan
2002-01-04 13:46 ` Xavier Leroy
2002-01-05 11:19   ` [Caml-list] Non-mutable strings Mattias Waldau
2002-01-05 22:01     ` YAMAGATA yoriyuki
2002-01-10 17:56     ` Xavier Leroy
2002-01-10 18:25       ` [Caml-list] Float and OCaml C interface Christophe Raffalli
2002-01-12 21:12         ` David Mentre
2002-01-12 21:32           ` David Mentre
2002-01-23 15:07         ` [Caml-list] " Xavier Leroy
2002-01-23 16:02           ` David Monniaux
2002-01-10 18:41       ` [Caml-list] Non-mutable strings Patrick M Doane
2002-01-10 18:50         ` Brian Rogoff
2002-01-13 20:05           ` Nicolas George
2002-01-16 19:22       ` Mattias Waldau [this message]
2002-01-17  9:56         ` YAMAGATA yoriyuki
2002-01-17 10:19         ` Jerome Vouillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AAEBJHFJOIPMMIILCEPBKEPBDGAA.mattias.waldau@abc.se \
    --to=mattias.waldau@abc.se \
    --cc=caml-list@inria.fr \
    --cc=xavier.leroy@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).