caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Goswin von Brederlow <goswin-v-b@web.de>
To: Daniel Buenzli <daniel.buenzli@erratique.ch>
Cc: OCaml List <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] Strings
Date: Fri, 03 Apr 2009 23:44:04 +0200	[thread overview]
Message-ID: <87prftifa3.fsf@frosties.localdomain> (raw)
In-Reply-To: <8697F924-0485-4E00-81DF-9BCF74D872EA@erratique.ch> (Daniel Buenzli's message of "Fri, 3 Apr 2009 19:50:48 +0200")

Daniel Bünzli <daniel.buenzli@erratique.ch> writes:

> Le 3 avr. 09 à 18:52, Martin Jambon a écrit :
>
>> I love this recurrent discussion!
>
> I love your carefully argumented response !
>
>> - I see absolutely no practical advantage of having an immutable
>> "character
>> string" type.
>
> In fact I find the result of the following sequence of operations very
> disappointing for a functional programming language :
>
>         Objective Caml version 3.11.0
>
> # Sys.os_type;;
> - : string = "Unix"
> # let s = Sys.os_type;;
> val s : string = "Unix"
> # s.[0] <- 'a';;
> - : unit = ()
> # Sys.os_type;;
> - : string = "anix"
>
> I think it is a design error to conflate strings and byte arrays. You
> clearly want both, but each with its own type and strings as
> immutable. Individual character mutability is rarely needed in text
> processing and having immutable strings avoids the kind of quirks as
> seen above.

I think that is a design flaw in Sys. Strings are mutable. The os_type
is a constant. It should not hand out mutable access to a constant.

With the current string module a better way would be to return a copy
of os_type on each invocation. Drawback there is that then

Sys.os_type () != Sys.os_type ()

> You'll think that's a marginal example, but that actually happens in
> practice. For example in xmlm when I return a signal for a start tag I
> do not String.copy the tag name to avoid allocating too much. Thus in
> the documentation there's the following ugly advice :
>
> "The module assumes strings are immutable, thus strings the client
> gives or receives during the input and output process must not be
> modified."
>
> And if you don't follow the advice and mutate the tag's name before
> the end tag was parsed (or output) you'll get a tag mismatch error
> even though the document (or the output) is perfectly valid.
>
> Having immutable strings would not rely on the client for correctness
> of operation and that's always an advantage. Of course you'll tell me
> just use String.copy inside xmlm et voilà, but then you traded
> correctness for performance in a case where you could have both with
> immutable strings.

This is not just a problem for strings. Any data type can suffer the same.

>> - There is nothing to change in OCaml's string type because it is an
>> "array of
>> bytes", with type char representing single bytes.
>
>
> Oh no, there's nothing to change at all, that's a perfect
> implementation of byte arrays. You just want another type for
> immutable strings.
>
> Best,
>
> Daniel

It wouldn't be too hard to change the string module to allow for both
mutable and immutable strings:

module S :
sig
  type const
  type mutabl
  type 'a t
  val make : string -> mutabl t
  val set : mutabl t -> int -> char -> unit
  val get : 'a t -> int -> char
  val const : 'a t -> const t
  val print : 'a t -> unit
end = struct
  type const
  type mutabl
  type 'a t = string
  let make s = s
  let set = String.set
  let get = String.get
  let const s = s
  let print = print_string
end

let str = S.make "hallo" in
  S.set str 0 'H'; S.print str
let str = S.const (S.make "hallo") in
  S.set str 0 'H'; S.print str
        ^^^
Error: This expression has type S.const S.t but is here used with type
         S.mutabl S.t

By adding a phantom type the type system can keep track of where a
string is mutable and where not. The only restriction is that "const"
does not mean the string will not change. It only means that that
reference to the string can not change it:

# let str = S.make "hallo" in
  let str2 = S.const str in
    S.set str 0 'H'; S.print str2;;
Hallo- : unit = ()

If you let a mutable reference to the string escape and then assume it
remains const that is your problem. Easily avoidable in a library or
module.

MfG
        Goswin


  parent reply	other threads:[~2009-04-03 21:44 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-03 11:56 Strings Jon Harrop
2009-04-03 12:25 ` [Caml-list] Strings Paolo Donadeo
2009-04-03 14:18 ` Ashish Agarwal
2009-04-03 14:46   ` Jon Harrop
2009-04-03 15:03     ` Daniel Bünzli
2009-04-03 16:52       ` Martin Jambon
2009-04-03 17:50         ` Daniel Bünzli
2009-04-03 19:46           ` Paolo Donadeo
2009-04-03 20:41             ` Harrison, John R
2009-04-04 10:11               ` Jon Harrop
2009-04-04 11:12                 ` David Teller
2009-04-04 11:40                   ` Jon Harrop
2009-04-04 12:34                     ` David Rajchenbach-Teller
2009-04-18 12:31                   ` Arkady Andrukonis
2009-04-04 10:13             ` Jon Harrop
2009-04-03 21:44           ` Goswin von Brederlow [this message]
2009-04-04  9:10             ` David Rajchenbach-Teller
2009-04-05 10:06               ` Strings Zheng Li
2009-04-06  9:20                 ` Strings David Rajchenbach-Teller
2009-04-06 10:07                   ` Strings Goswin von Brederlow
2009-04-06 11:03                   ` Strings Zheng Li
2009-04-04 17:11           ` [Caml-list] Strings Kuba Ober
2009-04-04 17:26             ` Jon Harrop
2009-04-05 20:54           ` Richard Jones
2009-04-05 23:40             ` Daniel Bünzli
2009-04-03 18:24         ` Florian Hars
2009-04-03 20:34         ` Arnaud Spiwack
2009-04-04 10:20       ` Jon Harrop
2009-04-04  9:14 ` David Rajchenbach-Teller
2009-04-04  9:26   ` Alp Mestan
2009-04-04 10:55     ` blue storm
2009-04-04 21:51     ` Goswin von Brederlow
2009-04-04 23:35       ` Yaron Minsky
2009-04-05  9:36         ` David Rajchenbach-Teller
2009-04-05 10:08           ` Alp Mestan
2009-04-05 21:41             ` Goswin von Brederlow
2009-04-05 21:40           ` Goswin von Brederlow
2009-04-05  2:55       ` Jon Harrop
2009-04-05  4:22         ` Edgar Friendly
2009-04-05  7:03           ` Goswin von Brederlow
2009-04-05  6:57         ` Goswin von Brederlow
2009-04-05  7:11           ` Jon Harrop
2009-04-04 10:11   ` Jon Harrop
2009-04-04 21:39   ` Goswin von Brederlow
2009-04-05  7:14   ` Romain Beauxis
2009-04-05  9:34     ` David Rajchenbach-Teller
2009-04-05 21:37     ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87prftifa3.fsf@frosties.localdomain \
    --to=goswin-v-b@web.de \
    --cc=caml-list@yquem.inria.fr \
    --cc=daniel.buenzli@erratique.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).