caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
@ 2006-05-29 20:52 Harrison, John R
  0 siblings, 0 replies; 18+ messages in thread
From: Harrison, John R @ 2006-05-29 20:52 UTC (permalink / raw)
  To: Martin Jambon, Caml List; +Cc: Harrison, John R

Hi Martin,

| OK, but let's be pragmatic: what kind of interface and implementation
do
| you have in mind?

I did indeed have a very specific example in mind, my theorem prover HOL
Light. I have an OCaml type of typed lambda-terms:

  type term =
      Var of string * hol_type
    | Const of string * hol_type
    | Comb of term * term
    | Abs of term * term

The type "term" is private, and the abstract type interface only permits
you to construct well-typed terms, via interface functions like "mk_var"
and "mk_comb". For example, the call "mk_comb(s,t)" gives "Comb(s,t)"
provided the types agree, and fails otherwise.

I would like the user to be able to write "mk_var(x,ty)" and the net
result to be just one cons "Var(x,ty)" with "x" and "ty" identical to
the
input arguments. But with mutable strings, it is possible in principle
for the string "x" inside that object to get modified by other code.
Of course, it's a bit artificial, but I would like it to be impossible,
given that the principle of LCF provers is that the user should be able
to use arbitrary programs while having soundness enforced by the ML
type system.

Of course, I can use my own private type of strings, but then I need
to convert every time I use standard library functions, pattern matching
is a bit less convenient, etc.

John.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-28 23:20 Harrison, John R
  2006-05-29  2:36 ` Martin Jambon
@ 2006-05-31 12:53 ` Jean-Christophe Filliatre
  1 sibling, 0 replies; 18+ messages in thread
From: Jean-Christophe Filliatre @ 2006-05-31 12:53 UTC (permalink / raw)
  To: Harrison, John R; +Cc: Martin Jambon, Caml List


Harrison, John R writes:
 > The point is not that I will mutate a string by accident.

I once discovered a bug in  the Coq proof assistant that was precisely
due  to  a  string (an  identifier)  mutated  by  accident (may  be  I
shouldn't say  it :-) A name  was capitalized in-place  somewhere in a
piece  of code  unrelated with  the Coq  kernel but  of course  it had
consequences all over the system (including the kernel).

So I'm definitely in favor of immutable strings, for the exact reasons
mentioned by John.

But I  think an abstract data type  is not really an  issue, since one
does little pattern-matching on  strings in practice.  And having your
own  abstract data type  for immutable  strings has  other advantages,
such as  the ability  to share equal  strings (using  hash-consing) to
speedup  names comparisons. Even  printing is  not painful  provided a
suitable formatter-based printing function and %a.

-- 
Jean-Christophe Filliâtre (http://www.lri.fr/~filliatr)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-28 23:20 Harrison, John R
@ 2006-05-29  2:36 ` Martin Jambon
  2006-05-31 12:53 ` Jean-Christophe Filliatre
  1 sibling, 0 replies; 18+ messages in thread
From: Martin Jambon @ 2006-05-29  2:36 UTC (permalink / raw)
  To: Harrison, John R; +Cc: Caml List

Hi John,

On Sun, 28 May 2006, Harrison, John R wrote:

> With immutable strings, you'd never need to do conversions at the module
> interfaces. As with any other functional data structure, you only copy
> when you want to change part of it.

OK, but let's be pragmatic: what kind of interface and implementation do 
you have in mind?

(and then: isn't it possible to implement in OCaml?)


If anyone is interested:

Before posting I tried a polymorphic (wrt mutability) string type.
It was fun enough, but it doesn't scale very well. I put it there:

   http://martin.jambon.free.fr/ocaml.html#gstring



Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

Edit http://wikiomics.org, bioinformatics wiki


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
@ 2006-05-28 23:20 Harrison, John R
  2006-05-29  2:36 ` Martin Jambon
  2006-05-31 12:53 ` Jean-Christophe Filliatre
  0 siblings, 2 replies; 18+ messages in thread
From: Harrison, John R @ 2006-05-28 23:20 UTC (permalink / raw)
  To: Martin Jambon, Caml List; +Cc: Harrison, John R

Hi Martin,

| I disagree: has it ever happened to you to mutate a string by
accident?

The point is not that I will mutate a string by accident. I've never
done
it by accident or by design. The point is that I can't depend on code
that I call, or code that calls mine, not to subsequently modify strings
that are passed as arguments. So if I really need to reliably fix them I
am forced into expensive copy operations.

In practice, the obvious library calls are safe, so like Aleksey, I use
the built-in strings for the sake of convenience and compatibility. But
it's unsatisfactory intellectually. Some of us want to program in a
primarily functional style, yet the implementation of one of the most
basic and useful datatypes is not functional.

| Yes, so how do you avoid copies without using the "unsafe" conversions
all
| over the place?

With immutable strings, you'd never need to do conversions at the module
interfaces. As with any other functional data structure, you only copy
when you want to change part of it.

John.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 11:14               ` Brian Hurt
  2006-05-25 19:42                 ` Oliver Bandel
@ 2006-05-26  6:51                 ` Alain Frisch
  1 sibling, 0 replies; 18+ messages in thread
From: Alain Frisch @ 2006-05-26  6:51 UTC (permalink / raw)
  To: Brian Hurt; +Cc: caml-list

Brian Hurt wrote:
> Note that this also eliminates an inefficiency of ocaml.  When you use a
> constant string, it has to allocate a new string for you.  Consider
> Ocaml's response to the expression:
> 
> "foo" == "foo";;
> 
> The reason ocaml returns false in the above expression is that when you
> give a constant string in an expression, Ocaml first has to make a copy
> of that string (in case you modify it), and then it returns the copy. 

This is not true. The two strings above are physically different because
the two constants appear at different location of the source.

The expression

let f () = "foo" in f () == f ()

evaluates to true. The block for "foo" is allocated only once; the
string is not copied not each time the body of f is evaluated.
As a consequence,

let f () = "foo" in (f ()).[0] <- 'F'; f ()

returns "Foo".

-- Alain


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 23:00                   ` Jon Harrop
@ 2006-05-25 23:15                     ` Martin Jambon
  0 siblings, 0 replies; 18+ messages in thread
From: Martin Jambon @ 2006-05-25 23:15 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

On Fri, 26 May 2006, Jon Harrop wrote:

> On Thursday 25 May 2006 19:44, Tom wrote:
> [immutable strings]
>> But why don't you add such a type yourself?
>
> You would have to make your new type abstract in order to distinguish it from
> the build-in mutable string type. Then you lose the ability to do pattern
> matching and s.[i].

The former problem can be solved with camlp4 (which would use a "unsafe" 
function to do the type conversion). The latter can be solved with camlp4 
too, or simply by opening a module which defines a String module which 
provides a "get" function.

Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

Edit http://wikiomics.org, bioinformatics wiki


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 18:44                 ` Tom
@ 2006-05-25 23:00                   ` Jon Harrop
  2006-05-25 23:15                     ` Martin Jambon
  0 siblings, 1 reply; 18+ messages in thread
From: Jon Harrop @ 2006-05-25 23:00 UTC (permalink / raw)
  To: caml-list

On Thursday 25 May 2006 19:44, Tom wrote:
[immutable strings]
> But why don't you add such a type yourself?

You would have to make your new type abstract in order to distinguish it from 
the build-in mutable string type. Then you lose the ability to do pattern 
matching and s.[i].

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 17:31               ` Aleksey Nogin
@ 2006-05-25 19:54                 ` Martin Jambon
  0 siblings, 0 replies; 18+ messages in thread
From: Martin Jambon @ 2006-05-25 19:54 UTC (permalink / raw)
  To: Caml List; +Cc: Martin Jambon

On Thu, 25 May 2006, Aleksey Nogin wrote:

> On 24.05.2006 22:56, Martin Jambon wrote:
>
>>> I think it's OK to have (mutable) byte arrays, but strings should simply
>>> always be immutable.
>>  OCaml strings are compact byte arrays which serve their purpose well.
>
> Yes, however immutable strings are also very useful and that functionality is 
> simply missing in OCaml. The usage I am very interested in is essentially 
> using strings as "printable tokens". In other words, a data type that is easy 
> to compare and has an obvious I/O representation.
>
>> Having a whole different type for immutable strings is in my opinion a 
>> waste of energy. The problem is that freezing or unfreezing a string safely 
>> involves a copy of the whole string. And obviously it's not possible to 
>> handle only immutable strings since somehow you have to create them, and 
>> unlike record fields, they won't be set in one operation but in n 
>> operations, n being the length of the string.
>
> This is not true. All I want is having a purely functional interface with:
> - Constants (a compiler flag for turning "..." constants into immutable 
> strings instead of mutable ones).
> - Inputing from a channel
> - Concatenation
> - Things like string_of_int for immutable string.

Isn't it a bit limited? What if I want other functions?

But if it satisfies you, you can do the syntax part with an unsafe_freeze 
function and a bit of camlp4. The rest is just plain old OCaml.

> Of course, it might be the case that the standard library might have to use 
> some sort of "unsafe" operations that would "inappropriately" mutate the 
> newly created immutable string buffer, but this is IMHO no different than how 
> the unsafe operations are already used in standard library for arrays and 
> strings.

I disagree: has it ever happened to you to mutate a string by accident?
I never met this situation and this is mostly why I don't see the point of 
your suggestions. This strongly constrasts with mistakes in array/string 
indices which happen all the time.


>> So I'd really love to see actual examples where using immutable strings 
>> would be such an improvement over mutable strings.
>> If the problem is just to ensure that string data won't be changed by the 
>> user of a library, then it is trivial using module signatures and 
>> String.copy for the conversions.
>
> Such a copy operation can be extremely prohibitive in a setting that assumes 
> that a data structure is immutable and tries really hard to preserve sharing 
> (including using functions like a sharing-preserving version of map (*), 
> etc). In such a setting, these extra copies can potentially have a 
> devastating effect on memory usage, cache performance, etc. And this 
> situation is exactly what we have in our MetaPRL project - there we have 
> resorted to simply using strings and pretending they are immutable, but this 
> is clearly suboptimal.

Yes, so how do you avoid copies without using the "unsafe" conversions all 
over the place?


> ----
> (*)
> let rec smap f = function
>   [] -> []
> | (hd :: tl) as l ->
>      let hd' = f hd in
>      let tl' = smap f tl in
>         if hd == hd' && tl == tl' then l else hd' :: tl'

In order to maximize sharing, I'd rather use a global weak hash table.
In your context, it seems that you could afford String.copy, as long as it 
doesn't break sharing:

let freeze s =
   let s' = make_constant s (* using a copy! *) in
   if s' is in the table then return the element from the table
   else add s' and return s'



Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

Edit http://wikiomics.org, bioinformatics wiki


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 11:14               ` Brian Hurt
@ 2006-05-25 19:42                 ` Oliver Bandel
  2006-05-26  6:51                 ` Alain Frisch
  1 sibling, 0 replies; 18+ messages in thread
From: Oliver Bandel @ 2006-05-25 19:42 UTC (permalink / raw)
  To: caml-list

On Thu, May 25, 2006 at 07:14:55AM -0400, Brian Hurt wrote:
[...]
> Note that even pure-imperitive languages, like C/C++, hit this problem. 
> What is the output of the following peice of C code:
>     for (i = 0; i < 10; ++i) {
>         char * p = "foo";
>         p[0] += 1;
>         printf("%s\n", p);
>     }
[...]

Heheh, this brings a "Bus Error" (which one might call an "exception" ;-)
 (an exception th C'ish way)).

Ciao,
   Oliver


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  7:23               ` j h woodyatt
  2006-05-25 10:22                 ` Jon Harrop
@ 2006-05-25 19:28                 ` Oliver Bandel
  1 sibling, 0 replies; 18+ messages in thread
From: Oliver Bandel @ 2006-05-25 19:28 UTC (permalink / raw)
  To: caml-list

On Thu, May 25, 2006 at 12:23:51AM -0700, j h woodyatt wrote:
> On May 24, 2006, at 10:56 PM, Martin Jambon wrote:
> >
> >So I'd really love to see actual examples where using immutable  
> >strings would be such an improvement over mutable strings.
> >If the problem is just to ensure that string data won't be changed  
> >by the user of a library, then it is trivial using module  
> >signatures and String.copy for the conversions.
> 
> I have no dog in this fight, but I can imagine a pragmatic approach  
> that might satisfy some of these concerns without introducing much in  
> the way of extra runtime complexity costs.
> 
> Change the way strings are boxed so that there is an additional tag  
> for immutable strings as opposed to the normal mutable ones.  In all  
> respects, allow string objects with either tag to be treated the same  
> by functions that do not modify the content of the string.  When the  
> "safe" variants of the string mutation functions are called on a  
> string object with the immutable tag, throw a runtime exception.   
[...]

Compiletime error would be what I would prefer.


[...]
> It *might*  
> give a pragmatic benefit of surfacing unintentional programming  
> errors earlier than they might otherwise be found,
[...]

The earlier the better it is.
It's like non-matching types found by the compiler: it helps
a lot in development. Debugging normallay takes the most time
in development; and with OCaml debugging time decreases much - that
is what I experienced. And this comes from early getting a hint
to possible problems (possible with compilers that would accept
such code; im possible when the compiler says: "no! I don't accept
these sources!").

Ciao,
  Oliver


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 17:34               ` Aleksey Nogin
@ 2006-05-25 18:44                 ` Tom
  2006-05-25 23:00                   ` Jon Harrop
  0 siblings, 1 reply; 18+ messages in thread
From: Tom @ 2006-05-25 18:44 UTC (permalink / raw)
  To: Caml List

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

On 25/05/06, Aleksey Nogin <nogin@cs.caltech.edu> wrote:
>
> On 25.05.2006 04:18, Brian Hurt wrote:
>
> > Changing this behavior in Ocaml
> > would break huge amounts of code (including some of my code).
>
> It is obvious that at this point changing the default behavior would not
> be possible. However, adding the new "immutable strings" type would
> definitely be great and adding a compiler flag that would turn all
> string constants, "^", etc in a given module into immutable string
> operations would be nice as well and would not break any existing code.


But why don't you add such a type yourself?

[-- Attachment #2: Type: text/html, Size: 943 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25 11:18             ` Brian Hurt
@ 2006-05-25 17:34               ` Aleksey Nogin
  2006-05-25 18:44                 ` Tom
  0 siblings, 1 reply; 18+ messages in thread
From: Aleksey Nogin @ 2006-05-25 17:34 UTC (permalink / raw)
  To: Caml List

On 25.05.2006 04:18, Brian Hurt wrote:

> I wish to reiterate that my comments on immutable strings are solely in 
> the context of designing a new language, and should not be construed to 
> indicate a desire to change Ocaml. 

Well, I am advocating a change. IMHO the immutable strings is a very 
useful concept and I would really like to see it in OCaml.

> Changing this behavior in Ocaml 
> would break huge amounts of code (including some of my code).

It is obvious that at this point changing the default behavior would not 
be possible. However, adding the new "immutable strings" type would 
definitely be great and adding a compiler flag that would turn all 
string constants, "^", etc in a given module into immutable string 
operations would be nice as well and would not break any existing code.

-- 
Aleksey Nogin

Home Page: http://nogin.org/
E-Mail: nogin@cs.caltech.edu (office), aleksey@nogin.org (personal)
Office: Moore 04, tel: (626) 395-2200


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  5:56             ` [Caml-list] " Martin Jambon
  2006-05-25  7:23               ` j h woodyatt
  2006-05-25 11:14               ` Brian Hurt
@ 2006-05-25 17:31               ` Aleksey Nogin
  2006-05-25 19:54                 ` Martin Jambon
  2 siblings, 1 reply; 18+ messages in thread
From: Aleksey Nogin @ 2006-05-25 17:31 UTC (permalink / raw)
  To: Caml List; +Cc: Martin Jambon

On 24.05.2006 22:56, Martin Jambon wrote:

>> I think it's OK to have (mutable) byte arrays, but strings should simply
>> always be immutable.
>  
> OCaml strings are compact byte arrays which serve their purpose well.

Yes, however immutable strings are also very useful and that 
functionality is simply missing in OCaml. The usage I am very interested 
in is essentially using strings as "printable tokens". In other words, a 
data type that is easy to compare and has an obvious I/O representation.

> Having a whole different type for immutable strings is in my opinion a 
> waste of energy. The problem is that freezing or unfreezing a string 
> safely involves a copy of the whole string. And obviously it's not 
> possible to handle only immutable strings since somehow you have to 
> create them, and unlike record fields, they won't be set in one 
> operation but in n operations, n being the length of the string.

This is not true. All I want is having a purely functional interface with:
- Constants (a compiler flag for turning "..." constants into immutable 
strings instead of mutable ones).
- Inputing from a channel
- Concatenation
- Things like string_of_int for immutable string.

Of course, it might be the case that the standard library might have to 
use some sort of "unsafe" operations that would "inappropriately" mutate 
the newly created immutable string buffer, but this is IMHO no different 
than how the unsafe operations are already used in standard library for 
arrays and strings.

> So I'd really love to see actual examples where using immutable strings 
> would be such an improvement over mutable strings.
> If the problem is just to ensure that string data won't be changed by 
> the user of a library, then it is trivial using module signatures and 
> String.copy for the conversions.

Such a copy operation can be extremely prohibitive in a setting that 
assumes that a data structure is immutable and tries really hard to 
preserve sharing (including using functions like a sharing-preserving 
version of map (*), etc). In such a setting, these extra copies can 
potentially have a devastating effect on memory usage, cache 
performance, etc. And this situation is exactly what we have in our 
MetaPRL project - there we have resorted to simply using strings and 
pretending they are immutable, but this is clearly suboptimal.

----
(*)
let rec smap f = function
    [] -> []
  | (hd :: tl) as l ->
       let hd' = f hd in
       let tl' = smap f tl in
          if hd == hd' && tl == tl' then l else hd' :: tl'

-- 
Aleksey Nogin

Home Page: http://nogin.org/
E-Mail: nogin@cs.caltech.edu (office), aleksey@nogin.org (personal)
Office: Moore 04, tel: (626) 395-2200


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  4:32           ` immutable strings (Re: " Stefan Monnier
  2006-05-25  5:56             ` [Caml-list] " Martin Jambon
@ 2006-05-25 11:18             ` Brian Hurt
  2006-05-25 17:34               ` Aleksey Nogin
  1 sibling, 1 reply; 18+ messages in thread
From: Brian Hurt @ 2006-05-25 11:18 UTC (permalink / raw)
  To: caml-list


I wish to reiterate that my comments on immutable strings are solely in 
the context of designing a new language, and should not be construed to 
indicate a desire to change Ocaml.  Changing this behavior in Ocaml would 
break huge amounts of code (including some of my code).

Brian


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  5:56             ` [Caml-list] " Martin Jambon
  2006-05-25  7:23               ` j h woodyatt
@ 2006-05-25 11:14               ` Brian Hurt
  2006-05-25 19:42                 ` Oliver Bandel
  2006-05-26  6:51                 ` Alain Frisch
  2006-05-25 17:31               ` Aleksey Nogin
  2 siblings, 2 replies; 18+ messages in thread
From: Brian Hurt @ 2006-05-25 11:14 UTC (permalink / raw)
  To: Martin Jambon; +Cc: Stefan Monnier, caml-list



On Wed, 24 May 2006, Martin Jambon wrote:

> And obviously it's not possible to handle only immutable strings since 
> somehow you have to create them, and unlike record fields, they won't be 
> set in one operation but in n operations, n being the length of the 
> string.

This is the advantage of regular expressions- they can be implemented on 
immutable strings (regular expressions don't modify the old strings, they 
produce new strings), and can change large chunks of the string in one 
"operation", limiting the number of unnecessary string copies you need to 
do.  Yes, the underlying implementation would be mutable- but so is they 
underlying implementation of everything else.  The semantics remain 
applicative.

Note that this also eliminates an inefficiency of ocaml.  When you use a 
constant string, it has to allocate a new string for you.  Consider 
Ocaml's response to the expression:

"foo" == "foo";;

The reason ocaml returns false in the above expression is that when you 
give a constant string in an expression, Ocaml first has to make a copy of 
that string (in case you modify it), and then it returns the copy.  So it 
makes two copies of the string "foo" above, and then compares them for 
referential equality- which, since they're different copies, returns 
false.

Note that even pure-imperitive languages, like C/C++, hit this problem. 
What is the output of the following peice of C code:
     for (i = 0; i < 10; ++i) {
         char * p = "foo";
         p[0] += 1;
         printf("%s\n", p);
     }

C/C++ itself wants *some* strings to be immutable.  Fortran used to let 
you change the value of 2, which made for some interesting (in the bad 
sense) code tricks.  I comment that they've since fixed this.  But mutable 
strings allow you to change the value of "two", which is almost as bad.

Brian


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  7:23               ` j h woodyatt
@ 2006-05-25 10:22                 ` Jon Harrop
  2006-05-25 19:28                 ` Oliver Bandel
  1 sibling, 0 replies; 18+ messages in thread
From: Jon Harrop @ 2006-05-25 10:22 UTC (permalink / raw)
  To: caml-list

On Thursday 25 May 2006 08:23, j h woodyatt wrote:
> Change the way strings are boxed so that there is an additional tag
> for immutable strings as opposed to the normal mutable ones.  In all
> respects, allow string objects with either tag to be treated the same
> by functions that do not modify the content of the string.  When the
> "safe" variants of the string mutation functions are called on a
> string object with the immutable tag, throw a runtime exception.

Blech. I didn't come all this way for dynamic typing... ;-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  5:56             ` [Caml-list] " Martin Jambon
@ 2006-05-25  7:23               ` j h woodyatt
  2006-05-25 10:22                 ` Jon Harrop
  2006-05-25 19:28                 ` Oliver Bandel
  2006-05-25 11:14               ` Brian Hurt
  2006-05-25 17:31               ` Aleksey Nogin
  2 siblings, 2 replies; 18+ messages in thread
From: j h woodyatt @ 2006-05-25  7:23 UTC (permalink / raw)
  To: The Caml Trade

On May 24, 2006, at 10:56 PM, Martin Jambon wrote:
>
> So I'd really love to see actual examples where using immutable  
> strings would be such an improvement over mutable strings.
> If the problem is just to ensure that string data won't be changed  
> by the user of a library, then it is trivial using module  
> signatures and String.copy for the conversions.

I have no dog in this fight, but I can imagine a pragmatic approach  
that might satisfy some of these concerns without introducing much in  
the way of extra runtime complexity costs.

Change the way strings are boxed so that there is an additional tag  
for immutable strings as opposed to the normal mutable ones.  In all  
respects, allow string objects with either tag to be treated the same  
by functions that do not modify the content of the string.  When the  
"safe" variants of the string mutation functions are called on a  
string object with the immutable tag, throw a runtime exception.   
Finally, provide a function that changes the tag of a string from  
mutable to immutable, i.e. like a special blessed case of  
Obj.set_tag.  Not from immutable to mutable, though-- that should be  
disallowed.  If the -unsafe is not set, then allow a new compiler  
option -constliteral (or something) that would make all string  
literals immutable.  You'd need to use String.copy to get a mutable  
string with contents copied from an immutable one.

Of course, I can't offer a convincing reason to believe this would  
bring an improvement over what we have today.  My proposal would not  
give the programmer any assurance that contents of string values  
passed as parameters to library functions are not modified by the  
library, i.e. the "unsafe" functions would still work.  It *might*  
give a pragmatic benefit of surfacing unintentional programming  
errors earlier than they might otherwise be found, but I'm not  
prepared to argue in defense of that.

If you did immutable strings this way, it would be nice for the sake  
of consistency to have a comparable feature for arrays.


—
j h woodyatt <jhw@conjury.org>




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-25  4:32           ` immutable strings (Re: " Stefan Monnier
@ 2006-05-25  5:56             ` Martin Jambon
  2006-05-25  7:23               ` j h woodyatt
                                 ` (2 more replies)
  2006-05-25 11:18             ` Brian Hurt
  1 sibling, 3 replies; 18+ messages in thread
From: Martin Jambon @ 2006-05-25  5:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: caml-list

On Thu, 25 May 2006, Stefan Monnier wrote:

>> IMHO strings should be possibly made immutable by using the "immutable"
>> keyword, which is the opposite of the "mutable" keyword as it is used for
>> records. So the user/programmer can chose the kind of strings he7she wants.
>
> I think it's OK to have (mutable) byte arrays, but strings should simply
> always be immutable.

OCaml strings are compact byte arrays which serve their purpose well.
Having a whole different type for immutable strings is in my opinion a 
waste of energy. The problem is that freezing or unfreezing a string 
safely involves a copy of the whole string. And obviously it's not 
possible to handle only immutable strings since somehow you have to create 
them, and unlike record fields, they won't be set in one operation but in 
n operations, n being the length of the string.

So I'd really love to see actual examples where using immutable strings 
would be such an improvement over mutable strings.
If the problem is just to ensure that string data won't be changed by the 
user of a library, then it is trivial using module signatures and 
String.copy for the conversions.


Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

Edit http://wikiomics.org, bioinformatics wiki


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2006-05-31 12:53 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-29 20:52 [Caml-list] Re: immutable strings (Re: Array 4 MB size limit) Harrison, John R
  -- strict thread matches above, loose matches on Subject: below --
2006-05-28 23:20 Harrison, John R
2006-05-29  2:36 ` Martin Jambon
2006-05-31 12:53 ` Jean-Christophe Filliatre
2006-05-15 18:12 Array 4 MB size limit akalin
2006-05-19  5:57 ` [Caml-list] " Frederick Akalin
2006-05-19 16:28   ` Jozef Kosoru
2006-05-19 21:26     ` Jon Harrop
2006-05-20  1:06       ` Brian Hurt
2006-05-20 21:11         ` immutable strings (Re: [Caml-list] Array 4 MB size limit) Oliver Bandel
2006-05-25  4:32           ` immutable strings (Re: " Stefan Monnier
2006-05-25  5:56             ` [Caml-list] " Martin Jambon
2006-05-25  7:23               ` j h woodyatt
2006-05-25 10:22                 ` Jon Harrop
2006-05-25 19:28                 ` Oliver Bandel
2006-05-25 11:14               ` Brian Hurt
2006-05-25 19:42                 ` Oliver Bandel
2006-05-26  6:51                 ` Alain Frisch
2006-05-25 17:31               ` Aleksey Nogin
2006-05-25 19:54                 ` Martin Jambon
2006-05-25 11:18             ` Brian Hurt
2006-05-25 17:34               ` Aleksey Nogin
2006-05-25 18:44                 ` Tom
2006-05-25 23:00                   ` Jon Harrop
2006-05-25 23:15                     ` Martin Jambon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).