caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
@ 2006-05-28 23:20 Harrison, John R
  2006-05-29  2:36 ` Martin Jambon
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Harrison, John R @ 2006-05-28 23:20 UTC (permalink / raw)
  To: Martin Jambon, Caml List; +Cc: Harrison, John R

Hi Martin,

| I disagree: has it ever happened to you to mutate a string by
accident?

The point is not that I will mutate a string by accident. I've never
done
it by accident or by design. The point is that I can't depend on code
that I call, or code that calls mine, not to subsequently modify strings
that are passed as arguments. So if I really need to reliably fix them I
am forced into expensive copy operations.

In practice, the obvious library calls are safe, so like Aleksey, I use
the built-in strings for the sake of convenience and compatibility. But
it's unsatisfactory intellectually. Some of us want to program in a
primarily functional style, yet the implementation of one of the most
basic and useful datatypes is not functional.

| Yes, so how do you avoid copies without using the "unsafe" conversions
all
| over the place?

With immutable strings, you'd never need to do conversions at the module
interfaces. As with any other functional data structure, you only copy
when you want to change part of it.

John.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-28 23:20 [Caml-list] Re: immutable strings (Re: Array 4 MB size limit) Harrison, John R
@ 2006-05-29  2:36 ` Martin Jambon
  2006-05-31 12:53 ` Jean-Christophe Filliatre
  2006-06-05 20:54 ` immutable strings Matti Jokinen
  2 siblings, 0 replies; 5+ messages in thread
From: Martin Jambon @ 2006-05-29  2:36 UTC (permalink / raw)
  To: Harrison, John R; +Cc: Caml List

Hi John,

On Sun, 28 May 2006, Harrison, John R wrote:

> With immutable strings, you'd never need to do conversions at the module
> interfaces. As with any other functional data structure, you only copy
> when you want to change part of it.

OK, but let's be pragmatic: what kind of interface and implementation do 
you have in mind?

(and then: isn't it possible to implement in OCaml?)


If anyone is interested:

Before posting I tried a polymorphic (wrt mutability) string type.
It was fun enough, but it doesn't scale very well. I put it there:

   http://martin.jambon.free.fr/ocaml.html#gstring



Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

Edit http://wikiomics.org, bioinformatics wiki


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit)
  2006-05-28 23:20 [Caml-list] Re: immutable strings (Re: Array 4 MB size limit) Harrison, John R
  2006-05-29  2:36 ` Martin Jambon
@ 2006-05-31 12:53 ` Jean-Christophe Filliatre
  2006-06-05 20:54 ` immutable strings Matti Jokinen
  2 siblings, 0 replies; 5+ messages in thread
From: Jean-Christophe Filliatre @ 2006-05-31 12:53 UTC (permalink / raw)
  To: Harrison, John R; +Cc: Martin Jambon, Caml List


Harrison, John R writes:
 > The point is not that I will mutate a string by accident.

I once discovered a bug in  the Coq proof assistant that was precisely
due  to  a  string (an  identifier)  mutated  by  accident (may  be  I
shouldn't say  it :-) A name  was capitalized in-place  somewhere in a
piece  of code  unrelated with  the Coq  kernel but  of course  it had
consequences all over the system (including the kernel).

So I'm definitely in favor of immutable strings, for the exact reasons
mentioned by John.

But I  think an abstract data type  is not really an  issue, since one
does little pattern-matching on  strings in practice.  And having your
own  abstract data type  for immutable  strings has  other advantages,
such as  the ability  to share equal  strings (using  hash-consing) to
speedup  names comparisons. Even  printing is  not painful  provided a
suitable formatter-based printing function and %a.

-- 
Jean-Christophe Filliâtre (http://www.lri.fr/~filliatr)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: immutable strings
  2006-05-28 23:20 [Caml-list] Re: immutable strings (Re: Array 4 MB size limit) Harrison, John R
  2006-05-29  2:36 ` Martin Jambon
  2006-05-31 12:53 ` Jean-Christophe Filliatre
@ 2006-06-05 20:54 ` Matti Jokinen
  2006-06-07  0:36   ` [Caml-list] " Jacques Garrigue
  2 siblings, 1 reply; 5+ messages in thread
From: Matti Jokinen @ 2006-06-05 20:54 UTC (permalink / raw)
  To: Caml List

> In practice, the obvious library calls are safe, so like Aleksey, I use
> the built-in strings for the sake of convenience and compatibility. But
> it's unsatisfactory intellectually.

Actually, there are cases of unsafe sharing even in the standard library.


# let x = "X" in
  let g = Genlex.make_lexer [x] in
  let s = Stream.of_string "X" in
  let t = g s in
  let _ = Stream.peek t in
  x.[0] <- 'Y';
  Stream.peek t;;

result:

- : Genlex.token option = Some (Genlex.Kwd "Y")


Format:

# let x = "X" in
  let f = Format.make_formatter (output stdout) (fun () -> flush stdout) in
  Format.pp_print_string f x;
  x.[0] <- 'Y';
  Format.pp_print_newline f (); Format.pp_print_flush f ();;

output:

        Y


I think this demonstrates that the problem is real: it is too easy to
forget copying.

- Matti Jokien


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Re: immutable strings
  2006-06-05 20:54 ` immutable strings Matti Jokinen
@ 2006-06-07  0:36   ` Jacques Garrigue
  0 siblings, 0 replies; 5+ messages in thread
From: Jacques Garrigue @ 2006-06-07  0:36 UTC (permalink / raw)
  To: moj; +Cc: caml-list

From: moj@utu.fi (Matti Jokinen)

> > In practice, the obvious library calls are safe, so like Aleksey, I use
> > the built-in strings for the sake of convenience and compatibility. But
> > it's unsatisfactory intellectually.
> 
> Actually, there are cases of unsafe sharing even in the standard library.
> 
> 
> # let x = "X" in
>   let g = Genlex.make_lexer [x] in
>   let s = Stream.of_string "X" in
>   let t = g s in
>   let _ = Stream.peek t in
>   x.[0] <- 'Y';
>   Stream.peek t;;
> 
> result:
> 
> - : Genlex.token option = Some (Genlex.Kwd "Y")
[...]
> I think this demonstrates that the problem is real: it is too easy to
> forget copying.

I don't think this is what the original poster meant by "unsafe".
Standard library functions do not mutate strings when this is not
explicitly stated.
If you apply this principle to user behaviour, it means that you
shouldn't mutate a string passed to or from a library function except
when it is explicitly ok.

In practice this usually works well, because the string type is
actually used as two independent types:
* mutable strings for some I/O and buffers
* immutable strings for all other uses

But this still puts a burden on users.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-06-07  0:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-28 23:20 [Caml-list] Re: immutable strings (Re: Array 4 MB size limit) Harrison, John R
2006-05-29  2:36 ` Martin Jambon
2006-05-31 12:53 ` Jean-Christophe Filliatre
2006-06-05 20:54 ` immutable strings Matti Jokinen
2006-06-07  0:36   ` [Caml-list] " Jacques Garrigue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).