From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id E9592BBC4 for ; Fri, 3 Apr 2009 23:44:37 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsYFAIUg1knZSMDjZmdsb2JhbACBUpRRDQsDBQkPBbgghA8G X-IronPort-AV: E=Sophos;i="4.39,321,1235948400"; d="scan'208";a="25600048" Received: from fmmailgate02.web.de ([217.72.192.227]) by mail3-smtp-sop.national.inria.fr with ESMTP; 03 Apr 2009 23:44:05 +0200 Received: from smtp07.web.de (fmsmtp07.dlan.cinetic.de [172.20.5.215]) by fmmailgate02.web.de (Postfix) with ESMTP id 65DD8FC752FF; Fri, 3 Apr 2009 23:44:05 +0200 (CEST) Received: from [78.43.226.218] (helo=frosties.localdomain) by smtp07.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #277) id 1LprBF-00033n-00; Fri, 03 Apr 2009 23:44:05 +0200 Received: from mrvn by frosties.localdomain with local (Exim 4.69) (envelope-from ) id 1LprBE-0006Ld-Q3; Fri, 03 Apr 2009 23:44:04 +0200 From: Goswin von Brederlow To: Daniel Buenzli Cc: OCaml List Subject: Re: [Caml-list] Strings In-Reply-To: <8697F924-0485-4E00-81DF-9BCF74D872EA@erratique.ch> (Daniel Buenzli's message of "Fri, 3 Apr 2009 19:50:48 +0200") References: <200904031256.33357.jon@ffconsultancy.com> <200904031546.14071.jon@ffconsultancy.com> <49D63EE6.2020407@ens-lyon.org> <8697F924-0485-4E00-81DF-9BCF74D872EA@erratique.ch> User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.4.21 (linux) Date: Fri, 03 Apr 2009 23:44:04 +0200 Message-ID: <87prftifa3.fsf@frosties.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Sender: goswin-v-b@web.de X-Sender: goswin-v-b@web.de X-Provags-ID: V01U2FsdGVkX1+PQ/wNAw22ky56RHQhN6ciIBg/ac6lr0NSuf3a N2i8vHLxXpRjazRrLKTcoaLpxA60vJYwsjsK0VpCgTulrJfY1i wizOlQNxM= X-Spam: no; 0.00; bunzli:01 buenzli:01 val:01 byte:01 arrays:01 quirks:01 mutable:01 mutable:01 invocation:01 allocating:01 mismatch:01 ocaml's:01 byte:01 arrays:01 sig:01 Daniel Bünzli writes: > Le 3 avr. 09 à 18:52, Martin Jambon a écrit : > >> I love this recurrent discussion! > > I love your carefully argumented response ! > >> - I see absolutely no practical advantage of having an immutable >> "character >> string" type. > > In fact I find the result of the following sequence of operations very > disappointing for a functional programming language : > > Objective Caml version 3.11.0 > > # Sys.os_type;; > - : string = "Unix" > # let s = Sys.os_type;; > val s : string = "Unix" > # s.[0] <- 'a';; > - : unit = () > # Sys.os_type;; > - : string = "anix" > > I think it is a design error to conflate strings and byte arrays. You > clearly want both, but each with its own type and strings as > immutable. Individual character mutability is rarely needed in text > processing and having immutable strings avoids the kind of quirks as > seen above. I think that is a design flaw in Sys. Strings are mutable. The os_type is a constant. It should not hand out mutable access to a constant. With the current string module a better way would be to return a copy of os_type on each invocation. Drawback there is that then Sys.os_type () != Sys.os_type () > You'll think that's a marginal example, but that actually happens in > practice. For example in xmlm when I return a signal for a start tag I > do not String.copy the tag name to avoid allocating too much. Thus in > the documentation there's the following ugly advice : > > "The module assumes strings are immutable, thus strings the client > gives or receives during the input and output process must not be > modified." > > And if you don't follow the advice and mutate the tag's name before > the end tag was parsed (or output) you'll get a tag mismatch error > even though the document (or the output) is perfectly valid. > > Having immutable strings would not rely on the client for correctness > of operation and that's always an advantage. Of course you'll tell me > just use String.copy inside xmlm et voilà, but then you traded > correctness for performance in a case where you could have both with > immutable strings. This is not just a problem for strings. Any data type can suffer the same. >> - There is nothing to change in OCaml's string type because it is an >> "array of >> bytes", with type char representing single bytes. > > > Oh no, there's nothing to change at all, that's a perfect > implementation of byte arrays. You just want another type for > immutable strings. > > Best, > > Daniel It wouldn't be too hard to change the string module to allow for both mutable and immutable strings: module S : sig type const type mutabl type 'a t val make : string -> mutabl t val set : mutabl t -> int -> char -> unit val get : 'a t -> int -> char val const : 'a t -> const t val print : 'a t -> unit end = struct type const type mutabl type 'a t = string let make s = s let set = String.set let get = String.get let const s = s let print = print_string end let str = S.make "hallo" in S.set str 0 'H'; S.print str let str = S.const (S.make "hallo") in S.set str 0 'H'; S.print str ^^^ Error: This expression has type S.const S.t but is here used with type S.mutabl S.t By adding a phantom type the type system can keep track of where a string is mutable and where not. The only restriction is that "const" does not mean the string will not change. It only means that that reference to the string can not change it: # let str = S.make "hallo" in let str2 = S.const str in S.set str 0 'H'; S.print str2;; Hallo- : unit = () If you let a mutable reference to the string escape and then assume it remains const that is your problem. Easily avoidable in a library or module. MfG Goswin