From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: ** X-Spam-Status: No, score=2.1 required=5.0 tests=AWL,DNS_FROM_RFC_POST, SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 60D49BBC4 for ; Sat, 4 Apr 2009 12:55:36 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlMCAKjZ1klKfSwdk2dsb2JhbACVZz8BAQEBCQkKCREDpS6BCY9dAQMBA4QMBg X-IronPort-AV: E=Sophos;i="4.39,323,1235948400"; d="scan'208";a="25623845" Received: from yx-out-2324.google.com ([74.125.44.29]) by mail3-smtp-sop.national.inria.fr with ESMTP; 04 Apr 2009 12:55:35 +0200 Received: by yx-out-2324.google.com with SMTP id 8so979056yxm.3 for ; Sat, 04 Apr 2009 03:55:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=9eNnHLDMFcKeVRCo9XRqd+6HGFhXO3RBuuvGU8rpWuE=; b=GXSejKO/li6XWMtzhtEOzJpGuFn8l4JBIPeuWFbY6vh6QGD7SQl+4hBu2bPv461zF3 NJRpzJpl/1muYRODQrNr91YAb9slxkkF4aO+pPc7ch50d1w7MIkx5+Ob0odXZkACW602 k1puo50sb74NHXmow2LBvXI5Ty5VTzUGVLigs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=wZEf9tuhRAKvriFV824gnif7XN3LtTcbbn0uR07vsGL8FaheXlcSqgVlTgl2FSBmgZ PlhwBSSyYQ9hW01vxxAIv7zzpvDSqBcFhnkeDFIkEZpKQ0V1o92Y4mkmL5EGH5vgWn0g V9GR/AqZ+6zZwK4XqxHnnY1vYwydWNgkYOfG0= MIME-Version: 1.0 Received: by 10.150.91.20 with SMTP id o20mr4133439ybb.70.1238842534174; Sat, 04 Apr 2009 03:55:34 -0700 (PDT) In-Reply-To: References: <200904031256.33357.jon@ffconsultancy.com> <1238836474.6250.11.camel@Blefuscu> Date: Sat, 4 Apr 2009 12:55:34 +0200 Message-ID: <527cf6bc0904040355x32830525h4d191032f34b271f@mail.gmail.com> Subject: Re: [Caml-list] Strings From: blue storm To: caml-list@yquem.inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; haskell's:01 ocaml's:01 lacks:01 haskell:01 enum:01 amortized:01 mutable:01 byte:01 arrays:01 explicitely:01 mutable:01 mutation:01 storm:98 2009:98 char:01 On Sat, Apr 4, 2009 at 11:26 AM, Alp Mestan wrote: > However, let's study Haskell's strings. > They simply are a list of characters. This let the ability to use heavily > list-related functions (take, takeWhile, drop, dropWhile, map, etc.). On the > other hand, OCaml's standard library lacks of many functions for strings ! I > think this is too much imperative oriented. Maybe we could try to implement > (for Batteries or in a separate project) string lists and then use the power > of Batteries' list module with many (really many... that's a real pleasure > to read its documentation) functions to work with. I think with a bit of > internal laziness, we could get a great immutable string type with many > functions to manipulate it. > > But I guess there are many cons to do so, otherwise it would have been > "standardized". This is actually a bad idea : strings and lists are used for different purpose. Some operations are common (takeWhile is probably a good example) but the concepts are differents (does "tail" as a primitive on a string really match your usage pattern ?) and using lists entails bad performance caracteristics (hence the need for byteString, etc., the lack of standardization among the different Haskell libraries using strings, and the general confusion resulting). In the (rare imho) cases where you exactly want a list of characters, you can use (string -> char list) (or string -> char Enum.t) conversion functions. In the immutable string land, we have much better representations (that is, implementations that match the common use case of strings better), such as ropes. Ropes provide (amortized ?) constant-time concatenation, wich I think is the killer feature. I very rarely use mutable strings, whereas in my experience string concatenation is a very common operation (more than random access). I expect performance benefits for most real programs. In some cases, byte arrays are still useful (as John Harrop noted), but I think they are anecdotical and I'm ready to explicitely convert my strings to a mutable format before performing any mutation.