From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA13109; Mon, 3 Mar 2003 22:43:45 +0100 (MET) Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA12756 for caml-list@pauillac.inria.fr; Mon, 3 Mar 2003 22:43:43 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id UAA07468 for ; Sun, 2 Mar 2003 20:03:21 +0100 (MET) Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id h22J3KH10090; Sun, 2 Mar 2003 20:03:20 +0100 (MET) Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.10.1/1.01.28121999) with ESMTP id h22J3IM30380 ; Sun, 2 Mar 2003 20:03:18 +0100 (CET) Received: from localhost (frisch@localhost) by clipper.ens.fr (8.12.3/jb-1.1) id h22J3ClX010800 ; Sun, 2 Mar 2003 20:03:12 +0100 (MET) X-Authentication-Warning: clipper.ens.fr: frisch owned process doing -bs Date: Sun, 2 Mar 2003 20:03:12 +0100 (MET) From: Alain.Frisch@ens.fr X-X-Sender: frisch@clipper.ens.fr Reply-To: Alain.Frisch@ens.fr To: Xavier Leroy cc: brogoff@speakeasy.net, Oliver Bandel , "caml-list@inria.fr" Subject: Re: [Caml-list] Strings as arrays or lists... In-Reply-To: <20030302193437.A6487@pauillac.inria.fr> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) X-Spam: no; 0.00; alain:01 frisch:01 caml-list:01 haskell:01 cduce:01 char:01 homogeneous:01 runtime:01 buffer:01 deconstruct:01 lazily:01 appending:01 advocating:01 arrays:01 unicode:01 X-Spam: no; 0.00; alain:01 frisch:01 caml-list:01 haskell:01 cduce:01 char:01 homogeneous:01 runtime:01 buffer:01 deconstruct:01 lazily:01 appending:01 advocating:01 arrays:01 unicode:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sun, 2 Mar 2003, Xavier Leroy wrote: > > > in Haskell, strings are lists of chars. > > > > And what a horrible design decision that is! > > Agreed. Well, it's a great way to multiply the memory requirements > for your strings by a factor of 12 (on 32-bit platforms) or 24 (on > 64-bit platforms), while at the same time losing constant-time > indexing :-) Not necessarily. Considering strings as lists of chars is a design decision about the semantics of the language, not its implementation. For instance in CDuce, we choosed to define the type String as [Char*], that is, homogeneous sequences of (Unicode) characters. The idea is that we want to have characters and elements at the same level inside XML elements (the other solution is to have elements and strings; but when you concatenate two sequences of elements-or-strings, you want to perform implicit concatenation of strings at the middle to avoid two consecutive strings). For the implementation, we introduce a compact representation of strings at runtime, namely a pointer to a segment of a buffer (OCaml string) + a "continuation" (which can be a representation of the same kind, or a real list cell). When a pattern tries to deconstruct such a representation to see it as list cell (head,tail), the runtime system extracts the first character from the buffer and compute a pointer to the next character (depending on the internal Unicode encoding of the buffer). When a pattern extracts a consecutive subsequence (substrings) - patterns in CDuce are roughly regular expressions - it actually returns a pointer to a subsegment of the buffer. Concatenation of sequences is computed lazily, to avoid quadractic behavior when appending elements one by one at the end. Basically, in many situations, we avoid space overhead and keep a reasonable time overhead. I'm not advocating this kind of techniques for a fully general language such as OCaml; I just wanted to defend the "horrible" design for specific applications... -- Alain ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners