From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA00379; Tue, 4 Mar 2003 13:44:26 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA32640 for ; Tue, 4 Mar 2003 13:44:24 +0100 (MET) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id h24CiNH16802 for ; Tue, 4 Mar 2003 13:44:23 +0100 (MET) Received: from ibm3.cicrp.jussieu.fr (ibm3.cicrp.jussieu.fr [134.157.15.3]) by shiva.jussieu.fr (8.12.5/jtpda-5.4) with ESMTP id h24CiMYY088899 ; Tue, 4 Mar 2003 13:44:22 +0100 (CET) Received: from ibm1.cicrp.jussieu.fr (ibm1.cicrp.jussieu.fr [134.157.15.1]) by ibm3.cicrp.jussieu.fr (8.8.8/jtpda/mob-V8) with ESMTP id NAA26454 ; Tue, 4 Mar 2003 13:43:21 +0100 Received: from localhost (fernande@localhost) by ibm1.cicrp.jussieu.fr (8.8.8/jtpda/mob-v8) with ESMTP id NAA4595800 ; Tue, 4 Mar 2003 13:43:20 +0100 Date: Tue, 4 Mar 2003 13:43:20 +0100 (NFT) From: Diego Olivier Fernandez Pons To: Nicolas George cc: caml-list@inria.fr Subject: Re: [Caml-list] [RANT] String representation (was: Strings as arrays or lists...) In-Reply-To: <20030303221022.GA24499@clipper.ens.fr> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Antivirus: scanned by sophie at shiva.jussieu.fr X-Spam: no; 0.00; pons:01 cicrp:01 caml-list:01 rant:01 bugged:01 catenable:01 buffer:01 operands:01 alain:01 frisch:01 cduce:01 bookkeeping:01 cursors:01 logarithmic:01 low-level:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Bonjour, Some of the features you wish are 'not so hard to implement', at least if you already have 'conceptually bugged low-level' strings : > - Strings need fast concatenation and parts extraction, arrays do > not need that. Any 'fast mergeable' data structure will do it : - trees of strings - fast catenable lists > - Arrays need fast random access by numeric index, strings do not > need that. You can easily hava a (log n/k + 1) acces where n is the total size of the string and k is the size of each bucket (if you choose a data structure with constant buffer size) > - string is an abstract type, with fast concatenation (especially > when one of the operands is not used anymore; this is like > tail-recursion optimisation, but for strings); You just have to free the unused buckets of you string. If you really want tail recursive functions try a 'list' based data structure. That seems to be what Alain Frisch did in CDuce. > - there is also a `cursor' type, which is something like a pair > (string, index in that string); Easy... You can even do better : using a zipper you have constant acces to the pointed element (instead of log n), with some bookkeeping every time you move the index. > - there are functions to move cursors forward and backward, take > substrings between two cursors, etc.; Zippers only allow one (functional) pointer by data structure. For this you will have to stick to an index with logarithmic time acces. You can of course mix both (zippers + indexes) > - there is a buffer type which is a byte array for low-level > operations, and conversion functions between buffers and strings, > with several possible encodings; Yes... strings. Diego Olivier ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners