From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 955E1BC6B for ; Tue, 9 Oct 2007 18:42:35 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAHJMC0fRVZKwkmdsb2JhbACOSAIBAQcEBBMW X-IronPort-AV: E=Sophos;i="4.21,249,1188770400"; d="scan'208";a="17762036" Received: from wa-out-1112.google.com ([209.85.146.176]) by mail4-smtp-sop.national.inria.fr with ESMTP; 09 Oct 2007 18:42:34 +0200 Received: by wa-out-1112.google.com with SMTP id k17so2155587waf for ; Tue, 09 Oct 2007 09:42:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=KiVLEaBqce03gcsXY0Ku2jPmwdeBRkrmcyCBunOlxTY=; b=k5ka3IfKxD7kCYpmh6AqCA4AID/wdYvRs5SwxQ4wzZyUqpAoKTF24NB1BaszZdlgHte4zMlxlelJIAMOt2WFu1VzuFc5FrCkKuu9VZRzqYpEE1n9cvN/c9PcB6konWHJvkHbfRocGuK3NU7u7VBmYaHkg/pfps3pkgg+vqj6cKo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=ALQxaR+by3AiUru2BQ3t7hODrs7+f3MwP+opu3YJgE4jBagrsiXSkWmlB+ip5XBpEn3keR1phBNRuyAodRSFSLCUfVSnqT28oav0eIGUaH/J/lmxsdqKinswdoZs5aLJ9oZKP/ktXVPcFTSwgfYtNmdkIAnk2D/ZQOdIkVkiYzo= Received: by 10.114.67.2 with SMTP id p2mr3863956waa.1191948152307; Tue, 09 Oct 2007 09:42:32 -0700 (PDT) Received: by 10.114.13.16 with HTTP; Tue, 9 Oct 2007 09:42:32 -0700 (PDT) Message-ID: <6f9f8f4a0710090942u2afe6c5erc2d5b11ecfff2253@mail.gmail.com> Date: Tue, 9 Oct 2007 18:42:32 +0200 From: "Loup Vaillant" To: "Vincent Hanquez" Subject: Re: [Caml-list] Re: Rope is the new string Cc: "Jon Harrop" , caml-list@yquem.inria.fr In-Reply-To: <20071009155702.GB26282@snarc.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <1191879429.28011.27.camel@rosella.wigram> <1191886669.26491.29.camel@rosella.wigram> <20071009.122004.251675959.Christophe.Troestler+ocaml@umh.ac.be> <200710091440.48381.jon@ffconsultancy.com> <20071009155702.GB26282@snarc.org> X-Spam: no; 0.00; 0100,:01 arrays:01 byte:01 arrays:01 encodings:01 byte:01 atoms:98 wrote:01 caml-list:01 functions:01 strings:01 strings:01 encoding:02 encoding:02 supported:02 2007/10/9, Vincent Hanquez : > On Tue, Oct 09, 2007 at 02:40:48PM +0100, Jon Harrop wrote: > > Out of curiosity, do your ropes handle UTF-8 and UTF-16? > > Out of curiosity, why would a string implementation (has a handle of > chars bundle together) has to handle UTF-X ? My 2 cents: It is more convenient to consider strings as characters arrays. Then, these characters are handled as atoms, even if they take several bytes in the chosen encoding. Of course, multi-byte characters must be supported as well. Still, I can use byte arrays as strings. But it limits me to ASCII and Latin-like encodings: if I want to do UTF-X, then I have to worry about multi-bytes characters myself. Internationalization made hard... I would find very convenient to have plain unicode strings (and chars), with appropriate scan, print, byte_array_from_string, and string_from_byte_array functions, one bundle per supported encoding. So I don't need to think about the internals of such a string. Loup Vaillant