From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 8CDCABB83 for ; Sat, 19 Aug 2006 13:02:56 +0200 (CEST) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.185]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k7JB2uah030571 for ; Sat, 19 Aug 2006 13:02:56 +0200 Received: by nf-out-0910.google.com with SMTP id g2so1778289nfe for ; Sat, 19 Aug 2006 04:02:56 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=B7SJgWwNvWopUAjkFFynYu0/4Vfm9UBoxGBZUVhnjT2BpBKXa4jb9EtWFeqJGS6idPh0sBF7ufEdqABlHo/N0Qv2ktx3gCfNCVKM1Zas16cTY4Nx3EQpjbUbTJ1TMasINzHvJLNfU/HiIkjTE57IFe9Qc1UmcgCKQodaPib+jnw= Received: by 10.49.41.18 with SMTP id t18mr5135212nfj; Sat, 19 Aug 2006 04:02:56 -0700 (PDT) Received: by 10.78.194.3 with HTTP; Sat, 19 Aug 2006 04:02:56 -0700 (PDT) Message-ID: Date: Sat, 19 Aug 2006 23:02:56 +1200 From: "Jonathan Roewen" To: "Richard Jones" Subject: Re: [Caml-list] Supporting unicode in ocaml... Cc: OCaml In-Reply-To: <20060819090500.GB2213@furbychan.cocan.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20060819090500.GB2213@furbychan.cocan.org> X-Spam: no; 0.00; ocaml:01 camomile:01 camomile:01 lowercase:01 preprocess:01 sourceforge:01 caml-list:01 functions:01 unsafe:01 encoding:02 string:02 string:02 generally:03 library:03 standard:07 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.3 > Have a look at Camomile: > > http://camomile.sourceforge.net/ > > Generally speaking, though, I just always use string == UTF-8 string > and avoid using some of the unsafe functions from the standard > library, such as String.lowercase. Yes I did (as if you read closer, I did mention). But being able to use actual utf8 rather than manually encoding characters in the string is what would be nice. Maybe I just need a script to preprocess sources in utf8...