From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26088 invoked from network); 3 Mar 2005 11:12:11 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 3 Mar 2005 11:12:11 -0000 Received: (qmail 9497 invoked from network); 3 Mar 2005 11:12:04 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 3 Mar 2005 11:12:04 -0000 Received: (qmail 2483 invoked by alias); 3 Mar 2005 11:12:02 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 20921 Received: (qmail 2469 invoked from network); 3 Mar 2005 11:12:02 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 3 Mar 2005 11:12:02 -0000 Received: (qmail 9236 invoked from network); 3 Mar 2005 11:11:59 -0000 Received: from mail36.messagelabs.com (193.109.254.211) by a.mx.sunsite.dk with SMTP; 3 Mar 2005 11:11:54 -0000 X-VirusChecked: Checked X-Env-Sender: okiddle@yahoo.co.uk X-Msg-Ref: server-22.tower-36.messagelabs.com!1109848274!14559140!1 X-StarScan-Version: 5.4.11; banners=-,-,- X-Originating-IP: [158.234.9.163] Received: (qmail 23981 invoked from network); 3 Mar 2005 11:11:14 -0000 Received: from iris.logica.co.uk (158.234.9.163) by server-22.tower-36.messagelabs.com with SMTP; 3 Mar 2005 11:11:14 -0000 Received: from trentino.logica.co.uk ([158.234.142.59]) by iris.logica.co.uk (8.12.3/8.12.3/Debian -4) with ESMTP id j23BBE61001233 for ; Thu, 3 Mar 2005 11:11:14 GMT Received: from trentino.groupinfra.com (localhost [127.0.0.1]) by trentino.logica.co.uk (Postfix) with ESMTP id CA68D3AF90 for ; Thu, 3 Mar 2005 12:10:53 +0100 (CET) X-VirusChecked: Checked X-StarScan-Version: 5.0.7; banners=.,-,- In-reply-to: <200503021608.j22G8ips020857@news01.csr.com> From: Oliver Kiddle References: <5964.1109263147@trentino.logica.co.uk> <200503011246.j21CkRGS031240@news01.csr.com> <31072.1109700154@trentino.groupinfra.com> <200503021608.j22G8ips020857@news01.csr.com> To: Zsh workers Subject: Re: configure tests for iconv Date: Thu, 03 Mar 2005 12:10:53 +0100 Message-ID: <28818.1109848253@trentino.groupinfra.com> X-Spam-Checker-Version: SpamAssassin 3.0.2 on a.mx.sunsite.dk X-Spam-Level: X-Spam-Status: No, score=-2.6 required=6.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.2 X-Spam-Hits: -2.6 Peter wrote: > > No, that doesn't work either. The error is from returning -1 from > cd = iconv_open(nl_langinfo(CODESET), "ISO-10646"); I tried downloading GNU libiconv and, sure enough, it doesn't like "ISO-10646". I had imagined libiconv was the same code as glibc uses but perhaps not. At least with this being the problem, I'm now fairly confident that the configure tests are working. Trying a few different systems, it seems UCS-4BE is a much more portable choice of name to identify the character encoding by. Given that the endianness is explicit, that might be a better choice anyway. So with the following patch it should now work. If this breaks for any system we can always try multiple names for the encoding. Incidentally, the patch also helps on Solaris 8. The Solaris machines I have access to didn't previously have any of the UTF-8 iconv packages installed so I had assumed it simply couldn't do the necessary conversions. Below is also a patch against _iconv to pick up these character encodings on Solaris. > > Does /usr/bin/printf's \u work? > > This fails too, but with the slightly odd error "invalid universal > character name". It's not a problem with the input format, however, I think its telling you that it refuses to convert characters in that particular range and not that it especially *can't* convert them. It won't handle the basic ASCII characters on any system. I think it also prints that message for certain reserved or unallocated ranges. I really can't see the point of that but it's a GNU coreutils issue. > This gives US-ASCII, which might be part of the problem, though I really > haven't the faintest idea. A quick scan of the regional and language > settings didn't suggest anything. Well with the patch below, it should hopefully now cope with stuff like \\u0061 which is as much as we can hope for in a US-ASCII locale. The rest is obviously a Cygwin issue. Perhaps we should add an UNKNOWN_CHAR variable or similar system to allow something else to be substituted instead of an error message. Oliver Index: Completion/Unix/Command/_iconv =================================================================== RCS file: /cvsroot/zsh/zsh/Completion/Unix/Command/_iconv,v retrieving revision 1.4 diff -u -r1.4 _iconv --- Completion/Unix/Command/_iconv 17 Jun 2004 13:12:26 -0000 1.4 +++ Completion/Unix/Command/_iconv 3 Mar 2005 10:29:49 -0000 @@ -1,7 +1,8 @@ #compdef iconv -local expl curcontext="$curcontext" state line codeset ret=1 +local expl curcontext="$curcontext" state line ret=1 local LOCPATH="${LOCPATH:-/usr/lib/nls/loc}" +local -U codeset if _pick_variant gnu=GNU unix --version; then @@ -40,6 +41,7 @@ if [[ $state = codeset ]]; then if [[ -f /usr/lib/iconv/iconv_data ]]; then # IRIX & Solaris codeset=( ${${(f)"$(=0;i--) { inbuf[i] = wval & 0xff; wval >>= 8; } - cd = iconv_open(nl_langinfo(CODESET), "ISO-10646"); + cd = iconv_open(nl_langinfo(CODESET), "UCS-4BE"); if (cd == (iconv_t)-1) { zerr("cannot do charset conversion", NULL, 0); if (fromwhere == 4) {