zsh-workers
 help / color / mirror / code / Atom feed
From: Oliver Kiddle <okiddle@yahoo.co.uk>
To: Zsh workers <zsh-workers@sunsite.dk>
Subject: Re: configure tests for iconv
Date: Thu, 03 Mar 2005 12:10:53 +0100	[thread overview]
Message-ID: <28818.1109848253@trentino.groupinfra.com> (raw)
In-Reply-To: <200503021608.j22G8ips020857@news01.csr.com>

Peter wrote:
> 
> No, that doesn't work either.  The error is from returning -1 from
>     	    	    cd = iconv_open(nl_langinfo(CODESET), "ISO-10646");

I tried downloading GNU libiconv and, sure enough, it doesn't like
"ISO-10646". I had imagined libiconv was the same code as glibc uses but
perhaps not. At least with this being the problem, I'm now fairly
confident that the configure tests are working.

Trying a few different systems, it seems UCS-4BE is a much more portable
choice of name to identify the character encoding by. Given that the
endianness is explicit, that might be a better choice anyway. So with
the following patch it should now work. If this breaks for any system we
can always try multiple names for the encoding.

Incidentally, the patch also helps on Solaris 8. The Solaris machines I
have access to didn't previously have any of the UTF-8 iconv packages
installed so I had assumed it simply couldn't do the necessary
conversions. Below is also a patch against _iconv to pick up these
character encodings on Solaris.

> > Does /usr/bin/printf's \u work?
> 
> This fails too, but with the slightly odd error "invalid universal
> character name".  It's not a problem with the input format, however,

I think its telling you that it refuses to convert characters in that
particular range and not that it especially *can't* convert them. It
won't handle the basic ASCII characters on any system. I think it also
prints that message for certain reserved or unallocated ranges. I really
can't see the point of that but it's a GNU coreutils issue.

> This gives US-ASCII, which might be part of the problem, though I really
> haven't the faintest idea.  A quick scan of the regional and language
> settings didn't suggest anything.

Well with the patch below, it should hopefully now cope with stuff like
\\u0061 which is as much as we can hope for in a US-ASCII locale. The
rest is obviously a Cygwin issue. Perhaps we should add an UNKNOWN_CHAR
variable or similar system to allow something else to be substituted
instead of an error message.

Oliver

Index: Completion/Unix/Command/_iconv
===================================================================
RCS file: /cvsroot/zsh/zsh/Completion/Unix/Command/_iconv,v
retrieving revision 1.4
diff -u -r1.4 _iconv
--- Completion/Unix/Command/_iconv	17 Jun 2004 13:12:26 -0000	1.4
+++ Completion/Unix/Command/_iconv	3 Mar 2005 10:29:49 -0000
@@ -1,7 +1,8 @@
 #compdef iconv
 
-local expl curcontext="$curcontext" state line codeset ret=1
+local expl curcontext="$curcontext" state line ret=1
 local LOCPATH="${LOCPATH:-/usr/lib/nls/loc}"
+local -U codeset
 
 if _pick_variant gnu=GNU unix --version; then
 
@@ -40,6 +41,7 @@
   if [[ $state = codeset ]]; then
     if [[ -f /usr/lib/iconv/iconv_data ]]; then  # IRIX & Solaris
       codeset=( ${${(f)"$(</usr/lib/iconv/iconv_data)"}%%[[:blank:]]*} )
+      codeset+=( /usr/lib/iconv/*%*.so(Ne.'reply=( ${${REPLY:t}%%%*} ${${REPLY:r}#*%} )'.) )
     elif [[ -d $LOCPATH/iconv ]]; then  # OSF
       codeset=( $LOCPATH/iconv/*(N:t) )
       codeset=( ${(j:_:s:_:)codeset} )
Index: Src/utils.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
retrieving revision 1.75
diff -u -r1.75 utils.c
--- Src/utils.c	25 Feb 2005 10:21:01 -0000	1.75
+++ Src/utils.c	3 Mar 2005 10:29:50 -0000
@@ -3617,13 +3617,13 @@
 		    ICONV_CONST char *inptr = inbuf;
     	    	    inbytes = 4;
 		    outbytes = 6;
-		    /* assume big endian convention for UCS-4 */
+		    /* store value in big endian form */
 		    for (i=3;i>=0;i--) {
 			inbuf[i] = wval & 0xff;
 			wval >>= 8;
 		    }
 
-    	    	    cd = iconv_open(nl_langinfo(CODESET), "ISO-10646");
+    	    	    cd = iconv_open(nl_langinfo(CODESET), "UCS-4BE");
 		    if (cd == (iconv_t)-1) {
 			zerr("cannot do charset conversion", NULL, 0);
 			if (fromwhere == 4) {


  parent reply	other threads:[~2005-03-03 11:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-24 16:39 Oliver Kiddle
2005-03-01 12:46 ` Peter Stephenson
2005-03-01 18:02   ` Oliver Kiddle
2005-03-02 16:08     ` Peter Stephenson
2005-03-02 17:32       ` Andrey Borzenkov
2005-03-02 18:59         ` Peter A. Castro
2005-03-03 11:10       ` Oliver Kiddle [this message]
2005-03-03 11:26         ` Peter Stephenson
2005-03-03 13:51           ` Oliver Kiddle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28818.1109848253@trentino.groupinfra.com \
    --to=okiddle@yahoo.co.uk \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).