From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11371 invoked from network); 14 Feb 2007 07:48:59 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,FORGED_RCVD_HELO autolearn=ham version=3.1.7 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 14 Feb 2007 07:48:59 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 76923 invoked from network); 14 Feb 2007 07:48:53 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 14 Feb 2007 07:48:53 -0000 Received: (qmail 12067 invoked by alias); 14 Feb 2007 07:48:51 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 23173 Received: (qmail 12058 invoked from network); 14 Feb 2007 07:48:50 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 14 Feb 2007 07:48:50 -0000 Received: (qmail 76640 invoked from network); 14 Feb 2007 07:48:50 -0000 Received: from vms046pub.verizon.net (206.46.252.46) by a.mx.sunsite.dk with SMTP; 14 Feb 2007 07:48:44 -0000 Received: from torch.brasslantern.com ([71.116.79.148]) by vms046.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPA id <0JDG00CZE0CGSPP1@vms046.mailsrvcs.net> for zsh-workers@sunsite.dk; Wed, 14 Feb 2007 01:48:17 -0600 (CST) Received: from torch.brasslantern.com (localhost.localdomain [127.0.0.1]) by torch.brasslantern.com (8.13.1/8.13.1) with ESMTP id l1E7mF0Y005426 for ; Tue, 13 Feb 2007 23:48:15 -0800 Received: (from schaefer@localhost) by torch.brasslantern.com (8.13.1/8.13.1/Submit) id l1E7mF78005425 for zsh-workers@sunsite.dk; Tue, 13 Feb 2007 23:48:15 -0800 Date: Tue, 13 Feb 2007 23:48:14 -0800 From: Bart Schaefer Subject: Re: Quoting problem and crashes with ${(#)var} In-reply-to: <200702132111.l1DLB5rA003849@pwslaptop.csr.com> To: zsh-workers@sunsite.dk Message-id: <070213234815.ZM5424@torch.brasslantern.com> MIME-version: 1.0 X-Mailer: OpenZMail Classic (0.9.2 24April2005) Content-type: text/plain; charset=us-ascii References: <200702132111.l1DLB5rA003849@pwslaptop.csr.com> Comments: In reply to Peter Stephenson "Re: Quoting problem and crashes with ${(#)var}" (Feb 13, 9:11pm) On Feb 13, 9:11pm, Peter Stephenson wrote: } } Bart Schaefer wrote: } > I'm a bit puzzled, given this test ... } > } > } if (isset(MULTIBYTE) && ires > 127) { } > } > ... why ${(V)x} for x in 128 through 159 display as \u0080 through } > \u009f, but then 160 through 255 are treated as directly printable. } } On my terminal, I've got different effects, which worries me more: if I } assign the UTF-8 representation of character 128 to a variable, ${(V)x} } tries to print it out directly (and it only shows up if send it through } xxd or equivalent). Did you remember to use "print -R"? If I do print ${(V)x} then print interprets the \u0080 sequence and send a raw byte. That doesn't happen with print -R ${(V)x} } (However, the ZLE function insert-unicode-char correctly } shows it as control character, ^ followed by A with a grave accent.) That's what I expected ${(V)x} to do, but instead it displays it as a \u escape. } > % for x in {1..254}; h[x]=${(V#)x} } > zsh: character not in range } > } > That seems wrong. } } Well, because you've (explicitly or otherwise) got it set to a locale } with no knowledge of characters beyond 127; it only knows about the } portable character set. It's simply telling you it doesn't know what to } do with them. } } What you're asking is for some kludged special case for LANG=C Well, no, I'm not. I'm asking for two things: (1) when "character not in range" we don't treat it as a fatal error and bail out of the whole surrounding loop; and (2) regardless of the locale, single-byte values should always be convertible to something "viewable", either \u00xy or \M-c. There might be cases where "character not in range" is a fatal error, but this doesn't seem as though it ought to be one of them. --- /tmp/subst.c 2007-02-13 23:44:46.000000000 -0800 +++ /tmp/subst.c5229YwW 2007-02-13 23:44:46.000000000 -0800 @@ -1193,18 +1193,22 @@ substevalchar(char *ptr) { zlong ires = mathevali(ptr); - int len; + int len = 0; if (errflag) return NULL; #ifdef MULTIBYTE_SUPPORT if (isset(MULTIBYTE) && ires > 127) { + int one = noerrs; char buf[10]; /* inefficient: should separate out \U handling from getkeystring */ sprintf(buf, "\\U%.8x", (unsigned int)ires); + noerrs = 1; ptr = getkeystring(buf, &len, GETKEYS_BINDKEY, NULL); - } else + noerrs = one, errflag = 0; + } + if (len == 0) #endif { ptr = zhalloc(2); --