From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20123 invoked from network); 17 Aug 2007 12:09:00 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.1 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 17 Aug 2007 12:09:00 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 41554 invoked from network); 17 Aug 2007 12:08:54 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 17 Aug 2007 12:08:54 -0000 Received: (qmail 29314 invoked by alias); 17 Aug 2007 12:08:51 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 23767 Received: (qmail 29304 invoked from network); 17 Aug 2007 12:08:49 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 17 Aug 2007 12:08:49 -0000 Received: (qmail 41226 invoked from network); 17 Aug 2007 12:08:49 -0000 Received: from acolyte.scowler.net (216.254.112.45) by a.mx.sunsite.dk with SMTP; 17 Aug 2007 12:08:46 -0000 Received: by acolyte.scowler.net (Postfix, from userid 1000) id D3B8E5CDA6; Fri, 17 Aug 2007 08:08:44 -0400 (EDT) Date: Fri, 17 Aug 2007 08:08:44 -0400 From: Clint Adams To: zsh-workers@sunsite.dk Cc: Alan Curry , 419832-forwarded@bugs.debian.org Subject: Re: Bug#419832: zsh: expanding non-ASCII filenames with Message-ID: <20070817120844.GA9936@scowler.net> Mail-Followup-To: zsh-workers@sunsite.dk, Alan Curry , 419832-forwarded@bugs.debian.org References: <20070817001222.GA19399@scowler.net> <200708170905.l7H9521T1534406@shell01.TheWorld.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200708170905.l7H9521T1534406@shell01.TheWorld.com> User-Agent: Mutt/1.5.16 (2007-06-11) On Fri, Aug 17, 2007 at 05:05:02AM -0400, Alan Curry wrote: > >> In the following demonstration, the first keypress inserted the $'\300' > >> for me. The second keypress, typed immediately after the asterisk, > >> should expand the glob into $'\300' also, but instead it just erases the > >> asterisk, replacing it with nothing at all. If Return is pressed after the > >> tab, the cat is executed with no arguments and reads from the tty. > >> Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) > >Non-ASCII characters don't exist in the C locale; maybe you want to pick > >a better one. > That's a pretty lame brush-off. > My locale is set correctly (to be precise, it is unset correctly; none of > those environment variables are set). It represents the type of output I want > to get from all programs that recognize locales: text in English if possible, > and traditional sort order, not that new-fangled chaotic LANG=en order, where > ls hides your Makefile in the middle of all your lowercase source files! (Why > do you think they made make(1) recognize Makefiles with a capital M? Because > it belongs at the start of the listing, that's why.) > If you think this behavior is justified, for what am I being punished? Using > the default ("C") locale? It accurately describes what language I can read. > Having a file that is not a valid sequence of characters in that locale? > Maybe I should go file bug reports on all the programs that allow me to > create a file with such a name. That will be a lot of bug reports. > Or maybe we could admit that regardless of one's preferred locale, it is > inevitable that one will occasionally obtain files whose names are not valid > character strings in that locale. It would be nice if our tools would not > choke on those, would it not? > The $'\300' notation is a vast improvement over what older zsh versions did, > just dump the wacky bytes directly to the terminal. The current version > already automatically inserts $'\300' when completing; I only suggest that it > behave identically when expanding. > Expanding a glob to an empty list, when in fact it matched something, surely > can't be considered acceptable behavior. Even worse if it matched several > things and only one of them had a nasty byte and got omitted, you might not > notice and then go ahead and act on the wrong set of files. > > Come on. There is apparently a bit of inconsistency here. expand-or-complete and expand-word will eat the asterisk, but _expand_word won't. What's worse is that if you `touch a b$'\300' c` in an empty directory, cat * will only expand to "a b".