From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29733 invoked from network); 6 Jul 1998 20:05:37 -0000 Received: from postoffice.telstra.net (139.130.4.7) by ns1.primenet.com.au with SMTP; 6 Jul 1998 20:05:37 -0000 Received: from math.gatech.edu (list@math.gatech.edu [130.207.146.50]) by postoffice.telstra.net (8.8.8/8.8.8) with ESMTP id CAA28679 for ; Tue, 7 Jul 1998 02:52:36 +1000 (EST) (envelope-from zsh-workers-request@math.gatech.edu) Received: (from list@localhost) by math.gatech.edu (8.8.5/8.8.5) id OAA11609; Mon, 6 Jul 1998 14:13:31 -0400 (EDT) Resent-Date: Mon, 6 Jul 1998 14:13:31 -0400 (EDT) From: "Bart Schaefer" Message-Id: <980706111424.ZM5205@candle.brasslantern.com> Date: Mon, 6 Jul 1998 11:14:24 -0700 In-Reply-To: Comments: In reply to "C. v. Stuckrad" "'LC_COLLATE=de ls [A-Z]*' expands to 'every file' including lowercase" (Jul 6, 7:28pm) References: X-Mailer: Z-Mail (4.0b.820 20aug96) To: "C. v. Stuckrad" , Zsh workers list Subject: Re: 'LC_COLLATE=de ls [A-Z]*' expands to 'every file' including lowercase MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Resent-Message-ID: <"gy-yR.0.Kr2.AFHer"@math> Resent-From: zsh-workers@math.gatech.edu X-Mailing-List: archive/latest/4205 X-Loop: zsh-workers@math.gatech.edu Precedence: list Resent-Sender: zsh-workers-request@math.gatech.edu On Jul 6, 7:28pm, C. v. Stuckrad wrote: } Subject: 'LC_COLLATE=de ls [A-Z]*' expands to 'every file' including lower } } } Is it 'really correct', that after setting 'LANG=de' or 'LC_COLLATE=de' } ranges of characters will no more be differentiate between uppercase } and lowecase ? So 'rm [A-Z]' will remove not only 'FOO' but 'bar' too! Ranges like [A-Z] are computed using strcoll() when it is available. If that collation function returns that "b" is greater than "A" and less than "Z" then 'b' is considered to be in the range [A-Z]. It's entirely possible that setting LANG and/or LC_COLLATE to something other than C or ASCII could cause sorting to become case-insensitive or to mix the letters (e.g. AaBbCcDd...). In the latter case, [A-Z] would include 'a' through 'y' but not 'z', which is seriously confusing. } Is this a bug ? Or a feature I've not been warned of by the manuals. I'd have to list it as the latter, but it sure creeps awfully close to being a bug, because it's totally unexpected if you actually know about the numeric values of your character set. I'd vote in favor of removing HAVE_STRCOLL from matchonce() in glob.c. -- Bart Schaefer Brass Lantern Enterprises http://www.well.com/user/barts http://www.brasslantern.com