From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18258 invoked by alias); 6 Jun 2017 09:22:12 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 41233 Received: (qmail 3587 invoked from network); 6 Jun 2017 09:22:11 -0000 X-Qmail-Scanner-Diagnostics: from mail-wr0-f179.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(209.85.128.179):SA:0(-2.8/5.0):. Processed in 1.166335 secs); 06 Jun 2017 09:22:11 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_PASS,T_DKIM_INVALID autolearn=unavailable autolearn_force=no version=3.4.1 X-Envelope-From: stephane.chazelas@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _netblocks.google.com designates 209.85.128.179 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=gtxWY7clV6Ju4MMvOD9DqfGI457Fh83XQgWGLwGGHPc=; b=eFtBWt15hmM0gl7/UO2NqBW9lGQbWefnEm9u7tijSeUs72zFTa52YXyv7/Rk78dRIp W66z+SRRCYFSoGsB3y3oM6Wv4lMEuqrd2xjNJC7R81FocdbHfBTqTJjGFOxGP1xibfz2 0WWWP56sE8g6mzad9CsHYH8Va2v9a140vsQtf4N9GwGIeFydMrn69lPdbJJ4KT7bPzm2 XYBVU5P3MmLAtqG1Hh01DBx4qW755YcCJh+SHKRAPAmOKoTXzcj+uNWwxFeSEClFtAba aWnapo6n0OSIp5Uf+gpqxVpjJyXYFSjOEBsiqfSJzemV/1dh54ur54CkWXD6yV9V96lw zsdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=gtxWY7clV6Ju4MMvOD9DqfGI457Fh83XQgWGLwGGHPc=; b=HdCEdBMwEfh6YIDHnV/OeHEk+gB0GISL9LiDvsNYIue/QtwVgZVluYPTvbxVpC/gKI /YQpFRB883JXOPsWx+NOC6VtLBC0MwzI1IXGyivaDsfBFFxpUQbHCkKRieiTd3n+et7/ ZxR7TaOwVQtjnlzZpvPlE7PtULrjHm28L3XcsKNpb0vlDSS88dzFUkdxDNLPB+1x1DAD 1OBFmDp+j4Of8LPmmwolSIBwH/k5D5iQnhrW7rls4HyRWjkaHMpakVv5tLfQAebYKGp7 +iNOZianbSVS0tgFsHs5ynovVpGHeo/UqLFXjmR8GswlzjpstFaGSG872c36Jtw3mR1Z D/Sg== X-Gm-Message-State: AODbwcAdZJc1W6gluZPwq7Dkyc7mm9xitrlcRu/560m7qN9aIe4yTuN9 REyOFPcFFFdoic3u X-Received: by 10.223.179.86 with SMTP id k22mr16657344wrd.5.1496740922667; Tue, 06 Jun 2017 02:22:02 -0700 (PDT) Date: Tue, 6 Jun 2017 10:22:00 +0100 From: Stephane Chazelas To: Bart Schaefer Cc: Zsh hackers list Subject: Re: Surprising behaviour with numeric glob sort Message-ID: <20170606092200.GA8595@chaz.gmail.com> Mail-Followup-To: Bart Schaefer , Zsh hackers list References: <20170531212453.GA31563@chaz.gmail.com> <170601152943.ZM4783@torch.brasslantern.com> <20170602090332.GA6574@chaz.gmail.com> <170602161905.ZM10488@torch.brasslantern.com> <20170603211645.GA17785@chaz.gmail.com> <170603170724.ZM15645@torch.brasslantern.com> <20170604173157.GB9094@chaz.gmail.com> <170604150135.ZM13291@torch.brasslantern.com> <20170605115439.GA15325@chaz.gmail.com> <170605201354.ZM16693@torch.brasslantern.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <170605201354.ZM16693@torch.brasslantern.com> User-Agent: Mutt/1.5.24 (2015-08-30) 2017-06-05 20:13:54 -0700, Bart Schaefer: [...] > Like I said, I think it does this wrong. If I'm reading the code > correctly, it first compares the strings for absolute identity while > searching for embedded nuls, and if they are identical up to the nul > it then orders the shorter string before the longer one; otherwise > it skips past the last nul and then relies on strcoll() for the rest > of both strings. It would seem to me that the collation order should > be checked before any nul as well as after, otherwise the first loop > might conclude the strings differ when strcoll() would order them the > same. (However, read below.) I see, like in: $ print -lo $'\u2461\0d' $'\u2463\0c' $'\u2460\0b' $'\u2462\0a' | tr -cd 'abcd\n' d c b a Even though \u2460 .. \u2462 all sort the same in my locale, so the order should be: a b c d [...] > (I don't think zero-padding will work as we > don't know how many zeroes are needed to make the strings be the same > number of digits.) Yes, like I said, that would mean an extra scan of the whole list to find the widest number. Or, since most of the rest of zsh can't cope with decimal integer numbers that are more than 19 digits, pad to 19 digits (at the expense of memory and unnecessary byte comparisons when it comes to comparing those large numbers of zeros), like in my n() sorting function for *(o+n) as a replacement of *(n): n() REPLY=${REPLY//(#m)<->/${(l:20::0:)MATCH}} -- Stephane