From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26797 invoked by alias); 4 Jun 2017 22:01:23 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 41222 Received: (qmail 8778 invoked from network); 4 Jun 2017 22:01:23 -0000 X-Qmail-Scanner-Diagnostics: from mail-ua0-f171.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(209.85.217.171):SA:0(-2.3/5.0):. Processed in 2.045858 secs); 04 Jun 2017 22:01:23 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,RCVD_IN_SORBS_SPAM,SPF_PASS,T_DKIM_INVALID autolearn=unavailable autolearn_force=no version=3.4.1 X-Envelope-From: schaefer@brasslantern.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _netblocks.google.com designates 209.85.217.171 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:date:in-reply-to:comments:references:to:subject :mime-version; bh=Y+VwsGDU7Spi53E1/9t4hKFZYyJewTXEj7RgIcxPUKs=; b=sFet8ZIU49cZqaDDu3FrUGKSrHGcgwyQWsBl56/N1VMUt245t24cGZ8XHmo3kKg7XT KahxPPrRpqFqi+U2dAzYmzEVl+L7OoRe12YI48I7xY2pgY22GaMo1pv6xyTOfF2rL17/ mSDIxSoFPAf8peTdUXZBqi9Kr4yatnp7S9SI6aTubcEgbm/ZHnvta+iY5113Fhr+ANl8 SBj6F6vn51eDONVcvTAtL6mr4s6+Smi6zXaX7bKrS/gtaqYVqYx82bPdW5/farM2DSz6 L7SKDwYysWn6/Dt/hgx3RZ+J1fVJOhC9XmvqQIfi7KcUKFP7Bk7SlRJ1EBDN2dfwmCzb hT/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:date:in-reply-to:comments :references:to:subject:mime-version; bh=Y+VwsGDU7Spi53E1/9t4hKFZYyJewTXEj7RgIcxPUKs=; b=U8X5T1lhnNizHlSwbVyKG2gdic4a1xkrByS33h9fWOgQeIXTpVT1Ohlnv4DImJKRwn dSflSr1T7T8Yq2MOxtEyFZLQkELVvZFnkkWbYqC2Ehee38RstJRQhelAbepmQMQHl767 x1ezTib07Yo35X/8ieG4lPlUcT3QO0xnDWlePhc85PCMPfBdvSJ/mqKOAFDTH/JSUyzv fV23mVNojrokHB2b9H4gQ9/gGit3ZU0hwmB7L0pfp0YOpFYMg8WEtgmMNEi/XZ0QcbRD mcMiPiegouUpSFRyeoIfvIdchuFfJxVi+B8Wmz0uo8nOkugpbpJX0YTJIRt2h81syfP9 6wJg== X-Gm-Message-State: AODbwcAQJIQk2q26de1j2HGfiT8K+FrxNPL9ZHgCVg1wzn22XhRlxk+x aNEpV8jotBGlmPzZwr4= X-Received: by 10.159.48.218 with SMTP id k26mr341185uab.31.1496613673406; Sun, 04 Jun 2017 15:01:13 -0700 (PDT) From: Bart Schaefer Message-Id: <170604150135.ZM13291@torch.brasslantern.com> Date: Sun, 4 Jun 2017 15:01:35 -0700 In-Reply-To: <20170604173157.GB9094@chaz.gmail.com> Comments: In reply to Stephane Chazelas "Re: Surprising behaviour with numeric glob sort" (Jun 4, 6:31pm) References: <20170531212453.GA31563@chaz.gmail.com> <170601152943.ZM4783@torch.brasslantern.com> <20170602090332.GA6574@chaz.gmail.com> <170602161905.ZM10488@torch.brasslantern.com> <20170603211645.GA17785@chaz.gmail.com> <170603170724.ZM15645@torch.brasslantern.com> <20170604173157.GB9094@chaz.gmail.com> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: Zsh hackers list Subject: Re: Surprising behaviour with numeric glob sort MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii On Jun 4, 6:31pm, Stephane Chazelas wrote: } } "Slow" (though probably quite negligible compared to strcoll() } which already does things like in addition to a lot more hard } wark) but working. According to one strcoll man page I have, the right thing to do is convert all the strings with strxfrm and then strcmp those. It provides no advice on how to order the original array to match the sorted result of the xfrm'd array (the transform is not reversible), nor how to determine how much space to allocate for each transform. Zsh has the additional complication of needing to deal with strings having embedded '\0' bytes, which neither strcoll nor strxfrm is able to deal with. I'm not 100% confident that zsh's current algorithm deals correct with this either. A possible approach would be to pre-allocate a hash table and fill it on the fly with keys the original strings and values the strxfrm results. An additional strxfrm could be called on the trailing bits after any embedded nul. Then sort the original array by comparing the values in the hash. Doesn't solve the question of how much space to reserve, but it would allow the current algorithm for picking out numeric substrings to be used. For parameter array sorting we could extend the (m) flag to indicate that this transformation is required [it already means to count byte widths for (l), (r), etc.] so as to avoid the overhead when it is not needed. For globbing, we'd have to rely on something else such as whether MULTIBYTE is set.