From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1761 invoked by alias); 1 Oct 2015 13:45:42 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 36733 Received: (qmail 5873 invoked from network); 1 Oct 2015 13:45:41 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=p/jYzq2OjFEWv+5adShcbcwqKsKaxeUMxo8ttCVbvCI=; b=mSke7Gg2mUkpVQBeqo0pXnDyomuuoRgz4iSFnSBdY+kuvBXyx0zzSg6N3MVIKkEeB4 Tvkecj7Oe1zrfD5nEqIm3Ri5SuQ/+BI4/t4GWmZ3SqATyOs3HdEryO+8A8R2EKkzl4vq 7pnsdw85XZVULgFexp4wkhKd8cb522jqn8lMXudKo0Pjqssacqf3UZTwV6/pNVrpW61Q ZwxqYAzeCGbLx6GFUCJxBE4/vEU488QmYVX8z6oi3KzwVdMXcnuF3mRE0CWSwHMcpXnY q8qE8pVgo8Ie0AWIFqDknzF2I/Pm6Px7vTn7Dw+FSyRN62HqZV0YvNrJSssmpi8fs3Ho BsUQ== X-Received: by 10.152.21.200 with SMTP id x8mr2863464lae.23.1443707137531; Thu, 01 Oct 2015 06:45:37 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <150927091121.ZM25721@torch.brasslantern.com> References: <150926134410.ZM17546@torch.brasslantern.com> <150927091121.ZM25721@torch.brasslantern.com> From: Sebastian Gniazdowski Date: Thu, 1 Oct 2015 15:45:17 +0200 Message-ID: Subject: Re: Substitution ${...///} slows down when certain UTF character occurs To: zsh-workers@zsh.org Content-Type: multipart/mixed; boundary=089e0158aef4186dd405210b43fb --089e0158aef4186dd405210b43fb Content-Type: text/plain; charset=UTF-8 On 27 September 2015 at 18:11, Bart Schaefer > It'll get worse if there are partial matches, e.g., if you had 30000 > repetitions of "wfei" and scanned for "wfeiwj" there'd be a whole lot > of backtracking. There are no "w" anywhere in your sample $str so > each of the comparisons is only one equality test. It's still instant fast for 30k of wfei (attached is the script). I also tried generating a [wfeiwj]-only string, and it's also fast: cat /dev/urandom | env LC_CTYPE=C tr -cd 'wfeiwj' | head -c 120000 > input Gave it one more try with "wfeiwjwoiejfowiejfowijefoiwjefoiwjefoijwoeifjwoiejf" (30k of it, and was searching for it) and it becomes slower (times 0.15s instead of 0.012s) but is still instant fast. > > Still I think the biggest issue is that unmetafication happening too > low down. Since pattry*() is being called repeatedly with the same two > first arguments (prog and string) it might be possible to cache the > unmetafied string after the first call. I wonder why it depends on Zsh version and/or environment (OS, etc.). This doesn't seem related to unmetafication, unless it was changed after 5.0.2. Best regards, Sebastian Gniazdowski --089e0158aef4186dd405210b43fb Content-Type: application/octet-stream; name="test-script.zsh" Content-Disposition: attachment; filename="test-script.zsh" Content-Transfer-Encoding: base64 X-Attachment-Id: f_if89w50h0 IyEvdXNyL2xvY2FsL2Jpbi96c2gKCmVtdWxhdGUgLUxSIHpzaAoKc3RyPSIiCgppPTMwMDAwCndo aWxlICgoIGkgLS0gKSk7IGRvCiAgICBzdHJbMSwwXT0id2ZlaSIKZG9uZQojZWNobyAiJHN0ciIg PiBpbnB1dAoKCiNjaGFyPSLigJMiCmNoYXI9IsKlIgojY2hhcj0ixYMiCiNjaGFyPSLFgSIKI2No YXI9IseNIgojY2hhcj0ix54iCgpzdHJbNjBdPSRjaGFyCnRpbWUgKCBvdXRfYXJyYXk9KCAiJHtz dHIvLygjYmkpd2ZlaXdqL2F9IiApICkKc3RyWzYwXT0iYSIKCnN0clsxMDAwMF09JGNoYXIKdGlt ZSAoIG91dF9hcnJheT0oICIke3N0ci8vKCNiaSl3ZmVpd2ovYX0iICkgKQpzdHJbMTAwMDBdPSJh IgoKc3RyWzUwMDAwXT0kY2hhcgp0aW1lICggb3V0X2FycmF5PSggIiR7c3RyLy8oI2JpKXdmZWl3 ai9hfSIgKSApCnN0cls1MDAwMF09ImEiCgpzdHJbMTAwMDAwXT0kY2hhcgp0aW1lICggb3V0X2Fy cmF5PSggIiR7c3RyLy8oI2JpKXdmZWl3ai9hfSIgKSApCnN0clsxMDAwMDBdPSJhIgoK --089e0158aef4186dd405210b43fb--