From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21537 invoked by alias); 5 Jan 2018 22:24:11 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 42232 Received: (qmail 4777 invoked by uid 1010); 5 Jan 2018 22:24:11 -0000 X-Qmail-Scanner-Diagnostics: from mail-qk0-f182.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(209.85.220.182):SA:0(-1.9/5.0):. Processed in 9.859798 secs); 05 Jan 2018 22:24:11 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS, T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.1 X-Envelope-From: schaefer@brasslantern.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=mH1LfW/QwlL+JGBBWwABnSTeL+PZ/ZuQncXu2g1bF0k=; b=mbhv4la/Xz/Xhy+pRjUL2c/D0rqt6IiTbaG6zG8XU/55LFj70uW97ZB6Xtmy+yFq0G 9FcO2HKIQif/ECmuEzbKROZK8uXvdwfCxtdQj4VqF58jXqusSaJmJ8H4ITVZHftehwot wfkjzwem21U0mpIHZOSSVMGsjG5iOsZ/j7X8QDGcz6H9BVUjjCXGxD5Dxpf1FGQd+1Mk uldpUNpYm2Lr2P26mS9+641nOEge/JGO63ROS4hck2wLjkyT27S1Ib8Z3eXP8/U0BkM9 kctsYPO3aSLPnUBaFSNVWcccobDRHibXvCFcBWTT4H/nOrTRK5dGXU0iS5OA3nsxOY4N vfcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=mH1LfW/QwlL+JGBBWwABnSTeL+PZ/ZuQncXu2g1bF0k=; b=XIXVCFCJAau3xWrZqBL+gCtlJLuN3/Q7O14wYIao3BpV6dTBeVgpotpCSw03KLfViY gjLOHPwc4AoX/9Q0nWIOv3+p1xm7pXFKAGL1Tte3GVmXY46XHWU0MEz+xgCx+pz4M9MB jqCdsnsz+G5qDv6JQXDv3XN8D42wjHXGY3RuRAogyEDCPNh+RV9FB/cnmYocGBcAI7vX 4Ie4t2STEV1fss54NAyupY90oWG5b20xk0Kb/d6KPCnA/5B5BDczdn/3tIjzDvUd1Foj T3Zr+QulzCrRVF7derxENsixqgQZ73tsc6sK0l7iiEKDFWRCrBWamhSzhNpQUEHoUqr0 zcEg== X-Gm-Message-State: AKwxytfIrP3nWRx6LSny9/FroVCql+KEcU+hK4ryvvB3M0IAU/NhhS3k VD99LuAYzuAlrkm3Eo1FaGOwPOoXE/wx6zEUYmNh+rlE X-Google-Smtp-Source: ACJfBouJKCa18LrS+sR/S14Xv9kdy61wfLT8Ire367d0xIxxoeJeOLOC3+aQPO6A0IBZ5cRPx1i3Tyzq65Mbi5OF6nM= X-Received: by 10.55.71.133 with SMTP id u127mr6485820qka.102.1515191037927; Fri, 05 Jan 2018 14:23:57 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: From: Bart Schaefer Date: Fri, 5 Jan 2018 14:23:57 -0800 Message-ID: Subject: Re: Idea for optimization (use case: iterate string with index parameter) To: "zsh-workers@zsh.org" Content-Type: text/plain; charset="UTF-8" On Fri, Jan 5, 2018 at 5:38 AM, Sebastian Gniazdowski wrote: > iterating string with index parameter is quite slow, because unicode characters are skipped and counted using mbrtowc(). I can't remember the last time I needed to do that kind of iteration. > For example, I saw z-sy-h uses such loops, my projects sometimes use them too. The point is that iterating a string and doing something with letters, e.g. counting brackets, is a very common use case, and the optimization would be triggered often. Hmm. Whether this is worthwhile depends on the size of the typical processed string. I can see this affecting z-sy-h when e.g. running zed on a big function, but probably not when editing a typical command line. Maybe it would be reasonable to do something in shell code, e.g.: typeset -a iter=(${(s//)string}) for ((i=1; i <= $#iter; i++)); do something with $iter[i]; done string=${(j//)iter} # if needed That is more memory-intensive, of course, but it also assists with cases of unordered access into the array of characters. > In general, the array would hold #N (5-10 or so) last string-index requests. If new request would target the same string, but index greater by 1, getarg() would call mbrtowc() once (via MB_METACHARLEN macro) reusing the previous in-string pointer. Why only when greater by 1? If greater, scan to and record the next needed position. Same number of mbrtowc() conversions, overall.