zsh-workers
 help / color / mirror / code / Atom feed
From: Sebastian Gniazdowski <psprint@fastmail.com>
To: zsh-workers@zsh.org
Subject: [PATCH] Optimization for mb_metastrlenend()
Date: Thu, 03 Nov 2016 12:44:12 -0700	[thread overview]
Message-ID: <1478202252.540329.776653241.444797D3@webmail.messagingengine.com> (raw)

Hello
mb_metastrlenend can quickly count character if it's ASCII (0..127) and
occurs after complete char. A good test for this has been found – syntax
highlighting parser working on 823 lines of Zsh-code input. It comes
from my project HSMW, is a modified and optimized
zsh-syntax-highlighting parser. Running time before optimizations: 2237
ms, after: 2027 ms, so this is a 10% optimization for long buffers.
Repeated the test many times, it's a clear win. For short buffers
(line-by-line calling the parser on different, hard input) the gain is
~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers
and instruction to repeat the test are attached. Checked that all Zsh
tests are passing.



diff --git a/Src/utils.c b/Src/utils.c
index db43529..5bc9ef4 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
     char inchar, *laststart;
     size_t ret;
     wchar_t wc;
-    int num, num_in_char;
+    int num, num_in_char, complete;

     if (!isset(MULTIBYTE))
        return ztrlen(ptr);
@@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
     laststart = ptr;
     ret = MB_INVALID;
     num = num_in_char = 0;
+    complete = 1;

     memset(&mb_shiftstate, 0, sizeof(mb_shiftstate));
     while (*ptr && !(eptr && ptr >= eptr)) {
@@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
        else
            inchar = *ptr;
        ptr++;
+
+        if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) {
+            num ++;
+            laststart = ptr;
+            num_in_char = 0;
+            continue;
+        }
+
        ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate);

        if (ret == MB_INCOMPLETE) {
@@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
             * so we don't count characters twice.
             */
            num_in_char++;
+            complete = 0;
        } else {
            if (ret == MB_INVALID) {
                /* Reset, treat as single character */
@@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
                }
            } else
                num++;
+
            laststart = ptr;
            num_in_char = 0;
+            complete = 1;
        }
     }

-- 
  Sebastian Gniazdowski
  psprint@fastmail.com


             reply	other threads:[~2016-11-03 19:44 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20161103194449epcas4p18709e14994601110cf87ad06c2fcb9a1@epcas4p1.samsung.com>
2016-11-03 19:44 ` Sebastian Gniazdowski [this message]
2016-11-03 19:47   ` Sebastian Gniazdowski
2016-11-04  9:59   ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1478202252.540329.776653241.444797D3@webmail.messagingengine.com \
    --to=psprint@fastmail.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).