From: Sebastian Gniazdowski <psprint@fastmail.com>
To: zsh-workers@zsh.org
Subject: [PATCH] Optimization for mb_metastrlenend()
Date: Thu, 03 Nov 2016 12:44:12 -0700 [thread overview]
Message-ID: <1478202252.540329.776653241.444797D3@webmail.messagingengine.com> (raw)
Hello
mb_metastrlenend can quickly count character if it's ASCII (0..127) and
occurs after complete char. A good test for this has been found – syntax
highlighting parser working on 823 lines of Zsh-code input. It comes
from my project HSMW, is a modified and optimized
zsh-syntax-highlighting parser. Running time before optimizations: 2237
ms, after: 2027 ms, so this is a 10% optimization for long buffers.
Repeated the test many times, it's a clear win. For short buffers
(line-by-line calling the parser on different, hard input) the gain is
~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers
and instruction to repeat the test are attached. Checked that all Zsh
tests are passing.
diff --git a/Src/utils.c b/Src/utils.c
index db43529..5bc9ef4 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
char inchar, *laststart;
size_t ret;
wchar_t wc;
- int num, num_in_char;
+ int num, num_in_char, complete;
if (!isset(MULTIBYTE))
return ztrlen(ptr);
@@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
laststart = ptr;
ret = MB_INVALID;
num = num_in_char = 0;
+ complete = 1;
memset(&mb_shiftstate, 0, sizeof(mb_shiftstate));
while (*ptr && !(eptr && ptr >= eptr)) {
@@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
else
inchar = *ptr;
ptr++;
+
+ if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) {
+ num ++;
+ laststart = ptr;
+ num_in_char = 0;
+ continue;
+ }
+
ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate);
if (ret == MB_INCOMPLETE) {
@@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
* so we don't count characters twice.
*/
num_in_char++;
+ complete = 0;
} else {
if (ret == MB_INVALID) {
/* Reset, treat as single character */
@@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
}
} else
num++;
+
laststart = ptr;
num_in_char = 0;
+ complete = 1;
}
}
--
Sebastian Gniazdowski
psprint@fastmail.com
next reply other threads:[~2016-11-03 19:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20161103194449epcas4p18709e14994601110cf87ad06c2fcb9a1@epcas4p1.samsung.com>
2016-11-03 19:44 ` Sebastian Gniazdowski [this message]
2016-11-03 19:47 ` Sebastian Gniazdowski
2016-11-04 9:59 ` Peter Stephenson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1478202252.540329.776653241.444797D3@webmail.messagingengine.com \
--to=psprint@fastmail.com \
--cc=zsh-workers@zsh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).