* [PATCH] Optimization for mb_metastrlenend()
@ 2016-11-03 19:44 ` Sebastian Gniazdowski
2016-11-03 19:47 ` Sebastian Gniazdowski
2016-11-04 9:59 ` Peter Stephenson
0 siblings, 2 replies; 3+ messages in thread
From: Sebastian Gniazdowski @ 2016-11-03 19:44 UTC (permalink / raw)
To: zsh-workers
Hello
mb_metastrlenend can quickly count character if it's ASCII (0..127) and
occurs after complete char. A good test for this has been found – syntax
highlighting parser working on 823 lines of Zsh-code input. It comes
from my project HSMW, is a modified and optimized
zsh-syntax-highlighting parser. Running time before optimizations: 2237
ms, after: 2027 ms, so this is a 10% optimization for long buffers.
Repeated the test many times, it's a clear win. For short buffers
(line-by-line calling the parser on different, hard input) the gain is
~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers
and instruction to repeat the test are attached. Checked that all Zsh
tests are passing.
diff --git a/Src/utils.c b/Src/utils.c
index db43529..5bc9ef4 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
char inchar, *laststart;
size_t ret;
wchar_t wc;
- int num, num_in_char;
+ int num, num_in_char, complete;
if (!isset(MULTIBYTE))
return ztrlen(ptr);
@@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
laststart = ptr;
ret = MB_INVALID;
num = num_in_char = 0;
+ complete = 1;
memset(&mb_shiftstate, 0, sizeof(mb_shiftstate));
while (*ptr && !(eptr && ptr >= eptr)) {
@@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
else
inchar = *ptr;
ptr++;
+
+ if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) {
+ num ++;
+ laststart = ptr;
+ num_in_char = 0;
+ continue;
+ }
+
ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate);
if (ret == MB_INCOMPLETE) {
@@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
* so we don't count characters twice.
*/
num_in_char++;
+ complete = 0;
} else {
if (ret == MB_INVALID) {
/* Reset, treat as single character */
@@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
}
} else
num++;
+
laststart = ptr;
num_in_char = 0;
+ complete = 1;
}
}
--
Sebastian Gniazdowski
psprint@fastmail.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Optimization for mb_metastrlenend()
2016-11-03 19:44 ` [PATCH] Optimization for mb_metastrlenend() Sebastian Gniazdowski
@ 2016-11-03 19:47 ` Sebastian Gniazdowski
2016-11-04 9:59 ` Peter Stephenson
1 sibling, 0 replies; 3+ messages in thread
From: Sebastian Gniazdowski @ 2016-11-03 19:47 UTC (permalink / raw)
To: zsh-workers
[-- Attachment #1: Type: text/plain, Size: 70 bytes --]
The missing files
--
Sebastian Gniazdowski
psprint@fastmail.com
[-- Attachment #2: mbrtowc_utils.diff --]
[-- Type: text/plain, Size: 1452 bytes --]
diff --git a/Src/utils.c b/Src/utils.c
index db43529..5bc9ef4 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
char inchar, *laststart;
size_t ret;
wchar_t wc;
- int num, num_in_char;
+ int num, num_in_char, complete;
if (!isset(MULTIBYTE))
return ztrlen(ptr);
@@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
laststart = ptr;
ret = MB_INVALID;
num = num_in_char = 0;
+ complete = 1;
memset(&mb_shiftstate, 0, sizeof(mb_shiftstate));
while (*ptr && !(eptr && ptr >= eptr)) {
@@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
else
inchar = *ptr;
ptr++;
+
+ if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) {
+ num ++;
+ laststart = ptr;
+ num_in_char = 0;
+ continue;
+ }
+
ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate);
if (ret == MB_INCOMPLETE) {
@@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
* so we don't count characters twice.
*/
num_in_char++;
+ complete = 0;
} else {
if (ret == MB_INVALID) {
/* Reset, treat as single character */
@@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
}
} else
num++;
+
laststart = ptr;
num_in_char = 0;
+ complete = 1;
}
}
[-- Attachment #3: zprof_results.txt --]
[-- Type: text/plain, Size: 2013 bytes --]
git clone https://github.com/psprint/history-search-multi-word.git
cd test; ./parse.zsh ./to-parse.zsh
823 lines parsed with modified, optimized zsh-syntax-highlighting code
After optimization, minimum obtainable time:
Running time: 2.0280520000
num calls time self name
-----------------------------------------------------------------------------------
1) 1 2027,49 2027,49 100,00% 1898,46 1898,46 93,63% -hsmw-highlight-process
2) 754 109,36 0,15 5,39% 109,36 0,15 5,39% -hsmw-highlight-main-type
3) 395 11,57 0,03 0,57% 11,57 0,03 0,57% -hsmw-highlight-check-path
4) 22 5,78 0,26 0,28% 5,78 0,26 0,28% -hsmw-highlight-string
5) 6 2,33 0,39 0,11% 2,33 0,39 0,11% -hsmw-highlight-dollar-string
6) 1 0,07 0,07 0,00% 0,07 0,07 0,00% -hsmw-highlight-fill-option-variables
7) 1 0,01 0,01 0,00% 0,01 0,01 0,00% -hsmw-highlight-init
Before optimization, minimum obtainable time:
Running time: 2.2383990000
num calls time self name
-----------------------------------------------------------------------------------
1) 1 2237,79 2237,79 100,00% 2104,55 2104,55 94,04% -hsmw-highlight-process
2) 754 113,73 0,15 5,08% 113,73 0,15 5,08% -hsmw-highlight-main-type
3) 395 11,24 0,03 0,50% 11,24 0,03 0,50% -hsmw-highlight-check-path
4) 22 6,02 0,27 0,27% 6,02 0,27 0,27% -hsmw-highlight-string
5) 6 2,26 0,38 0,10% 2,26 0,38 0,10% -hsmw-highlight-dollar-string
6) 1 0,07 0,07 0,00% 0,07 0,07 0,00% -hsmw-highlight-fill-option-variables
7) 1 0,01 0,01 0,00% 0,01 0,01 0,00% -hsmw-highlight-init
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Optimization for mb_metastrlenend()
2016-11-03 19:44 ` [PATCH] Optimization for mb_metastrlenend() Sebastian Gniazdowski
2016-11-03 19:47 ` Sebastian Gniazdowski
@ 2016-11-04 9:59 ` Peter Stephenson
1 sibling, 0 replies; 3+ messages in thread
From: Peter Stephenson @ 2016-11-04 9:59 UTC (permalink / raw)
To: zsh-workers
On Thu, 03 Nov 2016 12:44:12 -0700
Sebastian Gniazdowski <psprint@fastmail.com> wrote:
> mb_metastrlenend can quickly count character if it's ASCII (0..127) and
> occurs after complete char. A good test for this has been found – syntax
> highlighting parser working on 823 lines of Zsh-code input. It comes
> from my project HSMW, is a modified and optimized
> zsh-syntax-highlighting parser. Running time before optimizations: 2237
> ms, after: 2027 ms, so this is a 10% optimization for long buffers.
> Repeated the test many times, it's a clear win. For short buffers
> (line-by-line calling the parser on different, hard input) the gain is
> ~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers
> and instruction to repeat the test are attached. Checked that all Zsh
> tests are passing.
Thanks, we do rely throughout on US-ASCII as a 7-bit subset so this is a
reasonable thing to do.
I had to apply it by hand and I've slightly reformatted it, but the code
is identical.
pws
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-11-04 9:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CGME20161103194449epcas4p18709e14994601110cf87ad06c2fcb9a1@epcas4p1.samsung.com>
2016-11-03 19:44 ` [PATCH] Optimization for mb_metastrlenend() Sebastian Gniazdowski
2016-11-03 19:47 ` Sebastian Gniazdowski
2016-11-04 9:59 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).