* [PATCH] Optimization for mb_metastrlenend() @ 2016-11-03 19:44 ` Sebastian Gniazdowski 2016-11-03 19:47 ` Sebastian Gniazdowski 2016-11-04 9:59 ` Peter Stephenson 0 siblings, 2 replies; 3+ messages in thread From: Sebastian Gniazdowski @ 2016-11-03 19:44 UTC (permalink / raw) To: zsh-workers Hello mb_metastrlenend can quickly count character if it's ASCII (0..127) and occurs after complete char. A good test for this has been found – syntax highlighting parser working on 823 lines of Zsh-code input. It comes from my project HSMW, is a modified and optimized zsh-syntax-highlighting parser. Running time before optimizations: 2237 ms, after: 2027 ms, so this is a 10% optimization for long buffers. Repeated the test many times, it's a clear win. For short buffers (line-by-line calling the parser on different, hard input) the gain is ~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers and instruction to repeat the test are attached. Checked that all Zsh tests are passing. diff --git a/Src/utils.c b/Src/utils.c index db43529..5bc9ef4 100644 --- a/Src/utils.c +++ b/Src/utils.c @@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr) char inchar, *laststart; size_t ret; wchar_t wc; - int num, num_in_char; + int num, num_in_char, complete; if (!isset(MULTIBYTE)) return ztrlen(ptr); @@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr) laststart = ptr; ret = MB_INVALID; num = num_in_char = 0; + complete = 1; memset(&mb_shiftstate, 0, sizeof(mb_shiftstate)); while (*ptr && !(eptr && ptr >= eptr)) { @@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char *eptr) else inchar = *ptr; ptr++; + + if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) { + num ++; + laststart = ptr; + num_in_char = 0; + continue; + } + ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate); if (ret == MB_INCOMPLETE) { @@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr) * so we don't count characters twice. */ num_in_char++; + complete = 0; } else { if (ret == MB_INVALID) { /* Reset, treat as single character */ @@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char *eptr) } } else num++; + laststart = ptr; num_in_char = 0; + complete = 1; } } -- Sebastian Gniazdowski psprint@fastmail.com ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Optimization for mb_metastrlenend() 2016-11-03 19:44 ` [PATCH] Optimization for mb_metastrlenend() Sebastian Gniazdowski @ 2016-11-03 19:47 ` Sebastian Gniazdowski 2016-11-04 9:59 ` Peter Stephenson 1 sibling, 0 replies; 3+ messages in thread From: Sebastian Gniazdowski @ 2016-11-03 19:47 UTC (permalink / raw) To: zsh-workers [-- Attachment #1: Type: text/plain, Size: 70 bytes --] The missing files -- Sebastian Gniazdowski psprint@fastmail.com [-- Attachment #2: mbrtowc_utils.diff --] [-- Type: text/plain, Size: 1452 bytes --] diff --git a/Src/utils.c b/Src/utils.c index db43529..5bc9ef4 100644 --- a/Src/utils.c +++ b/Src/utils.c @@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr) char inchar, *laststart; size_t ret; wchar_t wc; - int num, num_in_char; + int num, num_in_char, complete; if (!isset(MULTIBYTE)) return ztrlen(ptr); @@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr) laststart = ptr; ret = MB_INVALID; num = num_in_char = 0; + complete = 1; memset(&mb_shiftstate, 0, sizeof(mb_shiftstate)); while (*ptr && !(eptr && ptr >= eptr)) { @@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char *eptr) else inchar = *ptr; ptr++; + + if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) { + num ++; + laststart = ptr; + num_in_char = 0; + continue; + } + ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate); if (ret == MB_INCOMPLETE) { @@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr) * so we don't count characters twice. */ num_in_char++; + complete = 0; } else { if (ret == MB_INVALID) { /* Reset, treat as single character */ @@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char *eptr) } } else num++; + laststart = ptr; num_in_char = 0; + complete = 1; } } [-- Attachment #3: zprof_results.txt --] [-- Type: text/plain, Size: 2013 bytes --] git clone https://github.com/psprint/history-search-multi-word.git cd test; ./parse.zsh ./to-parse.zsh 823 lines parsed with modified, optimized zsh-syntax-highlighting code After optimization, minimum obtainable time: Running time: 2.0280520000 num calls time self name ----------------------------------------------------------------------------------- 1) 1 2027,49 2027,49 100,00% 1898,46 1898,46 93,63% -hsmw-highlight-process 2) 754 109,36 0,15 5,39% 109,36 0,15 5,39% -hsmw-highlight-main-type 3) 395 11,57 0,03 0,57% 11,57 0,03 0,57% -hsmw-highlight-check-path 4) 22 5,78 0,26 0,28% 5,78 0,26 0,28% -hsmw-highlight-string 5) 6 2,33 0,39 0,11% 2,33 0,39 0,11% -hsmw-highlight-dollar-string 6) 1 0,07 0,07 0,00% 0,07 0,07 0,00% -hsmw-highlight-fill-option-variables 7) 1 0,01 0,01 0,00% 0,01 0,01 0,00% -hsmw-highlight-init Before optimization, minimum obtainable time: Running time: 2.2383990000 num calls time self name ----------------------------------------------------------------------------------- 1) 1 2237,79 2237,79 100,00% 2104,55 2104,55 94,04% -hsmw-highlight-process 2) 754 113,73 0,15 5,08% 113,73 0,15 5,08% -hsmw-highlight-main-type 3) 395 11,24 0,03 0,50% 11,24 0,03 0,50% -hsmw-highlight-check-path 4) 22 6,02 0,27 0,27% 6,02 0,27 0,27% -hsmw-highlight-string 5) 6 2,26 0,38 0,10% 2,26 0,38 0,10% -hsmw-highlight-dollar-string 6) 1 0,07 0,07 0,00% 0,07 0,07 0,00% -hsmw-highlight-fill-option-variables 7) 1 0,01 0,01 0,00% 0,01 0,01 0,00% -hsmw-highlight-init ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Optimization for mb_metastrlenend() 2016-11-03 19:44 ` [PATCH] Optimization for mb_metastrlenend() Sebastian Gniazdowski 2016-11-03 19:47 ` Sebastian Gniazdowski @ 2016-11-04 9:59 ` Peter Stephenson 1 sibling, 0 replies; 3+ messages in thread From: Peter Stephenson @ 2016-11-04 9:59 UTC (permalink / raw) To: zsh-workers On Thu, 03 Nov 2016 12:44:12 -0700 Sebastian Gniazdowski <psprint@fastmail.com> wrote: > mb_metastrlenend can quickly count character if it's ASCII (0..127) and > occurs after complete char. A good test for this has been found – syntax > highlighting parser working on 823 lines of Zsh-code input. It comes > from my project HSMW, is a modified and optimized > zsh-syntax-highlighting parser. Running time before optimizations: 2237 > ms, after: 2027 ms, so this is a 10% optimization for long buffers. > Repeated the test many times, it's a clear win. For short buffers > (line-by-line calling the parser on different, hard input) the gain is > ~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers > and instruction to repeat the test are attached. Checked that all Zsh > tests are passing. Thanks, we do rely throughout on US-ASCII as a 7-bit subset so this is a reasonable thing to do. I had to apply it by hand and I've slightly reformatted it, but the code is identical. pws ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-11-04 9:59 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20161103194449epcas4p18709e14994601110cf87ad06c2fcb9a1@epcas4p1.samsung.com> 2016-11-03 19:44 ` [PATCH] Optimization for mb_metastrlenend() Sebastian Gniazdowski 2016-11-03 19:47 ` Sebastian Gniazdowski 2016-11-04 9:59 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).