zsh-workers
 help / color / mirror / code / Atom feed
* Idea for optimization (use case: iterate string with index parameter)
@ 2018-01-05 13:38 Sebastian Gniazdowski
  2018-01-05 22:23 ` Bart Schaefer
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Gniazdowski @ 2018-01-05 13:38 UTC (permalink / raw)
  To: zsh-workers

Hello,
iterating string with index parameter is quite slow, because unicode characters are skipped and counted using mbrtowc(). This function takes much CPU time. It is called theta(0.5 * n^2) times when iterating string of length n. The goal is to make this theta(n).

The idea is to avoid calling mbrtowc by holding array of 5-10 structures with:
- pointer to beginning of string
- requested index (request to getarg() / getindex())
- computed in-string pointer for that index (by getarg() / getindex())

In general, the array would hold #N (5-10 or so) last string-index requests. If new request would target the same string, but index greater by 1, getarg() would call mbrtowc() once (via MB_METACHARLEN macro) reusing the previous in-string pointer.

So the idea is to catch not too large loops (using not too many scalar indexing) into the cache. For example, I saw z-sy-h uses such loops, my projects sometimes use them too. The point is that iterating a string and doing something with letters, e.g. counting brackets, is a very common use case, and the optimization would be triggered often.

I'm sharing as maybe there's better way to do this and I'm not sure when I'll get time to write this, and maybe someone has time and energy to do it. It would solve the rather shocking revelation of a typical C programmer, that strings are iterated like lists :)
--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-01-06  5:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-05 13:38 Idea for optimization (use case: iterate string with index parameter) Sebastian Gniazdowski
2018-01-05 22:23 ` Bart Schaefer
2018-01-06  5:16   ` Sebastian Gniazdowski

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).