* Why sourcing a file is not faster than doing a loop with eval, zle -N @ 2017-06-17 11:34 Sebastian Gniazdowski [not found] ` <CAH+w=7bVXtubcdwvEBC9isE32683dUipAUS=vrAkgO5pp2bkkw@mail.gmail.com> 0 siblings, 1 reply; 10+ messages in thread From: Sebastian Gniazdowski @ 2017-06-17 11:34 UTC (permalink / raw) To: zsh-users Hello, I've tried to optimize my fast-syntax-highlighting. The idea is simple, instead of a loop: for cur_widget in $widgets_to_bind; do case $widgets[$cur_widget] in ... builtin) eval "_zsh_highlight_widget_${(q)prefix}-${(q)cur_widget}() { _call_widget .${(q)cur_widget} -- \"\$@\" }" zle -N $cur_widget _zsh_highlight_widget_$prefix-$cur_widget;; ... ... I do, in the same loop: ... print -r "zle -N" "$prefix-$cur_widget" "${widgets[$cur_widget]#*:}" >>| ~/.fsh_cache ... and so on, to then only detect ~/.fsh_cache, and source it, skipping the loop. Times of "zsh -i -c exit" are: - normal FSH: 0.3968 sec on average - cache-feature FSH: 0.4329 sec on average - zcompiled cache: 0.3831 sec on average So, only after compiling ~/.fsh_cache, I get slightly better time, normally it is ~30 ms slower. I would expect this to be always and more faster. Why it is not? I now suspect that maybe there's more parsing – loop doesn't have 554 lines like ~/.fsh_cache, and is parsed quickier. Test data: https://github.com/zdharma/hacking-private/tree/master/FSH -- Sebastian Gniazdowski psprint /at/ zdharma.org ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CAH+w=7bVXtubcdwvEBC9isE32683dUipAUS=vrAkgO5pp2bkkw@mail.gmail.com>]
[parent not found: <CAH+w=7afTi=1bfLBCmq8-vB-rLWDtEkAtk8gCCna3-mQwZ1-Ow@mail.gmail.com>]
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N [not found] ` <CAH+w=7afTi=1bfLBCmq8-vB-rLWDtEkAtk8gCCna3-mQwZ1-Ow@mail.gmail.com> @ 2017-06-17 14:05 ` Bart Schaefer [not found] ` <etPan.594538f9.2ea629d6.10b2e@AirmailxGenerated.am> 1 sibling, 0 replies; 10+ messages in thread From: Bart Schaefer @ 2017-06-17 14:05 UTC (permalink / raw) To: Sebastian Gniazdowski; +Cc: Zsh Users [-- Attachment #1: Type: text/plain, Size: 100 bytes --] You might get faster parsing if you setopt noaliases. Just a thought, can't try it right now. ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <etPan.594538f9.2ea629d6.10b2e@AirmailxGenerated.am>]
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N [not found] ` <etPan.594538f9.2ea629d6.10b2e@AirmailxGenerated.am> @ 2017-06-17 14:56 ` Sebastian Gniazdowski 2017-06-17 15:44 ` Sebastian Gniazdowski 0 siblings, 1 reply; 10+ messages in thread From: Sebastian Gniazdowski @ 2017-06-17 14:56 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh Users 17.06.2017 o 16:06:59, Bart Schaefer (schaefer@brasslantern.com) napisał: > You might get faster parsing if you setopt noaliases. Just a thought, > can't try it right now. I tried it. I think there's no change. Setup that reads ~/.fsh_cache is 0.414 vs 0.417 (noaliases). Before I've given time 0.432, but there was single large value 0.599, so I now summed 9 numbers, and divided by 9 (giving 0.414). Just to remind, normal time is 0.396 (0.389 in second test). Time for zcompiled ~/.fsh_cache is 0.375 in new test (previous: 0.383). So it's 20 ms compared to slower normal read 0.396. Almost reached the goal of 40 ms. For someone having startup time 200-250 ms the 40 ms would matter. -- Sebastian Gniazdowski psprint /at/ zdharma.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N 2017-06-17 14:56 ` Sebastian Gniazdowski @ 2017-06-17 15:44 ` Sebastian Gniazdowski 2017-06-17 16:39 ` Bart Schaefer 0 siblings, 1 reply; 10+ messages in thread From: Sebastian Gniazdowski @ 2017-06-17 15:44 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh Users On 17 czerwca 2017 at 16:56:12, Sebastian Gniazdowski (psprint@zdharma.org) wrote: > Time for zcompiled ~/.fsh_cache is 0.375 in new test (previous: 0.383). So it's 20 ms > compared to slower normal read 0.396. Almost reached the goal of 40 ms. For someone having > startup time 200-250 ms the 40 ms would matter. I've checked with zprof how much does the loop take normally: 19,41 19,41 36,83% 19,41 19,41 36,83% _zsh_highlight_bind_widgets It varies between 16 ms and 20 ms. So the gain from zcompiled .fsh_cache seems to be maximal. -- Sebastian Gniazdowski psprint /at/ zdharma.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N 2017-06-17 15:44 ` Sebastian Gniazdowski @ 2017-06-17 16:39 ` Bart Schaefer 2017-06-17 17:25 ` Sebastian Gniazdowski 0 siblings, 1 reply; 10+ messages in thread From: Bart Schaefer @ 2017-06-17 16:39 UTC (permalink / raw) To: Zsh Users On Jun 17, 5:44pm, Sebastian Gniazdowski wrote: } } So the gain from zcompiled .fsh_cache seems to be maximal. As long as you've got a good way to test these timings ... compile your .fsh_cache file with "zcompile -R" and see if you still get any speedup? Some simple tests that I did seem to indicate that zcompile does not do much good if the default "zcompile -M" behavior is disabled. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N 2017-06-17 16:39 ` Bart Schaefer @ 2017-06-17 17:25 ` Sebastian Gniazdowski 0 siblings, 0 replies; 10+ messages in thread From: Sebastian Gniazdowski @ 2017-06-17 17:25 UTC (permalink / raw) To: Bart Schaefer, Zsh Users On 17 Jun 2017 at 18:39:55, Bart Schaefer (schaefer@brasslantern.com) wrote: > On Jun 17, 5:44pm, Sebastian Gniazdowski wrote: > } > } So the gain from zcompiled .fsh_cache seems to be maximal. > > As long as you've got a good way to test these timings ... compile > your .fsh_cache file with "zcompile -R" and see if you still get > any speedup? > > Some simple tests that I did seem to indicate that zcompile does > not do much good if the default "zcompile -M" behavior is disabled. > No problem. The results seem the same, best "zcompile" time (previously reported): 0.375 s, best "zcompile -R": 0.376 s. Did this twice after anxiety that I forgot "-R" (very unlikely): 0.374 s. BTW, I have a whacky idea: 1. Invoke "source_prepare ~/.plugins/aplugin.plugin.zsh.zwc", etc. for all used plugins 2. Continue with normal zshrc 3. After it, invoke source_load with the same paths 4. source_prepare will use threads to load .zwc files 5. There will be internal hash table mapping paths to Eprogs, mutexes 6. If the Eprog is not yet ready, source_load will hang on mutex This is to: perform normal zshrc execution while loading of bytecode in background. I checked that ~/.fsh_cache.zwc is 158 kB in size, quite much. But all this is probably a lost game, the mutex use, thread creation, will waste the gain. Although a cool thing to code, I think I will do it anyway. -- Sebastian Gniazdowski psprint /at/ zdharma.org ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <etPan.594513a8.516100cd.10b2e__10513.1716504276$1497699329$gmane$org@zdharma.org>]
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N [not found] <etPan.594513a8.516100cd.10b2e__10513.1716504276$1497699329$gmane$org@zdharma.org> @ 2017-06-19 12:24 ` Stephane Chazelas 2017-06-19 15:31 ` Bart Schaefer [not found] ` <170619083116.ZM17323__41722.0601499595$1497886320$gmane$org@torch.brasslantern.com> 0 siblings, 2 replies; 10+ messages in thread From: Stephane Chazelas @ 2017-06-19 12:24 UTC (permalink / raw) To: Sebastian Gniazdowski; +Cc: zsh-users Note: $ time zsh -c 'repeat 100 . ./fsh_cache' [...] ./fsh_cache:zle:269: invalid widget `.menu-select' ./fsh_cache:zle:269: invalid widget `.menu-select' zsh -c 'repeat 100 . ../hacking-private/FSH/fsh_cache' 1.13s user 0.98s system 99% cpu 2.109 total A lot of "system" time. $ wc ./fsh_cache 554 2964 58524 ./fsh_cache $ strace -c zsh -c '. ./fsh_cache' % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000996 0 60022 rt_sigprocmask 60022 calls to rt_sigprocmask sounds a bit much. They seem to be all on #0 0x00007ffff730d730 in __sigprocmask (how=1, set=0x7fffffffb1c0, oset=0x7fffffffb120) at ../sysdeps/unix/sysv/linux/x86_64/sigprocmask.c:36 #1 0x000000000049b2c0 in signal_unblock (set=...) at signals.c:274 #2 0x00000000004580ac in shingetline () at input.c:148 #3 0x000000000045899b in inputline () at input.c:278 #4 0x000000000045882a in ingetc () at input.c:226 #5 0x000000000046211e in gettok () at lex.c:611 #6 0x000000000046183b in zshlex () at lex.c:275 #7 0x0000000000484825 in parse_event (endtok=37) at parse.c:569 #8 0x0000000000453f6e in loop (toplevel=0, justonce=0) at init.c:146 #9 0x0000000000456db0 in source (s=0x708930 "../hacking-private/FSH/fsh_cache") at init.c:1386 #10 0x0000000000425a0e in bin_dot (name=0x7ffff7ff2550 ".", argv=0x7ffff7ff25b0, ops=0x7fffffffd980, func=0) at builtin.c:5699 #11 0x00000000004105ff in execbuiltin (args=0x7ffff7ff2580, assigns=0x0, bn=0x6dc7c0 <builtins+384>) at builtin.c:485 #12 0x0000000000437fd4 in execcmd_exec (state=0x7fffffffe300, eparams=0x7fffffffdef0, input=0, output=0, how=18, last1=1) at exec.c:3958 #13 0x0000000000431a50 in execpline2 (state=0x7fffffffe300, pcode=131, how=18, input=0, output=0, last1=1) at exec.c:1873 #14 0x0000000000430665 in execpline (state=0x7fffffffe300, slcode=4098, how=18, last1=1) at exec.c:1602 #15 0x000000000042f95a in execlist (state=0x7fffffffe300, dont_change_job=0, exiting=1) at exec.c:1360 #16 0x000000000042efd4 in execode (p=0x7ffff7ff2488, dont_change_job=0, exiting=1, context=0x4c37a2 "cmdarg") at exec.c:1141 #17 0x000000000042ee9c in execstring (s=0x7fffffffe772 ". ../hacking-private/FSH/fsh_cache", dont_change_job=0, exiting=1, context=0x4c37a2 "cmdarg") at exec.c:1107 #18 0x0000000000456a61 in init_misc (cmd=0x7fffffffe772 ". ../hacking-private/FSH/fsh_cache", zsh_name=0x7fffffffe76a "zsh") at init.c:1292 #19 0x0000000000457e8e in zsh_main (argc=3, argv=0x7fffffffe4f8) at init.c:1678 #20 0x000000000040f7f6 in main (argc=3, argv=0x7fffffffe4f8) at ./main.c:93 Which probably explains why one gets about as many rt_sigprocmask calls as there are bytes in the file. $ time zsh -c 'repeat 100 eval "$(<fsh_cache)"' gives: 1.18s user 0.05s system 99% cpu 1.239 total With "only" 942 rt_sigprocmask calls according to strace -c. There's probably scope for optimisation here, though I can't comment further as I don't know why that signal handling code is there in the first place. -- Stephane ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N 2017-06-19 12:24 ` Stephane Chazelas @ 2017-06-19 15:31 ` Bart Schaefer [not found] ` <170619083116.ZM17323__41722.0601499595$1497886320$gmane$org@torch.brasslantern.com> 1 sibling, 0 replies; 10+ messages in thread From: Bart Schaefer @ 2017-06-19 15:31 UTC (permalink / raw) To: zsh-users On Jun 19, 1:24pm, Stephane Chazelas wrote: } } There's probably scope for optimisation here, though I can't } comment further as I don't know why that signal handling code is } there in the first place. rt_signprocmask should not be significantly more expensive than an assignment to an integer. The signal handling code is there because the shell MUST NOT respond instantly to arbitrary signals while doing operations such as token interpretation or or memory management -- the signal handlers might themselves invoke shell commands/functions and many of those layers are not safe for re-entrancy -- but it MUST respond to those signals whenever it may be blocked for an unknown length of time, such as when reading from a file descriptor. Many years of "I can't interrupt my script when X" or "interrupting my script when Y causes a crash" resulted in the current signal paradigm. When the shell was first written, processors weren't fast enough and process scheduling not well-threaded enough to expose a lot of these issues, but the better our computers get the greater the likelyhood of hitting an ever-smaller race condition window, so those windows have to be aggressively closed. ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <170619083116.ZM17323__41722.0601499595$1497886320$gmane$org@torch.brasslantern.com>]
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N [not found] ` <170619083116.ZM17323__41722.0601499595$1497886320$gmane$org@torch.brasslantern.com> @ 2017-06-19 16:16 ` Stephane Chazelas 2017-06-19 19:14 ` Bart Schaefer 0 siblings, 1 reply; 10+ messages in thread From: Stephane Chazelas @ 2017-06-19 16:16 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-users 2017-06-19 08:31:16 -0700, Bart Schaefer: > On Jun 19, 1:24pm, Stephane Chazelas wrote: > } > } There's probably scope for optimisation here, though I can't > } comment further as I don't know why that signal handling code is > } there in the first place. > > rt_signprocmask should not be significantly more expensive than an > assignment to an integer. Still, $ time zsh -c 'repeat 100 . ./fsh_cache' 2> /dev/null zsh -c 'repeat 100 . ./fsh_cache' 2> /dev/null 0.73s user 0.78s system 99% cpu 1.522 total $ time zsh -c 'repeat 100 eval "$(<fsh_cache)"' 2> /dev/null zsh -c 'repeat 100 eval "$(<fsh_cache)"' 2> /dev/null 0.80s user 0.04s system 99% cpu 0.848 total See how the system time falls to almost 0 with the eval variant. I get the same kind of performance gain if I comment out the line that eventually calls the rt_signprocmask there. winch_unblock() (so only for SIGWINCH). > The signal handling code is there because the shell MUST NOT respond > instantly to arbitrary signals while doing operations such as token > interpretation or or memory management -- the signal handlers might > themselves invoke shell commands/functions and many of those layers > are not safe for re-entrancy -- but it MUST respond to those signals > whenever it may be blocked for an unknown length of time, such as when > reading from a file descriptor. > > Many years of "I can't interrupt my script when X" or "interrupting > my script when Y causes a crash" resulted in the current signal > paradigm. When the shell was first written, processors weren't fast > enough and process scheduling not well-threaded enough to expose a > lot of these issues, but the better our computers get the greater > the likelyhood of hitting an ever-smaller race condition window, so > those windows have to be aggressively closed. I suspected it would be something like that, but here note that it's done for every byte of the data even though the code is read in full chunks at a time (by stdio's fgetc) If you look at the strace output, you see something: open("/etc/zsh/zshenv", O_RDONLY|O_NOCTTY) = 3 fcntl(3, F_DUPFD, 10) = 11 read(11, "# /etc/zsh/zshenv: system-wide ."..., 4096) = 623 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [], 8) = 0 [...] open("./fsh_cache", O_RDONLY|O_NOCTTY) = 3 fcntl(3, F_DUPFD, 10) = 13 read(13, "zle -N orig-s0.0000060000-r9037-"..., 4096) = 4096 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [WINCH], [CHLD], 8) = 0 [...] Most of those rt_sigprocmask are unnecessary. That defeats a benefit of stdio saving read() systems calls by reading in chunk if we end up doing one system call per byte anyway. -- Stephane ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Why sourcing a file is not faster than doing a loop with eval, zle -N 2017-06-19 16:16 ` Stephane Chazelas @ 2017-06-19 19:14 ` Bart Schaefer 0 siblings, 0 replies; 10+ messages in thread From: Bart Schaefer @ 2017-06-19 19:14 UTC (permalink / raw) To: Stephane Chazelas; +Cc: zsh-users This is now WELL into zsh-workers territory, please direct replies there rather than to the -users list. On Jun 19, 5:16pm, Stephane Chazelas wrote: } } That defeats a benefit of stdio saving read() systems calls by } reading in chunk if we end up doing one system call per byte } anyway. Unfortunately we need to read from stdio one byte at a time, and as far as I know there is no way to "ask" stdio whether it is still working on a buffer, or is instead going to refill its buffer (and therefore possibly block) on the next attempted getc() -- and to find out would likely be more expensive than doing the system call. Also stdio is *itself* not re-entrant, so we have to control signals around all stdio operations. Just to demonstrate why the signal handling is necessary; consider this: % echo $(trap '' INT; sleep 100) That shell is now un-interruptible for 100 seconds, because readoutput() does not do signal management around its fgetc() calls. Worse, if you type ^Z the sleep is silently suspended and the parent is hung forever. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-06-19 19:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-17 11:34 Why sourcing a file is not faster than doing a loop with eval, zle -N Sebastian Gniazdowski [not found] ` <CAH+w=7bVXtubcdwvEBC9isE32683dUipAUS=vrAkgO5pp2bkkw@mail.gmail.com> [not found] ` <CAH+w=7afTi=1bfLBCmq8-vB-rLWDtEkAtk8gCCna3-mQwZ1-Ow@mail.gmail.com> 2017-06-17 14:05 ` Bart Schaefer [not found] ` <etPan.594538f9.2ea629d6.10b2e@AirmailxGenerated.am> 2017-06-17 14:56 ` Sebastian Gniazdowski 2017-06-17 15:44 ` Sebastian Gniazdowski 2017-06-17 16:39 ` Bart Schaefer 2017-06-17 17:25 ` Sebastian Gniazdowski [not found] <etPan.594513a8.516100cd.10b2e__10513.1716504276$1497699329$gmane$org@zdharma.org> 2017-06-19 12:24 ` Stephane Chazelas 2017-06-19 15:31 ` Bart Schaefer [not found] ` <170619083116.ZM17323__41722.0601499595$1497886320$gmane$org@torch.brasslantern.com> 2017-06-19 16:16 ` Stephane Chazelas 2017-06-19 19:14 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).