* Extended globbing seems to have become much slower in recent versions of Zsh @ 2016-02-28 20:18 Jesper Nygårds 2016-02-29 19:12 ` Bart Schaefer 0 siblings, 1 reply; 22+ messages in thread From: Jesper Nygårds @ 2016-02-28 20:18 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 1879 bytes --] I had a feeling that some of my scripts dealing with a large number of files had become significantly slower lately, but I haven't been sure. Today I took some time to find out whether or not I was only imagining things. The results of my experiments seem to indicate that something has changed relatively recently with regards to extended globbing. I am using a Mac for the moment, and I have currently have two versions of Zsh installed: 5.0.5 and 5.2. See below for my test run. As you can see, I do some globbing in a fairly large directory (2.4 Gb and 231748 files). Notice that with a complicated glob pattern, assigning matching files to an array takes about 33 seconds under 5.2, whereas the same assignment takes about 6 seconds under 5.0.5. The speed difference with a less complicated glob (**/*) is negligable. Is there some change that could explain this difference, and is there something to do about it? The difference in speed is so large that it affects scripts that I run regularly quite a lot. The test run, I hope it's self explanatory. All commands are run from the same directory. % print -l **/* | wc -l 231748 % du -sh . 2.4G . % /usr/local/bin/zsh --version zsh 5.2 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB % date; myfiles=(**/*); date Sun Feb 28 20:58:12 CET 2016 Sun Feb 28 20:58:27 CET 2016 % date; myfiles=((#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D)); date Sun Feb 28 20:58:39 CET 2016 Sun Feb 28 20:59:12 CET 2016 % /bin/zsh --version zsh 5.0.5 (x86_64-apple-darwin14.0) % /bin/zsh -f % setopt EXTENDED_GLOB % date; myfiles=(**/*); date Sun Feb 28 21:00:43 CET 2016 Sun Feb 28 21:00:56 CET 2016 % date; myfiles=((#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D)); date Sun Feb 28 21:01:04 CET 2016 Sun Feb 28 21:01:10 CET 2016 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-02-28 20:18 Extended globbing seems to have become much slower in recent versions of Zsh Jesper Nygårds @ 2016-02-29 19:12 ` Bart Schaefer 2016-03-01 11:39 ` Jesper Nygårds 0 siblings, 1 reply; 22+ messages in thread From: Bart Schaefer @ 2016-02-29 19:12 UTC (permalink / raw) To: Zsh Users On Feb 28, 9:18pm, Jesper Nygards wrote: } } The results of my experiments seem to indicate that something has changed } relatively recently with regards to extended globbing. There were a number of changes mostly related to correct handling of multibyte characters and single bytes in the range 128-255 appearing in file and directory names. This means some strings are passing through the "unmetafication" process more often than they were, but I can't say for sure this is the source of any slowdown. It'd be helpful if you re-ran your tests with more variations of the pattern: **/* **/*(-.D) **/*~(your long list of exclusions) (#i)**/*~(your long list of exclusions) **/*~(your long list of exclusions)(-.D) Also, first declare "float SECONDS" and then use "print $SECONDS" instead of "date", so we get a more precise measure of where the time is spent. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-02-29 19:12 ` Bart Schaefer @ 2016-03-01 11:39 ` Jesper Nygårds 2016-03-01 18:28 ` Bart Schaefer 0 siblings, 1 reply; 22+ messages in thread From: Jesper Nygårds @ 2016-03-01 11:39 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 3404 bytes --] Here's another run with your supplied patterns. I hope I didn't misunderstand how you wanted me to use the SECONDS variable. In any case, as you can see below, the globbing more or less takes the same time for the simple patterns, but as soon as I introduce the exclusions, the 5.2 version is around six times slower than the 5.0.5 version. % du -sh . 2.4G . % print -l **/*(D) | wc -l 272681 ------------------------------------------------------ % /usr/local/bin/zsh --version zsh 5.2 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % pwd /opt/intyg % setopt EXTENDED_GLOB % float SECONDS MYNOW % MYNOW=$SECONDS; myfiles=(**/*); print $(( $SECONDS - $MYNOW )) 20.477513999999996 % MYNOW=$SECONDS; myfiles=(**/*(-.D)); print $(( $SECONDS - $MYNOW )) 20.415430000000001 % MYNOW=$SECONDS; myfiles=(**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D)); print $(( $SECONDS - $MYNOW )) 36.53519 % MYNOW=$SECONDS; myfiles=((#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D)); print $(( $SECONDS - $MYNOW )) 38.467647999999997 % MYNOW=$SECONDS; myfiles=(**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)); print $(( $SECONDS - $MYNOW )) 23.871832999999981 MYNOW=$SECONDS; myfiles=((#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)); print $(( $SECONDS - $MYNOW )) 23.234887999999955 ------------------------------------------------------- % /bin/zsh --version zsh 5.0.5 (x86_64-apple-darwin14.0) % /bin/zsh -f % pwd /opt/intyg % setopt EXTENDED_GLOB % float SECONDS MYNOW % MYNOW=$SECONDS; myfiles=(**/*); print $(( $SECONDS - $MYNOW )) 15.603122999999997 % MYNOW=$SECONDS; myfiles=(**/*(-.D)); print $(( $SECONDS - $MYNOW )) 17.826644999999999 % MYNOW=$SECONDS; myfiles=(**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D)); print $(( $SECONDS - $MYNOW )) 6.5746579999999994 % MYNOW=$SECONDS; myfiles=((#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D)); print $(( $SECONDS - $MYNOW )) 6.1519139999999766 % MYNOW=$SECONDS; myfiles=(**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)); print $(( $SECONDS - $MYNOW )) 3.6434740000000261 % MYNOW=$SECONDS; myfiles=((#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)); print $(( $SECONDS - $MYNOW )) 4.0673059999999737 On Mon, Feb 29, 2016 at 8:12 PM, Bart Schaefer <schaefer@brasslantern.com> wrote: > On Feb 28, 9:18pm, Jesper Nygards wrote: > } > } The results of my experiments seem to indicate that something has changed > } relatively recently with regards to extended globbing. > > There were a number of changes mostly related to correct handling of > multibyte characters and single bytes in the range 128-255 appearing > in file and directory names. This means some strings are passing > through the "unmetafication" process more often than they were, but > I can't say for sure this is the source of any slowdown. > > It'd be helpful if you re-ran your tests with more variations of the > pattern: > > **/* > **/*(-.D) > **/*~(your long list of exclusions) > (#i)**/*~(your long list of exclusions) > **/*~(your long list of exclusions)(-.D) > > Also, first declare "float SECONDS" and then use "print $SECONDS" instead > of "date", so we get a more precise measure of where the time is spent. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-01 11:39 ` Jesper Nygårds @ 2016-03-01 18:28 ` Bart Schaefer 2016-03-01 19:11 ` Jesper Nygårds 0 siblings, 1 reply; 22+ messages in thread From: Bart Schaefer @ 2016-03-01 18:28 UTC (permalink / raw) To: Zsh Users On Mar 1, 12:39pm, Jesper Nygards wrote: } } Here's another run with your supplied patterns. I hope I didn't } misunderstand how you wanted me to use the SECONDS variable. Close enough. } In any case, as you can see below, the globbing more or less takes the same } time for the simple patterns, but as soon as I introduce the exclusions, } the 5.2 version is around six times slower than the 5.0.5 version. That tends to point to changes in pattern matching rather than globbing (related but distinct operations), but can you also try it with : pattern-goes-here instead of myfiles=( pattern-goes-here ) ?? I'd like to factor out the contributions of memory management for the array assignment. I ran a couple of quick local tests and with assignment there are about 50% more calls to malloc in the current git head than in 5.0.5, but without assignment there are slightly fewer in current. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-01 18:28 ` Bart Schaefer @ 2016-03-01 19:11 ` Jesper Nygårds 2016-03-02 0:03 ` Bart Schaefer 0 siblings, 1 reply; 22+ messages in thread From: Jesper Nygårds @ 2016-03-01 19:11 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 2585 bytes --] New runs with your suggestion: % /usr/local/bin/zsh --version zsh 5.2 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 36.463269999999994 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 45.178413000000006 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 26.012812999999994 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 23.646717999999964 ------------------------------------------------------- % /bin/zsh --version zsh 5.0.5 (x86_64-apple-darwin14.0) % /bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 13.899865999999989 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 10.239834999999999 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 5.4469640000000084 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 6.2737860000000296 On Tue, Mar 1, 2016 at 7:28 PM, Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 1, 12:39pm, Jesper Nygards wrote: > } > } Here's another run with your supplied patterns. I hope I didn't > } misunderstand how you wanted me to use the SECONDS variable. > > Close enough. > > } In any case, as you can see below, the globbing more or less takes the > same > } time for the simple patterns, but as soon as I introduce the exclusions, > } the 5.2 version is around six times slower than the 5.0.5 version. > > That tends to point to changes in pattern matching rather than globbing > (related but distinct operations), but can you also try it with > > : pattern-goes-here > > instead of > > myfiles=( pattern-goes-here ) > > ?? I'd like to factor out the contributions of memory management for the > array assignment. I ran a couple of quick local tests and with assignment > there are about 50% more calls to malloc in the current git head than in > 5.0.5, but without assignment there are slightly fewer in current. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-01 19:11 ` Jesper Nygårds @ 2016-03-02 0:03 ` Bart Schaefer 2016-03-02 8:39 ` Jesper Nygårds 2016-03-02 9:32 ` Peter Stephenson 0 siblings, 2 replies; 22+ messages in thread From: Bart Schaefer @ 2016-03-02 0:03 UTC (permalink / raw) To: Zsh Users On Mar 1, 8:11pm, Jesper Nygards wrote: } } New runs with your suggestion: Well, that's interesting. Seems to indicate that building argv to pass to builtins is even more expensive than an array assignment. Still, I can't find anything on my system that causes as dramatic a difference as yours from 5.0.5 -> 5.2+. The (-.D) qualifiers seem to have the largest effect for me, for you its the exclusion pattern. For the sake of covering all bases, are you sure the same compiler optimization flags were applied to the 5.0.5 and 5.2 builds you are testing? With your exclusion pattern and about 13,000 files in **/*, there are about 100,000 calls to strlen() in either version, so any optimization of that might make a large difference. I did find a small but observable speedup when compiling without MULTIBYTE_SUPPORT, too; there were a lot of multibyte repairs to pattern.c after 5.0.5. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-02 0:03 ` Bart Schaefer @ 2016-03-02 8:39 ` Jesper Nygårds 2016-03-03 0:06 ` Bart Schaefer 2016-03-02 9:32 ` Peter Stephenson 1 sibling, 1 reply; 22+ messages in thread From: Jesper Nygårds @ 2016-03-02 8:39 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 4465 bytes --] Previously I was using the Apple-provided version of 5.0.5, and the Homebrew version of 5.2. Now I have instead used different versions of Homebrew zsh, and I have made sure that the compiler flags were the same through all versions. I have then run the same tests again, with versions 5.0.5, 5.0.6, 5.1 and 5.2. % /usr/local/bin/zsh --version zsh 5.0.5 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 21.925625999999998 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 15.809167000000002 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 9.8283599999999893 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 4.0556160000000006 ------------------------------------------------------- % /usr/local/bin/zsh --version zsh 5.0.6 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 51.819927999999997 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 54.338758999999996 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 35.885299000000003 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 29.573196999999993 ------------------------------------------------------- % /usr/local/bin/zsh --version zsh 5.1 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 55.068858999999989 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 55.551114999999982 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 31.761401000000035 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 33.984271000000035 ------------------------------------------------------- % /usr/local/bin/zsh --version zsh 5.2 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 46.693610000000007 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 48.288181999999992 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 31.169990000000013 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 25.161075000000011 On Wed, Mar 2, 2016 at 1:03 AM, Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 1, 8:11pm, Jesper Nygards wrote: > } > } New runs with your suggestion: > > Well, that's interesting. Seems to indicate that building argv to pass > to builtins is even more expensive than an array assignment. > > Still, I can't find anything on my system that causes as dramatic a > difference as yours from 5.0.5 -> 5.2+. The (-.D) qualifiers seem to > have the largest effect for me, for you its the exclusion pattern. > > For the sake of covering all bases, are you sure the same compiler > optimization flags were applied to the 5.0.5 and 5.2 builds you are > testing? With your exclusion pattern and about 13,000 files in **/*, > there are about 100,000 calls to strlen() in either version, so any > optimization of that might make a large difference. > > I did find a small but observable speedup when compiling without > MULTIBYTE_SUPPORT, too; there were a lot of multibyte repairs to > pattern.c after 5.0.5. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-02 8:39 ` Jesper Nygårds @ 2016-03-03 0:06 ` Bart Schaefer 2016-03-04 8:17 ` Jesper Nygårds 0 siblings, 1 reply; 22+ messages in thread From: Bart Schaefer @ 2016-03-03 0:06 UTC (permalink / raw) To: Zsh Users On Wed, Mar 2, 2016 at 12:39 AM, Jesper Nygårds <jesper.nygards@gmail.com> wrote: > Previously I was using the Apple-provided version of 5.0.5, and the > Homebrew version of 5.2. Now I have instead used different versions of > Homebrew zsh, and I have made sure that the compiler flags were the same > through all versions. I have then run the same tests again, with versions > 5.0.5, 5.0.6, 5.1 and 5.2. OK, thanks; next thing to try: configure --with-stack-allocation I've scanned through diffs and I just don't see any structural changes that would account for that slowdown; the above is a bit of a longshot. Incidentally (for -workers) there's a typo in ChangeLog -- 32932: Src/glob.c -- should say Src/subst.c. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-03 0:06 ` Bart Schaefer @ 2016-03-04 8:17 ` Jesper Nygårds 2016-03-04 13:22 ` Jesper Nygårds 0 siblings, 1 reply; 22+ messages in thread From: Jesper Nygårds @ 2016-03-04 8:17 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 2679 bytes --] I have run 5.2, compiled with and without the --with-stack-allocation flag. It made no difference however. You can see that the runs below are very similar in the measured times. With "--with-stack-allocation": % zsh --version zsh 5.2 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 54.868589 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 47.001608000000019 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 38.805462000000006 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 32.296016000000009 ------------------------------------------------------- Without "--with-stack-allocation": % /usr/local/bin/zsh --version zsh 5.2 (x86_64-apple-darwin14.5.0) % /usr/local/bin/zsh -f % setopt EXTENDED_GLOB; float SECONDS MYNOW % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 53.115647000000003 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); print $(( $SECONDS - $MYNOW )) 42.988532000000006 % MYNOW=$SECONDS; : **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 36.619570999999993 % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 30.786290000000008 On Thu, Mar 3, 2016 at 1:06 AM, Bart Schaefer <schaefer@brasslantern.com> wrote: > On Wed, Mar 2, 2016 at 12:39 AM, Jesper Nygårds > <jesper.nygards@gmail.com> wrote: > > Previously I was using the Apple-provided version of 5.0.5, and the > > Homebrew version of 5.2. Now I have instead used different versions of > > Homebrew zsh, and I have made sure that the compiler flags were the same > > through all versions. I have then run the same tests again, with versions > > 5.0.5, 5.0.6, 5.1 and 5.2. > > OK, thanks; next thing to try: configure --with-stack-allocation > > I've scanned through diffs and I just don't see any structural changes > that would account for that slowdown; the above is a bit of a > longshot. > > Incidentally (for -workers) there's a typo in ChangeLog -- 32932: > Src/glob.c -- should say Src/subst.c. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-04 8:17 ` Jesper Nygårds @ 2016-03-04 13:22 ` Jesper Nygårds 2016-03-04 14:03 ` Peter Stephenson 2016-03-05 17:37 ` Bart Schaefer 0 siblings, 2 replies; 22+ messages in thread From: Jesper Nygårds @ 2016-03-04 13:22 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 5107 bytes --] I don't know if this is relevant, but I have some more findings. I wanted to know which sub directory was contributing the most to the amount of time taken to process the root directory. I then realized that the sum of the time it took to process each sub directory separately was much lower than processing the whole root directory at once. In the run below, you can see that whilst it takes about 36 seconds to process the root directory, it only takes about 13 seconds to process all directories one at a time. Furthermore, when I descend into the sub directory taking the longest time in the first run, and run all its sub directories in sequence, the sum of these sub-sub directories is significantly lower than for the whole sub directory. So obviously the processing time is not linear with the number of files. % float SECONDS MYNOW; % MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )) 35.810789000000113 % for dir in */; do print $dir; cd $dir; MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )); cd -q - >|/dev/null; done common-pom/ 0.00011700000004566391 common.wiki/ 0.00033800000028350041 common/ 1.0273979999997209 intygstjanst-konfiguration/ 0.00071800000023358734 intygstjanst/ 0.08040000000028158 intygstyper/ 9.1739539999998669 minaintyg-konfiguration/ 0.00063700000009703217 minaintyg/ 0.67290499999990061 privatlakarportal-konfiguration/ 0.0015870000001996232 privatlakarportal/ 0.83380199999965043 schemas/ 0.11772999999993772 tools/ 0.0098799999996117549 webcert-konfiguration/ 0.0018289999998160056 webcert/ 1.4324129999999968 % cd intygstyper % for dir in */; do print $dir; cd $dir; MYNOW=$SECONDS; : (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); print $(( $SECONDS - $MYNOW )); cd -q - >|/dev/null; done fk/ 1.9322819999997591 fk7263/ 0.37481900000011592 target/ 0.17797399999972185 tools/ 0.0080309999998462445 ts-bas/ 0.31286899999986417 ts-diabetes/ 0.22946499999989101 ts-parent/ 0.0033199999998032581 On Fri, Mar 4, 2016 at 9:17 AM, Jesper Nygårds <jesper.nygards@gmail.com> wrote: > I have run 5.2, compiled with and without the --with-stack-allocation > flag. It made no difference however. You can see that the runs below are > very similar in the measured times. > > With "--with-stack-allocation": > > % zsh --version > zsh 5.2 (x86_64-apple-darwin14.5.0) > > % /usr/local/bin/zsh -f > % setopt EXTENDED_GLOB; float SECONDS MYNOW > > % MYNOW=$SECONDS; : > **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); > print $(( $SECONDS - $MYNOW )) > 54.868589 > > % MYNOW=$SECONDS; : > (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); > print $(( $SECONDS - $MYNOW )) > 47.001608000000019 > > % MYNOW=$SECONDS; : > **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); > print $(( $SECONDS - $MYNOW )) > 38.805462000000006 > > % MYNOW=$SECONDS; : > (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); > print $(( $SECONDS - $MYNOW )) > 32.296016000000009 > > ------------------------------------------------------- > > Without "--with-stack-allocation": > > % /usr/local/bin/zsh --version > zsh 5.2 (x86_64-apple-darwin14.5.0) > > % /usr/local/bin/zsh -f > % setopt EXTENDED_GLOB; float SECONDS MYNOW > > % MYNOW=$SECONDS; : > **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); > print $(( $SECONDS - $MYNOW )) > 53.115647000000003 > > % MYNOW=$SECONDS; : > (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)(-.D); > print $(( $SECONDS - $MYNOW )) > 42.988532000000006 > > % MYNOW=$SECONDS; : > **/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); > print $(( $SECONDS - $MYNOW )) > 36.619570999999993 > > % MYNOW=$SECONDS; : > (#i)**/*~(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*); > print $(( $SECONDS - $MYNOW )) > 30.786290000000008 > > > On Thu, Mar 3, 2016 at 1:06 AM, Bart Schaefer <schaefer@brasslantern.com> > wrote: > >> On Wed, Mar 2, 2016 at 12:39 AM, Jesper Nygårds >> <jesper.nygards@gmail.com> wrote: >> > Previously I was using the Apple-provided version of 5.0.5, and the >> > Homebrew version of 5.2. Now I have instead used different versions of >> > Homebrew zsh, and I have made sure that the compiler flags were the same >> > through all versions. I have then run the same tests again, with >> versions >> > 5.0.5, 5.0.6, 5.1 and 5.2. >> >> OK, thanks; next thing to try: configure --with-stack-allocation >> >> I've scanned through diffs and I just don't see any structural changes >> that would account for that slowdown; the above is a bit of a >> longshot. >> >> Incidentally (for -workers) there's a typo in ChangeLog -- 32932: >> Src/glob.c -- should say Src/subst.c. >> > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-04 13:22 ` Jesper Nygårds @ 2016-03-04 14:03 ` Peter Stephenson 2016-03-04 14:20 ` Peter Stephenson 2016-03-05 17:37 ` Bart Schaefer 1 sibling, 1 reply; 22+ messages in thread From: Peter Stephenson @ 2016-03-04 14:03 UTC (permalink / raw) To: Zsh Users On Fri, 04 Mar 2016 14:22:23 +0100 Jesper Nygårds <jesper.nygards@gmail.com> wrote: > I don't know if this is relevant, but I have some more findings. I wanted > to know which sub directory was contributing the most to the amount of time > taken to process the root directory. I then realized that the sum of the > time it took to process each sub directory separately was much lower than > processing the whole root directory at once. In the run below, you can see > that whilst it takes about 36 seconds to process the root directory, it > only takes about 13 seconds to process all directories one at a time. > Furthermore, when I descend into the sub directory taking the longest time > in the first run, and run all its sub directories in sequence, the sum of > these sub-sub directories is significantly lower than for the whole sub > directory. So obviously the processing time is not linear with the number > of files. You'd think that would point towards something higher level than the pattern matching itself, anyway. Brainstorming [this is a euphemism for I haven't the first clue what I'm talking about]... - Memory management associated with multiple directories; could be allocation or heap management. - Poor structuring of globbing code meaning it's repeating operations, i.e. it's not walking trees in a sensible fashion. - Some auxiliary operation associated with the file tree that's giving a similar effect to the foregoing (i.e. we walk the tree itself OK but then repeat some associated operation that should be cached). - Some form of context switching problem [now it's really getting silly] Most of these guesses ought to be amenable to tracing through some sort of profiling suite. Even some fprintfs to check what directories we've handled at what point might help. pws ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-04 14:03 ` Peter Stephenson @ 2016-03-04 14:20 ` Peter Stephenson 2016-03-04 21:49 ` Peter Stephenson 0 siblings, 1 reply; 22+ messages in thread From: Peter Stephenson @ 2016-03-04 14:20 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh Users On Fri, 4 Mar 2016 14:03:42 +0000 Peter Stephenson <p.stephenson@samsung.com> wrote: > - Memory management associated with multiple directories; could be > allocation or heap management. I'm forgetting it's not simply speed of glob/pattern code, it's speed of certain forms of glob/pattern code. So tree structure problems don't really fit. Another possible clue here: the exclusion modifiers require additional allocations when running the platform --- see line 3037 of pattern.c (there's another one later). This is currently permanently rather than heap allocated. It's easy to change to heap, but I wouldn't suggest that --- with any kind of complicated backtracking this would eat memory like there's no tomorrow. It would be possible to use local pools of memory for this, however, so it didn't get allocated and freed every single time it was needed. I think we only need one allocation for each occurrence of an exclusion, and usually there are very few in a pattern, so this isn't hard to manage. This might be an optimisation worth having anyway. Although this hasn't changed significantly since I put the pattern code in roughly its current form *many* years ago, 1999 apparently, it's hairy enough that it's quite possible it could trip over pathologies in interacting with the system and libraries. (So brainstorming wasn't quite so futile after all...) pws ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-04 14:20 ` Peter Stephenson @ 2016-03-04 21:49 ` Peter Stephenson 2016-03-05 17:47 ` Bart Schaefer 0 siblings, 1 reply; 22+ messages in thread From: Peter Stephenson @ 2016-03-04 21:49 UTC (permalink / raw) To: Zsh Users On Fri, 04 Mar 2016 14:20:46 +0000 Peter Stephenson <p.stephenson@samsung.com> wrote: > Another possible clue here: the exclusion modifiers require additional > allocations when running the platform --- see line 3037 of pattern.c > (there's another one later). This is currently permanently rather than > heap allocated. It's easy to change to heap, but I wouldn't suggest > that --- with any kind of complicated backtracking this would eat memory > like there's no tomorrow. > > It would be possible to use local pools of memory for this, however, so > it didn't get allocated and freed every single time it was needed. I > think we only need one allocation for each occurrence of an exclusion, > and usually there are very few in a pattern, so this isn't hard to > manage. This might be an optimisation worth having anyway. This is what I meant. It may not be related to the slow down in question, but it should speed up repeated use of exclusion patterns in any case. If you have DEBUG turned on you get an output for the maximum number of pool entries that were in use at one time in the globbing tests, which is currently 4. That's a measure of how complcated repeated patterns with exclusion that required backtracking became. 4 suggests the pool method is eminently practical. I ran the globtests under valgrind and it reported no errors. Follow-ups on the details of this patch should go to zsh-workers. diff --git a/Src/glob.c b/Src/glob.c index 2051016..3aa5a3c 100644 --- a/Src/glob.c +++ b/Src/glob.c @@ -1827,11 +1827,14 @@ zglob(LinkList list, LinkNode np, int nountok) sizeof(struct gmatch)); matchct = 0; pattrystart(); + patopt_glob_use(); /* The actual processing takes place here: matches go into * * matchbuf. This is the only top-level call to scanner(). */ scanner(q, shortcircuit); + patopt_glob_release(); + /* Deal with failures to match depending on options */ if (matchct) badcshglob |= 2; /* at least one cmd. line expansion O.K. */ diff --git a/Src/pattern.c b/Src/pattern.c index 72c7d97..348779c 100644 --- a/Src/pattern.c +++ b/Src/pattern.c @@ -73,7 +73,7 @@ /* * The following union is used mostly for alignment purposes. - * Normal nodes are longs, while certain nodes take a char * as an argument; + * Normal nodes are longs, while certain nodes take a pointer as an argument; * here we make sure that they both work out to the same length. * The compiled regexp we construct consists of upats stuck together; * anything else to be added (strings, numbers) is stuck after and @@ -84,6 +84,7 @@ union upat { long l; unsigned char *p; + Patopt_ref ref; }; typedef union upat *Upat; @@ -1852,10 +1853,129 @@ static void patoptail(long p, long val) /* + * Optimisation of memory allocation for pattern exclusion. + * This is to reduce the amount of allocation we do when + * running a platform. We'll keep a pool available for + * this purpose. We also optimise so that we never free + * or reduce string sizes across globbing for a particular + * pattern as this will have similar uses multiple times. + */ + +/* The size of the pool */ +#define PATOPT_MEM_POOL_SIZE (16) +/* The maximum length of entries in the pool we'll retain normally */ +#define PATOPT_MEM_MAX_LEN (128) + +/* Flag that we are globbing and should keep memory */ +static int patopt_glob; +/* The pool */ +struct patopt_mem patopt_pool[PATOPT_MEM_POOL_SIZE]; +#ifdef DEBUG +/* The most pool entries we've usd at once */ +static int patopt_pool_max_used; +#endif + +static void +patopt_free_excess(void) +{ + int iref; + Patopt_mem entry; + for (iref = 0, entry = patopt_pool; + iref < PATOPT_MEM_POOL_SIZE; + iref++, entry++) { + if (!entry->in_use /* wow, that's paranoid */ + && entry->len > PATOPT_MEM_MAX_LEN) + { + free(entry->mem.ptr); + entry->mem.ptr = NULL; + entry->len = 0; + } + } +} + +/**/ +mod_export void +patopt_glob_use(void) +{ + patopt_glob++; +} + +/**/ +mod_export void +patopt_glob_release(void) +{ + if (!--patopt_glob) + patopt_free_excess(); + DPUTS(patopt_glob < 0, "unmatched patopt_glob_release call"); +} + +/**/ +static Patopt_ref +patopt_mem_alloc(size_t allocsize) +{ + /* + * The strategy here is we find the first unused entry and if + * it's too small we resize it immediately, on the basis that + * there will be many calls requiring the same (maximum) allocation + * size. + */ + int iref; + Patopt_mem entry; + Patopt_ref ref; + for (iref = 0, entry = patopt_pool; + iref < PATOPT_MEM_POOL_SIZE; + iref++, entry++) { + if (!entry->in_use) { + /* Grab this one */ + entry->in_use = 1; + entry->mem.ref = iref; + if (entry->len < allocsize) { + /* Handles null */ + entry->mem.ptr = (unsigned char *) + zrealloc(entry->mem.ptr, allocsize); + entry->len = allocsize; + } +#ifdef DEBUG + if (iref+1 > patopt_pool_max_used) { + patopt_pool_max_used = iref + 1; + setiparam("ZSH_PATOPT_POOL_MAX_USED", patopt_pool_max_used); + } +#endif + return &entry->mem; + } + } + /* + * Overflowed pool, just allocate normally, except we need + * the reference structure as well which we'll tack on the + * front. + */ + ref = (Patopt_ref)zalloc(sizeof(struct patopt_ref) + allocsize); + ref->ptr = (unsigned char *)ref + sizeof(struct patopt_ref); + ref->ref = -1; + return ref; +} + + +static void +patopt_mem_free(Patopt_ref ref) +{ + Patopt_mem entry; + if (ref->ref < 0) { + free((void *)ref); + return; + } + entry = patopt_pool + ref->ref; + entry->in_use = 0; +} + +/* * Run a pattern. + * + * P.S. this code is not re-entrant... */ static char *patinstart; /* Start of input string */ static char *patinend; /* End of input string */ +static int patintotlen; /* Full length of input string */ static char *patinput; /* String input pointer */ static char *patinpath; /* Full path for use with ~ exclusions */ static int patinlen; /* Length of last successful match. @@ -2291,6 +2411,7 @@ pattryrefs(Patprog prog, char *string, int stringlen, int unmetalenin, patflags = prog->flags; patinend = patinstart + stringlen; + patintotlen = stringlen; /* * From now on we do not require NULL termination of * the test string. There should also be no more references @@ -2302,7 +2423,7 @@ pattryrefs(Patprog prog, char *string, int stringlen, int unmetalenin, * Either we are testing against a pure string, * or we can match anything at all. */ - int ret, pstrlen; + int pstrlen; char *pstr; if (patstralloc->alloced) { @@ -2404,8 +2525,6 @@ pattryrefs(Patprog prog, char *string, int stringlen, int unmetalenin, } } } - - return ret; } else { int q = queue_signal_level(); @@ -2456,6 +2575,7 @@ pattryrefs(Patprog prog, char *string, int stringlen, int unmetalenin, dont_queue_signals(); + patopt_glob_use(); if (patmatch((Upat)progstr)) { /* * we were lazy and didn't save the globflags if an exclusion @@ -2594,11 +2714,11 @@ pattryrefs(Patprog prog, char *string, int stringlen, int unmetalenin, ret = 1; } else ret = 0; + patopt_glob_release(); restore_queue_signals(q); - - return ret; } + return ret; } /* @@ -2947,9 +3067,9 @@ patmatch(Upat prog) after = P_OPERAND(scan); DPUTS(!P_ISEXCLUDE(after), "BUG: EXCSYNC not followed by EXCLUDE."); - DPUTS(!P_OPERAND(after)->p, + DPUTS(!P_OPERAND(after)->ref, "BUG: EXCSYNC not handled by EXCLUDE"); - syncptr = P_OPERAND(after)->p + (patinput - patinstart); + syncptr = P_OPERAND(after)->ref->ptr + (patinput - patinstart); /* * If we already matched from here, this time we fail. * See WBRANCH code for story about error count. @@ -3018,7 +3138,7 @@ patmatch(Upat prog) */ Upat syncstrp; char *origpatinend; - unsigned char *oldsyncstr; + Patopt_ref oldsyncstr; char *matchpt = NULL; int ret, savglobdots, matchederrs = 0; int savparsfound = parsfound; @@ -3032,9 +3152,15 @@ patmatch(Upat prog) * (foo~bar)(foo~bar)... from the exclusion point * of view, so we use a different sync string. */ - oldsyncstr = syncstrp->p; - syncstrp->p = (unsigned char *) - zshcalloc((patinend - patinstart) + 1); + oldsyncstr = syncstrp->ref; + /* + * Allocate maximum length we could need, in + * case we do this repeatedly, so we don't + * to reallocate. + */ + syncstrp->ref = patopt_mem_alloc(patintotlen+1); + memset(syncstrp->ref->ptr, 0, + (patinend - patinstart) + 1); origpatinend = patinend; while ((ret = patmatch(P_OPERAND(scan)))) { unsigned char *syncpt; @@ -3054,9 +3180,11 @@ patmatch(Upat prog) * possibilities for approximation have been * checked.) */ - for (syncpt = syncstrp->p; !*syncpt; syncpt++) + for (syncpt = syncstrp->ref->ptr; + !*syncpt; + syncpt++) ; - synclen = syncpt - syncstrp->p; + synclen = syncpt - syncstrp->ref->ptr; if (patinstart + synclen != patinend) { /* * Temporarily mark the string as @@ -3129,9 +3257,8 @@ patmatch(Upat prog) patglobflags = savglobflags; errsfound = saverrsfound; } - zfree((char *)syncstrp->p, - (patinend - patinstart) + 1); - syncstrp->p = oldsyncstr; + patopt_mem_free(syncstrp->ref); + syncstrp->ref = oldsyncstr; if (ret) { patinput = matchpt; errsfound = matchederrs; @@ -3141,7 +3268,8 @@ patmatch(Upat prog) P_ISEXCLUDE(scan)) ; } else { - int ret = 1, pfree = 0; + int ret = 1; + Patopt_ref pfree = NULL; Upat ptrp = NULL; unsigned char *ptr; if (P_OP(scan) == P_WBRANCH) { @@ -3162,12 +3290,13 @@ patmatch(Upat prog) */ opnd = P_OPERAND(scan); ptrp = opnd++; - if (!ptrp->p) { - ptrp->p = (unsigned char *) - zshcalloc((patinend - patinstart) + 1); - pfree = 1; + if (!ptrp->ref) { + pfree = ptrp->ref = + patopt_mem_alloc(patintotlen+1); + memset(pfree->ptr, 0, + (patinend - patinstart) + 1); } - ptr = ptrp->p + (patinput - patinstart); + ptr = ptrp->ref->ptr + (patinput - patinstart); /* * Without approximation, this is just a @@ -3188,9 +3317,8 @@ patmatch(Upat prog) if (ret) ret = patmatch(opnd); if (pfree) { - zfree((char *)ptrp->p, - (patinend - patinstart) + 1); - ptrp->p = NULL; + patopt_mem_free(pfree); + ptrp->ref = NULL; } if (ret) return 1; diff --git a/Src/zsh.h b/Src/zsh.h index eee31da..34bdc50 100644 --- a/Src/zsh.h +++ b/Src/zsh.h @@ -1613,6 +1613,38 @@ typedef struct zpc_disables_save *Zpc_disables_save; #define PP_UNKWN 20 /* Range: token followed by the (possibly multibyte) start and end */ #define PP_RANGE 21 +/* + * A reference to memory that may be from the pool or not. + * For convenience in use it contains the reference to say where it + * has come from as well as the memory. + * + * If this was from the pool, ptr is allocated from zalloc(). + * + * If this was not from the pool, ref is -1, the structure is + * allocated form zalloc() with ptr pointing sizeof(the structure) + * bytes into that chunk. + */ + +/**/ +struct patopt_ref +{ + /* Reference to a pool entry, or -1 if from zalloc() */ + int ref; + /* + * Pointer to the memory. + */ + unsigned char *ptr; +}; +typedef struct patopt_ref *Patopt_ref; + +/* A chunk of emory from the pool */ +struct patopt_mem +{ + size_t len; + struct patopt_ref mem; + int in_use; +}; +typedef struct patopt_mem *Patopt_mem; /* * Argument to get_match_ret() in glob.c diff --git a/Test/D02glob.ztst b/Test/D02glob.ztst index a6b704a..76209a0 100644 --- a/Test/D02glob.ztst +++ b/Test/D02glob.ztst @@ -7,7 +7,11 @@ : >glob.tmp/{,{dir1,dir2}/}{a,b,c} globtest () { - $ZTST_testdir/../Src/zsh -f $ZTST_srcdir/../Misc/$1 + $ZTST_testdir/../Src/zsh -fc " + . $ZTST_srcdir/../Misc/$1 + if [[ -n \$ZSH_PATOPT_POOL_MAX_USED ]]; then + print -u $ZTST_fd \"Max patopt_pool use: \$ZSH_PATOPT_POOL_MAX_USED\" + fi" } regress_absolute_path_and_core_dump() { ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-04 21:49 ` Peter Stephenson @ 2016-03-05 17:47 ` Bart Schaefer 2016-03-06 18:10 ` Peter Stephenson 0 siblings, 1 reply; 22+ messages in thread From: Bart Schaefer @ 2016-03-05 17:47 UTC (permalink / raw) To: Zsh Users On Mar 4, 9:49pm, Peter Stephenson wrote: } Subject: Re: Extended globbing seems to have become much slower in recent } } This is what I meant. It may not be related to the slow down in } question, but it should speed up repeated use of exclusion patterns in } any case. Sadly, on my desktop this actually SLOWS DOWN the process even further. The below is for a glob of 13,000 files (largest/deepest tree I have handy) with "repeat 3" and Jesper's pattern: zsh-5.0.5: 1.35s user 0.38s system 99% cpu 1.738 total zsh-5.0.5: 1.17s user 0.21s system 98% cpu 1.394 total zsh-5.0.5: 1.29s user 0.19s system 99% cpu 1.481 total Src/zsh: 1.44s user 0.21s system 99% cpu 1.657 total Src/zsh: 1.58s user 0.25s system 99% cpu 1.847 total Src/zsh: 1.45s user 0.23s system 99% cpu 1.689 total Src/zsh+u21352: 1.63s user 0.26s system 99% cpu 1.896 total Src/zsh+u21352: 1.43s user 0.25s system 99% cpu 1.692 total Src/zsh+u21352: 1.68s user 0.31s system 99% cpu 1.990 total ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-05 17:47 ` Bart Schaefer @ 2016-03-06 18:10 ` Peter Stephenson 2016-03-07 9:59 ` Jesper Nygårds 0 siblings, 1 reply; 22+ messages in thread From: Peter Stephenson @ 2016-03-06 18:10 UTC (permalink / raw) To: Zsh Users On Sat, 5 Mar 2016 09:47:23 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 4, 9:49pm, Peter Stephenson wrote: > } Subject: Re: Extended globbing seems to have become much slower in recent > } > } This is what I meant. It may not be related to the slow down in > } question, but it should speed up repeated use of exclusion patterns in > } any case. > > Sadly, on my desktop this actually SLOWS DOWN the process even further. > The below is for a glob of 13,000 files (largest/deepest tree I have > handy) with "repeat 3" and Jesper's pattern: That suggests either I've made a mistake in the optimisation portion (which I didn't explicit test, only that it gave the correct behaviour), or the repeated allocation ad nfreeing of similar length strings is close to optimal in allocation already... or a combination. pws ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-06 18:10 ` Peter Stephenson @ 2016-03-07 9:59 ` Jesper Nygårds 2016-03-07 10:15 ` Peter Stephenson 0 siblings, 1 reply; 22+ messages in thread From: Jesper Nygårds @ 2016-03-07 9:59 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 2173 bytes --] I see no real difference with or without Peter's patch, I'm afraid. I will incorporate the suggestions from Bart in my solution. But I feel I haven't explained very well (or at all) what I'm trying to achieve. I use this kind of globbing in a sort of "recursive grep" which I use interactively to grep in my project's codebase. The inputs to the script include both patterns that I search for and patterns that I want to exclude. Up until now, I have composed the list of exclusions from the input without separating ordinary files from directories, but Bart's tip made me aware that such a separation would be worth doing to make it faster. As a side note, since I have implemented this in zsh, I followed what I thought was "zsh best practice", and composed this rather complicated file list with a zsh glob expression, instead of relying on an external command such as gnu "find". But after I discovered this bottle neck in my scripts, I have experimented with instead letting "find" find the files, and indeed that seems a whole lot faster. I mention that just to be sure that I am not causing you to waste your time if my use case is so extreme that it's not worth bothering about its performance. On Sun, Mar 6, 2016 at 7:10 PM, Peter Stephenson < p.w.stephenson@ntlworld.com> wrote: > On Sat, 5 Mar 2016 09:47:23 -0800 > Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Mar 4, 9:49pm, Peter Stephenson wrote: > > } Subject: Re: Extended globbing seems to have become much slower in > recent > > } > > } This is what I meant. It may not be related to the slow down in > > } question, but it should speed up repeated use of exclusion patterns in > > } any case. > > > > Sadly, on my desktop this actually SLOWS DOWN the process even further. > > The below is for a glob of 13,000 files (largest/deepest tree I have > > handy) with "repeat 3" and Jesper's pattern: > > That suggests either I've made a mistake in the optimisation portion > (which I didn't explicit test, only that it gave the correct behaviour), > or the repeated allocation ad nfreeing of similar length strings is > close to optimal in allocation already... or a combination. > > pws > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-07 9:59 ` Jesper Nygårds @ 2016-03-07 10:15 ` Peter Stephenson 0 siblings, 0 replies; 22+ messages in thread From: Peter Stephenson @ 2016-03-07 10:15 UTC (permalink / raw) To: Zsh Users On Mon, 07 Mar 2016 10:59:31 +0100 Jesper Nygårds <jesper.nygards@gmail.com> wrote: > As a side note, since I have implemented this in zsh, I followed what I > thought was "zsh best practice", and composed this rather complicated file > list with a zsh glob expression, instead of relying on an external command > such as gnu "find". But after I discovered this bottle neck in my scripts, > I have experimented with instead letting "find" find the files, and indeed > that seems a whole lot faster. I mention that just to be sure that I am not > causing you to waste your time if my use case is so extreme that it's not > worth bothering about its performance. "find" is much more optimised for, well, finding files, rather than doing general globbing / pattern matching tasks, so I think that makes perfect sense if that's what your basic job is. I often do that myself for a very deep tree. However, that doesn't mean everything on the zsh side is as good as it can be. I won't commit my patch unless I can find some way it's clearly not doing what I intended that I can fix --- but it's useful to know the way it's done at the moment isn't obviously causing problems as I've always wondered. pws ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-04 13:22 ` Jesper Nygårds 2016-03-04 14:03 ` Peter Stephenson @ 2016-03-05 17:37 ` Bart Schaefer 2016-03-06 17:31 ` Bart Schaefer 1 sibling, 1 reply; 22+ messages in thread From: Bart Schaefer @ 2016-03-05 17:37 UTC (permalink / raw) To: Zsh Users On Mar 4, 2:22pm, Jesper Nygards wrote: } } I don't know if this is relevant, but I have some more findings. I wanted } to know which sub directory was contributing the most to the amount of time } taken to process the root directory. I then realized that the sum of the } time it took to process each sub directory separately was much lower than } processing the whole root directory at once. [...] } So obviously the processing time is not linear with the number } of files. Since exclusions are applied after globbing is completed, the processing time for exclusions is proportional to the sum of the lengths of all the file paths found by the recursive glob. If you glob individual sub- directories, you're not excluding based on the names of the container directories. E.g. you should find that x=(**/*) y=(${x:#(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)) takes about the same amount of time as your **/*~pat formulation, and x=(**/*(:t)) y=(${x:#(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)) takes about the same amount of time as your loop over subdirs. I still don't see a reason why pattern matching would have slowed down. On the other hand, your pattern could be a lot more precise. You have it starting with (|*/) and ending with (|/*) which means that you do want to exclude some directories, but obviously not all of the strings in the middle are directory names. If I were to guess that the directory names are build, target, node, and node_modules, then things might go a lot faster if you used **/^(.idea|*.min.js|*.iml|TAGS)~(|*/)(build|target|node|node_modules)(|/*) and even faster with skipdirs() { reply=( ${REPLY%((build|target|node|node_modules)(|/*)} ) } **/^(.idea|*.min.js|*.iml|TAGS)(+skipdirs) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-05 17:37 ` Bart Schaefer @ 2016-03-06 17:31 ` Bart Schaefer 2016-03-21 23:05 ` Bart Schaefer 0 siblings, 1 reply; 22+ messages in thread From: Bart Schaefer @ 2016-03-06 17:31 UTC (permalink / raw) To: Zsh Users On Mar 5, 9:37am, Bart Schaefer wrote: } } E.g. you should find that } } x=(**/*) } y=(${x:#(|*/)(build|target|node|node_modules|.idea|*.min.js|*.iml|TAGS)(|/*)) } } takes about the same amount of time as your **/*~pat formulation Curiously, the ${x:#...} construct is considerably faster than **/*~(...) for me, even on 5.0.5. I suspected briefly that signal handling changes (so that long-runing recursive globs are interruptible) might be having an effect, but the same code path through the signal queuing should be taken with both the glob exclusion and the parameter substitution. As I've mentioned, on my Linux desktop the majority of time is spent in resolving the (-.D) qualifier, but that's partly because the (-) is pulling in a new set of subdirectories so more files are found. I'm beginning to think we need some kind of C-code-level profiling tool like gprof to tell us where the time is being spent. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-06 17:31 ` Bart Schaefer @ 2016-03-21 23:05 ` Bart Schaefer 0 siblings, 0 replies; 22+ messages in thread From: Bart Schaefer @ 2016-03-21 23:05 UTC (permalink / raw) To: Zsh Users On Mar 6, 9:31am, Bart Schaefer wrote: } } I suspected briefly that signal handling changes (so that long-runing } recursive globs are interruptible) might be having an effect, but the } same code path through the signal queuing should be taken with both the } glob exclusion and the parameter substitution. Just in case ... if workers/38188 has an effect on this (in either direction) it might be useful to know that. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-02 0:03 ` Bart Schaefer 2016-03-02 8:39 ` Jesper Nygårds @ 2016-03-02 9:32 ` Peter Stephenson 2016-03-02 23:46 ` Bart Schaefer 1 sibling, 1 reply; 22+ messages in thread From: Peter Stephenson @ 2016-03-02 9:32 UTC (permalink / raw) To: Zsh Users On Tue, 01 Mar 2016 16:03:56 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mar 1, 8:11pm, Jesper Nygards wrote: > } > } New runs with your suggestion: > > Well, that's interesting. Seems to indicate that building argv to pass > to builtins is even more expensive than an array assignment. > > Still, I can't find anything on my system that causes as dramatic a > difference as yours from 5.0.5 -> 5.2+. The (-.D) qualifiers seem to > have the largest effect for me, for you its the exclusion pattern. >... > I did find a small but observable speedup when compiling without > MULTIBYTE_SUPPORT, too; there were a lot of multibyte repairs to > pattern.c after 5.0.5. There was a fairly recent optimsiation to handle patterns by metafying earlier in the case of parameter substitutions. Globbing shouldn't have been substantially affected although the order of operations is a bit different. I don't see how this could be relevant, though. pws ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Extended globbing seems to have become much slower in recent versions of Zsh 2016-03-02 9:32 ` Peter Stephenson @ 2016-03-02 23:46 ` Bart Schaefer 0 siblings, 0 replies; 22+ messages in thread From: Bart Schaefer @ 2016-03-02 23:46 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh Users On Wed, Mar 2, 2016 at 1:32 AM, Peter Stephenson <p.stephenson@samsung.com> wrote: > > There was a fairly recent optimsiation to handle patterns by metafying > earlier in the case of parameter substitutions. Globbing shouldn't have > been substantially affected although the order of operations is a bit > different. I don't see how this could be relevant, though. That change also pre-dates the 5.2 release, so may explain the improvement from 5.1 to 5.2 but not the slowdown from 5.0.5 to 5.0.6. ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2016-03-21 23:05 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-02-28 20:18 Extended globbing seems to have become much slower in recent versions of Zsh Jesper Nygårds 2016-02-29 19:12 ` Bart Schaefer 2016-03-01 11:39 ` Jesper Nygårds 2016-03-01 18:28 ` Bart Schaefer 2016-03-01 19:11 ` Jesper Nygårds 2016-03-02 0:03 ` Bart Schaefer 2016-03-02 8:39 ` Jesper Nygårds 2016-03-03 0:06 ` Bart Schaefer 2016-03-04 8:17 ` Jesper Nygårds 2016-03-04 13:22 ` Jesper Nygårds 2016-03-04 14:03 ` Peter Stephenson 2016-03-04 14:20 ` Peter Stephenson 2016-03-04 21:49 ` Peter Stephenson 2016-03-05 17:47 ` Bart Schaefer 2016-03-06 18:10 ` Peter Stephenson 2016-03-07 9:59 ` Jesper Nygårds 2016-03-07 10:15 ` Peter Stephenson 2016-03-05 17:37 ` Bart Schaefer 2016-03-06 17:31 ` Bart Schaefer 2016-03-21 23:05 ` Bart Schaefer 2016-03-02 9:32 ` Peter Stephenson 2016-03-02 23:46 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).