Fwd: Re: Brace expansion performance

zsh-users
 help / color / mirror / code / Atom feed

* Fwd: Re: Brace expansion performance
@ 2010-03-22 20:00 Radoulov, Dimitre
  2010-03-23  4:18 ` Bart Schaefer
  0 siblings, 1 reply; 3+ messages in thread
From: Radoulov, Dimitre @ 2010-03-22 20:00 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 5854 bytes --]


On 22/03/2010 18.03, Nikolai Weibull wrote:
> On Mon, Mar 22, 2010 at 17:24, Radoulov, Dimitre<cichomitiko@gmail.com>  wrote:
>
>    
>> does anybody know why the zsh brace expansion (?) performs so ...
>> differently?
>>      
> Ouch.
>
> Zsh is faster for
>
> for (( i = 1; i<= 300000; i++ )); do :; done,
>
> though.
>    

Yes,
it's not the expansion by itself:

% time bash --norc  -c ': {1..300000}'
bash --norc -c ': {1..300000}'  0.76s user 0.54s system 95% cpu 1.357 total

% time zsh -fc ': {1..300000}'
zsh -fc ': {1..300000}'  0.12s user 0.16s system 96% cpu 0.300 total

It's not the for loop:

% time bash --norc -c 'for ((;++i<=300000;)); do :;done'
bash --norc -c 'for ((;++i<=300000;)); do :;done'  2.62s user 1.99s 
system 89% cpu 5.135 total

% time zsh -fc 'for ((;++i<=300000;)); do :;done'
zsh -fc 'for ((;++i<=300000;)); do :;done'  0.83s user 1.94s system 82% 
cpu 3.345 total

It seams that it's that particular combination with the for loop:

% strace -c zsh -fc 'for i in {1..1000}; do :;done'
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  69.44    0.063650          21      3003           rt_sigprocmask
  15.84    0.014518          15       967       960 fcntl64
   7.21    0.006606          21       317       286 open
   1.69    0.001550          39        40           close
   1.45    0.001328          30        45           mmap2
   1.03    0.000942          32        29           fstat64
   0.95    0.000870          40        22           read
   0.53    0.000483          54         9           munmap
   0.51    0.000472         236         2           readlink
   0.51    0.000470          94         5           dup
   0.44    0.000400         400         1           time
   0.41    0.000378          34        11        11 access
   0.00    0.000000           0         1           execve
   0.00    0.000000           0         1           getpid
   0.00    0.000000           0         3           alarm
   0.00    0.000000           0         1           pipe
   0.00    0.000000           0         3           brk
   0.00    0.000000           0         6         1 ioctl
   0.00    0.000000           0         1           getppid
   0.00    0.000000           0         1           getrusage
   0.00    0.000000           0         1           gettimeofday
   0.00    0.000000           0         2           uname
   0.00    0.000000           0        12           mprotect
   0.00    0.000000           0         4           _llseek
   0.00    0.000000           0         7           rt_sigaction
   0.00    0.000000           0        16           getrlimit
   0.00    0.000000           0        20        18 stat64
   0.00    0.000000           0         2           getuid32
   0.00    0.000000           0         1           getgid32
   0.00    0.000000           0         1           geteuid32
   0.00    0.000000           0         1           getegid32
   0.00    0.000000           0         1           set_thread_area
   0.00    0.000000           0         2           socket
   0.00    0.000000           0         2         2 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.091667                  4540      1278 total


% strace -c bash --norc -c 'for i in {1..1000}; do :;done'
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  42.65    0.006292          43       147       125 open
  24.51    0.003615        3615         1           execve
  12.77    0.001884          31        60           brk
   5.58    0.000823         274         3           munmap
   5.49    0.000810          31        26           mmap2
   4.92    0.000726          33        22           close
   4.07    0.000601          30        20           fstat64
   0.00    0.000000           0         7           read
   0.00    0.000000           0         1           time
   0.00    0.000000           0         1           getpid
   0.00    0.000000           0         5         5 access
   0.00    0.000000           0         1           getppid
   0.00    0.000000           0         1           getpgrp
   0.00    0.000000           0         1           gettimeofday
   0.00    0.000000           0         1           uname
   0.00    0.000000           0         7           mprotect
   0.00    0.000000           0         8           rt_sigaction
   0.00    0.000000           0         6           rt_sigprocmask
   0.00    0.000000           0         1           getrlimit
   0.00    0.000000           0        27        22 stat64
   0.00    0.000000           0         1           getuid32
   0.00    0.000000           0         1           getgid32
   0.00    0.000000           0         1           geteuid32
   0.00    0.000000           0         1           getegid32
   0.00    0.000000           0         1           getgroups32
   0.00    0.000000           0         1           set_thread_area
------ ----------- ----------- --------- --------- ----------------
100.00    0.014751                   352       152 total




On 22/03/2010 18.22, Bart Schaefer wrote:

[...]
> First make sure "zsh -c" isn't reading some expensive init files that
> bash/ksh aren't, but:
>
> I'm pretty sure zsh is actually allocating an array of 300000 integers
> during expansion of that expression, and probably copying it a few times.
>
> schaefer[633] strace bash -c 'echo $BASH_VERSION;echo $BASH_VERSION;for i in {1..300000}; do :;done' |&  wc
>      322    1262   17792
> schaefer[634] strace Src/zsh -fc 'echo $ZSH_VERSION;for i in {1..300000}; do :;done' |&  wc
>   901381 5408181 43590584

Yes, thanks!
Could this explain such a big difference in the timings?



Regards
Dimitre


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fwd: Re: Brace expansion performance
  2010-03-22 20:00 Fwd: Re: Brace expansion performance Radoulov, Dimitre
@ 2010-03-23  4:18 ` Bart Schaefer
  2010-03-23  8:26   ` Radoulov, Dimitre
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Schaefer @ 2010-03-23  4:18 UTC (permalink / raw)
  To: zsh-users

On Mar 22,  9:00pm, Radoulov, Dimitre wrote:
}
} It seams that it's that particular combination with the for loop:
} 
} % strace -c zsh -fc 'for i in {1..1000}; do :;done'

[...]

} On 22/03/2010 18.22, Bart Schaefer wrote:
} > I'm pretty sure zsh is actually allocating an array of 300000
} > integers during expansion of that expression, and probably copying
} > it a few times.
} 
} Yes, thanks!
} Could this explain such a big difference in the timings?

Turns out that's not quite it.  Do you happen to know whether your zsh
is compiled with --enable-zsh-mem ?  With --enable-zsh-mem-debug ?

If you look at "ltrace -c" instead of "strace" for that same 1000
integers, here's zsh:

% time     seconds  usecs/call     calls      function
------ ----------- ----------- --------- --------------------
 35.83    2.273416         403      5630 strlen
 35.59    2.258366         751      3006 sigprocmask
  6.55    0.415719          48      8594 memset
  4.67    0.296125          48      6055 memcpy
  3.78    0.239621          48      4964 strcmp
  3.57    0.226469          48      4630 mbrtowc
  3.48    0.220989          48      4556 strcpy
  1.55    0.098170          48      2004 fflush
  1.52    0.096682          48      2004 __errno_location
  0.93    0.059187          48      1221 strchr
  0.80    0.050600          50      1001 sprintf
  0.76    0.048399          48      1001 ferror
  0.76    0.048374          48      1000 strcat
(snipped)

And here's bash:

% time     seconds  usecs/call     calls      function
------ ----------- ----------- --------- --------------------
 29.58    3.282781         124     26449 malloc
 29.39    3.261934         125     25933 free
 25.14    2.790494         171     16238 strcpy
  5.14    0.570544          48     11695 __ctype_get_mb_cur_max
  4.83    0.536449          48     11012 memcpy
  2.67    0.295913          48      6101 strcmp
  2.20    0.244207          48      5003 strchr
  0.44    0.048769          48      1002 memset
  0.44    0.048475          48      1000 ferror
  0.08    0.008553        8553         1 qsort
  0.03    0.002880          48        59 mbrtowc
  0.01    0.000828          48        17 sigemptyset
  0.01    0.000698          87         8 sigaction
(snipped)

Note that bash does significantly more calls to malloc/free, but
does much less signal handling -- and doesn't use stdio at all; and
zsh is spending a lot of calls zeroing out memory and doing multibyte
conversion, which bash doesn't do anywhere near as often.  I suspect
this is down to whether the shell stores things in native multibyte
internally or not.

However, for smaller loops I tried, zsh was consistently faster --
it's not until you approach six digits of loop bound that zsh begins
to fall behind, at least on my hardware.  So I think this is a case
of the shells having optimized for different things, and for what I'd
presume are the more common cases, zsh does just fine.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fwd: Re: Brace expansion performance
  2010-03-23  4:18 ` Bart Schaefer
@ 2010-03-23  8:26   ` Radoulov, Dimitre
  0 siblings, 0 replies; 3+ messages in thread
From: Radoulov, Dimitre @ 2010-03-23  8:26 UTC (permalink / raw)
  To: zsh-users

On 23/03/2010 5.18, Bart Schaefer wrote:
> On Mar 22,  9:00pm, Radoulov, Dimitre wrote:
> }
> } It seams that it's that particular combination with the for loop:
> }
> } % strace -c zsh -fc 'for i in {1..1000}; do :;done'
>
> [...]
>
> } On 22/03/2010 18.22, Bart Schaefer wrote:
> }>  I'm pretty sure zsh is actually allocating an array of 300000
> }>  integers during expansion of that expression, and probably copying
> }>  it a few times.
> }
> } Yes, thanks!
> } Could this explain such a big difference in the timings?
>
> Turns out that's not quite it.  Do you happen to know whether your zsh
> is compiled with --enable-zsh-mem ?  With --enable-zsh-mem-debug ?
>
>    

Zsh-mem was not enabled. Anyway, with enable-zsh-mem (on my notebook, 
Ubuntu on VirtualBox) I get similar timings.


> If you look at "ltrace -c" instead of "strace" for that same 1000
> integers, here's zsh:
>
> % time     seconds  usecs/call     calls      function
> ------ ----------- ----------- --------- --------------------
>   35.83    2.273416         403      5630 strlen
>   35.59    2.258366         751      3006 sigprocmask
>    6.55    0.415719          48      8594 memset
>    4.67    0.296125          48      6055 memcpy
>    3.78    0.239621          48      4964 strcmp
>    3.57    0.226469          48      4630 mbrtowc
>    3.48    0.220989          48      4556 strcpy
>    1.55    0.098170          48      2004 fflush
>    1.52    0.096682          48      2004 __errno_location
>    0.93    0.059187          48      1221 strchr
>    0.80    0.050600          50      1001 sprintf
>    0.76    0.048399          48      1001 ferror
>    0.76    0.048374          48      1000 strcat
> (snipped)
>
> And here's bash:
>
> % time     seconds  usecs/call     calls      function
> ------ ----------- ----------- --------- --------------------
>   29.58    3.282781         124     26449 malloc
>   29.39    3.261934         125     25933 free
>   25.14    2.790494         171     16238 strcpy
>    5.14    0.570544          48     11695 __ctype_get_mb_cur_max
>    4.83    0.536449          48     11012 memcpy
>    2.67    0.295913          48      6101 strcmp
>    2.20    0.244207          48      5003 strchr
>    0.44    0.048769          48      1002 memset
>    0.44    0.048475          48      1000 ferror
>    0.08    0.008553        8553         1 qsort
>    0.03    0.002880          48        59 mbrtowc
>    0.01    0.000828          48        17 sigemptyset
>    0.01    0.000698          87         8 sigaction
> (snipped)
>
> Note that bash does significantly more calls to malloc/free, but
> does much less signal handling -- and doesn't use stdio at all; and
> zsh is spending a lot of calls zeroing out memory and doing multibyte
> conversion, which bash doesn't do anywhere near as often.  I suspect
> this is down to whether the shell stores things in native multibyte
> internally or not.
>
> However, for smaller loops I tried, zsh was consistently faster --
> it's not until you approach six digits of loop bound that zsh begins
> to fall behind, at least on my hardware.  So I think this is a case
> of the shells having optimized for different things, and for what I'd
> presume are the more common cases, zsh does just fine.
>    

Thank you for the explanation.



Regards
Dimitre


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-03-23  8:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22 20:00 Fwd: Re: Brace expansion performance Radoulov, Dimitre
2010-03-23  4:18 ` Bart Schaefer
2010-03-23  8:26   ` Radoulov, Dimitre

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).