* ZSH performance regression in 5.8.1.2-test @ 2022-04-25 18:16 Jordan Patterson 2022-04-25 18:56 ` Bart Schaefer 2022-04-26 1:08 ` Bart Schaefer 0 siblings, 2 replies; 30+ messages in thread From: Jordan Patterson @ 2022-04-25 18:16 UTC (permalink / raw) To: zsh-workers [-- Attachment #1: Type: text/plain, Size: 1369 bytes --] Hi: This recent fix has led to a performance regression in zsh: 49792: Non-interative shell input is line buffered. I had noticed that my shell was loading slower and found this Gentoo bug report (https://bugs.gentoo.org/839900). They had recently picked up this patch to apply to the current 5.8.1 version. I've benchmarked loading zsh with the test release and a couple of previous versions with my zsh configuration: Benchmark 1: prefix/5.8/bin/zsh -i -c exit Time (mean ± σ): 262.6 ms ± 10.7 ms [User: 184.7 ms, System: 66.4 ms] Range (min … max): 246.7 ms … 279.4 ms 11 runs Benchmark 2: prefix/5.8.1/bin/zsh -i -c exit Time (mean ± σ): 226.5 ms ± 6.0 ms [User: 155.7 ms, System: 57.4 ms] Range (min … max): 216.6 ms … 238.5 ms 13 runs Benchmark 3: prefix/5.8.1.2-test/bin/zsh -i -c exit Time (mean ± σ): 2.088 s ± 0.027 s [User: 0.503 s, System: 1.562 s] Range (min … max): 2.056 s … 2.149 s 10 runs Summary 'prefix/5.8.1/bin/zsh -i -c exit' ran 1.16 ± 0.06 times faster than 'prefix/5.8/bin/zsh -i -c exit' 9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit' I've also generated some flamegraphs of the same commands, which I've attached. The test version now spends most of its time doing the reads. Jordan [-- Attachment #2: zsh-5.8.1.svg --] [-- Type: image/svg+xml, Size: 224043 bytes --] [-- Attachment #3: zsh-5.8.1.2-test.svg --] [-- Type: image/svg+xml, Size: 256770 bytes --] [-- Attachment #4: zsh-5.8.svg --] [-- Type: image/svg+xml, Size: 260490 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-25 18:16 ZSH performance regression in 5.8.1.2-test Jordan Patterson @ 2022-04-25 18:56 ` Bart Schaefer 2022-04-25 19:20 ` Stephane Chazelas 2022-04-26 1:08 ` Bart Schaefer 1 sibling, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-25 18:56 UTC (permalink / raw) To: Jordan Patterson; +Cc: Zsh hackers list On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote: > > This recent fix has led to a performance regression in zsh: 49792: > Non-interative shell input is line buffered. Hm. If we use stdio for speed, we have memory management re-entrance issues that can lead to a crash. If we use direct read of more than one character at a time, we can't enforce line buffering. Is there a way we can detect the case where we need to line-buffer and avoid it otherwise? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-25 18:56 ` Bart Schaefer @ 2022-04-25 19:20 ` Stephane Chazelas 2022-04-25 21:27 ` Bart Schaefer 0 siblings, 1 reply; 30+ messages in thread From: Stephane Chazelas @ 2022-04-25 19:20 UTC (permalink / raw) To: Bart Schaefer; +Cc: Jordan Patterson, Zsh hackers list 2022-04-25 11:56:59 -0700, Bart Schaefer: > On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote: > > > > This recent fix has led to a performance regression in zsh: 49792: > > Non-interative shell input is line buffered. > > Hm. If we use stdio for speed, we have memory management re-entrance > issues that can lead to a crash. > > If we use direct read of more than one character at a time, we can't > enforce line buffering. > > Is there a way we can detect the case where we need to line-buffer and > avoid it otherwise? [...] Sorry if I'm beside the point as I don't know the context here, but in cases where it's important that we don't read past the newline character that delimits an input line (like for the read builtin), some other shells, when the input is seekable do read by blocks (instead of one byte at a time), and seek back to just after the newline when they've read too much. That explains why using "read" on regular files can be significantly slower in zsh than in ksh/bash. ksh93 goes even further (but that causes bugs, and I'm not suggesting zsh goes there) in that the second "read" on the same fd reuses information that was retrieved from the previous read. It also explains why ksh93 uses socketpair() instead of pipe() for its pipes on Linux, as you can /peek/ at their contents without consuming them. Again, that comes with its own set of problems. -- Stephane ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-25 19:20 ` Stephane Chazelas @ 2022-04-25 21:27 ` Bart Schaefer 2022-04-26 7:01 ` Bart Schaefer 0 siblings, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-25 21:27 UTC (permalink / raw) To: Bart Schaefer, Jordan Patterson, Zsh hackers list On Mon, Apr 25, 2022 at 12:20 PM Stephane Chazelas <stephane@chazelas.org> wrote: > > 2022-04-25 11:56:59 -0700, Bart Schaefer: > > On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote: > > > > > > This recent fix has led to a performance regression in zsh: 49792: > > > Non-interative shell input is line buffered. Jumping back to this for a moment: > > > Summary > > > 'prefix/5.8.1/bin/zsh -i -c exit' ran > > > 1.16 ± 0.06 times faster than 'prefix/5.8/bin/zsh -i -c exit' > > > 9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit' I hadn't parsed closely before that 5.8.1 is faster than 5.8. That means avoiding stdio is faster if we don't have to do line-buffering. > > Is there a way we can detect the case where we need to line-buffer and > > avoid it otherwise? > > Sorry if I'm beside the point as I don't know the context here, The context is shell command input, that is, any of 1) loading init files 2) source or . command 3) zsh < file 4) command | zsh Have I missed any? The bug from https://bugs.gentoo.org/839900 crops up in a special case of 3 or 4 when the input runs a sub-command that itself wants to read from stdin. In that case the shell is supposed to have stopped reading at a newline, leaving the rest of the original input available to be consumed by the sub-command. >[...] some other shells, when the input is seekable do read by > blocks (instead of one byte at a time), and seek back to just > after the newline when they've read too much. Theoretically we can block-read with impunity in cases 1 and 2 (anyone disagree?). Testing for seek-ability would allow doing the "read too much and back up" trick in case 3. I don't immediately see any way to avoid reading one byte at a time in case 4, does anyone have a suggestion? It does not appear from some quick tests that lseek(SHIN, 0, SEEK_CUR) is guaranteed to succeed in case 3 nor to fail with ESPIPE in case 4. How to distinguish them? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-25 21:27 ` Bart Schaefer @ 2022-04-26 7:01 ` Bart Schaefer 2022-04-26 8:31 ` Peter Stephenson ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Bart Schaefer @ 2022-04-26 7:01 UTC (permalink / raw) To: Jordan Patterson, Zsh hackers list [-- Attachment #1: Type: text/plain, Size: 391 bytes --] On Mon, Apr 25, 2022 at 2:27 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > Theoretically we can block-read with impunity in cases 1 and 2 (anyone > disagree?). Testing for seek-ability would allow doing the "read too > much and back up" trick in case 3. I don't immediately see any way to > avoid reading one byte at a time in case 4, does anyone have a > suggestion? Try this? [-- Attachment #2: sysread.txt --] [-- Type: text/plain, Size: 1100 bytes --] diff --git a/Src/input.c b/Src/input.c index c59232681..f904427a0 100644 --- a/Src/input.c +++ b/Src/input.c @@ -217,12 +217,36 @@ shinbufrestore(void) static int shingetchar(void) { - int nread; + int nread, rsize = 1; if (shinbufptr < shinbufendptr) return STOUC(*shinbufptr++); - +#ifdef HAVE_FSTAT + else { + struct stat st; + if (fstat(SHIN, &st) == 0 && !S_ISFIFO(st.st_mode)) + rsize = SHINBUFSIZE; + } +#endif shinbufreset(); + if (rsize > 1) { + do { + errno = 0; + nread = read(SHIN, shinbuffer, rsize); + } while (nread < 0 && errno == EINTR); + if (nread <= 0) + return -1; + if (isset(SHINSTDIN) && + (shinbufendptr = memchr(shinbuffer, '\n', nread))) { + shinbufendptr++; + rsize = (shinbufendptr - shinbuffer); + if (nread > rsize && + lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0) + zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno); + } else + shinbufendptr = shinbuffer + nread; + return STOUC(*shinbufptr++); + } for (;;) { errno = 0; nread = read(SHIN, shinbufendptr, 1); ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 7:01 ` Bart Schaefer @ 2022-04-26 8:31 ` Peter Stephenson 2022-04-27 0:33 ` Bart Schaefer 2022-04-27 14:11 ` Stephane Chazelas 2022-04-26 14:31 ` Jun. T 2022-04-27 19:54 ` Jordan Patterson 2 siblings, 2 replies; 30+ messages in thread From: Peter Stephenson @ 2022-04-26 8:31 UTC (permalink / raw) To: Zsh hackers list > On 26 April 2022 at 08:01 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Mon, Apr 25, 2022 at 2:27 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > > > Theoretically we can block-read with impunity in cases 1 and 2 (anyone > > disagree?). Testing for seek-ability would allow doing the "read too > > much and back up" trick in case 3. I don't immediately see any way to > > avoid reading one byte at a time in case 4, does anyone have a > > suggestion? > > Try this? So the supposition is if it's not a pipe, providing we can detect that, it can be optimised to a block read an a seek. That ought to work most of the time; presumably the thing is to think of nasty edge cases where the test isn't good enough. You probably get away with it most of the time even there, in practice. Not sure how much difference it makes but to avoid additional system calls we could cache the result, invalidating it when we open a new file. (We could even cache whether a seek unexpectedly failed, but that might be getting too paranoid.) pws ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 8:31 ` Peter Stephenson @ 2022-04-27 0:33 ` Bart Schaefer 2022-04-27 14:11 ` Stephane Chazelas 1 sibling, 0 replies; 30+ messages in thread From: Bart Schaefer @ 2022-04-27 0:33 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list On Tue, Apr 26, 2022 at 1:32 AM Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > > Not sure how much difference it makes but to avoid additional system calls > we could cache the result, invalidating it when we open a new file. I considered that but there's no good way to isolate it to input.c ... SHIN could be the same FD number across open/close/open, for example. However, I think isset(SHINSTDIN) could be tested earlier in shingetchar to reduce the number of system calls. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 8:31 ` Peter Stephenson 2022-04-27 0:33 ` Bart Schaefer @ 2022-04-27 14:11 ` Stephane Chazelas 2022-04-27 15:02 ` Bart Schaefer 1 sibling, 1 reply; 30+ messages in thread From: Stephane Chazelas @ 2022-04-27 14:11 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list 2022-04-26 09:31:55 +0100, Peter Stephenson: [...] > Not sure how much difference it makes but to avoid additional system calls > we could cache the result, invalidating it when we open a new file. (We > could even cache whether a seek unexpectedly failed, but that might be > getting too paranoid.) [...] FYI, see https://github.com/att/ast/issues/15 for the kind of issue you can find with ksh93 caching read data. IMO, if the point is make sure we leave the position within the input past the delimiter of the line we have just read, so that other things can read it if they want, then it's wrong to cache, as those other things could just as well overwrite things. It's also probably not going to get you that much of a benefit as the OS will have cached the data as well. -- Stephane ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 14:11 ` Stephane Chazelas @ 2022-04-27 15:02 ` Bart Schaefer 2022-04-27 15:07 ` Peter Stephenson 0 siblings, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-27 15:02 UTC (permalink / raw) To: Peter Stephenson, Zsh hackers list On Wed, Apr 27, 2022 at 7:12 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > 2022-04-26 09:31:55 +0100, Peter Stephenson: > [...] > > Not sure how much difference it makes but to avoid additional system calls > > we could cache the result, invalidating it when we open a new file. > > FYI, see https://github.com/att/ast/issues/15 for the kind of > issue you can find with ksh93 caching read data. I think the idea was to cache the result of the "can we lseek" test, not the data that was read. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 15:02 ` Bart Schaefer @ 2022-04-27 15:07 ` Peter Stephenson 2022-04-27 15:17 ` Bart Schaefer 0 siblings, 1 reply; 30+ messages in thread From: Peter Stephenson @ 2022-04-27 15:07 UTC (permalink / raw) To: Zsh hackers list > On 27 April 2022 at 16:02 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Wed, Apr 27, 2022 at 7:12 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > > > 2022-04-26 09:31:55 +0100, Peter Stephenson: > > [...] > > > Not sure how much difference it makes but to avoid additional system calls > > > we could cache the result, invalidating it when we open a new file. > > > > FYI, see https://github.com/att/ast/issues/15 for the kind of > > issue you can find with ksh93 caching read data. > > I think the idea was to cache the result of the "can we lseek" test, > not the data that was read. Yes, I wouldn't expect that to change until we reopen the channel, which is detectable locally within input.c as we need to set up the pointers at the same time. pws ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 15:07 ` Peter Stephenson @ 2022-04-27 15:17 ` Bart Schaefer 0 siblings, 0 replies; 30+ messages in thread From: Bart Schaefer @ 2022-04-27 15:17 UTC (permalink / raw) To: Peter Stephenson; +Cc: Zsh hackers list On Wed, Apr 27, 2022 at 8:08 AM Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > > > I think the idea was to cache the result of the "can we lseek" test, > > Yes, I wouldn't expect that to change until we reopen the channel, > which is detectable locally within input.c as we need to set up > the pointers at the same time. All assignments to SHIN are in init.c, so it'd probably at least be necessary to pass an argument to shinbufalloc() et al. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 7:01 ` Bart Schaefer 2022-04-26 8:31 ` Peter Stephenson @ 2022-04-26 14:31 ` Jun. T 2022-04-26 15:15 ` Peter Stephenson 2022-04-27 0:38 ` Bart Schaefer 2022-04-27 19:54 ` Jordan Patterson 2 siblings, 2 replies; 30+ messages in thread From: Jun. T @ 2022-04-26 14:31 UTC (permalink / raw) To: zsh-workers > 2022/04/26 16:01, Bart Schaefer <schaefer@brasslantern.com> wrote: > > Try this? > <sysread.txt> --- a/Src/input.c +++ b/Src/input.c (snip) +#ifdef HAVE_FSTAT + else { + struct stat st; + if (fstat(SHIN, &st) == 0 && !S_ISFIFO(st.st_mode)) + rsize = SHINBUFSIZE; + } +#endif It works for printf '%s\n' 'echo $$' sh 'echo $$' | zsh But in theory (yes, just in theory), SHIN can be a socket. With two terminals A and B (zsh/socket already zmodload'ed): A% zsocket -l /tmp/tmpsocket A% zsocket -a $REPLY B% zsocket /tmp/tmpsocket A% zsh <&$REPLY B% printf '%s\n' 'echo $$' sh 'echo $$' >&$REPLY then on the terminal A: zsh: lseek(0, -11): illegal seek Adding !S_ISSOCK(st.st_mode) solves this, of course. But at least on my Mac the following seems to work also: if (lseek(SHIN, 0, SEEK_CUR) == 0) rsize = SHINBUFSIZE; # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails # when it shouldn't fail, or does not fail when it should fail? If you think using fstat() is safer then I have no objection. PS Either fstat()/S_ISFIFO() or lseek() works when SHIN is a FIFO: A% mkfifo fifo A% zsh < fifo B% printf '%s\n' 'echo $$' sh 'echo $$' > fifo ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 14:31 ` Jun. T @ 2022-04-26 15:15 ` Peter Stephenson 2022-04-27 0:55 ` Bart Schaefer 2022-04-27 0:38 ` Bart Schaefer 1 sibling, 1 reply; 30+ messages in thread From: Peter Stephenson @ 2022-04-26 15:15 UTC (permalink / raw) To: zsh-workers > On 26 April 2022 at 15:31 "Jun. T" <takimoto-j@kba.biglobe.ne.jp> wrote: > But at least on my Mac the following seems to work also: > > if (lseek(SHIN, 0, SEEK_CUR) == 0) > rsize = SHINBUFSIZE; > > # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails > # when it shouldn't fail, or does not fail when it should fail? We could use some variant of the following as a configure test for safety. (Not sure if the socket test is comprehensive enough, but we must be really into edge cases at that point.) If we get the equivalent of lseek on pipe failed lseek on UNIX domain socket failed lseek on regular file succeeded presumably we're safe to use lseek as a test, and if we don't we can decide whether to fall back to fstat or just leave unoptimised. pws #include <stdio.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> #include <sys/socket.h> int main(int argc, char **argv) { int pipefd[2], fd; if (pipe(pipefd) < 0) { printf("creating pipe failed\n"); } else if (lseek(pipefd[0], 0, SEEK_CUR) == (off_t)-1) { printf("lseek on pipe failed\n"); } else { printf("lseek on pipe succeeded\n"); } close(pipefd[0]); close(pipefd[1]); fd = socket(AF_UNIX, SOCK_STREAM, 0); if (fd < 0) { printf("creating UNIX domain socket failed\n"); } else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1) { printf("lseek on UNIX domain socket failed\n"); } else { printf("lseek on UNIX domain socket succeeded\n"); } close(fd); fd = open("seekfiletest.tmp", O_CREAT); if (fd < 0) { printf("creating file failed\n"); } else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1) { printf("lseek on regular file failed\n"); } else { printf("lseek on regular file succeeded\n"); } close(fd); return 0; } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 15:15 ` Peter Stephenson @ 2022-04-27 0:55 ` Bart Schaefer 2022-04-27 9:16 ` Jun T 0 siblings, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-27 0:55 UTC (permalink / raw) To: Zsh hackers list [-- Attachment #1: Type: text/plain, Size: 285 bytes --] On Tue, Apr 26, 2022 at 8:16 AM Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > > We could use some variant of the following as a configure test for safety. I will be happiest if someone else plugs that in to configure.ac. Here's the patch again, replacing fstat with lseek. [-- Attachment #2: sysread.txt --] [-- Type: text/plain, Size: 1062 bytes --] diff --git a/Src/input.c b/Src/input.c index c59232681..8e0a24f92 100644 --- a/Src/input.c +++ b/Src/input.c @@ -217,12 +217,31 @@ shinbufrestore(void) static int shingetchar(void) { - int nread; + int nread, rsize = isset(SHINSTDIN) ? 1 : SHINBUFSIZE; if (shinbufptr < shinbufendptr) return STOUC(*shinbufptr++); - + else if (rsize == 1 && lseek(SHIN, 0, SEEK_CUR) == 0) + rsize = SHINBUFSIZE; shinbufreset(); + if (rsize > 1) { + do { + errno = 0; + nread = read(SHIN, shinbuffer, rsize); + } while (nread < 0 && errno == EINTR); + if (nread <= 0) + return -1; + if (isset(SHINSTDIN) && + (shinbufendptr = memchr(shinbuffer, '\n', nread))) { + shinbufendptr++; + rsize = (shinbufendptr - shinbuffer); + if (nread > rsize && + lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0) + zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno); + } else + shinbufendptr = shinbuffer + nread; + return STOUC(*shinbufptr++); + } for (;;) { errno = 0; nread = read(SHIN, shinbufendptr, 1); ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 0:55 ` Bart Schaefer @ 2022-04-27 9:16 ` Jun T 0 siblings, 0 replies; 30+ messages in thread From: Jun T @ 2022-04-27 9:16 UTC (permalink / raw) To: zsh-workers > 2022/04/27 9:55, Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Tue, Apr 26, 2022 at 8:16 AM Peter Stephenson > <p.w.stephenson@ntlworld.com> wrote: >> >> We could use some variant of the following as a configure test for safety. > > I will be happiest if someone else plugs that in to configure.ac. Do you mean that trying lseek() for a just opened fd is enough? If so, I will add it to configure.ac. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 14:31 ` Jun. T 2022-04-26 15:15 ` Peter Stephenson @ 2022-04-27 0:38 ` Bart Schaefer 2022-04-27 9:34 ` Peter Stephenson 1 sibling, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-27 0:38 UTC (permalink / raw) To: Jun. T; +Cc: Zsh hackers list On Tue, Apr 26, 2022 at 7:37 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote: > > But at least on my Mac the following seems to work also: > > if (lseek(SHIN, 0, SEEK_CUR) == 0) > rsize = SHINBUFSIZE; > > # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails > # when it shouldn't fail, or does not fail when it should fail? I was pretty sure I'd found a case (on Ubuntu 20.04) where it succeeded on a pipe when there was already data written to the pipe before the seek was attempted. That was only/exactly for (0, SEEK_CUR). However, I can check that again. If that is the false-success mode, it won't be tested by PWS's little C program in workers/50108 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 0:38 ` Bart Schaefer @ 2022-04-27 9:34 ` Peter Stephenson 2022-04-27 10:28 ` Jun T 0 siblings, 1 reply; 30+ messages in thread From: Peter Stephenson @ 2022-04-27 9:34 UTC (permalink / raw) To: Zsh hackers list > On 27 April 2022 at 01:38 Bart Schaefer <schaefer@brasslantern.com> wrote: > On Tue, Apr 26, 2022 at 7:37 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote: > > > > But at least on my Mac the following seems to work also: > > > > if (lseek(SHIN, 0, SEEK_CUR) == 0) > > rsize = SHINBUFSIZE; > > > > # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails > > # when it shouldn't fail, or does not fail when it should fail? > > I was pretty sure I'd found a case (on Ubuntu 20.04) where it > succeeded on a pipe when there was already data written to the pipe > before the seek was attempted. That was only/exactly for (0, > SEEK_CUR). A simple test for this still caused the seek on the type to fail here, but I could be missing the key elements. Should still be a safe test with only a few bytes in the pipe, I would think. pws #include <stdio.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> #include <sys/socket.h> int main(int argc, char **argv) { int pipefd[2], fd; if (pipe(pipefd) < 0) { printf("creating pipe failed\n"); } else { char stuff[9] = "abcdefgh"; write(pipefd[1], stuff, 8); if (lseek(pipefd[0], 0, SEEK_CUR) == (off_t)-1) { printf("lseek on pipe failed\n"); } else { printf("lseek on pipe succeeded\n"); } } close(pipefd[0]); close(pipefd[1]); fd = socket(AF_UNIX, SOCK_STREAM, 0); if (fd < 0) { printf("creating UNIX domain socket failed\n"); } else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1) { printf("lseek on UNIX domain socket failed\n"); } else { printf("lseek on UNIX domain socket succeeded\n"); } close(fd); fd = open("seekfiletest.tmp", O_CREAT); if (fd < 0) { printf("creating file failed\n"); } else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1) { printf("lseek on regular file failed\n"); } else { printf("lseek on regular file succeeded\n"); } close(fd); return 0; } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 9:34 ` Peter Stephenson @ 2022-04-27 10:28 ` Jun T 2022-04-27 12:42 ` Jun T 2022-04-27 13:58 ` Jun T 0 siblings, 2 replies; 30+ messages in thread From: Jun T @ 2022-04-27 10:28 UTC (permalink / raw) To: zsh-workers > 2022/04/27 18:34, Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > >> On 27 April 2022 at 01:38 Bart Schaefer <schaefer@brasslantern.com> wrote: >> >> I was pretty sure I'd found a case (on Ubuntu 20.04) where it >> succeeded on a pipe when there was already data written to the pipe >> before the seek was attempted. That was only/exactly for (0, >> SEEK_CUR). > > A simple test for this still caused the seek on the type to fail > here, but I could be missing the key elements. Should still > be a safe test with only a few bytes in the pipe, I would think. OK, I will add this to configure.ac. > 2022/04/26 23:31, I wrote: > > But at least on my Mac the following seems to work also: > > if (lseek(SHIN, 0, SEEK_CUR) == 0) Sorry, it should have been: if (lseek(SHIN, 0, SEEK_CUR) != (off_t)_1) I will modify Bart's patch and post it with the patch for configure.ac. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 10:28 ` Jun T @ 2022-04-27 12:42 ` Jun T 2022-04-27 13:58 ` Jun T 1 sibling, 0 replies; 30+ messages in thread From: Jun T @ 2022-04-27 12:42 UTC (permalink / raw) To: zsh-workers This is (Bart's patch for input.c, with my error corrected) + (patch for configure.ac, suggested by Peter). I've tested only on macOS. If open()/pipe()/socket() fails (should not happen), I assume we can't use lseek(). diff --git a/Src/input.c b/Src/input.c index c59232681..9898a7177 100644 --- a/Src/input.c +++ b/Src/input.c @@ -217,12 +217,34 @@ shinbufrestore(void) static int shingetchar(void) { - int nread; + int nread, rsize = isset(SHINSTDIN) ? 1 : SHINBUFSIZE; if (shinbufptr < shinbufendptr) return STOUC(*shinbufptr++); shinbufreset(); +#ifdef USE_LSEEK + if (rsize == 1 && lseek(SHIN, 0, SEEK_CUR) != (off_t)-1) + rsize = SHINBUFSIZE; + if (rsize > 1) { + do { + errno = 0; + nread = read(SHIN, shinbuffer, rsize); + } while (nread < 0 && errno == EINTR); + if (nread <= 0) + return -1; + if (isset(SHINSTDIN) && + (shinbufendptr = memchr(shinbuffer, '\n', nread))) { + shinbufendptr++; + rsize = (shinbufendptr - shinbuffer); + if (nread > rsize && + lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0) + zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno); + } else + shinbufendptr = shinbuffer + nread; + return STOUC(*shinbufptr++); + } +#endif for (;;) { errno = 0; nread = read(SHIN, shinbufendptr, 1); diff --git a/configure.ac b/configure.ac index 8bba78c56..fdf1bc86e 100644 --- a/configure.ac +++ b/configure.ac @@ -2209,6 +2209,64 @@ if test x$zsh_cv_sys_fifo = xyes; then AC_DEFINE(HAVE_FIFOS) fi +dnl ----------- +dnl check that lseek() correctly reports seekability. +dnl ----------- +AC_CACHE_CHECK(if lseek() correctly reports seekability, +zsh_cv_sys_lseek, +[AC_RUN_IFELSE([AC_LANG_SOURCE([[ +#include <stdio.h> +#include <sys/types.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/socket.h> +int main() { + int pipefd[2], fd; + off_t ret; + char* tmpfile = "seekfiletest.tmp"; + if ((fd = open(tmpfile, O_CREAT)) < 0) { + fprintf(stderr, "creating file failed\n"); + return 1; + } + ret = lseek(fd, 0, SEEK_CUR); + close(fd); + unlink(tmpfile); + if (ret == (off_t)-1) { + fprintf(stderr, "lseek on regular file failed\n"); + return 1; + } + if (pipe(pipefd) < 0) { + fprintf(stderr, "creating pipe failed\n"); + return 1; + } + write(pipefd[1], "abcdefgh", 8); + ret = lseek(pipefd[0], 0, SEEK_CUR); + close(pipefd[0]); + close(pipefd[1]); + if (ret != (off_t)-1) { + fprintf(stderr, "lseek on pipe succeeded\n"); + return 1; + } + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + fprintf(stderr, "creating UNIX domain socket failed\n"); + return 1; + } + ret = lseek(fd, 0, SEEK_CUR); + close(fd); + if (ret != (off_t)-1) { + fprintf(stderr, "lseek on UNIX domain socket succeeded\n"); + return 1; + } + return 0; +} +]])],[zsh_cv_sys_lseek=yes],[zsh_cv_sys_lseek=no],[zsh_cv_sys_lseek=yes]) +]) +AH_TEMPLATE([USE_LSEEK], +[Define to 1 if lseek() can be used for SHIN.]) +if test x$zsh_cv_sys_lseek = xyes; then + AC_DEFINE(USE_LSEEK) +fi + dnl ----------- dnl test for whether link() works dnl for instance, BeOS R4.51 doesn't support hard links yet ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 10:28 ` Jun T 2022-04-27 12:42 ` Jun T @ 2022-04-27 13:58 ` Jun T 2022-04-27 15:25 ` Bart Schaefer 1 sibling, 1 reply; 30+ messages in thread From: Jun T @ 2022-04-27 13:58 UTC (permalink / raw) To: zsh-workers I did a few test to see the performance improvement. mkdir tmpdir && cd tmpdir N=400000 for ((i=0; i<N; ++i)); do echo '# foo bar boo foo bar boo' done > tmp.txt ln -s tmp.txt .zshrc tmp.txt contains 400,000 lines of comments (and nothing else). tmpdir/.zshrc is the symlink to tmp.txt. Then run the following 5 test (in the tmpdir): (1) time ZDOTDIR=. zsh -ic exit (2) time zsh -f tmp.txt (3) time zsh -f < tmp.txt (4) time echo 'source tmp.txt' | zsh -f (5) time cat tmp.txt | zsh -f total (approximate) CPU seconds are (tested on my iMac): (1) (2) (3) (4) (5) zsh-5.8 0.9 0.8 6.6 0.8 8.2 zsh-5.8.1 0.6 0.5 0.7 0.5 0.8 zsh-5.8.1.2-test 5.7 5.5 5.7 5.5 7.3 lseek() patch 0.7 0.5 1.4 0.5 7.8 bash --- 0.5 0.5 0.5 6.8 (bash --norc) (2) and (4) are almost equivalent, as it should be. With lseek() patch we get "reasonable" performance for (1)-(4). (3) is somewhat slower, because it is reading form stdin and need to call lseek() many times. The column (1) in the table above shows that both 5.8.1 and lseek() patch are about 8-10 times faster than 5.8.1.2-test. This ratio (about 8-10), obtained with .zshrc that contains 400,000 lines of comments, is close to the ratio reported by Jordan with his .zshrc: > 2022/04/26 3:16, Jordan Patterson <jordanp@gmail.com> wrote: > > Summary > 'prefix/5.8.1/bin/zsh -i -c exit' ran > 1.16 ± 0.06 times faster than 'prefix/5.8/bin/zsh -i -c exit' > 9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit' Jordan, what do you get by the following? zsh -xic exit 2>>(wc) Can you try the lseek() patch (in my previous post, 50115)? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 13:58 ` Jun T @ 2022-04-27 15:25 ` Bart Schaefer 2022-04-27 16:18 ` Jun. T 0 siblings, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-27 15:25 UTC (permalink / raw) To: Zsh hackers list On Wed, Apr 27, 2022 at 7:00 AM Jun T <takimoto-j@kba.biglobe.ne.jp> wrote: > > Then run the following 5 test (in the tmpdir): > (1) time ZDOTDIR=. zsh -ic exit > (2) time zsh -f tmp.txt > (3) time zsh -f < tmp.txt > (4) time echo 'source tmp.txt' | zsh -f > (5) time cat tmp.txt | zsh -f Does it matter to (3) if "zsh -fs" ? > With lseek() patch we get "reasonable" performance for (1)-(4). > (3) is somewhat slower, because it is reading form stdin and > need to call lseek() many times. I wonder if it would help to read less than SHINBUFSIZE when SHINSTDIN. E.g., make some guess at the "average line length" and read that many bytes. Might cause more calls to memchr() but less I/O (and shorter lseek()s, if that matters). ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 15:25 ` Bart Schaefer @ 2022-04-27 16:18 ` Jun. T 0 siblings, 0 replies; 30+ messages in thread From: Jun. T @ 2022-04-27 16:18 UTC (permalink / raw) To: zsh-workers > 2022/04/28 0:25, Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Wed, Apr 27, 2022 at 7:00 AM Jun T <takimoto-j@kba.biglobe.ne.jp> wrote: >> >> (3) time zsh -f < tmp.txt > > Does it matter to (3) if "zsh -fs" ? No, no noticeable difference. >> With lseek() patch we get "reasonable" performance for (1)-(4). >> (3) is somewhat slower, because it is reading form stdin and >> need to call lseek() many times. > > I wonder if it would help to read less than SHINBUFSIZE when > SHINSTDIN. E.g., make some guess at the "average line length" and > read that many bytes. Might cause more calls to memchr() but less I/O > (and shorter lseek()s, if that matters). I tried setting SHINBUFSIZE to 80 for (3) (with lseek() patch): SHINBUFSIZE=8192 about 1.3 sec SHINBUFSIZE=80 about 1.2 sec (slight improvement) (I'm now using a different Mac from the one used in the previous post) I guess data is cached by the OS and read() is, in most cases, something like memcpy(). I was running the test on a internal SDD of my iMac, but running the test on a external HDD gives almost the same results. If the file is on a "much slower" filesystem then the difference may be somewhat larger. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 7:01 ` Bart Schaefer 2022-04-26 8:31 ` Peter Stephenson 2022-04-26 14:31 ` Jun. T @ 2022-04-27 19:54 ` Jordan Patterson 2022-04-28 9:53 ` Jun T 2022-04-28 18:51 ` Jun. T 2 siblings, 2 replies; 30+ messages in thread From: Jordan Patterson @ 2022-04-27 19:54 UTC (permalink / raw) Cc: Zsh hackers list I'm not subscribed to the mailing list so I'm missing Jun's original email, but I'll reply here. > Jordan, what do you get by the following? > zsh -xic exit 2>>(wc) I get 13007 lines, 1534161 characters. > Can you try the lseek() patch (in my previous post, 50115)? Sure. I needed to add a mode argument to the open in your lseek configure check. I get this error otherwise on my system: error: call to '__open_missing_mode' declared with attribute error: open with o_creat or o_tmpfile in second argument needs 3 arguments My benchmark results: Benchmark 1: prefix/5.8/bin/zsh -i -c exit Time (mean ± σ): 254.6 ms ± 10.0 ms [User: 174.6 ms, System: 68.0 ms] Range (min … max): 240.1 ms … 271.7 ms 10 runs Benchmark 2: prefix/5.8.1/bin/zsh -i -c exit Time (mean ± σ): 251.7 ms ± 8.8 ms [User: 179.2 ms, System: 60.0 ms] Range (min … max): 233.3 ms … 262.5 ms 11 runs Benchmark 3: prefix/5.8.1.2-test/bin/zsh -i -c exit Time (mean ± σ): 2.583 s ± 0.053 s [User: 0.516 s, System: 2.033 s] Range (min … max): 2.540 s … 2.687 s 10 runs Benchmark 4: prefix/5.8.1.2-test-lseek/bin/zsh -i -c exit Time (mean ± σ): 283.8 ms ± 8.0 ms [User: 202.7 ms, System: 70.4 ms] Range (min … max): 273.0 ms … 300.3 ms 10 runs Summary 'prefix/5.8.1/bin/zsh -i -c exit' ran 1.01 ± 0.05 times faster than 'prefix/5.8/bin/zsh -i -c exit' 1.13 ± 0.05 times faster than 'prefix/5.8.1.2-test-lseek/bin/zsh -i -c exit' 10.27 ± 0.41 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit' On Tue, Apr 26, 2022 at 1:01 AM Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Mon, Apr 25, 2022 at 2:27 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > > > Theoretically we can block-read with impunity in cases 1 and 2 (anyone > > disagree?). Testing for seek-ability would allow doing the "read too > > much and back up" trick in case 3. I don't immediately see any way to > > avoid reading one byte at a time in case 4, does anyone have a > > suggestion? > > Try this? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 19:54 ` Jordan Patterson @ 2022-04-28 9:53 ` Jun T 2022-04-28 14:56 ` Bart Schaefer 2022-04-28 18:51 ` Jun. T 1 sibling, 1 reply; 30+ messages in thread From: Jun T @ 2022-04-28 9:53 UTC (permalink / raw) To: zsh-workers; +Cc: jordanp > 2022/04/28 4:54, Jordan Patterson <jordanp@gmail.com> wrote: > >> zsh -xic exit 2>>(wc) > > I get 13007 lines, 1534161 characters. So your .zshrc (and anything loaded from it) is much smaller than the tmp.txt I used, but: >> Can you try the lseek() patch (in my previous post, 50115)? > Summary > 'prefix/5.8.1/bin/zsh -i -c exit' ran > 1.01 ± 0.05 times faster than 'prefix/5.8/bin/zsh -i -c exit' > 1.13 ± 0.05 times faster than 'prefix/5.8.1.2-test-lseek/bin/zsh -i -c exit' > 10.27 ± 0.41 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit' it was about 10 times slower in 5.8.1.2-test (compared with 5.8), and now with the Bart's lseek patch it is just about 10% slower. I think 10% is acceptable. > I needed to add a mode argument to the open in your lseek > configure check. Thanks! I will use the following: open(tmpfile, O_CREAT, S_IRUSR) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-28 9:53 ` Jun T @ 2022-04-28 14:56 ` Bart Schaefer 0 siblings, 0 replies; 30+ messages in thread From: Bart Schaefer @ 2022-04-28 14:56 UTC (permalink / raw) To: Jun T; +Cc: Zsh hackers list, Jordan Patterson On Thu, Apr 28, 2022 at 2:54 AM Jun T <takimoto-j@kba.biglobe.ne.jp> wrote: > > it was about 10 times slower in 5.8.1.2-test (compared with 5.8), > and now with the Bart's lseek patch it is just about 10% slower. > I think 10% is acceptable. Indeed. Would be interesting to compare to 5.7.something which used fgetc. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-27 19:54 ` Jordan Patterson 2022-04-28 9:53 ` Jun T @ 2022-04-28 18:51 ` Jun. T 2022-04-29 0:28 ` Bart Schaefer 1 sibling, 1 reply; 30+ messages in thread From: Jun. T @ 2022-04-28 18:51 UTC (permalink / raw) To: zsh-workers > 2022/04/28 4:54, Jordan Patterson <jordanp@gmail.com> wrote: > > I needed to add a mode argument to the open in your lseek > configure check. Added S_IRUSR to open(), and include <sys/stat.h> for it. I reproduce the whole patch here for convenience. Tested on Fedora-35, macOS, FreeBSD and Cygwin. diff --git a/Src/input.c b/Src/input.c index c59232681..9898a7177 100644 --- a/Src/input.c +++ b/Src/input.c @@ -217,12 +217,34 @@ shinbufrestore(void) static int shingetchar(void) { - int nread; + int nread, rsize = isset(SHINSTDIN) ? 1 : SHINBUFSIZE; if (shinbufptr < shinbufendptr) return STOUC(*shinbufptr++); shinbufreset(); +#ifdef USE_LSEEK + if (rsize == 1 && lseek(SHIN, 0, SEEK_CUR) != (off_t)-1) + rsize = SHINBUFSIZE; + if (rsize > 1) { + do { + errno = 0; + nread = read(SHIN, shinbuffer, rsize); + } while (nread < 0 && errno == EINTR); + if (nread <= 0) + return -1; + if (isset(SHINSTDIN) && + (shinbufendptr = memchr(shinbuffer, '\n', nread))) { + shinbufendptr++; + rsize = (shinbufendptr - shinbuffer); + if (nread > rsize && + lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0) + zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno); + } else + shinbufendptr = shinbuffer + nread; + return STOUC(*shinbufptr++); + } +#endif for (;;) { errno = 0; nread = read(SHIN, shinbufendptr, 1); diff --git a/configure.ac b/configure.ac index 8bba78c56..c72148d06 100644 --- a/configure.ac +++ b/configure.ac @@ -2209,6 +2209,65 @@ if test x$zsh_cv_sys_fifo = xyes; then AC_DEFINE(HAVE_FIFOS) fi +dnl ----------- +dnl check that lseek() correctly reports seekability. +dnl ----------- +AC_CACHE_CHECK(if lseek() correctly reports seekability, +zsh_cv_sys_lseek, +[AC_RUN_IFELSE([AC_LANG_SOURCE([[ +#include <stdio.h> +#include <sys/types.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/socket.h> +#include <sys/stat.h> +int main() { + int pipefd[2], fd; + off_t ret; + char* tmpfile = "seekfiletest.tmp"; + if ((fd = open(tmpfile, O_CREAT, S_IRUSR)) < 0) { + fprintf(stderr, "creating file failed\n"); + return 1; + } + ret = lseek(fd, 0, SEEK_CUR); + close(fd); + unlink(tmpfile); + if (ret == (off_t)-1) { + fprintf(stderr, "lseek on regular file failed\n"); + return 1; + } + if (pipe(pipefd) < 0) { + fprintf(stderr, "creating pipe failed\n"); + return 1; + } + write(pipefd[1], "abcdefgh", 8); + ret = lseek(pipefd[0], 0, SEEK_CUR); + close(pipefd[0]); + close(pipefd[1]); + if (ret != (off_t)-1) { + fprintf(stderr, "lseek on pipe succeeded\n"); + return 1; + } + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + fprintf(stderr, "creating UNIX domain socket failed\n"); + return 1; + } + ret = lseek(fd, 0, SEEK_CUR); + close(fd); + if (ret != (off_t)-1) { + fprintf(stderr, "lseek on UNIX domain socket succeeded\n"); + return 1; + } + return 0; +} +]])],[zsh_cv_sys_lseek=yes],[zsh_cv_sys_lseek=no],[zsh_cv_sys_lseek=yes]) +]) +AH_TEMPLATE([USE_LSEEK], +[Define to 1 if lseek() can be used for SHIN.]) +if test x$zsh_cv_sys_lseek = xyes; then + AC_DEFINE(USE_LSEEK) +fi + dnl ----------- dnl test for whether link() works dnl for instance, BeOS R4.51 doesn't support hard links yet ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-28 18:51 ` Jun. T @ 2022-04-29 0:28 ` Bart Schaefer 2022-04-29 2:25 ` Jun. T 0 siblings, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-29 0:28 UTC (permalink / raw) To: Zsh hackers list On Thu, Apr 28, 2022 at 11:52 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote: > > I reproduce the whole patch here for convenience. You will commit this? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-29 0:28 ` Bart Schaefer @ 2022-04-29 2:25 ` Jun. T 0 siblings, 0 replies; 30+ messages in thread From: Jun. T @ 2022-04-29 2:25 UTC (permalink / raw) To: zsh-workers > 2022/04/29 9:28, Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Thu, Apr 28, 2022 at 11:52 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote: >> >> I reproduce the whole patch here for convenience. > > You will commit this? I think it's better for you to commit. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-25 18:16 ZSH performance regression in 5.8.1.2-test Jordan Patterson 2022-04-25 18:56 ` Bart Schaefer @ 2022-04-26 1:08 ` Bart Schaefer 2022-04-26 3:03 ` Jordan Patterson 1 sibling, 1 reply; 30+ messages in thread From: Bart Schaefer @ 2022-04-26 1:08 UTC (permalink / raw) To: Jordan Patterson; +Cc: Zsh hackers list On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote: > > 'prefix/5.8.1/bin/zsh -i -c exit' ran > 9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit' I'm curious how large is your interactive startup? I just compared 5.8.1.2-test with and without the line-buffering patch running zsh -xfic 'autoload compinit; compinit -D' 2>>(wc) and in order to get any significant difference I need to have compiled with --enable-zsh-debug --enable-zsh-mem-debug. wc gives 55727 lines / 1858616 characters of xtrace output, which ought to be enough for a reasonable comparison? Without debugging: line buffered: 0.57s user 0.31s system 100% cpu 0.883 total block read: 0.56s user 0.30s system 100% cpu 0.862 total With debugging: line buffered: 1.21s user 1.44s system 72% cpu 3.669 total block read: 0.80s user 0.93s system 100% cpu 1.720 total ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: ZSH performance regression in 5.8.1.2-test 2022-04-26 1:08 ` Bart Schaefer @ 2022-04-26 3:03 ` Jordan Patterson 0 siblings, 0 replies; 30+ messages in thread From: Jordan Patterson @ 2022-04-26 3:03 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list On Mon, Apr 25, 2022 at 7:09 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > I'm curious how large is your interactive startup? I just compared > 5.8.1.2-test with and without the line-buffering patch running > zsh -xfic 'autoload compinit; compinit -D' 2>>(wc) > and in order to get any significant difference I need to have compiled > with --enable-zsh-debug --enable-zsh-mem-debug. > > wc gives 55727 lines / 1858616 characters of xtrace output, which > ought to be enough for a reasonable comparison? Running that gives me 56619 lines / 1866092 characters. If I run a benchmark with your command (without tracing), I do see some slowdown, but not as dramatic a difference as when loading my normal zshrc with plugins. Benchmark 1: prefix/5.8/bin/zsh -fic "autoload compinit; compinit -D" Time (mean ± σ): 295.4 ms ± 4.2 ms [User: 181.9 ms, System: 114.0 ms] Range (min … max): 289.0 ms … 299.8 ms 10 runs Benchmark 2: prefix/5.8.1/bin/zsh -fic "autoload compinit; compinit -D" Time (mean ± σ): 295.1 ms ± 3.4 ms [User: 183.4 ms, System: 112.1 ms] Range (min … max): 289.8 ms … 299.5 ms 10 runs Benchmark 3: prefix/5.8.1.2-test/bin/zsh -fic "autoload compinit; compinit -D" Time (mean ± σ): 365.3 ms ± 4.2 ms [User: 204.8 ms, System: 160.4 ms] Range (min … max): 358.7 ms … 373.1 ms 10 runs Summary 'prefix/5.8.1/bin/zsh -fic "autoload compinit; compinit -D"' ran 1.00 ± 0.02 times faster than 'prefix/5.8/bin/zsh -fic "autoload compinit; compinit -D"' 1.24 ± 0.02 times faster than 'prefix/5.8.1.2-test/bin/zsh -fic "autoload compinit; compinit -D"' ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2022-04-29 2:25 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-25 18:16 ZSH performance regression in 5.8.1.2-test Jordan Patterson 2022-04-25 18:56 ` Bart Schaefer 2022-04-25 19:20 ` Stephane Chazelas 2022-04-25 21:27 ` Bart Schaefer 2022-04-26 7:01 ` Bart Schaefer 2022-04-26 8:31 ` Peter Stephenson 2022-04-27 0:33 ` Bart Schaefer 2022-04-27 14:11 ` Stephane Chazelas 2022-04-27 15:02 ` Bart Schaefer 2022-04-27 15:07 ` Peter Stephenson 2022-04-27 15:17 ` Bart Schaefer 2022-04-26 14:31 ` Jun. T 2022-04-26 15:15 ` Peter Stephenson 2022-04-27 0:55 ` Bart Schaefer 2022-04-27 9:16 ` Jun T 2022-04-27 0:38 ` Bart Schaefer 2022-04-27 9:34 ` Peter Stephenson 2022-04-27 10:28 ` Jun T 2022-04-27 12:42 ` Jun T 2022-04-27 13:58 ` Jun T 2022-04-27 15:25 ` Bart Schaefer 2022-04-27 16:18 ` Jun. T 2022-04-27 19:54 ` Jordan Patterson 2022-04-28 9:53 ` Jun T 2022-04-28 14:56 ` Bart Schaefer 2022-04-28 18:51 ` Jun. T 2022-04-29 0:28 ` Bart Schaefer 2022-04-29 2:25 ` Jun. T 2022-04-26 1:08 ` Bart Schaefer 2022-04-26 3:03 ` Jordan Patterson
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).