zsh-workers
 help / color / mirror / code / Atom feed
* ZSH performance regression in 5.8.1.2-test
@ 2022-04-25 18:16 Jordan Patterson
  2022-04-25 18:56 ` Bart Schaefer
  2022-04-26  1:08 ` Bart Schaefer
  0 siblings, 2 replies; 30+ messages in thread
From: Jordan Patterson @ 2022-04-25 18:16 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

Hi:

This recent fix has led to a performance regression in zsh: 49792:
Non-interative shell input is line buffered.

I had noticed that my shell was loading slower and found this Gentoo
bug report (https://bugs.gentoo.org/839900).  They had recently picked
up this patch to apply to the current 5.8.1 version.

I've benchmarked loading zsh with the test release and a couple of
previous versions with my zsh configuration:

Benchmark 1: prefix/5.8/bin/zsh -i -c exit
  Time (mean ± σ):     262.6 ms ±  10.7 ms    [User: 184.7 ms, System:
66.4 ms]
  Range (min … max):   246.7 ms … 279.4 ms    11 runs

Benchmark 2: prefix/5.8.1/bin/zsh -i -c exit
  Time (mean ± σ):     226.5 ms ±   6.0 ms    [User: 155.7 ms, System:
57.4 ms]
  Range (min … max):   216.6 ms … 238.5 ms    13 runs

Benchmark 3: prefix/5.8.1.2-test/bin/zsh -i -c exit
  Time (mean ± σ):      2.088 s ±  0.027 s    [User: 0.503 s, System: 1.562 s]
  Range (min … max):    2.056 s …  2.149 s    10 runs

Summary
  'prefix/5.8.1/bin/zsh -i -c exit' ran
    1.16 ± 0.06 times faster than 'prefix/5.8/bin/zsh -i -c exit'
    9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit'

I've also generated some flamegraphs of the same commands, which I've
attached.  The test version now spends most of its time doing the
reads.

Jordan

[-- Attachment #2: zsh-5.8.1.svg --]
[-- Type: image/svg+xml, Size: 224043 bytes --]

[-- Attachment #3: zsh-5.8.1.2-test.svg --]
[-- Type: image/svg+xml, Size: 256770 bytes --]

[-- Attachment #4: zsh-5.8.svg --]
[-- Type: image/svg+xml, Size: 260490 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-25 18:16 ZSH performance regression in 5.8.1.2-test Jordan Patterson
@ 2022-04-25 18:56 ` Bart Schaefer
  2022-04-25 19:20   ` Stephane Chazelas
  2022-04-26  1:08 ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-25 18:56 UTC (permalink / raw)
  To: Jordan Patterson; +Cc: Zsh hackers list

On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote:
>
> This recent fix has led to a performance regression in zsh: 49792:
> Non-interative shell input is line buffered.

Hm.  If we use stdio for speed, we have memory management re-entrance
issues that can lead to a crash.

If we use direct read of more than one character at a time, we can't
enforce line buffering.

Is there a way we can detect the case where we need to line-buffer and
avoid it otherwise?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-25 18:56 ` Bart Schaefer
@ 2022-04-25 19:20   ` Stephane Chazelas
  2022-04-25 21:27     ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Stephane Chazelas @ 2022-04-25 19:20 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Jordan Patterson, Zsh hackers list

2022-04-25 11:56:59 -0700, Bart Schaefer:
> On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote:
> >
> > This recent fix has led to a performance regression in zsh: 49792:
> > Non-interative shell input is line buffered.
> 
> Hm.  If we use stdio for speed, we have memory management re-entrance
> issues that can lead to a crash.
> 
> If we use direct read of more than one character at a time, we can't
> enforce line buffering.
> 
> Is there a way we can detect the case where we need to line-buffer and
> avoid it otherwise?
[...]

Sorry if I'm beside the point as I don't know the context here,
but in cases where it's important that we don't read past the
newline character that delimits an input line (like for the read
builtin), some other shells, when the input is seekable do read by
blocks (instead of one byte at a time), and seek back to just
after the newline when they've read too much.

That explains why using "read" on regular files can be
significantly slower in zsh than in ksh/bash.

ksh93 goes even further (but that causes bugs, and I'm not
suggesting zsh goes there) in that the
second "read" on the same fd reuses information that was
retrieved from the previous read.

It also explains why ksh93 uses socketpair() instead of pipe()
for its pipes on Linux, as you can /peek/ at their contents
without consuming them. Again, that comes with its own set of
problems.

-- 
Stephane


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-25 19:20   ` Stephane Chazelas
@ 2022-04-25 21:27     ` Bart Schaefer
  2022-04-26  7:01       ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-25 21:27 UTC (permalink / raw)
  To: Bart Schaefer, Jordan Patterson, Zsh hackers list

On Mon, Apr 25, 2022 at 12:20 PM Stephane Chazelas
<stephane@chazelas.org> wrote:
>
> 2022-04-25 11:56:59 -0700, Bart Schaefer:
> > On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote:
> > >
> > > This recent fix has led to a performance regression in zsh: 49792:
> > > Non-interative shell input is line buffered.

Jumping back to this for a moment:
> > > Summary
> > >  'prefix/5.8.1/bin/zsh -i -c exit' ran
> > >    1.16 ± 0.06 times faster than 'prefix/5.8/bin/zsh -i -c exit'
> > >    9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit'

I hadn't parsed closely before that 5.8.1 is faster than 5.8.  That
means avoiding stdio is faster if we don't have to do line-buffering.

> > Is there a way we can detect the case where we need to line-buffer and
> > avoid it otherwise?
>
> Sorry if I'm beside the point as I don't know the context here,

The context is shell command input, that is, any of
1) loading init files
2) source or . command
3) zsh < file
4) command | zsh

Have I missed any?

The bug from https://bugs.gentoo.org/839900 crops up in a special case
of 3 or 4 when the input runs a sub-command that itself wants to read
from stdin.  In that case the shell is supposed to have stopped
reading at a newline, leaving the rest of the original input available
to be consumed by the sub-command.

>[...] some other shells, when the input is seekable do read by
> blocks (instead of one byte at a time), and seek back to just
> after the newline when they've read too much.

Theoretically we can block-read with impunity in cases 1 and 2 (anyone
disagree?).  Testing for seek-ability would allow doing the "read too
much and back up" trick in case 3.  I don't immediately see any way to
avoid reading one byte at a time in case 4, does anyone have a
suggestion?

It does not appear from some quick tests that lseek(SHIN, 0, SEEK_CUR)
is guaranteed to succeed in case 3 nor to fail with ESPIPE in case 4.
How to distinguish them?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-25 18:16 ZSH performance regression in 5.8.1.2-test Jordan Patterson
  2022-04-25 18:56 ` Bart Schaefer
@ 2022-04-26  1:08 ` Bart Schaefer
  2022-04-26  3:03   ` Jordan Patterson
  1 sibling, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-26  1:08 UTC (permalink / raw)
  To: Jordan Patterson; +Cc: Zsh hackers list

On Mon, Apr 25, 2022 at 11:26 AM Jordan Patterson <jordanp@gmail.com> wrote:
>
>   'prefix/5.8.1/bin/zsh -i -c exit' ran
>     9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit'

I'm curious how large is your interactive startup?  I just compared
5.8.1.2-test with and without the line-buffering patch running
  zsh -xfic 'autoload compinit; compinit -D' 2>>(wc)
and in order to get any significant difference I need to have compiled
with --enable-zsh-debug --enable-zsh-mem-debug.

wc gives 55727 lines / 1858616 characters of xtrace output, which
ought to be enough for a reasonable comparison?

Without debugging:
line buffered:  0.57s user 0.31s system 100% cpu 0.883 total
block read:  0.56s user 0.30s system 100% cpu 0.862 total

With debugging:
line buffered:  1.21s user 1.44s system 72% cpu 3.669 total
block read:  0.80s user 0.93s system 100% cpu 1.720 total


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26  1:08 ` Bart Schaefer
@ 2022-04-26  3:03   ` Jordan Patterson
  0 siblings, 0 replies; 30+ messages in thread
From: Jordan Patterson @ 2022-04-26  3:03 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Mon, Apr 25, 2022 at 7:09 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> I'm curious how large is your interactive startup?  I just compared
> 5.8.1.2-test with and without the line-buffering patch running
>   zsh -xfic 'autoload compinit; compinit -D' 2>>(wc)
> and in order to get any significant difference I need to have compiled
> with --enable-zsh-debug --enable-zsh-mem-debug.
>
> wc gives 55727 lines / 1858616 characters of xtrace output, which
> ought to be enough for a reasonable comparison?

Running that gives me 56619 lines / 1866092 characters.  If I run a
benchmark with your command (without tracing), I do see some slowdown,
but not as dramatic a difference as when loading my normal zshrc with
plugins.

Benchmark 1: prefix/5.8/bin/zsh -fic "autoload compinit; compinit -D"
  Time (mean ± σ):     295.4 ms ±   4.2 ms    [User: 181.9 ms, System: 114.0 ms]
  Range (min … max):   289.0 ms … 299.8 ms    10 runs

Benchmark 2: prefix/5.8.1/bin/zsh -fic "autoload compinit; compinit -D"
  Time (mean ± σ):     295.1 ms ±   3.4 ms    [User: 183.4 ms, System: 112.1 ms]
  Range (min … max):   289.8 ms … 299.5 ms    10 runs

Benchmark 3: prefix/5.8.1.2-test/bin/zsh -fic "autoload compinit; compinit -D"
  Time (mean ± σ):     365.3 ms ±   4.2 ms    [User: 204.8 ms, System: 160.4 ms]
  Range (min … max):   358.7 ms … 373.1 ms    10 runs

Summary
  'prefix/5.8.1/bin/zsh -fic "autoload compinit; compinit -D"' ran
    1.00 ± 0.02 times faster than 'prefix/5.8/bin/zsh -fic "autoload
compinit; compinit -D"'
    1.24 ± 0.02 times faster than 'prefix/5.8.1.2-test/bin/zsh -fic
"autoload compinit; compinit -D"'


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-25 21:27     ` Bart Schaefer
@ 2022-04-26  7:01       ` Bart Schaefer
  2022-04-26  8:31         ` Peter Stephenson
                           ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Bart Schaefer @ 2022-04-26  7:01 UTC (permalink / raw)
  To: Jordan Patterson, Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 391 bytes --]

On Mon, Apr 25, 2022 at 2:27 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> Theoretically we can block-read with impunity in cases 1 and 2 (anyone
> disagree?).  Testing for seek-ability would allow doing the "read too
> much and back up" trick in case 3.  I don't immediately see any way to
> avoid reading one byte at a time in case 4, does anyone have a
> suggestion?

Try this?

[-- Attachment #2: sysread.txt --]
[-- Type: text/plain, Size: 1100 bytes --]

diff --git a/Src/input.c b/Src/input.c
index c59232681..f904427a0 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -217,12 +217,36 @@ shinbufrestore(void)
 static int
 shingetchar(void)
 {
-    int nread;
+    int nread, rsize = 1;
 
     if (shinbufptr < shinbufendptr)
 	return STOUC(*shinbufptr++);
-
+#ifdef HAVE_FSTAT
+    else {
+      struct stat st;
+      if (fstat(SHIN, &st) == 0 && !S_ISFIFO(st.st_mode))
+	  rsize = SHINBUFSIZE;
+    }
+#endif
     shinbufreset();
+    if (rsize > 1) {
+	do {
+	    errno = 0;
+	    nread = read(SHIN, shinbuffer, rsize);
+	} while (nread < 0 && errno == EINTR);
+	if (nread <= 0)
+	    return -1;
+	if (isset(SHINSTDIN) &&
+	    (shinbufendptr = memchr(shinbuffer, '\n', nread))) {
+	    shinbufendptr++;
+	    rsize = (shinbufendptr - shinbuffer);
+	    if (nread > rsize &&
+		lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0)
+		zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno);
+	} else
+	    shinbufendptr = shinbuffer + nread;
+	return STOUC(*shinbufptr++);
+    }
     for (;;) {
        errno = 0;
        nread = read(SHIN, shinbufendptr, 1);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26  7:01       ` Bart Schaefer
@ 2022-04-26  8:31         ` Peter Stephenson
  2022-04-27  0:33           ` Bart Schaefer
  2022-04-27 14:11           ` Stephane Chazelas
  2022-04-26 14:31         ` Jun. T
  2022-04-27 19:54         ` Jordan Patterson
  2 siblings, 2 replies; 30+ messages in thread
From: Peter Stephenson @ 2022-04-26  8:31 UTC (permalink / raw)
  To: Zsh hackers list

> On 26 April 2022 at 08:01 Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Mon, Apr 25, 2022 at 2:27 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > Theoretically we can block-read with impunity in cases 1 and 2 (anyone
> > disagree?).  Testing for seek-ability would allow doing the "read too
> > much and back up" trick in case 3.  I don't immediately see any way to
> > avoid reading one byte at a time in case 4, does anyone have a
> > suggestion?
> 
> Try this?

So the supposition is if it's not a pipe, providing we can detect that, it
can be optimised to a block read an a seek.  That ought to work most of the
time; presumably the thing is to think of nasty edge cases where the test
isn't good enough.  You probably get away with it most of the time even there,
in practice.

Not sure how much difference it makes but to avoid additional system calls
we could cache the result, invalidating it when we open a new file.  (We
could even cache whether a seek unexpectedly failed, but that might be
getting too paranoid.)

pws


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26  7:01       ` Bart Schaefer
  2022-04-26  8:31         ` Peter Stephenson
@ 2022-04-26 14:31         ` Jun. T
  2022-04-26 15:15           ` Peter Stephenson
  2022-04-27  0:38           ` Bart Schaefer
  2022-04-27 19:54         ` Jordan Patterson
  2 siblings, 2 replies; 30+ messages in thread
From: Jun. T @ 2022-04-26 14:31 UTC (permalink / raw)
  To: zsh-workers


> 2022/04/26 16:01, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> Try this?
> <sysread.txt>
--- a/Src/input.c
+++ b/Src/input.c
(snip)
+#ifdef HAVE_FSTAT
+    else {
+      struct stat st;
+      if (fstat(SHIN, &st) == 0 && !S_ISFIFO(st.st_mode))
+         rsize = SHINBUFSIZE;
+    }
+#endif

It works for
printf '%s\n' 'echo $$' sh 'echo $$' | zsh

But in theory (yes, just in theory), SHIN can be a socket.
With two terminals A and B (zsh/socket already zmodload'ed):

A% zsocket -l /tmp/tmpsocket
A% zsocket -a $REPLY

B% zsocket /tmp/tmpsocket

A% zsh <&$REPLY

B% printf '%s\n' 'echo $$' sh 'echo $$' >&$REPLY

then on the terminal A:
zsh: lseek(0, -11): illegal seek


Adding !S_ISSOCK(st.st_mode) solves this, of course.
But at least on my Mac the following seems to work also:

    if (lseek(SHIN, 0, SEEK_CUR) == 0)
          rsize = SHINBUFSIZE;

# Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails
# when it shouldn't fail, or does not fail when it should fail?
If you think using fstat() is safer then I have no objection.

PS
Either fstat()/S_ISFIFO() or lseek() works when SHIN is a FIFO:
A% mkfifo fifo
A% zsh < fifo
B% printf '%s\n' 'echo $$' sh 'echo $$' > fifo


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26 14:31         ` Jun. T
@ 2022-04-26 15:15           ` Peter Stephenson
  2022-04-27  0:55             ` Bart Schaefer
  2022-04-27  0:38           ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: Peter Stephenson @ 2022-04-26 15:15 UTC (permalink / raw)
  To: zsh-workers

> On 26 April 2022 at 15:31 "Jun. T" <takimoto-j@kba.biglobe.ne.jp> wrote:
> But at least on my Mac the following seems to work also:
> 
>     if (lseek(SHIN, 0, SEEK_CUR) == 0)
>           rsize = SHINBUFSIZE;
> 
> # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails
> # when it shouldn't fail, or does not fail when it should fail?

We could use some variant of the following as a configure test for safety.
(Not sure if the socket test is comprehensive enough, but we must be
really into edge cases at that point.)  If we get the equivalent of

lseek on pipe failed
lseek on UNIX domain socket failed
lseek on regular file succeeded

presumably we're safe to use lseek as a test, and if we don't we can
decide whether to fall back to fstat or just leave unoptimised.

pws


#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/socket.h>

int main(int argc, char **argv)
{
    int pipefd[2], fd;
    if (pipe(pipefd) < 0)
    {
	printf("creating pipe failed\n");
    }
    else if (lseek(pipefd[0], 0, SEEK_CUR) == (off_t)-1)
    {
	printf("lseek on pipe failed\n");
    }
    else
    {
	printf("lseek on pipe succeeded\n");
    }
    close(pipefd[0]);
    close(pipefd[1]);

    fd = socket(AF_UNIX, SOCK_STREAM, 0);
    if (fd < 0)
    {
	printf("creating UNIX domain socket failed\n");
    }
    else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1)
    {
	printf("lseek on UNIX domain socket failed\n");
    }
    else
    {
	printf("lseek on UNIX domain socket succeeded\n");
    }
    close(fd);

    fd = open("seekfiletest.tmp", O_CREAT);
    if (fd < 0)
    {
	printf("creating file failed\n");
    }
    else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1)
    {
	printf("lseek on regular file failed\n");
    }
    else
    {
	printf("lseek on regular file succeeded\n");
    }
    close(fd);

    return 0;
}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26  8:31         ` Peter Stephenson
@ 2022-04-27  0:33           ` Bart Schaefer
  2022-04-27 14:11           ` Stephane Chazelas
  1 sibling, 0 replies; 30+ messages in thread
From: Bart Schaefer @ 2022-04-27  0:33 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Tue, Apr 26, 2022 at 1:32 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
>
> Not sure how much difference it makes but to avoid additional system calls
> we could cache the result, invalidating it when we open a new file.

I considered that but there's no good way to isolate it to input.c ...
SHIN could be the same FD number across open/close/open, for example.

However, I think isset(SHINSTDIN) could be tested earlier in
shingetchar to reduce the number of system calls.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26 14:31         ` Jun. T
  2022-04-26 15:15           ` Peter Stephenson
@ 2022-04-27  0:38           ` Bart Schaefer
  2022-04-27  9:34             ` Peter Stephenson
  1 sibling, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-27  0:38 UTC (permalink / raw)
  To: Jun. T; +Cc: Zsh hackers list

On Tue, Apr 26, 2022 at 7:37 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
> But at least on my Mac the following seems to work also:
>
>     if (lseek(SHIN, 0, SEEK_CUR) == 0)
>           rsize = SHINBUFSIZE;
>
> # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails
> # when it shouldn't fail, or does not fail when it should fail?

I was pretty sure I'd found a case (on Ubuntu 20.04) where it
succeeded on a pipe when there was already data written to the pipe
before the seek was attempted.  That was only/exactly for (0,
SEEK_CUR).

However, I can check that again.  If that is the false-success mode,
it won't be tested by PWS's little C program in workers/50108


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26 15:15           ` Peter Stephenson
@ 2022-04-27  0:55             ` Bart Schaefer
  2022-04-27  9:16               ` Jun T
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-27  0:55 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 285 bytes --]

On Tue, Apr 26, 2022 at 8:16 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
>
> We could use some variant of the following as a configure test for safety.

I will be happiest if someone else plugs that in to configure.ac.

Here's the patch again, replacing fstat with lseek.

[-- Attachment #2: sysread.txt --]
[-- Type: text/plain, Size: 1062 bytes --]

diff --git a/Src/input.c b/Src/input.c
index c59232681..8e0a24f92 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -217,12 +217,31 @@ shinbufrestore(void)
 static int
 shingetchar(void)
 {
-    int nread;
+    int nread, rsize = isset(SHINSTDIN) ? 1 : SHINBUFSIZE;
 
     if (shinbufptr < shinbufendptr)
 	return STOUC(*shinbufptr++);
-
+    else if (rsize == 1 && lseek(SHIN, 0, SEEK_CUR) == 0)
+	rsize = SHINBUFSIZE;
     shinbufreset();
+    if (rsize > 1) {
+	do {
+	    errno = 0;
+	    nread = read(SHIN, shinbuffer, rsize);
+	} while (nread < 0 && errno == EINTR);
+	if (nread <= 0)
+	    return -1;
+	if (isset(SHINSTDIN) &&
+	    (shinbufendptr = memchr(shinbuffer, '\n', nread))) {
+	    shinbufendptr++;
+	    rsize = (shinbufendptr - shinbuffer);
+	    if (nread > rsize &&
+		lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0)
+		zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno);
+	} else
+	    shinbufendptr = shinbuffer + nread;
+	return STOUC(*shinbufptr++);
+    }
     for (;;) {
        errno = 0;
        nread = read(SHIN, shinbufendptr, 1);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27  0:55             ` Bart Schaefer
@ 2022-04-27  9:16               ` Jun T
  0 siblings, 0 replies; 30+ messages in thread
From: Jun T @ 2022-04-27  9:16 UTC (permalink / raw)
  To: zsh-workers


> 2022/04/27 9:55, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Tue, Apr 26, 2022 at 8:16 AM Peter Stephenson
> <p.w.stephenson@ntlworld.com> wrote:
>> 
>> We could use some variant of the following as a configure test for safety.
> 
> I will be happiest if someone else plugs that in to configure.ac.

Do you mean that trying lseek() for a just opened fd is enough?
If so, I will add it to configure.ac.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27  0:38           ` Bart Schaefer
@ 2022-04-27  9:34             ` Peter Stephenson
  2022-04-27 10:28               ` Jun T
  0 siblings, 1 reply; 30+ messages in thread
From: Peter Stephenson @ 2022-04-27  9:34 UTC (permalink / raw)
  To: Zsh hackers list

> On 27 April 2022 at 01:38 Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Tue, Apr 26, 2022 at 7:37 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
> >
> > But at least on my Mac the following seems to work also:
> >
> >     if (lseek(SHIN, 0, SEEK_CUR) == 0)
> >           rsize = SHINBUFSIZE;
> >
> > # Have you found a case in which lseek(SHIN, 0, SEEK_CUR) fails
> > # when it shouldn't fail, or does not fail when it should fail?
> 
> I was pretty sure I'd found a case (on Ubuntu 20.04) where it
> succeeded on a pipe when there was already data written to the pipe
> before the seek was attempted.  That was only/exactly for (0,
> SEEK_CUR).

A simple test for this still caused the seek on the type to fail
here, but I could be missing the  key elements.  Should still
be a safe test with only a few bytes in the pipe, I would think.

pws


#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/socket.h>

int main(int argc, char **argv)
{
    int pipefd[2], fd;
    if (pipe(pipefd) < 0)
    {
	printf("creating pipe failed\n");
    }
    else
    {
	char stuff[9] = "abcdefgh";
	write(pipefd[1], stuff, 8);
	if (lseek(pipefd[0], 0, SEEK_CUR) == (off_t)-1)
	{
	    printf("lseek on pipe failed\n");
	}
	else
	{
	    printf("lseek on pipe succeeded\n");
	}
    }
    close(pipefd[0]);
    close(pipefd[1]);

    fd = socket(AF_UNIX, SOCK_STREAM, 0);
    if (fd < 0)
    {
	printf("creating UNIX domain socket failed\n");
    }
    else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1)
    {
	printf("lseek on UNIX domain socket failed\n");
    }
    else
    {
	printf("lseek on UNIX domain socket succeeded\n");
    }
    close(fd);

    fd = open("seekfiletest.tmp", O_CREAT);
    if (fd < 0)
    {
	printf("creating file failed\n");
    }
    else if (lseek(fd, 0, SEEK_CUR) == (off_t)-1)
    {
	printf("lseek on regular file failed\n");
    }
    else
    {
	printf("lseek on regular file succeeded\n");
    }
    close(fd);

    return 0;
}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27  9:34             ` Peter Stephenson
@ 2022-04-27 10:28               ` Jun T
  2022-04-27 12:42                 ` Jun T
  2022-04-27 13:58                 ` Jun T
  0 siblings, 2 replies; 30+ messages in thread
From: Jun T @ 2022-04-27 10:28 UTC (permalink / raw)
  To: zsh-workers


> 2022/04/27 18:34, Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> 
>> On 27 April 2022 at 01:38 Bart Schaefer <schaefer@brasslantern.com> wrote:
>> 
>> I was pretty sure I'd found a case (on Ubuntu 20.04) where it
>> succeeded on a pipe when there was already data written to the pipe
>> before the seek was attempted.  That was only/exactly for (0,
>> SEEK_CUR).
> 
> A simple test for this still caused the seek on the type to fail
> here, but I could be missing the  key elements.  Should still
> be a safe test with only a few bytes in the pipe, I would think.

OK, I will add this to configure.ac.


> 2022/04/26 23:31, I wrote:
> 
> But at least on my Mac the following seems to work also:
> 
>    if (lseek(SHIN, 0, SEEK_CUR) == 0)

Sorry, it should have been:
    if (lseek(SHIN, 0, SEEK_CUR) != (off_t)_1)

I will modify Bart's patch and post it with the patch for configure.ac.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 10:28               ` Jun T
@ 2022-04-27 12:42                 ` Jun T
  2022-04-27 13:58                 ` Jun T
  1 sibling, 0 replies; 30+ messages in thread
From: Jun T @ 2022-04-27 12:42 UTC (permalink / raw)
  To: zsh-workers

This is (Bart's patch for input.c, with my error corrected)
  + (patch for configure.ac, suggested by Peter).

I've tested only on macOS.

If open()/pipe()/socket() fails (should not happen), I assume
we can't use lseek().


diff --git a/Src/input.c b/Src/input.c
index c59232681..9898a7177 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -217,12 +217,34 @@ shinbufrestore(void)
 static int
 shingetchar(void)
 {
-    int nread;
+    int nread, rsize = isset(SHINSTDIN) ? 1 : SHINBUFSIZE;
 
     if (shinbufptr < shinbufendptr)
 	return STOUC(*shinbufptr++);
 
     shinbufreset();
+#ifdef USE_LSEEK
+    if (rsize == 1 && lseek(SHIN, 0, SEEK_CUR) != (off_t)-1)
+	rsize = SHINBUFSIZE;
+    if (rsize > 1) {
+	do {
+	    errno = 0;
+	    nread = read(SHIN, shinbuffer, rsize);
+	} while (nread < 0 && errno == EINTR);
+	if (nread <= 0)
+	    return -1;
+	if (isset(SHINSTDIN) &&
+	    (shinbufendptr = memchr(shinbuffer, '\n', nread))) {
+	    shinbufendptr++;
+	    rsize = (shinbufendptr - shinbuffer);
+	    if (nread > rsize &&
+		lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0)
+		zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno);
+	} else
+	    shinbufendptr = shinbuffer + nread;
+	return STOUC(*shinbufptr++);
+    }
+#endif
     for (;;) {
        errno = 0;
        nread = read(SHIN, shinbufendptr, 1);
diff --git a/configure.ac b/configure.ac
index 8bba78c56..fdf1bc86e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2209,6 +2209,64 @@ if test x$zsh_cv_sys_fifo = xyes; then
   AC_DEFINE(HAVE_FIFOS)
 fi
 
+dnl -----------
+dnl check that lseek() correctly reports seekability.
+dnl -----------
+AC_CACHE_CHECK(if lseek() correctly reports seekability,
+zsh_cv_sys_lseek,
+[AC_RUN_IFELSE([AC_LANG_SOURCE([[
+#include <stdio.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/socket.h>
+int main() {
+    int pipefd[2], fd;
+    off_t ret;
+    char* tmpfile = "seekfiletest.tmp";
+    if ((fd = open(tmpfile, O_CREAT)) < 0) {
+	fprintf(stderr, "creating file failed\n");
+	return 1;
+    }
+    ret = lseek(fd, 0, SEEK_CUR);
+    close(fd);
+    unlink(tmpfile);
+    if (ret == (off_t)-1) {
+	fprintf(stderr, "lseek on regular file failed\n");
+	return 1;
+    }
+    if (pipe(pipefd) < 0) {
+	fprintf(stderr, "creating pipe failed\n");
+	return 1;
+    }
+    write(pipefd[1], "abcdefgh", 8);
+    ret = lseek(pipefd[0], 0, SEEK_CUR);
+    close(pipefd[0]);
+    close(pipefd[1]);
+    if (ret != (off_t)-1) {
+	fprintf(stderr, "lseek on pipe succeeded\n");
+	return 1;
+    }
+    if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
+	fprintf(stderr, "creating UNIX domain socket failed\n");
+	return 1;
+    }
+    ret = lseek(fd, 0, SEEK_CUR);
+    close(fd);
+    if (ret != (off_t)-1) {
+	fprintf(stderr, "lseek on UNIX domain socket succeeded\n");
+	return 1;
+    }
+    return 0;
+}
+]])],[zsh_cv_sys_lseek=yes],[zsh_cv_sys_lseek=no],[zsh_cv_sys_lseek=yes])
+])
+AH_TEMPLATE([USE_LSEEK],
+[Define to 1 if lseek() can be used for SHIN.])
+if test x$zsh_cv_sys_lseek = xyes; then
+  AC_DEFINE(USE_LSEEK)
+fi
+
 dnl -----------
 dnl test for whether link() works
 dnl for instance, BeOS R4.51 doesn't support hard links yet





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 10:28               ` Jun T
  2022-04-27 12:42                 ` Jun T
@ 2022-04-27 13:58                 ` Jun T
  2022-04-27 15:25                   ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: Jun T @ 2022-04-27 13:58 UTC (permalink / raw)
  To: zsh-workers

I did a few test to see the performance improvement.

mkdir tmpdir && cd tmpdir
N=400000
for ((i=0; i<N; ++i)); do
    echo '# foo bar boo foo bar boo'
done > tmp.txt
ln -s tmp.txt .zshrc

tmp.txt contains 400,000 lines of comments (and nothing else).
tmpdir/.zshrc is the symlink to tmp.txt.

Then run the following 5 test (in the tmpdir):
(1) time ZDOTDIR=. zsh -ic exit
(2) time zsh -f tmp.txt
(3) time zsh -f < tmp.txt
(4) time echo 'source tmp.txt' | zsh -f
(5) time cat tmp.txt | zsh -f

total (approximate) CPU seconds are (tested on my iMac):

                     (1)   (2)   (3)   (4)   (5)
zsh-5.8             0.9   0.8   6.6   0.8   8.2
zsh-5.8.1           0.6   0.5   0.7   0.5   0.8
zsh-5.8.1.2-test    5.7   5.5   5.7   5.5   7.3
lseek() patch       0.7   0.5   1.4   0.5   7.8
bash                ---   0.5   0.5   0.5   6.8   (bash --norc)

(2) and (4) are almost equivalent, as it should be.

With lseek() patch we get "reasonable" performance for (1)-(4).
(3) is somewhat slower, because it is reading form stdin and
need to call lseek() many times.

The column (1) in the table above shows that both 5.8.1 and
lseek() patch are about 8-10 times faster than 5.8.1.2-test.
This ratio (about 8-10), obtained with .zshrc that contains
400,000 lines of comments, is close to the ratio reported by
Jordan with his .zshrc:

> 2022/04/26 3:16, Jordan Patterson <jordanp@gmail.com> wrote:
> 
> Summary
>  'prefix/5.8.1/bin/zsh -i -c exit' ran
>    1.16 ± 0.06 times faster than 'prefix/5.8/bin/zsh -i -c exit'
>    9.22 ± 0.27 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit'


Jordan, what do you get by the following?
zsh -xic exit 2>>(wc)
Can you try the lseek() patch (in my previous post, 50115⁩)?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26  8:31         ` Peter Stephenson
  2022-04-27  0:33           ` Bart Schaefer
@ 2022-04-27 14:11           ` Stephane Chazelas
  2022-04-27 15:02             ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: Stephane Chazelas @ 2022-04-27 14:11 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

2022-04-26 09:31:55 +0100, Peter Stephenson:
[...]
> Not sure how much difference it makes but to avoid additional system calls
> we could cache the result, invalidating it when we open a new file.  (We
> could even cache whether a seek unexpectedly failed, but that might be
> getting too paranoid.)
[...]

FYI, see https://github.com/att/ast/issues/15 for the kind of
issue you can find with ksh93 caching read data.

IMO, if the point is make sure we leave the position within the
input past the delimiter of the line we have just read, so that
other things can read it if they want, then it's wrong to cache,
as those other things could just as well overwrite things.

It's also probably not going to get you that much of a benefit
as the OS will have cached the data as well.

-- 
Stephane


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 14:11           ` Stephane Chazelas
@ 2022-04-27 15:02             ` Bart Schaefer
  2022-04-27 15:07               ` Peter Stephenson
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-27 15:02 UTC (permalink / raw)
  To: Peter Stephenson, Zsh hackers list

On Wed, Apr 27, 2022 at 7:12 AM Stephane Chazelas <stephane@chazelas.org> wrote:
>
> 2022-04-26 09:31:55 +0100, Peter Stephenson:
> [...]
> > Not sure how much difference it makes but to avoid additional system calls
> > we could cache the result, invalidating it when we open a new file.
>
> FYI, see https://github.com/att/ast/issues/15 for the kind of
> issue you can find with ksh93 caching read data.

I think the idea was to cache the result of the "can we lseek" test,
not the data that was read.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 15:02             ` Bart Schaefer
@ 2022-04-27 15:07               ` Peter Stephenson
  2022-04-27 15:17                 ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Peter Stephenson @ 2022-04-27 15:07 UTC (permalink / raw)
  To: Zsh hackers list

> On 27 April 2022 at 16:02 Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Wed, Apr 27, 2022 at 7:12 AM Stephane Chazelas <stephane@chazelas.org> wrote:
> >
> > 2022-04-26 09:31:55 +0100, Peter Stephenson:
> > [...]
> > > Not sure how much difference it makes but to avoid additional system calls
> > > we could cache the result, invalidating it when we open a new file.
> >
> > FYI, see https://github.com/att/ast/issues/15 for the kind of
> > issue you can find with ksh93 caching read data.
> 
> I think the idea was to cache the result of the "can we lseek" test,
> not the data that was read.

Yes, I wouldn't expect that to change until we reopen the channel,
which is detectable locally within input.c as we need to set up
the pointers at the same time.

pws


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 15:07               ` Peter Stephenson
@ 2022-04-27 15:17                 ` Bart Schaefer
  0 siblings, 0 replies; 30+ messages in thread
From: Bart Schaefer @ 2022-04-27 15:17 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Wed, Apr 27, 2022 at 8:08 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
>
> > I think the idea was to cache the result of the "can we lseek" test,
>
> Yes, I wouldn't expect that to change until we reopen the channel,
> which is detectable locally within input.c as we need to set up
> the pointers at the same time.

All assignments to SHIN are in init.c, so it'd probably at least be
necessary to pass an argument to shinbufalloc() et al.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 13:58                 ` Jun T
@ 2022-04-27 15:25                   ` Bart Schaefer
  2022-04-27 16:18                     ` Jun. T
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-27 15:25 UTC (permalink / raw)
  To: Zsh hackers list

On Wed, Apr 27, 2022 at 7:00 AM Jun T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
> Then run the following 5 test (in the tmpdir):
> (1) time ZDOTDIR=. zsh -ic exit
> (2) time zsh -f tmp.txt
> (3) time zsh -f < tmp.txt
> (4) time echo 'source tmp.txt' | zsh -f
> (5) time cat tmp.txt | zsh -f

Does it matter to (3) if "zsh -fs" ?

> With lseek() patch we get "reasonable" performance for (1)-(4).
> (3) is somewhat slower, because it is reading form stdin and
> need to call lseek() many times.

I wonder if it would help to read less than SHINBUFSIZE when
SHINSTDIN.  E.g., make some guess at the "average line length" and
read that many bytes.  Might cause more calls to memchr() but less I/O
(and shorter lseek()s, if that matters).


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 15:25                   ` Bart Schaefer
@ 2022-04-27 16:18                     ` Jun. T
  0 siblings, 0 replies; 30+ messages in thread
From: Jun. T @ 2022-04-27 16:18 UTC (permalink / raw)
  To: zsh-workers


> 2022/04/28 0:25, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Wed, Apr 27, 2022 at 7:00 AM Jun T <takimoto-j@kba.biglobe.ne.jp> wrote:
>> 
>> (3) time zsh -f < tmp.txt
> 
> Does it matter to (3) if "zsh -fs" ?

No, no noticeable difference.

>> With lseek() patch we get "reasonable" performance for (1)-(4).
>> (3) is somewhat slower, because it is reading form stdin and
>> need to call lseek() many times.
> 
> I wonder if it would help to read less than SHINBUFSIZE when
> SHINSTDIN.  E.g., make some guess at the "average line length" and
> read that many bytes.  Might cause more calls to memchr() but less I/O
> (and shorter lseek()s, if that matters).

I tried setting SHINBUFSIZE to 80 for (3) (with lseek() patch):
SHINBUFSIZE=8192  about 1.3 sec
SHINBUFSIZE=80    about 1.2 sec    (slight improvement)
(I'm now using a different Mac from the one used in the previous post)

I guess data is cached by the OS and read() is, in most cases, something
like memcpy(). I was running the test on a internal SDD of my iMac, but
running the test on a external HDD gives almost the same results.
If the file is on a "much slower" filesystem then the difference may be
somewhat larger.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-26  7:01       ` Bart Schaefer
  2022-04-26  8:31         ` Peter Stephenson
  2022-04-26 14:31         ` Jun. T
@ 2022-04-27 19:54         ` Jordan Patterson
  2022-04-28  9:53           ` Jun T
  2022-04-28 18:51           ` Jun. T
  2 siblings, 2 replies; 30+ messages in thread
From: Jordan Patterson @ 2022-04-27 19:54 UTC (permalink / raw)
  Cc: Zsh hackers list

I'm not subscribed to the mailing list so I'm missing Jun's original
email, but I'll reply here.

> Jordan, what do you get by the following?
> zsh -xic exit 2>>(wc)

I get 13007 lines, 1534161 characters.

> Can you try the lseek() patch (in my previous post, 50115⁩)?

Sure.  I needed to add a mode argument to the open in your lseek
configure check.  I get this error otherwise on my system:

error: call to '__open_missing_mode' declared with attribute error:
open with o_creat or o_tmpfile in second argument needs 3 arguments

My benchmark results:

Benchmark 1: prefix/5.8/bin/zsh -i -c exit
  Time (mean ± σ):     254.6 ms ±  10.0 ms    [User: 174.6 ms, System: 68.0 ms]
  Range (min … max):   240.1 ms … 271.7 ms    10 runs

Benchmark 2: prefix/5.8.1/bin/zsh -i -c exit
  Time (mean ± σ):     251.7 ms ±   8.8 ms    [User: 179.2 ms, System: 60.0 ms]
  Range (min … max):   233.3 ms … 262.5 ms    11 runs

Benchmark 3: prefix/5.8.1.2-test/bin/zsh -i -c exit
  Time (mean ± σ):      2.583 s ±  0.053 s    [User: 0.516 s, System: 2.033 s]
  Range (min … max):    2.540 s …  2.687 s    10 runs

Benchmark 4: prefix/5.8.1.2-test-lseek/bin/zsh -i -c exit
  Time (mean ± σ):     283.8 ms ±   8.0 ms    [User: 202.7 ms, System: 70.4 ms]
  Range (min … max):   273.0 ms … 300.3 ms    10 runs

Summary
  'prefix/5.8.1/bin/zsh -i -c exit' ran
    1.01 ± 0.05 times faster than 'prefix/5.8/bin/zsh -i -c exit'
    1.13 ± 0.05 times faster than 'prefix/5.8.1.2-test-lseek/bin/zsh -i -c exit'
   10.27 ± 0.41 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit'

On Tue, Apr 26, 2022 at 1:01 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Mon, Apr 25, 2022 at 2:27 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > Theoretically we can block-read with impunity in cases 1 and 2 (anyone
> > disagree?).  Testing for seek-ability would allow doing the "read too
> > much and back up" trick in case 3.  I don't immediately see any way to
> > avoid reading one byte at a time in case 4, does anyone have a
> > suggestion?
>
> Try this?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 19:54         ` Jordan Patterson
@ 2022-04-28  9:53           ` Jun T
  2022-04-28 14:56             ` Bart Schaefer
  2022-04-28 18:51           ` Jun. T
  1 sibling, 1 reply; 30+ messages in thread
From: Jun T @ 2022-04-28  9:53 UTC (permalink / raw)
  To: zsh-workers; +Cc: jordanp


> 2022/04/28 4:54, Jordan Patterson <jordanp@gmail.com> wrote:
> 
>> zsh -xic exit 2>>(wc)
> 
> I get 13007 lines, 1534161 characters.

So your .zshrc (and anything loaded from it) is much smaller than the
tmp.txt I used, but:

>> Can you try the lseek() patch (in my previous post, 50115⁩)?
> Summary
>  'prefix/5.8.1/bin/zsh -i -c exit' ran
>    1.01 ± 0.05 times faster than 'prefix/5.8/bin/zsh -i -c exit'
>    1.13 ± 0.05 times faster than 'prefix/5.8.1.2-test-lseek/bin/zsh -i -c exit'
>   10.27 ± 0.41 times faster than 'prefix/5.8.1.2-test/bin/zsh -i -c exit'

it was about 10 times slower in 5.8.1.2-test (compared with 5.8),
and now with the Bart's lseek patch it is just about 10% slower.
I think 10% is acceptable.

> I needed to add a mode argument to the open in your lseek
> configure check.

Thanks!  I will use the following:
     open(tmpfile, O_CREAT, S_IRUSR)



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-28  9:53           ` Jun T
@ 2022-04-28 14:56             ` Bart Schaefer
  0 siblings, 0 replies; 30+ messages in thread
From: Bart Schaefer @ 2022-04-28 14:56 UTC (permalink / raw)
  To: Jun T; +Cc: Zsh hackers list, Jordan Patterson

On Thu, Apr 28, 2022 at 2:54 AM Jun T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
> it was about 10 times slower in 5.8.1.2-test (compared with 5.8),
> and now with the Bart's lseek patch it is just about 10% slower.
> I think 10% is acceptable.

Indeed.  Would be interesting to compare to 5.7.something which used fgetc.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-27 19:54         ` Jordan Patterson
  2022-04-28  9:53           ` Jun T
@ 2022-04-28 18:51           ` Jun. T
  2022-04-29  0:28             ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: Jun. T @ 2022-04-28 18:51 UTC (permalink / raw)
  To: zsh-workers


> 2022/04/28 4:54, Jordan Patterson <jordanp@gmail.com> wrote:
> 
> I needed to add a mode argument to the open in your lseek
> configure check.

Added S_IRUSR to open(), and include <sys/stat.h> for it.

I reproduce the whole patch here for convenience.
Tested on Fedora-35, macOS, FreeBSD and Cygwin.


diff --git a/Src/input.c b/Src/input.c
index c59232681..9898a7177 100644
--- a/Src/input.c
+++ b/Src/input.c
@@ -217,12 +217,34 @@ shinbufrestore(void)
 static int
 shingetchar(void)
 {
-    int nread;
+    int nread, rsize = isset(SHINSTDIN) ? 1 : SHINBUFSIZE;
 
     if (shinbufptr < shinbufendptr)
 	return STOUC(*shinbufptr++);
 
     shinbufreset();
+#ifdef USE_LSEEK
+    if (rsize == 1 && lseek(SHIN, 0, SEEK_CUR) != (off_t)-1)
+	rsize = SHINBUFSIZE;
+    if (rsize > 1) {
+	do {
+	    errno = 0;
+	    nread = read(SHIN, shinbuffer, rsize);
+	} while (nread < 0 && errno == EINTR);
+	if (nread <= 0)
+	    return -1;
+	if (isset(SHINSTDIN) &&
+	    (shinbufendptr = memchr(shinbuffer, '\n', nread))) {
+	    shinbufendptr++;
+	    rsize = (shinbufendptr - shinbuffer);
+	    if (nread > rsize &&
+		lseek(SHIN, -(nread - rsize), SEEK_CUR) < 0)
+		zerr("lseek(%d, %d): %e", SHIN, -(nread - rsize), errno);
+	} else
+	    shinbufendptr = shinbuffer + nread;
+	return STOUC(*shinbufptr++);
+    }
+#endif
     for (;;) {
        errno = 0;
        nread = read(SHIN, shinbufendptr, 1);
diff --git a/configure.ac b/configure.ac
index 8bba78c56..c72148d06 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2209,6 +2209,65 @@ if test x$zsh_cv_sys_fifo = xyes; then
   AC_DEFINE(HAVE_FIFOS)
 fi
 
+dnl -----------
+dnl check that lseek() correctly reports seekability.
+dnl -----------
+AC_CACHE_CHECK(if lseek() correctly reports seekability,
+zsh_cv_sys_lseek,
+[AC_RUN_IFELSE([AC_LANG_SOURCE([[
+#include <stdio.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/socket.h>
+#include <sys/stat.h>
+int main() {
+    int pipefd[2], fd;
+    off_t ret;
+    char* tmpfile = "seekfiletest.tmp";
+    if ((fd = open(tmpfile, O_CREAT, S_IRUSR)) < 0) {
+	fprintf(stderr, "creating file failed\n");
+	return 1;
+    }
+    ret = lseek(fd, 0, SEEK_CUR);
+    close(fd);
+    unlink(tmpfile);
+    if (ret == (off_t)-1) {
+	fprintf(stderr, "lseek on regular file failed\n");
+	return 1;
+    }
+    if (pipe(pipefd) < 0) {
+	fprintf(stderr, "creating pipe failed\n");
+	return 1;
+    }
+    write(pipefd[1], "abcdefgh", 8);
+    ret = lseek(pipefd[0], 0, SEEK_CUR);
+    close(pipefd[0]);
+    close(pipefd[1]);
+    if (ret != (off_t)-1) {
+	fprintf(stderr, "lseek on pipe succeeded\n");
+	return 1;
+    }
+    if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
+	fprintf(stderr, "creating UNIX domain socket failed\n");
+	return 1;
+    }
+    ret = lseek(fd, 0, SEEK_CUR);
+    close(fd);
+    if (ret != (off_t)-1) {
+	fprintf(stderr, "lseek on UNIX domain socket succeeded\n");
+	return 1;
+    }
+    return 0;
+}
+]])],[zsh_cv_sys_lseek=yes],[zsh_cv_sys_lseek=no],[zsh_cv_sys_lseek=yes])
+])
+AH_TEMPLATE([USE_LSEEK],
+[Define to 1 if lseek() can be used for SHIN.])
+if test x$zsh_cv_sys_lseek = xyes; then
+  AC_DEFINE(USE_LSEEK)
+fi
+
 dnl -----------
 dnl test for whether link() works
 dnl for instance, BeOS R4.51 doesn't support hard links yet




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-28 18:51           ` Jun. T
@ 2022-04-29  0:28             ` Bart Schaefer
  2022-04-29  2:25               ` Jun. T
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2022-04-29  0:28 UTC (permalink / raw)
  To: Zsh hackers list

On Thu, Apr 28, 2022 at 11:52 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
> I reproduce the whole patch here for convenience.

You will commit this?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ZSH performance regression in 5.8.1.2-test
  2022-04-29  0:28             ` Bart Schaefer
@ 2022-04-29  2:25               ` Jun. T
  0 siblings, 0 replies; 30+ messages in thread
From: Jun. T @ 2022-04-29  2:25 UTC (permalink / raw)
  To: zsh-workers


> 2022/04/29 9:28, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Thu, Apr 28, 2022 at 11:52 AM Jun. T <takimoto-j@kba.biglobe.ne.jp> wrote:
>> 
>> I reproduce the whole patch here for convenience.
> 
> You will commit this?

I think it's better for you to commit.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-04-29  2:25 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25 18:16 ZSH performance regression in 5.8.1.2-test Jordan Patterson
2022-04-25 18:56 ` Bart Schaefer
2022-04-25 19:20   ` Stephane Chazelas
2022-04-25 21:27     ` Bart Schaefer
2022-04-26  7:01       ` Bart Schaefer
2022-04-26  8:31         ` Peter Stephenson
2022-04-27  0:33           ` Bart Schaefer
2022-04-27 14:11           ` Stephane Chazelas
2022-04-27 15:02             ` Bart Schaefer
2022-04-27 15:07               ` Peter Stephenson
2022-04-27 15:17                 ` Bart Schaefer
2022-04-26 14:31         ` Jun. T
2022-04-26 15:15           ` Peter Stephenson
2022-04-27  0:55             ` Bart Schaefer
2022-04-27  9:16               ` Jun T
2022-04-27  0:38           ` Bart Schaefer
2022-04-27  9:34             ` Peter Stephenson
2022-04-27 10:28               ` Jun T
2022-04-27 12:42                 ` Jun T
2022-04-27 13:58                 ` Jun T
2022-04-27 15:25                   ` Bart Schaefer
2022-04-27 16:18                     ` Jun. T
2022-04-27 19:54         ` Jordan Patterson
2022-04-28  9:53           ` Jun T
2022-04-28 14:56             ` Bart Schaefer
2022-04-28 18:51           ` Jun. T
2022-04-29  0:28             ` Bart Schaefer
2022-04-29  2:25               ` Jun. T
2022-04-26  1:08 ` Bart Schaefer
2022-04-26  3:03   ` Jordan Patterson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).