zsh-workers
 help / color / mirror / code / Atom feed
* [PATCH] readhistfile: avoid thousands of lseek(2) syscalls via ftell()
@ 2020-02-07  7:47 Michael Stapelberg
  2020-02-07 11:30 ` Mikael Magnusson
  2020-02-08  5:54 ` Bart Schaefer
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Stapelberg @ 2020-02-07  7:47 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 651 bytes --]

[please cc me in replies, as I’m not subscribed to this list]

Before this change, zsh startup time was dominated by lseek(2) system calls on
the history file, as shown by strace -c:

time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 97,35    1,112890           1    697153         1 lseek
  0,99    0,011314           2      5277           read
[…]

This change keeps track of read bytes and the position within the file, removing
all of these system calls.

I verified correctness of the change by comparing fpos with ftell(in) in every
loop iteration.

[-- Attachment #2: 0001-readhistfile-avoid-thousands-of-lseek-2-syscalls-via.patch --]
[-- Type: text/x-patch, Size: 2980 bytes --]

From 55ca0751f11432efab667451cc9b63ec0becc038 Mon Sep 17 00:00:00 2001
From: Michael Stapelberg <stapelberg@google.com>
Date: Fri, 7 Feb 2020 08:41:26 +0100
Subject: [PATCH] readhistfile: avoid thousands of lseek(2) syscalls via
 ftell()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before this change, zsh startup time was dominated by lseek(2) system calls on
the history file, as shown by strace -c:

time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 97,35    1,112890           1    697153         1 lseek
  0,99    0,011314           2      5277           read
[…]

This change keeps track of read bytes and the position within the file, removing
all of these system calls.

I verified correctness of the change by comparing fpos with ftell(in) in every
loop iteration.
---
 Src/hist.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/Src/hist.c b/Src/hist.c
index 5281e8718..12eef5294 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2575,10 +2575,11 @@ resizehistents(void)
 }
 
 static int
-readhistline(int start, char **bufp, int *bufsiz, FILE *in)
+readhistline(int start, char **bufp, int *bufsiz, FILE *in, int *readbytes)
 {
     char *buf = *bufp;
     if (fgets(buf + start, *bufsiz - start, in)) {
+	*readbytes += strlen(buf + start);
 	int len = start + strlen(buf + start);
 	if (len == start)
 	    return -1;
@@ -2588,7 +2589,7 @@ readhistline(int start, char **bufp, int *bufsiz, FILE *in)
 		    return -1;
 		*bufp = zrealloc(buf, 2 * (*bufsiz));
 		*bufsiz = 2 * (*bufsiz);
-		return readhistline(len, bufp, bufsiz, in);
+		return readhistline(len, bufp, bufsiz, in, readbytes);
 	    }
 	}
 	else {
@@ -2596,7 +2597,7 @@ readhistline(int start, char **bufp, int *bufsiz, FILE *in)
 	    if (len > 1 && buf[len - 2] == '\\') {
 		buf[--len - 1] = '\n';
 		if (!feof(in))
-		    return readhistline(len, bufp, bufsiz, in);
+		    return readhistline(len, bufp, bufsiz, in, readbytes);
 	    }
 	}
 	return len;
@@ -2616,7 +2617,7 @@ readhistfile(char *fn, int err, int readflags)
     short *words;
     struct stat sb;
     int nwordpos, nwords, bufsiz;
-    int searching, newflags, l, ret, uselex;
+    int searching, newflags, l, ret, uselex, readbytes;
 
     if (!fn && !(fn = getsparam("HISTFILE")))
 	return;
@@ -2658,13 +2659,15 @@ readhistfile(char *fn, int err, int readflags)
 	} else
 	    searching = 0;
 
+	fpos = ftell(in);
+	readbytes = 0;
 	newflags = HIST_OLD | HIST_READ;
 	if (readflags & HFILE_FAST)
 	    newflags |= HIST_FOREIGN;
 	if (readflags & HFILE_SKIPOLD
 	 || (hist_ignore_all_dups && newflags & hist_skip_flags))
 	    newflags |= HIST_MAKEUNIQUE;
-	while (fpos = ftell(in), (l = readhistline(0, &buf, &bufsiz, in))) {
+	while (fpos += readbytes, readbytes = 0, (l = readhistline(0, &buf, &bufsiz, in, &readbytes))) {
 	    char *pt;
 	    int remeta = 0;
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-10 17:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07  7:47 [PATCH] readhistfile: avoid thousands of lseek(2) syscalls via ftell() Michael Stapelberg
2020-02-07 11:30 ` Mikael Magnusson
2020-02-07 11:33   ` Michael Stapelberg
2021-04-10  0:54     ` Oliver Kiddle
2021-04-10 17:31       ` Bart Schaefer
2020-02-08  5:54 ` Bart Schaefer
2020-02-08  7:47   ` Michael Stapelberg

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).