History file lock contention

zsh-workers
 help / color / mirror / code / Atom feed

* History file lock contention
@ 2015-01-23 11:46 ondra
  2015-01-23 12:13 ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: ondra @ 2015-01-23 11:46 UTC (permalink / raw)
  To: zsh-workers

Hi,
when running zsh from a terminal multiplexer like tmux, it is possible
(and not uncommon) to close several processes at the exact same moment.
In that case, all of them try to sync their history and all but one fail
to acquire the lock. So far so good.

However, all of these call nanosleep({1,0}), and since we might be on a
multicore system, they might just be scheduled at the same time again,
only one of them will progress and the scenario repeats. What the user
sees is the terminals not disappearing immediately, but one after the
other, a second per process.

A common solution to this problem is to randomize the back-off, for
example in the interval (0, 1s). On a decent system without load, this
would make it more responsive (all of them would be likely to finish
within a second of the user's request if possible) while not causing
any undue increase in the load on the machine.

Let me know what you think. I am not (yet) subscribed to this list, so
please copy me in your discussion.

Thanks,
Ondrej

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: History file lock contention
  2015-01-23 11:46 History file lock contention ondra
@ 2015-01-23 12:13 ` Peter Stephenson
  2015-01-23 20:49   ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2015-01-23 12:13 UTC (permalink / raw)
  To: ondra; +Cc: zsh workers

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]

On 23 January 2015 at 11:46, <ondra@mistotebe.net> wrote:
> when running zsh from a terminal multiplexer like tmux, it is possible
> (and not uncommon) to close several processes at the exact same moment.
> In that case, all of them try to sync their history and all but one fail
> to acquire the lock. So far so good.
>
> However, all of these call nanosleep({1,0}), and since we might be on a
> multicore system, they might just be scheduled at the same time again,
> only one of them will progress and the scenario repeats. What the user
> sees is the terminals not disappearing immediately, but one after the
> other, a second per process.
>
> A common solution to this problem is to randomize the back-off, for
> example in the interval (0, 1s). On a decent system without load, this
> would make it more responsive (all of them would be likely to finish
> within a second of the user's request if possible) while not causing
> any undue increase in the load on the machine.

Yes, this is sensible.  I think the current simplistic algorithm
is mostly there because we aren't making assumptions about the
availability of sleep with a resolution better than 1 second ---
and a random backoff of multiples of seconds isn't very useful.
However, it can be improved in various largely standard ways which
are already used in other parts of the shell, so this is fixable without
needing to rely on system specific features.

A better scheme would be to start with a small wait time, says 0.1 seconds,
with a random back off, with a maximum length of say the same time again,
and then double the base time each time we back off, but keeping the
same maximum time of 10 seconds since we first tried before reporting
a failure.  We can then tweak this as necessary.  Or maybe just
having a random backoff with doubling maximum and no fixed
base time is good enough for typical problems.

pws

P.S. if acronyms like EDCA and SIFS went through anyone else's mind,
too, my commiserations :-).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: History file lock contention
  2015-01-23 12:13 ` Peter Stephenson
@ 2015-01-23 20:49   ` Peter Stephenson
  2015-01-23 21:00     ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2015-01-23 20:49 UTC (permalink / raw)
  To: ondra, zsh workers

On Fri, 23 Jan 2015 12:13:09 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On 23 January 2015 at 11:46, <ondra@mistotebe.net> wrote:
> > when running zsh from a terminal multiplexer like tmux, it is possible
> > (and not uncommon) to close several processes at the exact same moment.
> > In that case, all of them try to sync their history and all but one fail
> > to acquire the lock. So far so good.
> >
> > However, all of these call nanosleep({1,0}), and since we might be on a
> > multicore system, they might just be scheduled at the same time again,
> > only one of them will progress and the scenario repeats. What the user
> > sees is the terminals not disappearing immediately, but one after the
> > other, a second per process.
>
> A better scheme would be to start with a small wait time, says 0.1 seconds,
> with a random back off, with a maximum length of say the same time again,
> and then double the base time each time we back off, but keeping the
> same maximum time of 10 seconds since we first tried before reporting
> a failure.  We can then tweak this as necessary.  Or maybe just
> having a random backoff with doubling maximum and no fixed
> base time is good enough for typical problems.

The following doesn't appear to be too fundamentally broken, however I
don't have a set-up that's particularly likely to trigger problems.  It
would be good if other people who think they do could try it out before
I commit it --- although I don't think it's too far off from being
usable by bleeding edge types.

pws

P.S. Looks like I forgot to "remove formatting" from webmail last time.
Back with proper mail.

diff --git a/Src/hist.c b/Src/hist.c
index 11d9722..05dcdb6 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2610,7 +2610,8 @@ static int
 flockhistfile(char *fn, int keep_trying)
 {
     struct flock lck;
-    int ctr = keep_trying ? 9 : 0;
+    long sleep_us = 0x4000000; /* about 67 ms */
+    time_t end_time;
 
     if (flock_fd >= 0)
 	return 0; /* already locked */
@@ -2623,13 +2624,22 @@ flockhistfile(char *fn, int keep_trying)
     lck.l_start = 0;
     lck.l_len = 0;  /* lock the whole file */
 
+    /*
+     * Timeout is ten seconds.
+     */
+    end_time = time(NULL) + (time_t)10;
     while (fcntl(flock_fd, F_SETLKW, &lck) == -1) {
-	if (--ctr < 0) {
+	if (!keep_trying || time(NULL) >= end_time ||
+	    /*
+	     * Randomise wait to minimise clashes with shells exiting at
+	     * the same time.
+	     */
+	    !zsleep_random(sleep_us, end_time)) {
 	    close(flock_fd);
 	    flock_fd = -1;
 	    return 1;
 	}
-	sleep(1);
+	sleep_us <<= 1;
     }
 
     return 0;
@@ -2865,7 +2875,7 @@ savehistfile(char *fn, int err, int writeflags)
 static int lockhistct;
 
 static int
-checklocktime(char *lockfile, time_t then)
+checklocktime(char *lockfile, long *sleep_usp, time_t then)
 {
     time_t now = time(NULL);
 
@@ -2875,9 +2885,19 @@ checklocktime(char *lockfile, time_t then)
 	return -1;
     }
 
-    if (now - then < 10)
-	sleep(1);
-    else
+    if (now - then < 10) {
+	/*
+	 * To give the effect of a gradually increasing backoff,
+	 * we'll sleep a period based on the time we've spent so far.
+	 */
+	DPUTS(now < then, "time flowing backwards through history");
+	/*
+	 * Randomise to minimise clashes with shells exiting at the same
+	 * time.
+	 */
+	(void)zsleep_random(*sleep_usp, then + 10);
+	*sleep_usp <<= 1;
+    } else
 	unlink(lockfile);
 
     return 0;
@@ -2894,6 +2914,7 @@ lockhistfile(char *fn, int keep_trying)
 {
     int ct = lockhistct;
     int ret = 0;
+    long sleep_us = 0x4000000; /* about 67 ms */
 
     if (!fn && !(fn = getsparam("HISTFILE")))
 	return 1;
@@ -2937,7 +2958,7 @@ lockhistfile(char *fn, int keep_trying)
 		    continue;
 		break;
 	    }
-	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+	    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 		ret = 1;
 		break;
 	    }
@@ -2965,7 +2986,7 @@ lockhistfile(char *fn, int keep_trying)
 			continue;
 		    ret = 2;
 		} else {
-		    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+		    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 			ret = 1;
 			break;
 		    }
@@ -2993,7 +3014,7 @@ lockhistfile(char *fn, int keep_trying)
 		ret = 2;
 		break;
 	    }
-	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+	    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 		ret = 1;
 		break;
 	    }
diff --git a/Src/utils.c b/Src/utils.c
index cf18f12..6ee2ec6 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -2261,6 +2261,10 @@ setblock_stdin(void)
  * Note that apart from setting (and restoring) non-blocking input,
  * this function does not change the input mode.  The calling function
  * should have set cbreak mode if necessary.
+ *
+ * fd may be -1 to sleep until the timeout in microseconds.  This is a
+ * fallback for old systems that don't have nanosleep().  Some very old
+ * systems might not have select: get with it, daddy-o.
  */
 
 /**/
@@ -2282,6 +2286,8 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     struct ttyinfo ti;
 #endif
 
+     if (fd < 0)
+	 polltty = 0;		/* no tty to poll */
 
 #if defined(HAS_TIO) && !defined(__CYGWIN__)
     /*
@@ -2303,7 +2309,7 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
      * as plausible as it sounds, but it seems the right way to guess.
      *		pws 2000/06/26
      */
-    if (polltty) {
+    if (fd >= 0) {
 	gettyinfo(&ti);
 	if ((polltty = ti.tio.c_cc[VMIN])) {
 	    ti.tio.c_cc[VMIN] = 0;
@@ -2319,16 +2325,24 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     expire_tv.tv_sec = (int) (microseconds / (zlong)1000000);
     expire_tv.tv_usec = microseconds % (zlong)1000000;
     FD_ZERO(&foofd);
-    FD_SET(fd, &foofd);
-    ret = select(fd+1, (SELECT_ARG_2_T) &foofd, NULL, NULL, &expire_tv);
+    if (fd > -1) {
+	FD_SET(fd, &foofd);
+	ret = select(fd+1, (SELECT_ARG_2_T) &foofd, NULL, NULL, &expire_tv);
+    } else
+	ret = select(0, NULL, NULL, NULL, &expire_tv);
 #else
+    if (fd < 0) {
+	/* OK, can't do that.  Just quietly sleep for a second. */
+	sleep(1);
+	return 1;
+    }
 #ifdef FIONREAD
     if (ioctl(fd, FIONREAD, (char *) &val) == 0)
 	ret = (val > 0);
 #endif
 #endif
 
-    if (ret < 0) {
+    if (fd >= 0 && ret < 0) {
 	/*
 	 * Final attempt: set non-blocking read and try to read a character.
 	 * Praise Bill, this works under Cygwin (nothing else seems to).
@@ -2350,6 +2364,80 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     return (ret > 0);
 }
 
+/*
+ * Sleep for the given number of microseconds --- must be within
+ * range of a long at the moment, but this is only used for
+ * limited internal purposes.
+ */
+
+/**/
+int
+zsleep(long us)
+{
+#ifdef HAVE_NANOSLEEP
+    struct timespec sleeptime;
+
+    sleeptime.tv_sec = (time_t)us / (time_t)1000000;
+    sleeptime.tv_nsec = (us % 1000000L) * 1000L;
+    for (;;) {
+	struct timespec rem;
+	int ret = nanosleep(&sleeptime, &rem);
+
+	if (ret == 0)
+	    return 1;
+	else if (errno != EINTR)
+	    return 0;
+	sleeptime = rem;
+    }
+#else
+    int dummy;
+    return read_poll(-1, &dummy, 0, us);
+#endif
+}
+
+/**
+ * Sleep for time (fairly) randomly up to max_us microseconds.
+ * Don't let the wallclock time extend beyond end_time.
+ * Return 1 if that seemed to work, else 0.
+ *
+ * For best results max_us should be a multiple of 2**16 or large
+ * enough that it doesn't matter.
+ */
+
+/**/
+int
+zsleep_random(long max_us, time_t end_time)
+{
+    int r;
+    time_t now = time(NULL);
+
+    /*
+     * Randomish backoff.  Doesn't need to be fundamentally
+     * unpredictable, just probably unlike the value another
+     * exiting shell is using.  On some systems the bottom 16
+     * bits aren't that random but the use here doesn't
+     * really care.
+     */
+    r = (long)(rand() & 0xFFFF);
+    /*
+     * Turn this into a fraction of sleep_us.  Again, this
+     * doesn't need to be particularly accurate and the base time
+     * is sufficient that we can do the division first and not
+     * worry about the range.
+     */
+    r = (max_us >> 16) * r;
+    /*
+     * Don't sleep beyond timeout.
+     * Not that important as timeout is ridiculously long, but
+     * if there's an interface, interface to it...
+     */
+    while (now + r > end_time)
+	r >>= 1;
+    if (r) /* pedantry */
+	return zsleep(r);
+    return 0;
+}
+
 /**/
 int
 checkrmall(char *s)
diff --git a/configure.ac b/configure.ac
index 8ea9737..bfc02b2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1299,7 +1299,8 @@ AC_CHECK_FUNCS(strftime strptime mktime timelocal \
 	       gdbm_open getxattr \
 	       realpath canonicalize_file_name \
 	       symlink getcwd \
-	       cygwin_conv_path)
+	       cygwin_conv_path \
+	       nanosleep)
 AC_FUNC_STRCOLL
 
 AH_TEMPLATE([REALPATH_ACCEPTS_NULL],


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: History file lock contention
  2015-01-23 20:49   ` Peter Stephenson
@ 2015-01-23 21:00     ` Peter Stephenson
  2015-01-23 21:28       ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2015-01-23 21:00 UTC (permalink / raw)
  To: ondra, zsh workers

On Fri, 23 Jan 2015 20:49:16 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> +    long sleep_us = 0x4000000; /* about 67 ms */

67 *seconds*... that's probably a bit long.  Sorry.

pws

diff --git a/Src/hist.c b/Src/hist.c
index 11d9722..303c1dd 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2610,7 +2610,8 @@ static int
 flockhistfile(char *fn, int keep_trying)
 {
     struct flock lck;
-    int ctr = keep_trying ? 9 : 0;
+    long sleep_us = 0x10000; /* about 67 ms */
+    time_t end_time;
 
     if (flock_fd >= 0)
 	return 0; /* already locked */
@@ -2623,13 +2624,22 @@ flockhistfile(char *fn, int keep_trying)
     lck.l_start = 0;
     lck.l_len = 0;  /* lock the whole file */
 
+    /*
+     * Timeout is ten seconds.
+     */
+    end_time = time(NULL) + (time_t)10;
     while (fcntl(flock_fd, F_SETLKW, &lck) == -1) {
-	if (--ctr < 0) {
+	if (!keep_trying || time(NULL) >= end_time ||
+	    /*
+	     * Randomise wait to minimise clashes with shells exiting at
+	     * the same time.
+	     */
+	    !zsleep_random(sleep_us, end_time)) {
 	    close(flock_fd);
 	    flock_fd = -1;
 	    return 1;
 	}
-	sleep(1);
+	sleep_us <<= 1;
     }
 
     return 0;
@@ -2865,7 +2875,7 @@ savehistfile(char *fn, int err, int writeflags)
 static int lockhistct;
 
 static int
-checklocktime(char *lockfile, time_t then)
+checklocktime(char *lockfile, long *sleep_usp, time_t then)
 {
     time_t now = time(NULL);
 
@@ -2875,9 +2885,19 @@ checklocktime(char *lockfile, time_t then)
 	return -1;
     }
 
-    if (now - then < 10)
-	sleep(1);
-    else
+    if (now - then < 10) {
+	/*
+	 * To give the effect of a gradually increasing backoff,
+	 * we'll sleep a period based on the time we've spent so far.
+	 */
+	DPUTS(now < then, "time flowing backwards through history");
+	/*
+	 * Randomise to minimise clashes with shells exiting at the same
+	 * time.
+	 */
+	(void)zsleep_random(*sleep_usp, then + 10);
+	*sleep_usp <<= 1;
+    } else
 	unlink(lockfile);
 
     return 0;
@@ -2894,6 +2914,7 @@ lockhistfile(char *fn, int keep_trying)
 {
     int ct = lockhistct;
     int ret = 0;
+    long sleep_us = 0x10000; /* about 67 ms */
 
     if (!fn && !(fn = getsparam("HISTFILE")))
 	return 1;
@@ -2937,7 +2958,7 @@ lockhistfile(char *fn, int keep_trying)
 		    continue;
 		break;
 	    }
-	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+	    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 		ret = 1;
 		break;
 	    }
@@ -2965,7 +2986,7 @@ lockhistfile(char *fn, int keep_trying)
 			continue;
 		    ret = 2;
 		} else {
-		    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+		    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 			ret = 1;
 			break;
 		    }
@@ -2993,7 +3014,7 @@ lockhistfile(char *fn, int keep_trying)
 		ret = 2;
 		break;
 	    }
-	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+	    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 		ret = 1;
 		break;
 	    }
diff --git a/Src/utils.c b/Src/utils.c
index cf18f12..6ee2ec6 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -2261,6 +2261,10 @@ setblock_stdin(void)
  * Note that apart from setting (and restoring) non-blocking input,
  * this function does not change the input mode.  The calling function
  * should have set cbreak mode if necessary.
+ *
+ * fd may be -1 to sleep until the timeout in microseconds.  This is a
+ * fallback for old systems that don't have nanosleep().  Some very old
+ * systems might not have select: get with it, daddy-o.
  */
 
 /**/
@@ -2282,6 +2286,8 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     struct ttyinfo ti;
 #endif
 
+     if (fd < 0)
+	 polltty = 0;		/* no tty to poll */
 
 #if defined(HAS_TIO) && !defined(__CYGWIN__)
     /*
@@ -2303,7 +2309,7 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
      * as plausible as it sounds, but it seems the right way to guess.
      *		pws 2000/06/26
      */
-    if (polltty) {
+    if (fd >= 0) {
 	gettyinfo(&ti);
 	if ((polltty = ti.tio.c_cc[VMIN])) {
 	    ti.tio.c_cc[VMIN] = 0;
@@ -2319,16 +2325,24 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     expire_tv.tv_sec = (int) (microseconds / (zlong)1000000);
     expire_tv.tv_usec = microseconds % (zlong)1000000;
     FD_ZERO(&foofd);
-    FD_SET(fd, &foofd);
-    ret = select(fd+1, (SELECT_ARG_2_T) &foofd, NULL, NULL, &expire_tv);
+    if (fd > -1) {
+	FD_SET(fd, &foofd);
+	ret = select(fd+1, (SELECT_ARG_2_T) &foofd, NULL, NULL, &expire_tv);
+    } else
+	ret = select(0, NULL, NULL, NULL, &expire_tv);
 #else
+    if (fd < 0) {
+	/* OK, can't do that.  Just quietly sleep for a second. */
+	sleep(1);
+	return 1;
+    }
 #ifdef FIONREAD
     if (ioctl(fd, FIONREAD, (char *) &val) == 0)
 	ret = (val > 0);
 #endif
 #endif
 
-    if (ret < 0) {
+    if (fd >= 0 && ret < 0) {
 	/*
 	 * Final attempt: set non-blocking read and try to read a character.
 	 * Praise Bill, this works under Cygwin (nothing else seems to).
@@ -2350,6 +2364,80 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     return (ret > 0);
 }
 
+/*
+ * Sleep for the given number of microseconds --- must be within
+ * range of a long at the moment, but this is only used for
+ * limited internal purposes.
+ */
+
+/**/
+int
+zsleep(long us)
+{
+#ifdef HAVE_NANOSLEEP
+    struct timespec sleeptime;
+
+    sleeptime.tv_sec = (time_t)us / (time_t)1000000;
+    sleeptime.tv_nsec = (us % 1000000L) * 1000L;
+    for (;;) {
+	struct timespec rem;
+	int ret = nanosleep(&sleeptime, &rem);
+
+	if (ret == 0)
+	    return 1;
+	else if (errno != EINTR)
+	    return 0;
+	sleeptime = rem;
+    }
+#else
+    int dummy;
+    return read_poll(-1, &dummy, 0, us);
+#endif
+}
+
+/**
+ * Sleep for time (fairly) randomly up to max_us microseconds.
+ * Don't let the wallclock time extend beyond end_time.
+ * Return 1 if that seemed to work, else 0.
+ *
+ * For best results max_us should be a multiple of 2**16 or large
+ * enough that it doesn't matter.
+ */
+
+/**/
+int
+zsleep_random(long max_us, time_t end_time)
+{
+    int r;
+    time_t now = time(NULL);
+
+    /*
+     * Randomish backoff.  Doesn't need to be fundamentally
+     * unpredictable, just probably unlike the value another
+     * exiting shell is using.  On some systems the bottom 16
+     * bits aren't that random but the use here doesn't
+     * really care.
+     */
+    r = (long)(rand() & 0xFFFF);
+    /*
+     * Turn this into a fraction of sleep_us.  Again, this
+     * doesn't need to be particularly accurate and the base time
+     * is sufficient that we can do the division first and not
+     * worry about the range.
+     */
+    r = (max_us >> 16) * r;
+    /*
+     * Don't sleep beyond timeout.
+     * Not that important as timeout is ridiculously long, but
+     * if there's an interface, interface to it...
+     */
+    while (now + r > end_time)
+	r >>= 1;
+    if (r) /* pedantry */
+	return zsleep(r);
+    return 0;
+}
+
 /**/
 int
 checkrmall(char *s)
diff --git a/configure.ac b/configure.ac
index 8ea9737..bfc02b2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1299,7 +1299,8 @@ AC_CHECK_FUNCS(strftime strptime mktime timelocal \
 	       gdbm_open getxattr \
 	       realpath canonicalize_file_name \
 	       symlink getcwd \
-	       cygwin_conv_path)
+	       cygwin_conv_path \
+	       nanosleep)
 AC_FUNC_STRCOLL
 
 AH_TEMPLATE([REALPATH_ACCEPTS_NULL],


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: History file lock contention
  2015-01-23 21:00     ` Peter Stephenson
@ 2015-01-23 21:28       ` Peter Stephenson
  2015-01-25 20:33         ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2015-01-23 21:28 UTC (permalink / raw)
  To: ondra, zsh workers

On Fri, 23 Jan 2015 21:00:36 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On Fri, 23 Jan 2015 20:49:16 +0000
> Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > +    long sleep_us = 0x4000000; /* about 67 ms */
> 
> 67 *seconds*... that's probably a bit long.  Sorry.

Apologies, there were more fundamental problems in zsleep_random() that
were cancelling out the problem above.  The following has now had a bit
more testing.  Third time lucky.

I suspect with the randomness we don't need to increase the sleep
period so quickly.

pws

diff --git a/Src/hist.c b/Src/hist.c
index 11d9722..303c1dd 100644
--- a/Src/hist.c
+++ b/Src/hist.c
@@ -2610,7 +2610,8 @@ static int
 flockhistfile(char *fn, int keep_trying)
 {
     struct flock lck;
-    int ctr = keep_trying ? 9 : 0;
+    long sleep_us = 0x10000; /* about 67 ms */
+    time_t end_time;
 
     if (flock_fd >= 0)
 	return 0; /* already locked */
@@ -2623,13 +2624,22 @@ flockhistfile(char *fn, int keep_trying)
     lck.l_start = 0;
     lck.l_len = 0;  /* lock the whole file */
 
+    /*
+     * Timeout is ten seconds.
+     */
+    end_time = time(NULL) + (time_t)10;
     while (fcntl(flock_fd, F_SETLKW, &lck) == -1) {
-	if (--ctr < 0) {
+	if (!keep_trying || time(NULL) >= end_time ||
+	    /*
+	     * Randomise wait to minimise clashes with shells exiting at
+	     * the same time.
+	     */
+	    !zsleep_random(sleep_us, end_time)) {
 	    close(flock_fd);
 	    flock_fd = -1;
 	    return 1;
 	}
-	sleep(1);
+	sleep_us <<= 1;
     }
 
     return 0;
@@ -2865,7 +2875,7 @@ savehistfile(char *fn, int err, int writeflags)
 static int lockhistct;
 
 static int
-checklocktime(char *lockfile, time_t then)
+checklocktime(char *lockfile, long *sleep_usp, time_t then)
 {
     time_t now = time(NULL);
 
@@ -2875,9 +2885,19 @@ checklocktime(char *lockfile, time_t then)
 	return -1;
     }
 
-    if (now - then < 10)
-	sleep(1);
-    else
+    if (now - then < 10) {
+	/*
+	 * To give the effect of a gradually increasing backoff,
+	 * we'll sleep a period based on the time we've spent so far.
+	 */
+	DPUTS(now < then, "time flowing backwards through history");
+	/*
+	 * Randomise to minimise clashes with shells exiting at the same
+	 * time.
+	 */
+	(void)zsleep_random(*sleep_usp, then + 10);
+	*sleep_usp <<= 1;
+    } else
 	unlink(lockfile);
 
     return 0;
@@ -2894,6 +2914,7 @@ lockhistfile(char *fn, int keep_trying)
 {
     int ct = lockhistct;
     int ret = 0;
+    long sleep_us = 0x10000; /* about 67 ms */
 
     if (!fn && !(fn = getsparam("HISTFILE")))
 	return 1;
@@ -2937,7 +2958,7 @@ lockhistfile(char *fn, int keep_trying)
 		    continue;
 		break;
 	    }
-	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+	    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 		ret = 1;
 		break;
 	    }
@@ -2965,7 +2986,7 @@ lockhistfile(char *fn, int keep_trying)
 			continue;
 		    ret = 2;
 		} else {
-		    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+		    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 			ret = 1;
 			break;
 		    }
@@ -2993,7 +3014,7 @@ lockhistfile(char *fn, int keep_trying)
 		ret = 2;
 		break;
 	    }
-	    if (checklocktime(lockfile, sb.st_mtime) < 0) {
+	    if (checklocktime(lockfile, &sleep_us, sb.st_mtime) < 0) {
 		ret = 1;
 		break;
 	    }
diff --git a/Src/utils.c b/Src/utils.c
index cf18f12..cf89f09 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -2261,6 +2261,10 @@ setblock_stdin(void)
  * Note that apart from setting (and restoring) non-blocking input,
  * this function does not change the input mode.  The calling function
  * should have set cbreak mode if necessary.
+ *
+ * fd may be -1 to sleep until the timeout in microseconds.  This is a
+ * fallback for old systems that don't have nanosleep().  Some very old
+ * systems might not have select: get with it, daddy-o.
  */
 
 /**/
@@ -2282,6 +2286,8 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     struct ttyinfo ti;
 #endif
 
+     if (fd < 0)
+	 polltty = 0;		/* no tty to poll */
 
 #if defined(HAS_TIO) && !defined(__CYGWIN__)
     /*
@@ -2303,7 +2309,7 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
      * as plausible as it sounds, but it seems the right way to guess.
      *		pws 2000/06/26
      */
-    if (polltty) {
+    if (fd >= 0) {
 	gettyinfo(&ti);
 	if ((polltty = ti.tio.c_cc[VMIN])) {
 	    ti.tio.c_cc[VMIN] = 0;
@@ -2319,16 +2325,24 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     expire_tv.tv_sec = (int) (microseconds / (zlong)1000000);
     expire_tv.tv_usec = microseconds % (zlong)1000000;
     FD_ZERO(&foofd);
-    FD_SET(fd, &foofd);
-    ret = select(fd+1, (SELECT_ARG_2_T) &foofd, NULL, NULL, &expire_tv);
+    if (fd > -1) {
+	FD_SET(fd, &foofd);
+	ret = select(fd+1, (SELECT_ARG_2_T) &foofd, NULL, NULL, &expire_tv);
+    } else
+	ret = select(0, NULL, NULL, NULL, &expire_tv);
 #else
+    if (fd < 0) {
+	/* OK, can't do that.  Just quietly sleep for a second. */
+	sleep(1);
+	return 1;
+    }
 #ifdef FIONREAD
     if (ioctl(fd, FIONREAD, (char *) &val) == 0)
 	ret = (val > 0);
 #endif
 #endif
 
-    if (ret < 0) {
+    if (fd >= 0 && ret < 0) {
 	/*
 	 * Final attempt: set non-blocking read and try to read a character.
 	 * Praise Bill, this works under Cygwin (nothing else seems to).
@@ -2350,6 +2364,80 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds)
     return (ret > 0);
 }
 
+/*
+ * Sleep for the given number of microseconds --- must be within
+ * range of a long at the moment, but this is only used for
+ * limited internal purposes.
+ */
+
+/**/
+int
+zsleep(long us)
+{
+#ifdef HAVE_NANOSLEEP
+    struct timespec sleeptime;
+
+    sleeptime.tv_sec = (time_t)us / (time_t)1000000;
+    sleeptime.tv_nsec = (us % 1000000L) * 1000L;
+    for (;;) {
+	struct timespec rem;
+	int ret = nanosleep(&sleeptime, &rem);
+
+	if (ret == 0)
+	    return 1;
+	else if (errno != EINTR)
+	    return 0;
+	sleeptime = rem;
+    }
+#else
+    int dummy;
+    return read_poll(-1, &dummy, 0, us);
+#endif
+}
+
+/**
+ * Sleep for time (fairly) randomly up to max_us microseconds.
+ * Don't let the wallclock time extend beyond end_time.
+ * Return 1 if that seemed to work, else 0.
+ *
+ * For best results max_us should be a multiple of 2**16 or large
+ * enough that it doesn't matter.
+ */
+
+/**/
+int
+zsleep_random(long max_us, time_t end_time)
+{
+    long r;
+    time_t now = time(NULL);
+
+    /*
+     * Randomish backoff.  Doesn't need to be fundamentally
+     * unpredictable, just probably unlike the value another
+     * exiting shell is using.  On some systems the bottom 16
+     * bits aren't that random but the use here doesn't
+     * really care.
+     */
+    r = (long)(rand() & 0xFFFF);
+    /*
+     * Turn this into a fraction of sleep_us.  Again, this
+     * doesn't need to be particularly accurate and the base time
+     * is sufficient that we can do the division first and not
+     * worry about the range.
+     */
+    r = (max_us >> 16) * r;
+    /*
+     * Don't sleep beyond timeout.
+     * Not that important as timeout is ridiculously long, but
+     * if there's an interface, interface to it...
+     */
+    while (r && now + (time_t)(r / 1000000) > end_time)
+	r >>= 1;
+    if (r) /* pedantry */
+	return zsleep(r);
+    return 0;
+}
+
 /**/
 int
 checkrmall(char *s)
diff --git a/configure.ac b/configure.ac
index 8ea9737..bfc02b2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1299,7 +1299,8 @@ AC_CHECK_FUNCS(strftime strptime mktime timelocal \
 	       gdbm_open getxattr \
 	       realpath canonicalize_file_name \
 	       symlink getcwd \
-	       cygwin_conv_path)
+	       cygwin_conv_path \
+	       nanosleep)
 AC_FUNC_STRCOLL
 
 AH_TEMPLATE([REALPATH_ACCEPTS_NULL],


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: History file lock contention
  2015-01-23 21:28       ` Peter Stephenson
@ 2015-01-25 20:33         ` Peter Stephenson
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Stephenson @ 2015-01-25 20:33 UTC (permalink / raw)
  To: zsh workers

On Fri, 23 Jan 2015 21:28:34 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On Fri, 23 Jan 2015 21:00:36 +0000
> Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > On Fri, 23 Jan 2015 20:49:16 +0000
> > Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > > +    long sleep_us = 0x4000000; /* about 67 ms */
> > 
> > 67 *seconds*... that's probably a bit long.  Sorry.
> 
> Apologies, there were more fundamental problems in zsleep_random() that
> were cancelling out the problem above.  The following has now had a bit
> more testing.  Third time lucky.

I've pushed this to Let the Market Decide.  As far as I understand these
things, this can't possibly be wrong.

pws

P.S. being British, I'm immune to complaints about sarcasm.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-01-25 20:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 11:46 History file lock contention ondra
2015-01-23 12:13 ` Peter Stephenson
2015-01-23 20:49   ` Peter Stephenson
2015-01-23 21:00     ` Peter Stephenson
2015-01-23 21:28       ` Peter Stephenson
2015-01-25 20:33         ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).