From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=1.0 required=5.0 tests=LOTS_OF_MONEY, MAILING_LIST_MULTI,PDS_OTHER_BAD_TLD,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 Received: (qmail 23694 invoked from network); 13 Apr 2020 21:35:44 -0000 Received-SPF: pass (primenet.com.au: domain of zsh.org designates 203.24.36.2 as permitted sender) receiver=inbox.vuxu.org; client-ip=203.24.36.2 envelope-from= Received: from ns1.primenet.com.au (HELO primenet.com.au) (203.24.36.2) by inbox.vuxu.org with UTF8ESMTPZ; 13 Apr 2020 21:35:44 -0000 Received: (qmail 16747 invoked by alias); 13 Apr 2020 21:35:33 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 45690 Received: (qmail 12224 invoked by uid 1010); 13 Apr 2020 21:35:33 -0000 X-Qmail-Scanner-Diagnostics: from nef2.ens.fr by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.102.2/25779. spamassassin: 3.4.4. Clear:RC:0(129.199.96.40):SA:0(-2.2/5.0):. Processed in 3.810499 secs); 13 Apr 2020 21:35:33 -0000 X-Envelope-From: ware@phare.normalesup.org X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at ens.fr designates 129.199.96.40 as permitted sender) X-ENS-nef-client: 129.199.129.80 Date: Mon, 13 Apr 2020 23:34:49 +0200 From: Cedric Ware To: zsh-workers@zsh.org Subject: Re: [PATCH] Enable sub-second timeout in zsystem flock Message-ID: <20200413213449.orqym5bqboznancl@phare.normalesup.org> References: <20200111154143.fjtwgfnztqfmkyda@phare.normalesup.org> <20200308183907.mxnhqrr2uflwooax@phare.normalesup.org> <20200314210454.hp562smyqv3ew255@phare.normalesup.org> <20200315005036.45bc846b@tarpaulin.shahaf.local2> <20200315160324.dstgtmajzwxpaccn@phare.normalesup.org> <20200315165410.GA30241@tarpaulin.shahaf.local2> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200315165410.GA30241@tarpaulin.shahaf.local2> User-Agent: NeoMutt/20170113 (1.7.2) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (nef.ens.fr [129.199.96.32]); Mon, 13 Apr 2020 23:34:49 +0200 (CEST) Hello, About the patch I submitted last month enabling fractional seconds for the timeout in zsystem flock, there was the following outstanding issue. Daniel Shahaf (Sunday 2020-03-15): > How likely is zsh 5.9 to be ported to a platform that doesn't have a > 64-bit type? > > If that's a concern, I'd recommend to issue a warning or disable the > feature when we detect that case. > > I'm not sure whether that's a concern. If it is, here is a reimplementation of the patch that doesn't depend on a 64-bit type. Instead of counting time in microseconds, I've used struct timespec and a function similar to zgettime() but that uses CLOCK_MONOTONIC if available instead of CLOCK_REALTIME. (I haven't managed to avoid some code duplication, because a common function would require choosing between the two clocks, thus need a parameter of type clockid_t, which is not always defined. So I don't know how to write the prototype of a common function.) Another difference is that the timeout length is now limited to 2^30-1 seconds (34 years) so as to avoid overflowing time_t on 32-bit systems. There is one case where it may overflow, but it can't be avoided: if time_t is 32-bit signed and CLOCK_MONOTONIC isn't available or returns the wall clock instead of e.g. the uptime. Then, specifying a timeout beyond year 2038 won't work, but then the system has a Y2038 problem anyway. I don't see how to avoid it short of checking the timeout against Y2038, which may actually create a Y2038 problem on systems that wouldn't otherwise have it. Also, the retry interval is now limited to 0.999*LONG_MAX microseconds; the 0.999 avoids a rounding problem that causes an overflow on 64-bit systems (the value is given as a double, but a double can't represent 2^63-1 exactly). I think that's all. I believe I checked all code paths, though I don't know how to do it systematically. Thanks, best regards, Cedric Ware. diff --git a/Doc/Zsh/mod_system.yo b/Doc/Zsh/mod_system.yo index 6292af071..be9a99f62 100644 --- a/Doc/Zsh/mod_system.yo +++ b/Doc/Zsh/mod_system.yo @@ -166,7 +166,7 @@ to the command, or 2 for an error on the write; no error message is printed in the last case, but the parameter tt(ERRNO) will reflect the error that occurred. ) -xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-f) var(var) ] [tt(-er)] var(file)) +xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-i) var(interval) ] [ tt(-f) var(var) ] [tt(-er)] var(file)) item(tt(zsystem flock -u) var(fd_expr))( The builtin tt(zsystem)'s subcommand tt(flock) performs advisory file locking (via the manref(fcntl)(2) system call) over the entire contents @@ -196,9 +196,16 @@ a safety check that the file descriptor is in use for file locking. By default the shell waits indefinitely for the lock to succeed. The option tt(-t) var(timeout) specifies a timeout for the lock in -seconds; currently this must be an integer. The shell will attempt -to lock the file once a second during this period. If the attempt -times out, status 2 is returned. +seconds; fractional seconds are allowed. During this period, the +shell will attempt to lock the file every var(interval) seconds +if the tt(-i) var(interval) option is given, otherwise once a second. +(This var(interval) is shortened before the last attempt if needed, +so that the shell waits only until the var(timeout) and not longer.) +If the attempt times out, status 2 is returned. + +(Note: var(timeout) must be less than 2^30-1 seconds (about 34 years); +and var(interval) must be less than 0.999 * LONG_MAX microseconds +(only about 35 minutes on 32-bit systems).) If the option tt(-e) is given, the file descriptor for the lock is preserved when the shell uses tt(exec) to start a new process; diff --git a/Src/Modules/system.c b/Src/Modules/system.c index fb3d80773..a60c18f5d 100644 --- a/Src/Modules/system.c +++ b/Src/Modules/system.c @@ -29,6 +29,7 @@ #include "system.mdh" #include "system.pro" +#include #ifdef HAVE_POLL_H # include @@ -531,7 +532,9 @@ static int bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) { int cloexec = 1, unlock = 0, readlock = 0; - zlong timeout = -1; + double timeout = -1; + long timeout_interval = 1e6; + mnumber timeout_param; char *fdvar = NULL; #ifdef HAVE_FCNTL_H struct flock lck; @@ -583,7 +586,51 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) } else { optarg = *args++; } - timeout = mathevali(optarg); + timeout_param = matheval(optarg); + timeout = (timeout_param.type & MN_FLOAT)? + timeout_param.u.d : (double)timeout_param.u.l; + + /* + * timeout must not overflow time_t, but little is known + * about this type's limits. Conservatively limit to 2^30-1 + * (34 years). Then it can only overflow if time_t is only + * a 32-bit int and CLOCK_MONOTONIC is not supported, in which + * case there is a Y2038 problem anyway. + */ + if ((timeout < 1e-6) || (timeout > 1073741823.)) { + zwarnnam(nam, "flock: invalid timeout value: '%s'", + optarg); + return 1; + } + break; + + case 'i': + /* retry interval in seconds */ + if (optptr[1]) { + optarg = optptr + 1; + optptr += strlen(optarg) - 1; + } else if (!*args) { + zwarnnam(nam, + "flock: option %c requires " + "a numeric retry interval", + opt); + return 1; + } else { + optarg = *args++; + } + timeout_param = matheval(optarg); + if (!(timeout_param.type & MN_FLOAT)) { + timeout_param.type = MN_FLOAT; + timeout_param.u.d = (double)timeout_param.u.l; + } + timeout_param.u.d *= 1e6; + if ((timeout_param.u.d < 1) + || (timeout_param.u.d > 0.999 * LONG_MAX)) { + zwarnnam(nam, "flock: invalid interval value: '%s'", + optarg); + return 1; + } + timeout_interval = (long)timeout_param.u.d; break; case 'u': @@ -647,7 +694,24 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) lck.l_len = 0; /* lock the whole file */ if (timeout > 0) { - time_t end = time(NULL) + (time_t)timeout; + /* + * Get current time, calculate timeout time. + * No need to check for overflow, already checked above. + */ + struct timespec now, end; + double timeout_s; + long remaining_us; + zgettime_monotonic_if_available(&now); + end.tv_sec = now.tv_sec; + end.tv_nsec = now.tv_nsec; + end.tv_nsec += modf(timeout, &timeout_s) * 1000000000L; + end.tv_sec += timeout_s; + if (end.tv_nsec >= 1000000000L) { + end.tv_nsec -= 1000000000L; + end.tv_sec += 1; + } + + /* Try acquiring lock, loop until timeout. */ while (fcntl(flock_fd, F_SETLK, &lck) < 0) { if (errflag) { zclose(flock_fd); @@ -658,11 +722,16 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) zwarnnam(nam, "failed to lock file %s: %e", args[0], errno); return 1; } - if (time(NULL) >= end) { + zgettime_monotonic_if_available(&now); + remaining_us = timespec_diff_us(&now, &end); + if (remaining_us <= 0) { zclose(flock_fd); return 2; } - sleep(1); + if (remaining_us <= timeout_interval) { + timeout_interval = remaining_us; + } + zsleep(timeout_interval); } } else { while (fcntl(flock_fd, timeout == 0 ? F_SETLK : F_SETLKW, &lck) < 0) { diff --git a/Src/compat.c b/Src/compat.c index 74e426fba..817bb4aaf 100644 --- a/Src/compat.c +++ b/Src/compat.c @@ -126,6 +126,32 @@ zgettime(struct timespec *ts) return ret; } +/* Likewise with CLOCK_MONOTONIC if available. */ + +/**/ +mod_export int +zgettime_monotonic_if_available(struct timespec *ts) +{ + int ret = -1; + +#if defined(HAVE_CLOCK_GETTIME) && defined(CLOCK_MONOTONIC) + struct timespec dts; + if (clock_gettime(CLOCK_MONOTONIC, &dts) < 0) { + zwarn("unable to retrieve CLOCK_MONOTONIC time: %e", errno); + ret--; + } else { + ret++; + ts->tv_sec = (time_t) dts.tv_sec; + ts->tv_nsec = (long) dts.tv_nsec; + } +#endif + + if (ret) { + ret = zgettime(ts); + } + return ret; +} + /* compute the difference between two calendar times */ diff --git a/Src/utils.c b/Src/utils.c index 69885fed3..db5626830 100644 --- a/Src/utils.c +++ b/Src/utils.c @@ -2748,6 +2748,42 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds) return (ret > 0); } +/* + * Return the difference between 2 times, given as struct timespec*, + * expressed in microseconds, as a long. If the difference doesn't fit + * into a long, return LONG_MIN or LONG_MAX so that the times can still + * be compared. + * + * Note: returns a long rather than a zlong because zsleep() below + * takes a long. + */ + +/**/ +long +timespec_diff_us(const struct timespec *t1, const struct timespec *t2) +{ + int reverse = (t1->tv_sec > t2->tv_sec); + time_t diff_sec; + long diff_usec, max_margin, res; + + /* Don't just subtract t2-t1 because time_t might be unsigned. */ + diff_sec = (reverse? t1->tv_sec - t2->tv_sec : t2->tv_sec - t1->tv_sec); + if (diff_sec > LONG_MAX / 1000000L) { + goto overflow; + } + res = diff_sec * 1000000L; + max_margin = LONG_MAX - res; + diff_usec = (reverse? + t1->tv_nsec - t2->tv_nsec : t2->tv_nsec - t1->tv_nsec + ) / 1000; + if (diff_usec <= max_margin) { + res += diff_usec; + return (reverse? -res : res); + } + overflow: + return (reverse? LONG_MIN : LONG_MAX); +} + /* * Sleep for the given number of microseconds --- must be within * range of a long at the moment, but this is only used for diff --git a/Test/V14system.ztst b/Test/V14system.ztst index e69de29bb..79a2c0a77 100644 --- a/Test/V14system.ztst +++ b/Test/V14system.ztst @@ -0,0 +1,150 @@ +# Test zsh/system module + +%prep + + if zmodload -s zsh/system && zmodload -s zsh/zselect; then + tst_dir=V13.tmp + mkdir -p -- $tst_dir + else + ZTST_unimplemented='the zsh/system and zsh/zselect modules are not available' + fi + : > $tst_dir/file # File on which to acquire flock. + +%test + + ( + zsystem flock -t 0.1 -i 0.000001 $tst_dir/file + ) +0:zsystem flock valid time arguments + + ( + zsystem flock -t -1 $tst_dir/file || + zsystem flock -t 0.49e-6 $tst_dir/file || + zsystem flock -t 1073741824 $tst_dir/file || + zsystem flock -t 1e100 $tst_dir/file || + zsystem flock -i -1 $tst_dir/file || + zsystem flock -i 0.49e-6 $tst_dir/file || + zsystem flock -i 1e100 $tst_dir/file + ) +1:zsystem flock invalid time arguments +?(eval):zsystem:2: flock: invalid timeout value: '-1' +?(eval):zsystem:3: flock: invalid timeout value: '0.49e-6' +?(eval):zsystem:4: flock: invalid timeout value: '1073741824' +?(eval):zsystem:5: flock: invalid timeout value: '1e100' +?(eval):zsystem:6: flock: invalid interval value: '-1' +?(eval):zsystem:7: flock: invalid interval value: '0.49e-6' +?(eval):zsystem:8: flock: invalid interval value: '1e100' + + ( + # Lock file for 1 second in the background. + lock_flag=$tst_dir/locked1 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 100 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + else + # Attempt to lock file with 0.5 second timeout: must fail. + zsystem flock -t 0.5 $tst_dir/file + fi + ) +2:zsystem flock unsuccessful wait test +F:This timing test might fail due to process scheduling issues unrelated to zsh. + + ( + # Lock file for 0.5 second in the background. + lock_flag=$tst_dir/locked2 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 50 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + fi + typeset -F SECONDS + start=$SECONDS + # Attempt to lock file without a timeout: + # must succeed after sub-shell above releases it (0.5 second). + if zsystem flock $tst_dir/file; then + elapsed=$[ $SECONDS - $start ] + if [[ $elapsed -ge 0.3 && $elapsed -le 0.7 ]]; then + echo "elapsed time seems OK" 1>&2 + else + echo "elapsed time $elapsed should be ~ 0.5 second" 1>&2 + fi + fi + ) +0:zsystem flock successful wait test, no timeout +?elapsed time seems OK +F:This timing test might fail due to process scheduling issues unrelated to zsh. + + ( + # Lock file for 0.5 second in the background. + lock_flag=$tst_dir/locked3 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 50 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + fi + typeset -F SECONDS + start=$SECONDS + # Attempt to lock file with 1-second timeout: + # must succeed 1 second after start because we retry every 1 second. + if zsystem flock -t 1 $tst_dir/file; then + elapsed=$[ $SECONDS - $start ] + if [[ $elapsed -ge 0.8 && $elapsed -le 1.2 ]]; then + echo "elapsed time seems OK" 1>&2 + else + echo "elapsed time $elapsed should be ~ 1 second" 1>&2 + fi + fi + ) +0:zsystem flock successful wait test, integral seconds +?elapsed time seems OK +F:This timing test might fail due to process scheduling issues unrelated to zsh. + + ( + # Lock file for 0.25 second in the background. + lock_flag=$tst_dir/locked4 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 25 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + fi + typeset -F SECONDS + start=$SECONDS + # Attempt to lock file with 0.4-second timeout, retrying every 0.1 second: + # must succeed 0.3 second after start. + if zsystem flock -t 0.4 -i 0.1 $tst_dir/file; then + elapsed=$[ $SECONDS - $start ] + if [[ $elapsed -ge 0.2 && $elapsed -le 0.5 ]]; then + echo "elapsed time seems OK" 1>&2 + else + echo "elapsed time $elapsed should be ~ 0.3 second" 1>&2 + fi + fi + ) +0:zsystem flock successful wait test, fractional seconds +?elapsed time seems OK +F:This timing test might fail due to process scheduling issues unrelated to zsh.