From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,RDNS_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 Received: (qmail 28754 invoked from network); 15 Mar 2020 16:04:21 -0000 Received-SPF: pass (primenet.com.au: domain of zsh.org designates 203.24.36.2 as permitted sender) receiver=inbox.vuxu.org; client-ip=203.24.36.2 envelope-from= Received: from unknown (HELO primenet.com.au) (203.24.36.2) by inbox.vuxu.org with ESMTP; 15 Mar 2020 16:04:21 -0000 Received: (qmail 6913 invoked by alias); 15 Mar 2020 16:04:09 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 45561 Received: (qmail 26719 invoked by uid 1010); 15 Mar 2020 16:04:09 -0000 X-Qmail-Scanner-Diagnostics: from nef2.ens.fr by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.102.2/25744. spamassassin: 3.4.2. Clear:RC:0(129.199.96.40):SA:0(-4.2/5.0):. Processed in 3.949502 secs); 15 Mar 2020 16:04:09 -0000 X-Envelope-From: ware@phare.normalesup.org X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at ens.fr designates 129.199.96.40 as permitted sender) X-ENS-nef-client: 129.199.129.80 Date: Sun, 15 Mar 2020 17:03:24 +0100 From: Cedric Ware To: Daniel Shahaf Cc: dana , zsh-workers@zsh.org Subject: Re: [PATCH] Enable sub-second timeout in zsystem flock Message-ID: <20200315160324.dstgtmajzwxpaccn@phare.normalesup.org> References: <43775C64-0254-45AB-81AB-B04AE80C4416@dana.is> <20200106173030.eb2pg4rhhgysh35r@phare.normalesup.org> <20200111154143.fjtwgfnztqfmkyda@phare.normalesup.org> <20200308183907.mxnhqrr2uflwooax@phare.normalesup.org> <20200314210454.hp562smyqv3ew255@phare.normalesup.org> <20200315005036.45bc846b@tarpaulin.shahaf.local2> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200315005036.45bc846b@tarpaulin.shahaf.local2> User-Agent: NeoMutt/20170113 (1.7.2) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (nef.ens.fr [129.199.96.32]); Sun, 15 Mar 2020 17:03:25 +0100 (CET) Hello, Thanks for the feedback. I have tried to address all the issues with the new patch below (against git HEAD), but please keep in mind that I'm not familiar with the test framework, so I may have missed better solutions. Some comments on what I may not have addressed: Daniel Shahaf (Sunday 2020-03-15): > Cedric Ware wrote on Sat, 14 Mar 2020 22:04 +0100: > > +++ zsh-5.8/Doc/Zsh/mod_system.yo 2020-03-08 18:39:32.672683779 +0100 > > @@ -166,7 +166,7 @@ > > printed in the last case, but the parameter tt(ERRNO) will reflect > > the error that occurred. > > ) > > -xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-f) var(var) ] [tt(-er)] var(file)) > > +xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-i) var(interval) ] [ tt(-f) var(var) ] [tt(-er)] var(file)) > > item(tt(zsystem flock -u) var(fd_expr))( > > Please s/var(var)/var(interval)/ in the body paragraphs of the item()(). I don't understand. Option -f's parameter is actually called "var". The only occurrences of "var(var)" here are about -f, I didn't add them. > > + timeout_tmp = timeout_param.u.d * 1e6; > > + if ((timeout_tmp < 1) || (timeout_tmp > ZLONG_MAX / 2)) { > > + zwarnnam(nam, "flock: invalid timeout value"); > > I think the invalid value should be included in the error message, > either as a number, or as a string if needed: "flock: invalid timeout > value: '%s'". Done, but it had to be as a string, because I didn't see a way to use a zlong in zwarnnam(). > dana, I see you advocated the opposite approach in 45283. What's your > rationale? I think that was about my earlier attempt to include the actual limit in the error message, not the input. I reverted to a generic message, because I don't know how to make the test script check for an error message that depends on the system/compiler options. > > + zlong now; > > + zlong end = time_clock_us() + timeout; > > Could this sum overflow (for example, if the -t option is used)? I don't think so: timeout is limited to ZLONG_MAX / 2, which corresponds to 140000 years or so if zlong is 64-bit. For the sum to overflow, the current time as given by either gettimeofday() or clock_gettime(CLOCK_MONOTONIC) would have to also be greater than that. OTOH, if zlong is only 32-bit, there could be a problem, and in fact my function time_clock_us() might not work. However, it was suggested in message 45250 that it shouldn't be my problem. Should it be? Should this whole patch be disabled somehow on systems that don't support any 64-bit type? > > + if (now + timeout_retry > end) { > > Could this sum overflow (for example, if the -i option is used)? It might if long and zlong are the same size, as timeout_retry is limited to LONG_MAX. The program would then wait past the timeout by one interval. But, again, this interval would have to be hundreds of thousands of years; if a user requests that, in my opinion, we might as well just wait forever. Still, would you like it better if I limited the interval to min(LONG_MAX, ZLONG_MAX / 2) instead of LONG_MAX? > Isn't the macro spelled HAVE_CLOCK_GETTIME? It's spelled that way in > Src/compat.c:105. Oops. > Have both codepaths (the #if and the #else) been tested? Well, now they have. Best regards, Cedric Ware. diff --git a/Doc/Zsh/mod_system.yo b/Doc/Zsh/mod_system.yo index 6292af071..715acf45b 100644 --- a/Doc/Zsh/mod_system.yo +++ b/Doc/Zsh/mod_system.yo @@ -166,7 +166,7 @@ to the command, or 2 for an error on the write; no error message is printed in the last case, but the parameter tt(ERRNO) will reflect the error that occurred. ) -xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-f) var(var) ] [tt(-er)] var(file)) +xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-i) var(interval) ] [ tt(-f) var(var) ] [tt(-er)] var(file)) item(tt(zsystem flock -u) var(fd_expr))( The builtin tt(zsystem)'s subcommand tt(flock) performs advisory file locking (via the manref(fcntl)(2) system call) over the entire contents @@ -196,9 +196,16 @@ a safety check that the file descriptor is in use for file locking. By default the shell waits indefinitely for the lock to succeed. The option tt(-t) var(timeout) specifies a timeout for the lock in -seconds; currently this must be an integer. The shell will attempt -to lock the file once a second during this period. If the attempt -times out, status 2 is returned. +seconds; fractional seconds are allowed. During this period, the +shell will attempt to lock the file every var(interval) seconds +if the tt(-i) var(interval) option is given, otherwise once a second. +(This var(interval) is shortened before the last attempt if needed, +so that the shell waits only until the var(timeout) and not longer.) +If the attempt times out, status 2 is returned. + +(Note: var(interval) must be less than LONG_MAX microseconds. +This is many millenia on 64-bit systems, but only about 35 minutes +on 32-bit systems.) If the option tt(-e) is given, the file descriptor for the lock is preserved when the shell uses tt(exec) to start a new process; diff --git a/Src/Modules/system.c b/Src/Modules/system.c index fb3d80773..e0ddd2a2a 100644 --- a/Src/Modules/system.c +++ b/Src/Modules/system.c @@ -532,6 +532,9 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) { int cloexec = 1, unlock = 0, readlock = 0; zlong timeout = -1; + double timeout_tmp; + long timeout_retry = 1e6; + mnumber timeout_param; char *fdvar = NULL; #ifdef HAVE_FCNTL_H struct flock lck; @@ -583,7 +586,46 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) } else { optarg = *args++; } - timeout = mathevali(optarg); + timeout_param = matheval(optarg); + if (!(timeout_param.type & MN_FLOAT)) { + timeout_param.type = MN_FLOAT; + timeout_param.u.d = (double)timeout_param.u.l; + } + timeout_tmp = timeout_param.u.d * 1e6; + if ((timeout_tmp < 1) || (timeout_tmp > ZLONG_MAX / 2)) { + zwarnnam(nam, "flock: invalid timeout value: '%s'", + optarg); + return 1; + } + timeout = (zlong)timeout_tmp; + break; + + case 'i': + /* retry interval in seconds */ + if (optptr[1]) { + optarg = optptr + 1; + optptr += strlen(optarg) - 1; + } else if (!*args) { + zwarnnam(nam, + "flock: option %c requires " + "a numeric retry interval", + opt); + return 1; + } else { + optarg = *args++; + } + timeout_param = matheval(optarg); + if (!(timeout_param.type & MN_FLOAT)) { + timeout_param.type = MN_FLOAT; + timeout_param.u.d = (double)timeout_param.u.l; + } + timeout_tmp = timeout_param.u.d * 1e6; + if ((timeout_tmp < 1) || (timeout_tmp > LONG_MAX)) { + zwarnnam(nam, "flock: invalid interval value: '%s'", + optarg); + return 1; + } + timeout_retry = (long)timeout_tmp; break; case 'u': @@ -647,7 +689,8 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) lck.l_len = 0; /* lock the whole file */ if (timeout > 0) { - time_t end = time(NULL) + (time_t)timeout; + zlong now; + zlong end = time_clock_us() + timeout; while (fcntl(flock_fd, F_SETLK, &lck) < 0) { if (errflag) { zclose(flock_fd); @@ -658,11 +701,15 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(int func)) zwarnnam(nam, "failed to lock file %s: %e", args[0], errno); return 1; } - if (time(NULL) >= end) { + now = time_clock_us(); + if (now >= end) { zclose(flock_fd); return 2; } - sleep(1); + if (now + timeout_retry > end) { + timeout_retry = end - now; + } + zsleep(timeout_retry); } } else { while (fcntl(flock_fd, timeout == 0 ? F_SETLK : F_SETLKW, &lck) < 0) { diff --git a/Src/utils.c b/Src/utils.c index f9c2d4a2b..efa9fe08a 100644 --- a/Src/utils.c +++ b/Src/utils.c @@ -2744,6 +2744,26 @@ read_poll(int fd, int *readchar, int polltty, zlong microseconds) return (ret > 0); } +/* + * Return the current time in microseconds, using the system's + * monotonic clock if supported, the wall clock if not. + */ + +/**/ +zlong +time_clock_us(void) +{ +#if defined(HAVE_CLOCK_GETTIME) && defined(CLOCK_MONOTONIC) + struct timespec ts; + clock_gettime(CLOCK_MONOTONIC, &ts); + return ts.tv_sec * (zlong)1e6 + ts.tv_nsec / 1000; +#else + struct timeval tv; + gettimeofday(&tv, NULL); + return tv.tv_sec * (zlong)1e6 + tv.tv_usec; +#endif +} + /* * Sleep for the given number of microseconds --- must be within * range of a long at the moment, but this is only used for diff --git a/Test/V14system.ztst b/Test/V14system.ztst index e69de29bb..bc01a8fe6 100644 --- a/Test/V14system.ztst +++ b/Test/V14system.ztst @@ -0,0 +1,148 @@ +# Test zsh/system module + +%prep + + if zmodload -s zsh/system && zmodload -s zsh/zselect; then + tst_dir=V13.tmp + mkdir -p -- $tst_dir + else + ZTST_unimplemented='the zsh/system and zsh/zselect modules are not available' + fi + : > $tst_dir/file # File on which to acquire flock. + +%test + + ( + zsystem flock -t 0.1 -i 0.000001 $tst_dir/file + ) +0:zsystem flock valid time arguments + + ( + zsystem flock -t -1 $tst_dir/file || + zsystem flock -t 0.49e-6 $tst_dir/file || + zsystem flock -t 1e100 $tst_dir/file || + zsystem flock -i -1 $tst_dir/file || + zsystem flock -i 0.49e-6 $tst_dir/file || + zsystem flock -i 1e100 $tst_dir/file + ) +1:zsystem flock invalid time arguments +?(eval):zsystem:2: flock: invalid timeout value: '-1' +?(eval):zsystem:3: flock: invalid timeout value: '0.49e-6' +?(eval):zsystem:4: flock: invalid timeout value: '1e100' +?(eval):zsystem:5: flock: invalid interval value: '-1' +?(eval):zsystem:6: flock: invalid interval value: '0.49e-6' +?(eval):zsystem:7: flock: invalid interval value: '1e100' + + ( + # Lock file for 1 second in the background. + lock_flag=$tst_dir/locked1 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 100 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + else + # Attempt to lock file with 0.5 second timeout: must fail. + zsystem flock -t 0.5 $tst_dir/file + fi + ) +2:zsystem flock unsuccessful wait test +F:This timing test might fail due to process scheduling issues unrelated to zsh. + + ( + # Lock file for 0.5 second in the background. + lock_flag=$tst_dir/locked2 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 50 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + fi + typeset -F SECONDS + start=$SECONDS + # Attempt to lock file without a timeout: + # must succeed after sub-shell above releases it (0.5 second). + if zsystem flock $tst_dir/file; then + elapsed=$[ $SECONDS - $start ] + if [[ $elapsed -ge 0.3 && $elapsed -le 0.7 ]]; then + echo "elapsed time seems OK" 1>&2 + else + echo "elapsed time $elapsed should be ~ 0.5 second" 1>&2 + fi + fi + ) +0:zsystem flock successful wait test, no timeout +?elapsed time seems OK +F:This timing test might fail due to process scheduling issues unrelated to zsh. + + ( + # Lock file for 0.5 second in the background. + lock_flag=$tst_dir/locked3 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 50 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + fi + typeset -F SECONDS + start=$SECONDS + # Attempt to lock file with 1-second timeout: + # must succeed 1 second after start because we retry every 1 second. + if zsystem flock -t 1 $tst_dir/file; then + elapsed=$[ $SECONDS - $start ] + if [[ $elapsed -ge 0.8 && $elapsed -le 1.2 ]]; then + echo "elapsed time seems OK" 1>&2 + else + echo "elapsed time $elapsed should be ~ 1 second" 1>&2 + fi + fi + ) +0:zsystem flock successful wait test, integral seconds +?elapsed time seems OK +F:This timing test might fail due to process scheduling issues unrelated to zsh. + + ( + # Lock file for 0.25 second in the background. + lock_flag=$tst_dir/locked4 + (zsystem flock $tst_dir/file \ + && touch $lock_flag \ + && zselect -t 25 + mv $lock_flag $lock_flag.done) & + # Wait until sub-shell above has started. + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do + zselect -t 1 + done + if [[ -f $lock_flag.done ]]; then + echo "Background shell should not have completed already." 1>&2 + fi + typeset -F SECONDS + start=$SECONDS + # Attempt to lock file with 0.4-second timeout, retrying every 0.1 second: + # must succeed 0.3 second after start. + if zsystem flock -t 0.4 -i 0.1 $tst_dir/file; then + elapsed=$[ $SECONDS - $start ] + if [[ $elapsed -ge 0.2 && $elapsed -le 0.5 ]]; then + echo "elapsed time seems OK" 1>&2 + else + echo "elapsed time $elapsed should be ~ 0.3 second" 1>&2 + fi + fi + ) +0:zsystem flock successful wait test, fractional seconds +?elapsed time seems OK +F:This timing test might fail due to process scheduling issues unrelated to zsh.