From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=1.0 required=5.0 tests=LOTS_OF_MONEY, MAILING_LIST_MULTI,PDS_OTHER_BAD_TLD,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 Received: (qmail 31133 invoked from network); 14 Apr 2020 11:48:01 -0000 Received-SPF: pass (primenet.com.au: domain of zsh.org designates 203.24.36.2 as permitted sender) receiver=inbox.vuxu.org; client-ip=203.24.36.2 envelope-from= Received: from ns1.primenet.com.au (HELO primenet.com.au) (203.24.36.2) by inbox.vuxu.org with UTF8ESMTPZ; 14 Apr 2020 11:48:01 -0000 Received: (qmail 15937 invoked by alias); 14 Apr 2020 11:47:49 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 45693 Received: (qmail 3423 invoked by uid 1010); 14 Apr 2020 11:47:49 -0000 X-Qmail-Scanner-Diagnostics: from out5-smtp.messagingengine.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.102.2/25779. spamassassin: 3.4.4. Clear:RC:0(66.111.4.29):SA:0(-0.6/5.0):. Processed in 6.127764 secs); 14 Apr 2020 11:47:49 -0000 X-Envelope-From: d.s@daniel.shahaf.name X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: none (ns1.primenet.com.au: domain at daniel.shahaf.name does not designate permitted sender hosts) X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrfedugdeggecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepfffhvffukfgjfhfogggtgfesthhqtd dtredtjeenucfhrhhomhepffgrnhhivghlucfuhhgrhhgrfhcuoegurdhssegurghnihgv lhdrshhhrghhrghfrdhnrghmvgeqnecukfhppeejledrudejjedruddugedrudegleenuc evlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegurdhssegu rghnihgvlhdrshhhrghhrghfrdhnrghmvg X-ME-Proxy: Date: Tue, 14 Apr 2020 11:47:00 +0000 From: Daniel Shahaf To: Cedric Ware Cc: zsh-workers@zsh.org Subject: Re: [PATCH] Enable sub-second timeout in zsystem flock Message-ID: <20200414114700.108febf3@tarpaulin.shahaf.local2> In-Reply-To: <20200413213449.orqym5bqboznancl@phare.normalesup.org> References: <20200111154143.fjtwgfnztqfmkyda@phare.normalesup.org> <20200308183907.mxnhqrr2uflwooax@phare.normalesup.org> <20200314210454.hp562smyqv3ew255@phare.normalesup.org> <20200315005036.45bc846b@tarpaulin.shahaf.local2> <20200315160324.dstgtmajzwxpaccn@phare.normalesup.org> <20200315165410.GA30241@tarpaulin.shahaf.local2> <20200413213449.orqym5bqboznancl@phare.normalesup.org> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cedric Ware wrote on Mon, 13 Apr 2020 23:34 +0200: > Hello, >=20 > About the patch I submitted last month enabling fractional seconds for > the timeout in zsystem flock, there was the following outstanding issue. >=20 >=20 > Daniel Shahaf (Sunday 2020-03-15): > > How likely is zsh 5.9 to be ported to a platform that doesn't have a > > 64-bit type? > >=20 > > If that's a concern, I'd recommend to issue a warning or disable the > > feature when we detect that case. > >=20 > > I'm not sure whether that's a concern. =20 >=20 > If it is, here is a reimplementation of the patch that doesn't depend > on a 64-bit type. Instead of counting time in microseconds, I've used > struct timespec and a function similar to zgettime() but that uses > CLOCK_MONOTONIC if available instead of CLOCK_REALTIME. >=20 Wow. Thank you very much for the rewrite, Cedric. I'm afraid I wasn't clear, though. All I wanted to say was: (1)=C2=A0_If_ zsh 5.9 is likely to be ported to a 64-bit-type-less platform, we should add a preprocessor check to catch that case (at either configure- or compile-time); (2)=C2=A0I don't know what's the probability of the antecedent. I didn't mean to ask for a rewrite, not in the least =E2=80=94 only for a quick, three-liner =C2=AB#if=C2=BB block =E2=80=94 but that's water under the brid= ge now. Thanks for going the extra mile :-) Cheers, Daniel > (I haven't managed to avoid some code duplication, because a common > function would require choosing between the two clocks, thus need a > parameter of type clockid_t, which is not always defined. So I don't > know how to write the prototype of a common function.) >=20 > Another difference is that the timeout length is now limited to 2^30-1 > seconds (34 years) so as to avoid overflowing time_t on 32-bit systems. > There is one case where it may overflow, but it can't be avoided: if > time_t is 32-bit signed and CLOCK_MONOTONIC isn't available or returns > the wall clock instead of e.g. the uptime. Then, specifying a timeout > beyond year 2038 won't work, but then the system has a Y2038 problem > anyway. I don't see how to avoid it short of checking the timeout > against Y2038, which may actually create a Y2038 problem on systems > that wouldn't otherwise have it. >=20 > Also, the retry interval is now limited to 0.999*LONG_MAX microseconds; > the 0.999 avoids a rounding problem that causes an overflow on 64-bit > systems (the value is given as a double, but a double can't represent > 2^63-1 exactly). >=20 > I think that's all. I believe I checked all code paths, though I don't > know how to do it systematically. >=20 > Thanks, best regards, > Cedric Ware. >=20 >=20 > diff --git a/Doc/Zsh/mod_system.yo b/Doc/Zsh/mod_system.yo > index 6292af071..be9a99f62 100644 > --- a/Doc/Zsh/mod_system.yo > +++ b/Doc/Zsh/mod_system.yo > @@ -166,7 +166,7 @@ to the command, or 2 for an error on the write; no er= ror message is > printed in the last case, but the parameter tt(ERRNO) will reflect > the error that occurred. > ) > -xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-f) var(var) ] [tt(= -er)] var(file)) > +xitem(tt(zsystem flock) [ tt(-t) var(timeout) ] [ tt(-i) var(interval) ]= [ tt(-f) var(var) ] [tt(-er)] var(file)) > item(tt(zsystem flock -u) var(fd_expr))( > The builtin tt(zsystem)'s subcommand tt(flock) performs advisory file > locking (via the manref(fcntl)(2) system call) over the entire contents > @@ -196,9 +196,16 @@ a safety check that the file descriptor is in use fo= r file locking. > =20 > By default the shell waits indefinitely for the lock to succeed. > The option tt(-t) var(timeout) specifies a timeout for the lock in > -seconds; currently this must be an integer. The shell will attempt > -to lock the file once a second during this period. If the attempt > -times out, status 2 is returned. > +seconds; fractional seconds are allowed. During this period, the > +shell will attempt to lock the file every var(interval) seconds > +if the tt(-i) var(interval) option is given, otherwise once a second. > +(This var(interval) is shortened before the last attempt if needed, > +so that the shell waits only until the var(timeout) and not longer.) > +If the attempt times out, status 2 is returned. > + > +(Note: var(timeout) must be less than 2^30-1 seconds (about 34 years); > +and var(interval) must be less than 0.999 * LONG_MAX microseconds > +(only about 35 minutes on 32-bit systems).) > =20 > If the option tt(-e) is given, the file descriptor for the lock is > preserved when the shell uses tt(exec) to start a new process; > diff --git a/Src/Modules/system.c b/Src/Modules/system.c > index fb3d80773..a60c18f5d 100644 > --- a/Src/Modules/system.c > +++ b/Src/Modules/system.c > @@ -29,6 +29,7 @@ > =20 > #include "system.mdh" > #include "system.pro" > +#include > =20 > #ifdef HAVE_POLL_H > # include > @@ -531,7 +532,9 @@ static int > bin_zsystem_flock(char *nam, char **args, UNUSED(Options ops), UNUSED(in= t func)) > { > int cloexec =3D 1, unlock =3D 0, readlock =3D 0; > - zlong timeout =3D -1; > + double timeout =3D -1; > + long timeout_interval =3D 1e6; > + mnumber timeout_param; > char *fdvar =3D NULL; > #ifdef HAVE_FCNTL_H > struct flock lck; > @@ -583,7 +586,51 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Opt= ions ops), UNUSED(int func)) > } else { =20 > optarg =3D *args++; > } =20 > - timeout =3D mathevali(optarg); > + timeout_param =3D matheval(optarg); > + timeout =3D (timeout_param.type & MN_FLOAT)? > + timeout_param.u.d : (double)timeout_param.u.l; > + > + /* > + * timeout must not overflow time_t, but little is known > + * about this type's limits. Conservatively limit to 2^30-1 > + * (34 years). Then it can only overflow if time_t is only > + * a 32-bit int and CLOCK_MONOTONIC is not supported, in which > + * case there is a Y2038 problem anyway. > + */ > + if ((timeout < 1e-6) || (timeout > 1073741823.)) { > + zwarnnam(nam, "flock: invalid timeout value: '%s'", > + optarg); > + return 1; > + } > + break; > + > + case 'i': > + /* retry interval in seconds */ > + if (optptr[1]) { > + optarg =3D optptr + 1; > + optptr +=3D strlen(optarg) - 1; > + } else if (!*args) { > + zwarnnam(nam, > + "flock: option %c requires " > + "a numeric retry interval", > + opt); > + return 1; > + } else { > + optarg =3D *args++; > + } > + timeout_param =3D matheval(optarg); > + if (!(timeout_param.type & MN_FLOAT)) { > + timeout_param.type =3D MN_FLOAT; > + timeout_param.u.d =3D (double)timeout_param.u.l; > + } > + timeout_param.u.d *=3D 1e6; > + if ((timeout_param.u.d < 1) > + || (timeout_param.u.d > 0.999 * LONG_MAX)) { > + zwarnnam(nam, "flock: invalid interval value: '%s'", > + optarg); > + return 1; > + } > + timeout_interval =3D (long)timeout_param.u.d; > break; > =20 > case 'u': > @@ -647,7 +694,24 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Opt= ions ops), UNUSED(int func)) > lck.l_len =3D 0; /* lock the whole file */ > =20 > if (timeout > 0) { > - time_t end =3D time(NULL) + (time_t)timeout; > + /* > + * Get current time, calculate timeout time. > + * No need to check for overflow, already checked above. > + */ > + struct timespec now, end; > + double timeout_s; > + long remaining_us; > + zgettime_monotonic_if_available(&now); > + end.tv_sec =3D now.tv_sec; > + end.tv_nsec =3D now.tv_nsec; > + end.tv_nsec +=3D modf(timeout, &timeout_s) * 1000000000L; > + end.tv_sec +=3D timeout_s; > + if (end.tv_nsec >=3D 1000000000L) { > + end.tv_nsec -=3D 1000000000L; > + end.tv_sec +=3D 1; > + } > + > + /* Try acquiring lock, loop until timeout. */ > while (fcntl(flock_fd, F_SETLK, &lck) < 0) { > if (errflag) { > zclose(flock_fd); > @@ -658,11 +722,16 @@ bin_zsystem_flock(char *nam, char **args, UNUSED(Op= tions ops), UNUSED(int func)) > zwarnnam(nam, "failed to lock file %s: %e", args[0], errno); > return 1; > } =20 > - if (time(NULL) >=3D end) { > + zgettime_monotonic_if_available(&now); > + remaining_us =3D timespec_diff_us(&now, &end); > + if (remaining_us <=3D 0) { > zclose(flock_fd); > return 2; > } =20 > - sleep(1); > + if (remaining_us <=3D timeout_interval) { > + timeout_interval =3D remaining_us; > + } > + zsleep(timeout_interval); > } > } else { =20 > while (fcntl(flock_fd, timeout =3D=3D 0 ? F_SETLK : F_SETLKW, &lck) < 0= ) { > diff --git a/Src/compat.c b/Src/compat.c > index 74e426fba..817bb4aaf 100644 > --- a/Src/compat.c > +++ b/Src/compat.c > @@ -126,6 +126,32 @@ zgettime(struct timespec *ts) > return ret; > } =20 > =20 > +/* Likewise with CLOCK_MONOTONIC if available. */ > + > +/**/ > +mod_export int > +zgettime_monotonic_if_available(struct timespec *ts) > +{ > + int ret =3D -1; > + > +#if defined(HAVE_CLOCK_GETTIME) && defined(CLOCK_MONOTONIC) > + struct timespec dts; > + if (clock_gettime(CLOCK_MONOTONIC, &dts) < 0) { > + zwarn("unable to retrieve CLOCK_MONOTONIC time: %e", errno); > + ret--; > + } else { > + ret++; > + ts->tv_sec =3D (time_t) dts.tv_sec; > + ts->tv_nsec =3D (long) dts.tv_nsec; > + } > +#endif > + > + if (ret) { > + ret =3D zgettime(ts); > + } > + return ret; > +} =20 > + > =20 > /* compute the difference between two calendar times */ > =20 > diff --git a/Src/utils.c b/Src/utils.c > index 69885fed3..db5626830 100644 > --- a/Src/utils.c > +++ b/Src/utils.c > @@ -2748,6 +2748,42 @@ read_poll(int fd, int *readchar, int polltty, zlon= g microseconds) > return (ret > 0); > } =20 > =20 > +/* > + * Return the difference between 2 times, given as struct timespec*, > + * expressed in microseconds, as a long. If the difference doesn't fit > + * into a long, return LONG_MIN or LONG_MAX so that the times can still > + * be compared. > + * > + * Note: returns a long rather than a zlong because zsleep() below > + * takes a long. > + */ > + > +/**/ > +long > +timespec_diff_us(const struct timespec *t1, const struct timespec *t2) > +{ > + int reverse =3D (t1->tv_sec > t2->tv_sec); > + time_t diff_sec; > + long diff_usec, max_margin, res; > + > + /* Don't just subtract t2-t1 because time_t might be unsigned. */ > + diff_sec =3D (reverse? t1->tv_sec - t2->tv_sec : t2->tv_sec - t1->tv= _sec); > + if (diff_sec > LONG_MAX / 1000000L) { > + goto overflow; > + } > + res =3D diff_sec * 1000000L; > + max_margin =3D LONG_MAX - res; > + diff_usec =3D (reverse? > + t1->tv_nsec - t2->tv_nsec : t2->tv_nsec - t1->tv_nsec > + ) / 1000; > + if (diff_usec <=3D max_margin) { > + res +=3D diff_usec; > + return (reverse? -res : res); > + } > + overflow: > + return (reverse? LONG_MIN : LONG_MAX); > +} =20 > + > /* > * Sleep for the given number of microseconds --- must be within > * range of a long at the moment, but this is only used for > diff --git a/Test/V14system.ztst b/Test/V14system.ztst > index e69de29bb..79a2c0a77 100644 > --- a/Test/V14system.ztst > +++ b/Test/V14system.ztst > @@ -0,0 +1,150 @@ > +# Test zsh/system module > + > +%prep > + > + if zmodload -s zsh/system && zmodload -s zsh/zselect; then > + tst_dir=3DV13.tmp > + mkdir -p -- $tst_dir > + else > + ZTST_unimplemented=3D'the zsh/system and zsh/zselect modules are not= available' > + fi > + : > $tst_dir/file # File on which to acquire flock. > + > +%test > + > + ( > + zsystem flock -t 0.1 -i 0.000001 $tst_dir/file > + ) > +0:zsystem flock valid time arguments > + > + ( > + zsystem flock -t -1 $tst_dir/file || > + zsystem flock -t 0.49e-6 $tst_dir/file || > + zsystem flock -t 1073741824 $tst_dir/file || > + zsystem flock -t 1e100 $tst_dir/file || > + zsystem flock -i -1 $tst_dir/file || > + zsystem flock -i 0.49e-6 $tst_dir/file || > + zsystem flock -i 1e100 $tst_dir/file > + ) > +1:zsystem flock invalid time arguments > +?(eval):zsystem:2: flock: invalid timeout value: '-1' > +?(eval):zsystem:3: flock: invalid timeout value: '0.49e-6' > +?(eval):zsystem:4: flock: invalid timeout value: '1073741824' > +?(eval):zsystem:5: flock: invalid timeout value: '1e100' > +?(eval):zsystem:6: flock: invalid interval value: '-1' > +?(eval):zsystem:7: flock: invalid interval value: '0.49e-6' > +?(eval):zsystem:8: flock: invalid interval value: '1e100' > + > + ( > + # Lock file for 1 second in the background. > + lock_flag=3D$tst_dir/locked1 > + (zsystem flock $tst_dir/file \ > + && touch $lock_flag \ > + && zselect -t 100 > + mv $lock_flag $lock_flag.done) & > + # Wait until sub-shell above has started. > + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do > + zselect -t 1 > + done > + if [[ -f $lock_flag.done ]]; then > + echo "Background shell should not have completed already." 1>&2 > + else > + # Attempt to lock file with 0.5 second timeout: must fail. > + zsystem flock -t 0.5 $tst_dir/file > + fi > + ) > +2:zsystem flock unsuccessful wait test > +F:This timing test might fail due to process scheduling issues unrelated= to zsh. > + > + ( > + # Lock file for 0.5 second in the background. > + lock_flag=3D$tst_dir/locked2 > + (zsystem flock $tst_dir/file \ > + && touch $lock_flag \ > + && zselect -t 50 > + mv $lock_flag $lock_flag.done) & > + # Wait until sub-shell above has started. > + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do > + zselect -t 1 > + done > + if [[ -f $lock_flag.done ]]; then > + echo "Background shell should not have completed already." 1>&2 > + fi > + typeset -F SECONDS > + start=3D$SECONDS > + # Attempt to lock file without a timeout: > + # must succeed after sub-shell above releases it (0.5 second). > + if zsystem flock $tst_dir/file; then > + elapsed=3D$[ $SECONDS - $start ] > + if [[ $elapsed -ge 0.3 && $elapsed -le 0.7 ]]; then > + echo "elapsed time seems OK" 1>&2 > + else > + echo "elapsed time $elapsed should be ~ 0.5 second" 1>&2 > + fi > + fi > + ) > +0:zsystem flock successful wait test, no timeout > +?elapsed time seems OK > +F:This timing test might fail due to process scheduling issues unrelated= to zsh. > + > + ( > + # Lock file for 0.5 second in the background. > + lock_flag=3D$tst_dir/locked3 > + (zsystem flock $tst_dir/file \ > + && touch $lock_flag \ > + && zselect -t 50 > + mv $lock_flag $lock_flag.done) & > + # Wait until sub-shell above has started. > + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do > + zselect -t 1 > + done > + if [[ -f $lock_flag.done ]]; then > + echo "Background shell should not have completed already." 1>&2 > + fi > + typeset -F SECONDS > + start=3D$SECONDS > + # Attempt to lock file with 1-second timeout: > + # must succeed 1 second after start because we retry every 1 second. > + if zsystem flock -t 1 $tst_dir/file; then > + elapsed=3D$[ $SECONDS - $start ] > + if [[ $elapsed -ge 0.8 && $elapsed -le 1.2 ]]; then > + echo "elapsed time seems OK" 1>&2 > + else > + echo "elapsed time $elapsed should be ~ 1 second" 1>&2 > + fi > + fi > + ) > +0:zsystem flock successful wait test, integral seconds > +?elapsed time seems OK > +F:This timing test might fail due to process scheduling issues unrelated= to zsh. > + > + ( > + # Lock file for 0.25 second in the background. > + lock_flag=3D$tst_dir/locked4 > + (zsystem flock $tst_dir/file \ > + && touch $lock_flag \ > + && zselect -t 25 > + mv $lock_flag $lock_flag.done) & > + # Wait until sub-shell above has started. > + while ! [[ -f $lock_flag || -f $lock_flag.done ]]; do > + zselect -t 1 > + done > + if [[ -f $lock_flag.done ]]; then > + echo "Background shell should not have completed already." 1>&2 > + fi > + typeset -F SECONDS > + start=3D$SECONDS > + # Attempt to lock file with 0.4-second timeout, retrying every 0.1 s= econd: > + # must succeed 0.3 second after start. > + if zsystem flock -t 0.4 -i 0.1 $tst_dir/file; then > + elapsed=3D$[ $SECONDS - $start ] > + if [[ $elapsed -ge 0.2 && $elapsed -le 0.5 ]]; then > + echo "elapsed time seems OK" 1>&2 > + else > + echo "elapsed time $elapsed should be ~ 0.3 second" 1>&2 > + fi > + fi > + ) > +0:zsystem flock successful wait test, fractional seconds > +?elapsed time seems OK > +F:This timing test might fail due to process scheduling issues unrelated= to zsh.