zsh-workers
 help / color / mirror / code / Atom feed
* [4.2/4.3] Bug with wait and trapped signals
@ 2006-02-17 14:28 Vincent Lefevre
  2006-02-17 18:08 ` Dan Nelson
  2006-02-18 17:31 ` Peter Stephenson
  0 siblings, 2 replies; 6+ messages in thread
From: Vincent Lefevre @ 2006-02-17 14:28 UTC (permalink / raw)
  To: zsh-workers

zsh (all versions?) does not interrupt a "wait" when it receives
a signal for which a trap has been set.

For instance, consider the following script:

#!/usr/bin/env zsh
echo "PID = $$"
sleep 60 &
trap 'echo term; exit 0' TERM
wait

When I send SIGTERM to the shell process, zsh does nothing, waiting
for the child to terminate before executing the trap. POSIX says:

 2.11 Signals and Error Handling

   [...] When the shell is waiting, by means of the wait utility, for
   asynchronous commands to complete, the reception of a signal for
   which a trap has been set shall cause the wait utility to return
   immediately with an exit status >128, immediately after which the
   trap associated with that signal shall be taken.

http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_11

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [4.2/4.3] Bug with wait and trapped signals
  2006-02-17 14:28 [4.2/4.3] Bug with wait and trapped signals Vincent Lefevre
@ 2006-02-17 18:08 ` Dan Nelson
  2006-02-18  0:50   ` Vincent Lefevre
  2006-02-18 17:31 ` Peter Stephenson
  1 sibling, 1 reply; 6+ messages in thread
From: Dan Nelson @ 2006-02-17 18:08 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 652 bytes --]

In the last episode (Feb 17), Vincent Lefevre said:
> zsh (all versions?) does not interrupt a "wait" when it receives a
> signal for which a trap has been set.
> 
> For instance, consider the following script:
> 
> #!/usr/bin/env zsh
> echo "PID = $$"
> sleep 60 &
> trap 'echo term; exit 0' TERM
> wait

Here's the test I've been using.  All shells I've tested except zsh and
FreeBSD's ash interrupt the wait, but only bash returns the correct
value from wait #1.  I tested Solaris 10 /bin/sh and /bin/ksh, FreeBSD
5's /bin/sh, zsh-4.2.6 and ksh93-20060124 ports, and Debian's
bash-2.05b-14 and ash-0.5.2-5.

-- 
	Dan Nelson
	dnelson@allantgroup.com

[-- Attachment #2: waittest --]
[-- Type: text/plain, Size: 779 bytes --]

#! /bin/sh
# 
SIG=15
echo "\
Correct order is:
 parent waiting on \$childpid
 child sending signal to \$parentpid
 parent received signal
 wait #1 finished, gotsig=1, status=a number > 128, possibly $(($SIG+128))
 child exiting
 wait #2 finished, gotsig=0, status=33
"
gotsig=0
signal_handler() {
 echo "parent received signal"
 gotsig=1
}
child() {
 sleep 2
 echo "child sending signal to pid $parentpid"
 kill -$SIG $parentpid
 sleep 2
 echo "child exiting" 
 exit 33
} 
parentpid=$$
echo "parent's pid is $parentpid"
child &
childpid=$!
trap signal_handler $SIG
echo "parent waiting on pid $childpid"
wait $childpid
cstatus=$?
echo "wait #1 finished, gotsig=$gotsig, status=$cstatus"
gotsig=0
wait $childpid
cstatus=$?
echo "wait #2 finished, gotsig=$gotsig, status=$cstatus"

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [4.2/4.3] Bug with wait and trapped signals
  2006-02-17 18:08 ` Dan Nelson
@ 2006-02-18  0:50   ` Vincent Lefevre
  0 siblings, 0 replies; 6+ messages in thread
From: Vincent Lefevre @ 2006-02-18  0:50 UTC (permalink / raw)
  To: zsh-workers

On 2006-02-17 12:08:10 -0600, Dan Nelson wrote:
> Here's the test I've been using.  All shells I've tested except zsh and
> FreeBSD's ash interrupt the wait, but only bash returns the correct
> value from wait #1.  I tested Solaris 10 /bin/sh and /bin/ksh, FreeBSD
> 5's /bin/sh, zsh-4.2.6 and ksh93-20060124 ports, and Debian's
> bash-2.05b-14 and ash-0.5.2-5.

Under Debian/unstable, dash, pdksh and mksh return the correct values,
but not ksh93 (for wait #1).

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [4.2/4.3] Bug with wait and trapped signals
  2006-02-17 14:28 [4.2/4.3] Bug with wait and trapped signals Vincent Lefevre
  2006-02-17 18:08 ` Dan Nelson
@ 2006-02-18 17:31 ` Peter Stephenson
  2006-02-18 17:36   ` Peter Stephenson
  2006-02-21  4:06   ` Bart Schaefer
  1 sibling, 2 replies; 6+ messages in thread
From: Peter Stephenson @ 2006-02-18 17:31 UTC (permalink / raw)
  To: zsh-workers

Vincent Lefevre wrote:
> zsh (all versions?) does not interrupt a "wait" when it receives
> a signal for which a trap has been set.
> 
> For instance, consider the following script:
> 
> #!/usr/bin/env zsh
> echo "PID = $$"
> sleep 60 &
> trap 'echo term; exit 0' TERM
> wait
> 
> When I send SIGTERM to the shell process, zsh does nothing, waiting
> for the child to terminate before executing the trap. POSIX says:
> 
>  2.11 Signals and Error Handling
> 
>    [...] When the shell is waiting, by means of the wait utility, for
>    asynchronous commands to complete, the reception of a signal for
>    which a trap has been set shall cause the wait utility to return
>    immediately with an exit status >128, immediately after which the
>    trap associated with that signal shall be taken.

Here's an attempt at fixing that.  I'm unlikely to check this in before
4.3.1.

I'm reasonably happy with replacing the passing of SIGINT with a flag
indicating special handling for "wait" and allowing trapped signals to
interrupt.  I'm less happy with the way last_signal is set by the signal
handler and examined after sigsuspend() or equivalent returns.  However,
it may be that this is the right way to do it after all.

I would prefer if there was someone more proficient in the dark art of
signals to look at this.

By the way, I think I introduced a bug with TRAPSASYNC with BSD_SIGNALS,
if anyone's still using them...it wouldn't actually wait for a signal.

I also inserted an "else" which seemed to want to be present where it's
marked "HERE" (another CSR habit).

Index: Src/jobs.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/jobs.c,v
retrieving revision 1.43
diff -u -r1.43 jobs.c
--- Src/jobs.c	7 Feb 2006 16:55:58 -0000	1.43
+++ Src/jobs.c	18 Feb 2006 17:21:09 -0000
@@ -1116,11 +1116,16 @@
 
 }
 
-/* wait for a particular process */
+/*
+ * Wait for a particular process.
+ * wait_cmd indicates this is from the interactive wait command,
+ * in which case the behaviour is a little different:  the command
+ * itself can be interrupted by a trapped signal.
+ */
 
 /**/
-void
-waitforpid(pid_t pid)
+int
+waitforpid(pid_t pid, int wait_cmd)
 {
     int first = 1, q = queue_signal_level();
 
@@ -1133,18 +1138,30 @@
 	else
 	    kill(pid, SIGCONT);
 
-	signal_suspend(SIGCHLD, SIGINT);
+	last_signal = -1;
+	signal_suspend(SIGCHLD, wait_cmd);
+	if (last_signal != SIGCHLD && wait_cmd) {
+	    /* wait command interrupted, but no error: return */
+	    restore_queue_signals(q);
+	    return 128 + last_signal;
+	}
 	child_block();
     }
     child_unblock();
     restore_queue_signals(q);
+
+    return 0;
 }
 
-/* wait for a job to finish */
+/*
+ * Wait for a job to finish.
+ * wait_cmd indicates this is from the wait builtin; see
+ * wait_cmd in waitforpid().
+ */
 
 /**/
-static void
-zwaitjob(int job, int sig)
+static int
+zwaitjob(int job, int wait_cmd)
 {
     int q = queue_signal_level();
     Job jn = jobtab + job;
@@ -1158,7 +1175,13 @@
 	while (!errflag && jn->stat &&
 	       !(jn->stat & STAT_DONE) &&
 	       !(interact && (jn->stat & STAT_STOPPED))) {
-	    signal_suspend(SIGCHLD, sig);
+	    signal_suspend(SIGCHLD, wait_cmd);
+	    if (last_signal != SIGCHLD && wait_cmd)
+	    {
+		/* builtin wait interrupted by trapped signal */
+		restore_queue_signals(q);
+		return 128 + last_signal;
+	    }
 	    /* Commenting this out makes ^C-ing a job started by a function
 	       stop the whole function again.  But I guess it will stop
 	       something else from working properly, we have to find out
@@ -1181,6 +1204,8 @@
     }
     child_unblock();
     restore_queue_signals(q);
+
+    return 0;
 }
 
 /* wait for running job to finish */
@@ -1692,9 +1717,9 @@
 	} else {   /* Must be BIN_WAIT, so wait for all jobs */
 	    for (job = 0; job <= maxjob; job++)
 		if (job != thisjob && jobtab[job].stat)
-		    zwaitjob(job, SIGINT);
+		    retval = zwaitjob(job, 1);
 	    unqueue_signals();
-	    return 0;
+	    return retval;
 	}
     }
 
@@ -1712,11 +1737,19 @@
 	    Job j;
 	    Process p;
 
-	    if (findproc(pid, &j, &p, 0))
-		waitforpid(pid);
-	    else
+	    if (findproc(pid, &j, &p, 0)) {
+		/*
+		 * returns 0 for normal exit, else signal+128
+		 * in which case we should return that status.
+		 */
+		retval = waitforpid(pid, 1);
+		if (!retval)
+		    retval = lastval2;
+	    } else {
 		zwarnnam(name, "pid %d is not a child of this shell", 0, pid);
-	    retval = lastval2;
+		/* presumably lastval2 doesn't tell us a heck of a lot? */
+		retval = 1;
+	    }
 	    thisjob = ocj;
 	    continue;
 	}
@@ -1796,8 +1829,18 @@
 		killjb(jobtab + job, SIGCONT);
 	    }
 	    if (func == BIN_WAIT)
-		zwaitjob(job, SIGINT);
-	    if (func != BIN_BG) {
+	    {
+		retval = zwaitjob(job, 1);
+		if (!retval)
+		    retval = lastval2;
+	    }
+	    else if (func != BIN_BG) {
+		/*
+		 * HERE: there used not to be an "else" above.  How
+		 * could it be right to wait for the foreground job
+		 * when we've just been told to wait for another
+		 * job (and done it)?
+		 */
 		waitjobs();
 		retval = lastval2;
 	    } else if (ofunc == BIN_DISOWN)
Index: Src/signals.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/signals.c,v
retrieving revision 1.38
diff -u -r1.38 signals.c
--- Src/signals.c	15 Dec 2005 04:24:04 -0000	1.38
+++ Src/signals.c	18 Feb 2006 17:21:10 -0000
@@ -347,9 +347,38 @@
 static signal_jmp_buf suspend_jmp_buf;
 #endif
  
+#if defined(POSIX_SIGNALS) || defined(BSD_SIGNALS)
+static void
+signal_suspend_setup(sigset_t *set, int sig, int wait_cmd)
+{
+    if (isset(TRAPSASYNC)) {
+	sigemptyset(set);
+    } else {
+	sigfillset(set);
+	sigdelset(set, sig);
+#ifdef POSIX_SIGNALS
+	sigdelset(set, SIGHUP);  /* still don't know why we add this? */
+#endif
+	if (wait_cmd)
+	{
+	    /*
+	     * Using "wait" builtin.  We should allow SIGINT and
+	     * execute traps delivered to the shell.
+	     */
+	    int sigtr;
+	    sigdelset(set, SIGINT);
+	    for (sigtr = 1; sigtr <= NSIG; sigtr++) {
+		if (sigtrapped[sigtr] & ~ZSIG_IGNORED)
+		    sigdelset(set, sigtr);
+	    }
+	}
+    }
+}
+#endif
+
 /**/
 int
-signal_suspend(int sig, int sig2)
+signal_suspend(int sig, int wait_cmd)
 {
     int ret;
  
@@ -359,15 +388,7 @@
     sigset_t oset;
 #endif /* BROKEN_POSIX_SIGSUSPEND */
 
-    if (isset(TRAPSASYNC)) {
-	sigemptyset(&set);
-    } else {
-	sigfillset(&set);
-	sigdelset(&set, sig);
-	sigdelset(&set, SIGHUP);  /* still don't know why we add this? */
-	if (sig2)
-	    sigdelset(&set, sig2);
-    }
+    signal_suspend_setup(&set, sig, wait_cmd);
 #ifdef BROKEN_POSIX_SIGSUSPEND
     sigprocmask(SIG_SETMASK, &set, &oset);
     pause();
@@ -379,15 +400,8 @@
 # ifdef BSD_SIGNALS
     sigset_t set;
 
-    if (isset(TRAPSASYNC)) {
-	sigemptyset(&set);
-    } else {
-	sigfillset(&set);
-	sigdelset(&set, sig);
-	if (sig2)
-	    sigdelset(&set, sig2);
-	ret = sigpause(set);
-    }
+    signal_suspend_setup(&set, sig, wait_cmd);
+    ret = sigpause(set);
 # else
 #  ifdef SYSV_SIGNALS
     ret = sigpause(sig);
@@ -397,7 +411,7 @@
      * between the child_unblock() and pause()           */
     if (signal_setjmp(suspend_jmp_buf) == 0) {
         suspend_longjmp = 1;   /* we want to signal_longjmp after catching signal */
-        child_unblock();       /* do we need to unblock sig2 as well?             */
+        child_unblock();       /* do we need to do wait_cmd stuff as well?             */
         ret = pause();
     }
     suspend_longjmp = 0;       /* turn off using signal_longjmp since we are past *
@@ -409,6 +423,10 @@
     return ret;
 }
 
+/* last signal we handled: race prone, or what? */
+/**/
+int last_signal;
+
 /* the signal handler */
  
 /**/
@@ -422,6 +440,7 @@
     signal_jmp_buf jump_to;
 #endif
  
+    last_signal = sig;
     signal_process(sig);
  
     sigfillset(&newmask);

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page still at http://www.pwstephenson.fsnet.co.uk/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [4.2/4.3] Bug with wait and trapped signals
  2006-02-18 17:31 ` Peter Stephenson
@ 2006-02-18 17:36   ` Peter Stephenson
  2006-02-21  4:06   ` Bart Schaefer
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Stephenson @ 2006-02-18 17:36 UTC (permalink / raw)
  To: Zsh hackers list

Peter Stephenson wrote:
> Vincent Lefevre wrote:
> > 
> >  2.11 Signals and Error Handling
> > 
> >    [...] When the shell is waiting, by means of the wait utility, for
> >    asynchronous commands to complete, the reception of a signal for
> >    which a trap has been set shall cause the wait utility to return
> >    immediately with an exit status >128, immediately after which the
> >    trap associated with that signal shall be taken.
> 
> Here's an attempt at fixing that.  I'm unlikely to check this in before
> 4.3.1.

By the way, it passes Dan's test... remind me to add that to the trap
tests if this (or improvemed version) gets checked in.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page still at http://www.pwstephenson.fsnet.co.uk/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [4.2/4.3] Bug with wait and trapped signals
  2006-02-18 17:31 ` Peter Stephenson
  2006-02-18 17:36   ` Peter Stephenson
@ 2006-02-21  4:06   ` Bart Schaefer
  1 sibling, 0 replies; 6+ messages in thread
From: Bart Schaefer @ 2006-02-21  4:06 UTC (permalink / raw)
  To: zsh-workers

On Feb 18,  5:31pm, Peter Stephenson wrote:
}
} I would prefer if there was someone more proficient in the dark art of
} signals to look at this.

I don't know whether I qualify or not, but it looks mostly OK to me.  I
may even be able to answer this old question:

} +#ifdef POSIX_SIGNALS
} +	sigdelset(set, SIGHUP);  /* still don't know why we add this? */
} +#endif

The reason for this is so that the shell exits properly if the
terminal is disconnected (window closed, whatever) during the wait.

} I also inserted an "else" which seemed to want to be present where it's
} marked "HERE" (another CSR habit).

I think the intention there is to reap any other jobs that exited
while we were waiting for the specified job, because we won't get
the normal SIGCHLD signals for those.  However, I'm note entirely
sure that was necessary or that it accomplished what it meant to.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-02-21  4:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-17 14:28 [4.2/4.3] Bug with wait and trapped signals Vincent Lefevre
2006-02-17 18:08 ` Dan Nelson
2006-02-18  0:50   ` Vincent Lefevre
2006-02-18 17:31 ` Peter Stephenson
2006-02-18 17:36   ` Peter Stephenson
2006-02-21  4:06   ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).