zsh-workers
 help / color / mirror / Atom feed
* [PATCH] Do not send duplicate signals when MONITOR is set
@ 2021-06-07 17:27 Erik Paulson
  2021-06-07 18:45 ` Bart Schaefer
  0 siblings, 1 reply; 2+ messages in thread
From: Erik Paulson @ 2021-06-07 17:27 UTC (permalink / raw)
  To: zsh-workers

When job control is enabled, killjb() is sending signals to the job's group
leader via killpg(), and then falling into a loop where the job's
process list is traversed and the signal is sent to each process. This
causes signals to always be sent twice.

This patch adds a return after the killpg() call to avoid sending the
signal again.
---

I run emacs as a daemon and use the emacsclient program to connect to
it. I noticed that when I suspended the emacsclient program and
resumed it in zsh, the program would sporadically crash. After digging
into the code, I realized that emacsclient was receiving two SIGCONTs,
which caused it to send a malformed command to the daemon. While this
is definitely a problem with emacsclient, it doesn't feel right that
Zsh is sending two SIGCONTs.

I found that this return used to be present, but was removed in
https://www.zsh.org/mla/workers/2018/msg01338.html while addressing
another emacs issue. It looks to me to be an oversight, but I cannot
tell as I am not well versed in the Zsh codebase or job control. I
know that my issue goes away with this patch, and I cannot reproduce
the original issue in the linked mail thread with it either.

Note that on testing with Linux, it seems the kernel will suppress the
second signal; in order to get a test program to detect it, I have to
step through the code with the debugger. On OSX, where I originally
detected this problem, I reliably get two signals delivered each time.

 Src/signals.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Src/signals.c b/Src/signals.c
index 2c540f38f..5c787e2a8 100644
--- a/Src/signals.c
+++ b/Src/signals.c
@@ -810,6 +810,7 @@ killjb(Job jn, int sig)
 	    err = killpg(jn->gleader, sig);
 	    if (sig == SIGCONT && err != -1)
 		makerunning(jn);
+	    return err;
 	}
     }
     for (pn = jn->procs; pn; pn = pn->next) {
-- 
2.31.1



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Do not send duplicate signals when MONITOR is set
  2021-06-07 17:27 [PATCH] Do not send duplicate signals when MONITOR is set Erik Paulson
@ 2021-06-07 18:45 ` Bart Schaefer
  0 siblings, 0 replies; 2+ messages in thread
From: Bart Schaefer @ 2021-06-07 18:45 UTC (permalink / raw)
  To: Erik Paulson; +Cc: Zsh hackers list

On Mon, Jun 7, 2021 at 10:28 AM Erik Paulson <epaulson10@gmail.com> wrote:
>
> I run emacs as a daemon and use the emacsclient program to connect to
> it. I noticed that when I suspended the emacsclient program and
> resumed it in zsh, the program would sporadically crash. After digging
> into the code, I realized that emacsclient was receiving two SIGCONTs,
> which caused it to send a malformed command to the daemon.
>
> I found that this return used to be present, but was removed in
> https://www.zsh.org/mla/workers/2018/msg01338.html while addressing
> another emacs issue.

I don't think it was removed ... similar code was added in two
separate places, but the "return" was only added in one of those.

Your patch adds that return in the second case.

The difference is that in the first case, the SIGCONT is received by a
job that is marked STAT_SUPERJOB and in the second case it's received
by a different job.

I believe this means that in the former case the superjob is in the
foreground and in the second case, it isn't -- rather one of its
subjobs is.  In the first instance zsh sends the signal to all the
subjobs and then to the process group.  In the second case it sends
the signal to the process group first and then falls into the loop
sending the signal to any subjobs that still appear to be stopped.

In any case I think a potential problem with placing an unconditional
"return" where your patch does, is that signals other than SIGCONT
probably still need to be delivered to the subjobs.  PWS, any input
here?

> Note that on testing with Linux, it seems the kernel will suppress the
> second signal; in order to get a test program to detect it, I have to
> step through the code with the debugger. On OSX, where I originally
> detected this problem, I reliably get two signals delivered each time.

This is probably a process scheduling difference rather than a signal
being suppressed, e.g., on Linux the order of events is
1) zsh sends signal to process group
2) process group copies signal to all processes
3) those processes resume
4) zsh proceeds into makerunning() and clears the STAT_STOPPED flag
5) that makes the loop a no-op

Whereas on OSX,
1) zsh sends signal to process group
2) zsh proceeds into makerunning() so STAT_STOPPED is left in place
3) process group copies signal to all processes
4) the loop sends a second SIGCONT
5) those processes resume and get a double SIGCONT

(2 & 3 might be simultaneous or in either order)


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-07 18:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-07 17:27 [PATCH] Do not send duplicate signals when MONITOR is set Erik Paulson
2021-06-07 18:45 ` Bart Schaefer

zsh-workers

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.vuxu.org/zsh-workers

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 zsh-workers zsh-workers/ http://inbox.vuxu.org/zsh-workers \
		zsh-workers@zsh.org
	public-inbox-index zsh-workers

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.zsh.workers


code repositories for the project(s) associated with this inbox:

	https://git.vuxu.org/mirror/zsh/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git