zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: zsh-workers@zsh.org
Subject: Re: Deadlock when receiving kill-signal from child process
Date: Tue, 4 Aug 2015 23:53:59 -0700	[thread overview]
Message-ID: <150804235400.ZM9958@torch.brasslantern.com> (raw)
In-Reply-To: <CA+=GgY7uGzCYEKLBzqrt=ct6q72WFC5w1jMB5RDNe60J-wUz=Q@mail.gmail.com>

On Aug 5, 12:52am, Mathias Fredriksson wrote:
}
} I have however managed to get a dump with strace on Gentoo

Based on this strace plus a GDB stack trace Mathias sent me off-list,
I think the problem may be here:

 1415 zwaitjob(int job, int wait_cmd)
 1416 {
 1417     int q = queue_signal_level();
 1418     Job jn = jobtab + job;
 1419 
 1420     dont_queue_signals();
 1421     child_block();               /* unblocked during signal_suspend() */
 1422     queue_traps(wait_cmd);
...
 1440         while (!errflag && jn->stat &&
 1441                !(jn->stat & STAT_DONE) &&
 1442                !(interact && (jn->stat & STAT_STOPPED))) {
 1443             signal_suspend(SIGCHLD, wait_cmd);

I suspect what's happening is that the child represented by "job" exits
during dont_queue_signals(), which is a macro that expands to a loop
calling zhandler(), which will process TRAPUSR1 (or other traps).

Somehow this results in jn->stat never being marked STAT_DONE.  Perhaps
this happens because the "thisjob" global gets temporarily changed in
the TRAP* function?  Anyway signal_suspend(SIGCHLD, wait_cmd) is then
called when there are no children left, so we never receive another
SIGCHLD to break out of the while-loop, and even if we do come out of
signal_suspend() the while-loop goes around and we block again.

I'm not sure what to do if this is in fact the problem, because it
e.g. calling child_block() is before dont_queue_signals() has other
problems.

However, it's also possible that a child has exited even before its
job table entry has been created.  One way to find out if that has
happened is this patch:

diff --git a/Src/signals.c b/Src/signals.c
index 3950ad1..d72c7d6 100644
--- a/Src/signals.c
+++ b/Src/signals.c
@@ -519,6 +519,7 @@ wait_for_processes(void)
 	     * will get added on to the next found process that
 	     * terminates.
 	     */
+	    zwarn("no job table entry for pid %d", pid);
 	    get_usage();
 	}
 	/*

Mathias, if you could apply that patch and try again to reproduce the
deadlock, it might tell us something.

-- 
Barton E. Schaefer


  parent reply	other threads:[~2015-08-05  6:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-03 11:25 Mathias Fredriksson
2015-08-03 15:52 ` Bart Schaefer
2015-08-03 20:36   ` Mathias Fredriksson
2015-08-03 20:58     ` Bart Schaefer
2015-08-04 21:52       ` Mathias Fredriksson
2015-08-05  0:05         ` Mathias Fredriksson
2015-08-05  6:53         ` Bart Schaefer [this message]
2015-08-05 10:37           ` Mathias Fredriksson
2015-08-05 15:52             ` Bart Schaefer
2015-08-05 16:05               ` Mathias Fredriksson
2015-08-05 18:52                 ` Bart Schaefer
2015-08-05 19:11                   ` Mathias Fredriksson
2015-08-05 20:20                     ` Bart Schaefer
2015-08-05 21:49                       ` Mathias Fredriksson
2015-08-06  5:06                         ` Bart Schaefer
2015-08-06  8:24                           ` Mathias Fredriksson
2015-08-06 15:54                             ` Bart Schaefer
2015-08-07  0:45                               ` Mathias Fredriksson
2015-08-07  5:39                                 ` Bart Schaefer
2015-08-09 13:53                                   ` Mathias Fredriksson
2015-08-09 23:42                                     ` Bart Schaefer
2015-08-10  0:02                                       ` Mikael Magnusson
2015-08-10  0:16                                         ` Mathias Fredriksson
2015-08-10  1:47                                           ` Bart Schaefer
2015-08-10  2:02                                             ` Mikael Magnusson
2015-08-10 15:59                                               ` Bart Schaefer
2015-08-10 17:30                                                 ` Mathias Fredriksson
2015-08-10  0:36                                         ` Bart Schaefer
2015-08-10  0:29                                       ` Bart Schaefer
2015-08-10 19:34                                     ` Bart Schaefer
2015-08-10 21:17                                       ` Mathias Fredriksson
2015-08-10 22:53                                         ` Mathias Fredriksson
2015-08-11  0:53                                           ` Bart Schaefer
2015-08-11 12:17                                             ` Mathias Fredriksson
2015-08-11 14:38                                               ` Mathias Fredriksson
2015-08-11 15:07                                               ` Bart Schaefer
2015-08-11 15:22                                                 ` Mathias Fredriksson
2015-08-11  4:06                                           ` Running commands in a zpty worker Bart Schaefer
2015-08-11 13:14                                             ` Mathias Fredriksson
2015-08-11 14:35                                               ` Mathias Fredriksson
2015-08-11 14:37                                               ` Bart Schaefer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=150804235400.ZM9958@torch.brasslantern.com \
    --to=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).