supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers
       [not found]                     ` <20161208132842.5d7940bd@eto-mona.office.smartweb.sk>
@ 2016-12-08 20:23                       ` Jonathan de Boyne Pollard
  2016-12-11 12:46                         ` Jilles Tjoelker
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan de Boyne Pollard @ 2016-12-08 20:23 UTC (permalink / raw)
  To: FreeBSD Hackers, supervision

Martin "eto" Misuth:
> I think that might be the reason why my PID1 s6-svscan on FreeBSD is 
> accumulating zombies sometimes (seems like it is affected by dead 
> descendants of ssh and my experiments). [...]
> Anyway as you are probably much closer to FreeBSD team than I am, [...]

I'm not.  You have the same access as I and the rest of the world have.

For what it's worth, I've seen similar behaviour with zombies lying 
around.  If we can nail it down you can file a kernel bug report.

Have you checked that you aren't getting a NOTE_EXIT?
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers
  2016-12-08 20:23                       ` NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers Jonathan de Boyne Pollard
@ 2016-12-11 12:46                         ` Jilles Tjoelker
  2016-12-11 17:54                           ` Konstantin Belousov
  2016-12-13 12:39                           ` Jonathan de Boyne Pollard
  0 siblings, 2 replies; 6+ messages in thread
From: Jilles Tjoelker @ 2016-12-11 12:46 UTC (permalink / raw)
  To: Jonathan de Boyne Pollard; +Cc: FreeBSD Hackers, supervision

On Thu, Dec 08, 2016 at 08:23:29PM +0000, Jonathan de Boyne Pollard wrote:
> Martin "eto" Misuth:
> > I think that might be the reason why my PID1 s6-svscan on FreeBSD is 
> > accumulating zombies sometimes (seems like it is affected by dead 
> > descendants of ssh and my experiments). [...]
> > Anyway as you are probably much closer to FreeBSD team than I am, [...]

> I'm not.  You have the same access as I and the rest of the world have.

> For what it's worth, I've seen similar behaviour with zombies lying 
> around.  If we can nail it down you can file a kernel bug report.

> Have you checked that you aren't getting a NOTE_EXIT?

As reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213928
reapers do not receive SIGCHLD when inheriting zombies, although they do
receive SIGCHLD when an inherited descendant process later terminates
and are awakened when in a wait call for a matching process. This should
probably be fixed.

NOTE_TRACK has the inherent problem of what to do when NOTE_TRACKERR
happens and has problems with process ID reuse, so it seems better to
use reaper instead of it.

-- 
Jilles Tjoelker
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers
  2016-12-11 12:46                         ` Jilles Tjoelker
@ 2016-12-11 17:54                           ` Konstantin Belousov
  2016-12-11 22:23                             ` Jilles Tjoelker
  2016-12-13 12:39                           ` Jonathan de Boyne Pollard
  1 sibling, 1 reply; 6+ messages in thread
From: Konstantin Belousov @ 2016-12-11 17:54 UTC (permalink / raw)
  To: Jilles Tjoelker; +Cc: FreeBSD Hackers, Jonathan de Boyne Pollard, supervision

On Sun, Dec 11, 2016 at 01:46:38PM +0100, Jilles Tjoelker wrote:
> As reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213928
> reapers do not receive SIGCHLD when inheriting zombies, although they do
> receive SIGCHLD when an inherited descendant process later terminates
> and are awakened when in a wait call for a matching process. This should
> probably be fixed.
I agree, in principle.  Could you update your tests/sys/kern/reaper.c to
include this scenario ?

Untested change to the kernel side is below.

diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c
index f4f453c3556..92899740eef 100644
--- a/sys/kern/kern_exit.c
+++ b/sys/kern/kern_exit.c
@@ -455,6 +455,12 @@ exit1(struct thread *td, int rval, int signo)
 
 		if (!(q->p_flag & P_TRACED)) {
 			proc_reparent(q, q->p_reaper);
+			if (q->p_state == PRS_ZOMBIE) {
+				PROC_LOCK(q->p_reaper);
+				pksignal(q->p_reaper, SIGCHLD, q->p_ksi);
+				wakeup(q->p_reaper);
+				PROC_UNLOCK(q->p_reaper);
+			}
 		} else {
 			/*
 			 * Traced processes are killed since their existence
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers
  2016-12-11 17:54                           ` Konstantin Belousov
@ 2016-12-11 22:23                             ` Jilles Tjoelker
  2016-12-12 11:12                               ` Konstantin Belousov
  0 siblings, 1 reply; 6+ messages in thread
From: Jilles Tjoelker @ 2016-12-11 22:23 UTC (permalink / raw)
  To: Konstantin Belousov
  Cc: FreeBSD Hackers, Jonathan de Boyne Pollard, supervision

On Sun, Dec 11, 2016 at 07:54:51PM +0200, Konstantin Belousov wrote:
> On Sun, Dec 11, 2016 at 01:46:38PM +0100, Jilles Tjoelker wrote:
> > As reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213928
> > reapers do not receive SIGCHLD when inheriting zombies, although they do
> > receive SIGCHLD when an inherited descendant process later terminates
> > and are awakened when in a wait call for a matching process. This should
> > probably be fixed.
> I agree, in principle.  Could you update your tests/sys/kern/reaper.c to
> include this scenario ?

I tested with the patch below. The new test case
reaper_sigchld_child_first already passes and I have marked
reaper_sigchld_grandchild_first as an expected timeout.

> Untested change to the kernel side is below.

> diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c
> index f4f453c3556..92899740eef 100644
> --- a/sys/kern/kern_exit.c
> +++ b/sys/kern/kern_exit.c
> @@ -455,6 +455,12 @@ exit1(struct thread *td, int rval, int signo)
>  
>  		if (!(q->p_flag & P_TRACED)) {
>  			proc_reparent(q, q->p_reaper);
> +			if (q->p_state == PRS_ZOMBIE) {
> +				PROC_LOCK(q->p_reaper);
> +				pksignal(q->p_reaper, SIGCHLD, q->p_ksi);
> +				wakeup(q->p_reaper);
> +				PROC_UNLOCK(q->p_reaper);
> +			}
>  		} else {
>  			/*
>  			 * Traced processes are killed since their existence

This change makes reaper_sigchld_grandchild_first pass.

The wakeup added here is redundant with the one a few lines above:
        if (q != NULL)          /* only need this if any child is S_ZOMB */
                wakeup(q->p_reaper);

This is also clear because reaper_wait_grandchild_first already passes.

Index: tests/sys/kern/reaper.c
===================================================================
--- tests/sys/kern/reaper.c	(revision 309839)
+++ tests/sys/kern/reaper.c	(working copy)
@@ -35,6 +35,11 @@
 #include <signal.h>
 #include <unistd.h>
 
+static void
+dummy_sighandler(int sig __unused, siginfo_t *info __unused, void *ctx __unused)
+{
+}
+
 ATF_TC_WITHOUT_HEAD(reaper_wait_child_first);
 ATF_TC_BODY(reaper_wait_child_first, tc)
 {
@@ -129,6 +134,163 @@
 	ATF_CHECK_EQ(2, r);
 }
 
+ATF_TC(reaper_sigchld_child_first);
+ATF_TC_HEAD(reaper_sigchld_child_first, tc)
+{
+	atf_tc_set_md_var(tc, "timeout", "2");
+}
+ATF_TC_BODY(reaper_sigchld_child_first, tc)
+{
+	struct sigaction act;
+	sigset_t mask;
+	siginfo_t info;
+	pid_t parent, child, grandchild, pid;
+	int r;
+	int pip[2];
+
+	/* Be paranoid. */
+	pid = waitpid(-1, NULL, WNOHANG);
+	ATF_REQUIRE(pid == -1 && errno == ECHILD);
+
+	act.sa_sigaction = dummy_sighandler;
+	act.sa_flags = SA_SIGINFO | SA_RESTART;
+	r = sigemptyset(&act.sa_mask);
+	ATF_REQUIRE_EQ(0, r);
+	r = sigaction(SIGCHLD, &act, NULL);
+	ATF_REQUIRE_EQ(0, r);
+
+	r = sigemptyset(&mask);
+	ATF_REQUIRE_EQ(0, r);
+	r = sigaddset(&mask, SIGCHLD);
+	ATF_REQUIRE_EQ(0, r);
+	r = sigprocmask(SIG_BLOCK, &mask, NULL);
+	ATF_REQUIRE_EQ(0, r);
+
+	parent = getpid();
+	r = procctl(P_PID, parent, PROC_REAP_ACQUIRE, NULL);
+	ATF_REQUIRE_EQ(0, r);
+
+	r = pipe(pip);
+	ATF_REQUIRE_EQ(0, r);
+
+	child = fork();
+	ATF_REQUIRE(child != -1);
+	if (child == 0) {
+		if (close(pip[1]) != 0)
+			_exit(100);
+		grandchild = fork();
+		if (grandchild == -1)
+			_exit(101);
+		else if (grandchild == 0) {
+			if (read(pip[0], &(uint8_t){ 0 }, 1) != 0)
+				_exit(102);
+			if (getppid() != parent)
+				_exit(103);
+			_exit(2);
+		} else
+			_exit(3);
+	}
+
+	r = sigwaitinfo(&mask, &info);
+	ATF_REQUIRE_EQ(SIGCHLD, r);
+	ATF_CHECK_EQ(SIGCHLD, info.si_signo);
+	ATF_CHECK_EQ(CLD_EXITED, info.si_code);
+	ATF_CHECK_EQ(3, info.si_status);
+	ATF_CHECK_EQ(child, info.si_pid);
+
+	pid = waitpid(child, NULL, 0);
+	ATF_REQUIRE_EQ(child, pid);
+
+	r = close(pip[1]);
+	ATF_REQUIRE_EQ(0, r);
+
+	r = sigwaitinfo(&mask, &info);
+	ATF_REQUIRE_EQ(SIGCHLD, r);
+	ATF_CHECK_EQ(SIGCHLD, info.si_signo);
+	ATF_CHECK_EQ(CLD_EXITED, info.si_code);
+	ATF_CHECK_EQ(2, info.si_status);
+	grandchild = info.si_pid;
+	ATF_REQUIRE(grandchild > 0);
+	ATF_REQUIRE(grandchild != parent);
+	ATF_REQUIRE(grandchild != child);
+
+	pid = waitpid(-1, NULL, 0);
+	ATF_REQUIRE_EQ(grandchild, pid);
+
+	r = close(pip[0]);
+	ATF_REQUIRE_EQ(0, r);
+}
+
+ATF_TC(reaper_sigchld_grandchild_first);
+ATF_TC_HEAD(reaper_sigchld_grandchild_first, tc)
+{
+	atf_tc_set_md_var(tc, "timeout", "2");
+}
+ATF_TC_BODY(reaper_sigchld_grandchild_first, tc)
+{
+	struct sigaction act;
+	sigset_t mask;
+	siginfo_t info;
+	pid_t parent, child, grandchild, pid;
+	int r;
+
+	atf_tc_expect_timeout("bug 213928");
+
+	/* Be paranoid. */
+	pid = waitpid(-1, NULL, WNOHANG);
+	ATF_REQUIRE(pid == -1 && errno == ECHILD);
+
+	act.sa_sigaction = dummy_sighandler;
+	act.sa_flags = SA_SIGINFO | SA_RESTART;
+	r = sigemptyset(&act.sa_mask);
+	ATF_REQUIRE_EQ(0, r);
+	r = sigaction(SIGCHLD, &act, NULL);
+	ATF_REQUIRE_EQ(0, r);
+
+	r = sigemptyset(&mask);
+	ATF_REQUIRE_EQ(0, r);
+	r = sigaddset(&mask, SIGCHLD);
+	ATF_REQUIRE_EQ(0, r);
+	r = sigprocmask(SIG_BLOCK, &mask, NULL);
+	ATF_REQUIRE_EQ(0, r);
+
+	parent = getpid();
+	r = procctl(P_PID, parent, PROC_REAP_ACQUIRE, NULL);
+	ATF_REQUIRE_EQ(0, r);
+
+	child = fork();
+	ATF_REQUIRE(child != -1);
+	if (child == 0) {
+		grandchild = fork();
+		if (grandchild == -1)
+			_exit(101);
+		else if (grandchild == 0)
+			_exit(2);
+		else {
+			if (waitid(P_PID, grandchild, NULL,
+			    WNOWAIT | WEXITED) != 0)
+				_exit(102);
+			_exit(3);
+		}
+	}
+
+	pid = waitpid(child, NULL, 0);
+	ATF_REQUIRE_EQ(child, pid);
+
+	r = sigwaitinfo(&mask, &info);
+	ATF_REQUIRE_EQ(SIGCHLD, r);
+	ATF_CHECK_EQ(SIGCHLD, info.si_signo);
+	ATF_CHECK_EQ(CLD_EXITED, info.si_code);
+	ATF_CHECK_EQ(2, info.si_status);
+	grandchild = info.si_pid;
+	ATF_REQUIRE(grandchild > 0);
+	ATF_REQUIRE(grandchild != parent);
+	ATF_REQUIRE(grandchild != child);
+
+	pid = waitpid(-1, NULL, 0);
+	ATF_REQUIRE_EQ(grandchild, pid);
+}
+
 ATF_TC_WITHOUT_HEAD(reaper_status);
 ATF_TC_BODY(reaper_status, tc)
 {
@@ -484,6 +646,8 @@
 
 	ATF_TP_ADD_TC(tp, reaper_wait_child_first);
 	ATF_TP_ADD_TC(tp, reaper_wait_grandchild_first);
+	ATF_TP_ADD_TC(tp, reaper_sigchld_child_first);
+	ATF_TP_ADD_TC(tp, reaper_sigchld_grandchild_first);
 	ATF_TP_ADD_TC(tp, reaper_status);
 	ATF_TP_ADD_TC(tp, reaper_getpids);
 	ATF_TP_ADD_TC(tp, reaper_kill_badsig);

-- 
Jilles Tjoelker
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers
  2016-12-11 22:23                             ` Jilles Tjoelker
@ 2016-12-12 11:12                               ` Konstantin Belousov
  0 siblings, 0 replies; 6+ messages in thread
From: Konstantin Belousov @ 2016-12-12 11:12 UTC (permalink / raw)
  To: Jilles Tjoelker; +Cc: FreeBSD Hackers, Jonathan de Boyne Pollard, supervision

On Sun, Dec 11, 2016 at 11:23:28PM +0100, Jilles Tjoelker wrote:
> On Sun, Dec 11, 2016 at 07:54:51PM +0200, Konstantin Belousov wrote:
> > On Sun, Dec 11, 2016 at 01:46:38PM +0100, Jilles Tjoelker wrote:
> > > As reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213928
> > > reapers do not receive SIGCHLD when inheriting zombies, although they do
> > > receive SIGCHLD when an inherited descendant process later terminates
> > > and are awakened when in a wait call for a matching process. This should
> > > probably be fixed.
> > I agree, in principle.  Could you update your tests/sys/kern/reaper.c to
> > include this scenario ?
> 
> I tested with the patch below. The new test case
> reaper_sigchld_child_first already passes and I have marked
> reaper_sigchld_grandchild_first as an expected timeout.
> 
> > Untested change to the kernel side is below.
> 
> > diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c
> > index f4f453c3556..92899740eef 100644
> > --- a/sys/kern/kern_exit.c
> > +++ b/sys/kern/kern_exit.c
> > @@ -455,6 +455,12 @@ exit1(struct thread *td, int rval, int signo)
> >  
> >  		if (!(q->p_flag & P_TRACED)) {
> >  			proc_reparent(q, q->p_reaper);
> > +			if (q->p_state == PRS_ZOMBIE) {
> > +				PROC_LOCK(q->p_reaper);
> > +				pksignal(q->p_reaper, SIGCHLD, q->p_ksi);
> > +				wakeup(q->p_reaper);
> > +				PROC_UNLOCK(q->p_reaper);
> > +			}
> >  		} else {
> >  			/*
> >  			 * Traced processes are killed since their existence
> 
> This change makes reaper_sigchld_grandchild_first pass.
> 
> The wakeup added here is redundant with the one a few lines above:
>         if (q != NULL)          /* only need this if any child is S_ZOMB */
>                 wakeup(q->p_reaper);
I removed additional wakeup, it should be innocent.  Might be the better
change is to remove the lines you cited above and do wakeups inside the
loop after reparenting.  But I kept it that way for now.

> 
> This is also clear because reaper_wait_grandchild_first already passes.
> 
> Index: tests/sys/kern/reaper.c

Thank you, I verified that all tests pass with the patched kernel and
removed timeouts from your reaper_sigchld_*child_first new tests.
The kernel chunk is committed as r309886, please commit tests.
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers
  2016-12-11 12:46                         ` Jilles Tjoelker
  2016-12-11 17:54                           ` Konstantin Belousov
@ 2016-12-13 12:39                           ` Jonathan de Boyne Pollard
  1 sibling, 0 replies; 6+ messages in thread
From: Jonathan de Boyne Pollard @ 2016-12-13 12:39 UTC (permalink / raw)
  To: FreeBSD Hackers, supervision

Jilles Tjoelker:
> This should probably be fixed. 

There's another more insidious bug hiding inside kevent() somewhere that 
causes a kernel abend complaining about a sleeping thread holding a 
non-sleepable lock.  One needs to make fairly heavy use of kevent() in 
order to trigger it, I believe, as I have only seen it so far on systems 
that do.
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-12-13 12:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20161102185444.GA911@protected.rcdrun.com>
     [not found] ` <20161201081829.GG1487@protected.rcdrun.com>
     [not found]   ` <20161201120531.374588b2@mydesk.domain.cxm>
     [not found]     ` <20161201172846.GP3428@protected.rcdrun.com>
     [not found]       ` <20161201124118.46778e2b@mydesk.domain.cxm>
     [not found]         ` <20161201174837.GR3428@protected.rcdrun.com>
     [not found]           ` <20161201125438.15230317@mydesk.domain.cxm>
     [not found]             ` <20161206104020.6b2ebb30@eto-mona.office.smartweb.sk>
     [not found]               ` <20161206102637.1ddd152a@mydesk.domain.cxm>
     [not found]                 ` <20161207155638.4b2dd629@eto-mona.office.smartweb.sk>
     [not found]                   ` <630ace89-e29b-d0d3-9f15-110d8dc3de08@NTLWorld.com>
     [not found]                     ` <20161208132842.5d7940bd@eto-mona.office.smartweb.sk>
2016-12-08 20:23                       ` NOTE_TRACK, EVFILT_PROC, kqueue, and subreapers Jonathan de Boyne Pollard
2016-12-11 12:46                         ` Jilles Tjoelker
2016-12-11 17:54                           ` Konstantin Belousov
2016-12-11 22:23                             ` Jilles Tjoelker
2016-12-12 11:12                               ` Konstantin Belousov
2016-12-13 12:39                           ` Jonathan de Boyne Pollard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).