supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: Laurent Bercot <ska-supervision@skarnet.org>
To: supervision@list.skarnet.org
Subject: Re: runsv failing when starting up logger - missing pipe - failure of logpipe init?
Date: Tue, 17 Aug 2010 09:57:20 +0200	[thread overview]
Message-ID: <20100817075720.GA9754@skarnet.org> (raw)
In-Reply-To: <Pine.LNX.4.64.1008161922530.25187@e-smith.charlieb.ott.istop.com>

>> Has anyone else seen this error condition or can posit a situation where 
>> it might be seen?
> 
> The next question to ponder is where the bug lies. The runsv process here 
> has no fd 5 and fd 6 - IOW, logpipe[0] is 5, but isn't a valid fd. Are 
> there circumstances where a pipe can just cease to be? Should runsv have 
> detected this issue (where pipe() did not return -1, but the fds returned 
> were not valid)?
> 
> Is this a linux kernel bug?

 Before accusing the Linux kernel, let's check the runsv code and see
whether there's a possible execution path that leads to the situation
you're describing...

 The pipe creation part looks correct.
 The part where the error occurs looks correct.
 Okay, so is there a place where the pipe might be closed? Sure enough,
there is: right at the end, if svd[0].want == W_EXIT, svd[0].state == DOWN,
svd[1].pid != 0 and svd[1].want != W_EXIT, then logpipe[1] and logpipe[0]
both get closed. And this is the only place where it can happen.

 My bet is that at some point, your runsv ran through that code, but
somehow managed to live and the services didn't die, i.e. another control
message was sent and processed before the exit condition was reached, and
runsv is still trying to supervise things - but runs into trouble with the
closed logpipe. I have no time to investigate further right now, but earlier
in your strace, you should see stuff such as the control messages arriving,
the logpipe getting closed, etc.
 If my bet is correct, then the bug is that there's a case where runsv can
close the logpipe and still keep going, whereas it should exit as soon as
the logger dies no matter what (or just exit on the spot and let the logger
die on its own).

-- 
 Laurent


  reply	other threads:[~2010-08-17  7:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-16 20:32 Charlie Brady
2010-08-16 23:30 ` Charlie Brady
2010-08-17  7:57   ` Laurent Bercot [this message]
2010-08-17 11:56     ` Charlie Brady
2010-08-17 20:51     ` Charlie Brady

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100817075720.GA9754@skarnet.org \
    --to=ska-supervision@skarnet.org \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).