From: Ryan Woodrum <rwoodrum@avvo.com>
To: supervision@list.skarnet.org
Subject: Re: sv term handling with a slow child
Date: Wed, 16 Jan 2008 16:35:40 -0800 [thread overview]
Message-ID: <200801161635.40876.rwoodrum@avvo.com> (raw)
In-Reply-To: <200801161604.45554.mike@geekgene.com>
I should add for clarity to my first, well behaved example showing that the
process does indeed exit:
ops1test:/home/rwoodrum/tmp# /etc/init.d/slow_signal start \
> && ps ax | grep slow \
> && sleep 12 \
> && /etc/init.d/slow_signal stop \
> && ps ax | grep slow
ok: run: slow_signal: (pid 31434) 58s
30229 ? Ss 0:00 runsv slow_signal
31434 ? S 0:00 /usr/bin/ruby /home/rwoodrum/tmp/slow_signal.rb
31446 ttyp0 S+ 0:00 grep slow
ok: down: slow_signal: 0s, normally up
30229 ? Ss 0:00 runsv slow_signal
31456 ttyp0 S+ 0:00 grep slow
On Wednesday 16 January 2008 03:04:45 pm Mike Buland wrote:
> Hi
>
> I went ahead and ran a few tests, including your ruby script. I can't
> apparently repreduce the behaviour you describe.
>
> On linux (and POSIX systems) there is a default signal handler for many of
> the signals. The terminate signal normally ends the process. At least in
> my tests the ruby program is indeed terminated, the process ends, and the
> status in runit is set to 'd' or down. It is set to down, but the program
> is gone.
>
> When I wrote my own test in C:
> ----
> #include <stdlib.h>
>
> int main()
> {
> sleep( 50000 );
> }
> ----
>
> to test the behaviour of TERM everything works as expected. No term signal
> handler is registered, sending the program a term on the command line
> (kill -15 $pid) terminates the program. Then I tried ignoring term:
>
> ----
> #include <stdio.h>
> #include <stdlib.h>
> #include <signal.h>
>
> int main()
> {
> signal( 15, SIG_IGN );
> sleep( 50000 );
> }
> ----
>
> And the program kept running. Testing both of these programs with runit
> gave the expected results. The program using the default signal handler
> exited as soon as runit sent it term, and the status of the service was set
> accordingly, for the second program term was ignored and runit went
> into "want down, got TERM" state.
>
> On your system, are you 100% sure that the ruby test program you're using
> isn't just exiting appropriately? I can't find anything that mimics the
> described bahaviour. I.E. runit is behaving the way you describe, but the
> process does end.
>
> --Mike
>
> On Wednesday 16 January 2008 03:41:29 pm Ryan Woodrum wrote:
> > Hello!
> >
> > I believe I have found a possible bug/oddity in the behavior of sv
> > using runsv. I happened upon this particular scenario in a test
> > environment, but was actually able to repro it in my production
> > environment as well as in a primitive case. The issue involves slow
> > children or children whose TERM handler isn't registered soon enough.
> >
> > Here's the setup:
> > I create a simplistic base service configuration under which I will
> > run a ruby application. The ruby app looks like so:
> > slow_signal.rb
> > ---
> > sleep(10)
> >
> > puts "registering term handler..."
> > trap("TERM") do
> > puts "got term"
> > exit
> > end
> >
> > while(true) do
> > puts "looping and sleeping..."
> > sleep 2
> > end
> > ---
> >
> > I run this under my run svdir with:
> > #!/bin/sh
> > exec 2>&1
> > exec /usr/bin/ruby /home/rwoodrum/tmp/slow_signal.rb
> >
> >
> > The premise of the primitive ruby application is to emulate a slow-ish
> > loading base of code that has a term handler registered early in the
> > life of the process.
> >
> > If I invoke:
> > /etc/init.d/slow_signal start
> >
> > followed within the 10 second sleep period by:
> > /etc/init.d/slow_signal stop
> >
> > (/etc/init.d/slow_signal is a symlink to /usr/bin/sv)
> >
> > The process does not handle the signal but its state is set to 'd';
> > down. In subsequent calls to control() within sv.c, it will no longer
> > write to the pipe because it thinks there is no need. With no further
> > writes to the pipe, another TERM will never get sent and so the
> > process cannot be shut down via sv/runsv, at least not with TERM.
> >
> > It took me awhile to learn how everything was work and to track down
> > just where this check was happening. The source I worked against was
> > the source available via the debian package v1.8.0 (`apt-get source
> > runit` under debian sid). (I looked for a repo but did not find a public
> > one.)
> >
> > Two solutions I can think of are not to set svstatus[17] unless you're
> > sure the process actually went down, but this is more complicated
> > (perhaps more correct?) than a second solution. Inside of control() in
> > sv.c, a modification to always send a TERM can be made like so:
> > -----
> > 247c247,248
> > < if (svstatus[17] == *a) return(0);
> > ---
> >
> > > /* Write a TERM to the pipe even if we already have. Slow TERM
> > > handler perhaps? What about other cases?*/
> > > if (svstatus[17] == *a && *a != 'd') return(0);
> >
> > -----
> >
> > In this case, we simply decide that, if we want to issue a TERM via sv
> > stop, down etc., we will go ahead and write again to the pipe. Even
> > if we think we don't need to. This way, we're not stuck in "want down,
> > got TERM."
> >
> > So with an answer in hand... is this behavior by design? It seems
> > that a particularly slow child shouldn't immunize itself from a TERM
> > because of a slow load time or late signal handler registration.
> >
> > Thoughts appreciated! Thanks!
> >
> > -ryan woodrum
next prev parent reply other threads:[~2008-01-17 0:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-16 22:41 Ryan Woodrum
2008-01-16 23:04 ` Mike Buland
2008-01-17 0:25 ` Ryan Woodrum
2008-01-17 0:35 ` Ryan Woodrum [this message]
2008-01-17 8:25 ` Ryan Woodrum
2008-01-17 19:16 ` Mike Buland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801161635.40876.rwoodrum@avvo.com \
--to=rwoodrum@avvo.com \
--cc=supervision@list.skarnet.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).