supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* runit under sysvinit - coping when runsvdir dies
@ 2005-11-14 18:03 Charles Duffy
  2005-11-16  3:50 ` Charlie Brady
  2005-11-17  9:08 ` Gerrit Pape
  0 siblings, 2 replies; 4+ messages in thread
From: Charles Duffy @ 2005-11-14 18:03 UTC (permalink / raw)


I recently had a situation on a fielded server where runsvdir (from 
runit 1.2.3) apparently died; in any event, all the runsv processes 
which would typically be directly under runsvdir were instead inherited 
by sysvinit and running there. However, since runsvdir was respawned by 
sysvinit, a great deal of CPU time was being spent continuously trying 
to start new runsv instances under the fresh runsvdir -- attempts which 
failed because there were still runsv instances alive and holding open 
the relevant locks. I killed the old runsv instances with "runsvctrl e", 
and fresh children of the new runsvdir took their place -- but there are 
still some questions raised:

- How could this have happened? The system's message log doesn't show 
the OOM killer taking down runsvdir or any segfault on the part of the same.

- How could such situations be more gracefully handled in the future? 
Having the customer call and complain because their server was unusably 
slow was a less-than-ideal way to find out about this issue.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: runit under sysvinit - coping when runsvdir dies
  2005-11-14 18:03 runit under sysvinit - coping when runsvdir dies Charles Duffy
@ 2005-11-16  3:50 ` Charlie Brady
  2005-11-17  9:08 ` Gerrit Pape
  1 sibling, 0 replies; 4+ messages in thread
From: Charlie Brady @ 2005-11-16  3:50 UTC (permalink / raw)
  Cc: supervision


On Mon, 14 Nov 2005, Charles Duffy wrote:

> I recently had a situation on a fielded server where runsvdir (from runit 
> 1.2.3) apparently died; in any event, all the runsv processes which would 
> typically be directly under runsvdir were instead inherited by sysvinit and 
> running there. However, since runsvdir was respawned by sysvinit, a great 
> deal of CPU time was being spent continuously trying to start new runsv 
> instances under the fresh runsvdir -- attempts which failed because there 
> were still runsv instances alive and holding open the relevant locks. I 
> killed the old runsv instances with "runsvctrl e", and fresh children of the 
> new runsvdir took their place -- but there are still some questions raised:
>
> - How could this have happened? The system's message log doesn't show the OOM 
> killer taking down runsvdir or any segfault on the part of the same.

If any of the run scripts (or programs exec'd by the run scripts) did not 
create a new process group, but sent a kill or term signal to its own 
process group (pppd used to do this in the past, and maybe still does), 
then I think it could have brought down the whole pack of cards. In order 
to prevent this, I think that each runsv should start a new process group 
for the run script to execute in.

Note that for any service with a 'down' file, runsv crashing could mean 
that the service changes from an up state to a down state - which is 
almost certainly not want you want to happen (you're running runit to 
prevent such unrequested transitions).


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: runit under sysvinit - coping when runsvdir dies
  2005-11-14 18:03 runit under sysvinit - coping when runsvdir dies Charles Duffy
  2005-11-16  3:50 ` Charlie Brady
@ 2005-11-17  9:08 ` Gerrit Pape
  2005-11-17 11:13   ` Charles Duffy
  1 sibling, 1 reply; 4+ messages in thread
From: Gerrit Pape @ 2005-11-17  9:08 UTC (permalink / raw)


On Mon, Nov 14, 2005 at 12:03:20PM -0600, Charles Duffy wrote:
> I recently had a situation on a fielded server where runsvdir (from 
> runit 1.2.3) apparently died; in any event, all the runsv processes 
> which would typically be directly under runsvdir were instead inherited 
> by sysvinit and running there. However, since runsvdir was respawned by 
> sysvinit, a great deal of CPU time was being spent continuously trying 
> to start new runsv instances under the fresh runsvdir -- attempts which 
> failed because there were still runsv instances alive and holding open 
> the relevant locks. I killed the old runsv instances with "runsvctrl e", 
> and fresh children of the new runsvdir took their place -- but there are 
> still some questions raised:
> 
> - How could this have happened? The system's message log doesn't show 
> the OOM killer taking down runsvdir or any segfault on the part of the same.

Charlie answered that.

> - How could such situations be more gracefully handled in the future? 
> Having the customer call and complain because their server was unusably 
> slow was a less-than-ideal way to find out about this issue.

If runsvdir receives the HUP signal, it sends a term signal to all runsv
processes it manages.  On the TERM signal, it simply exits, leaving the
runsv processes alone.  Maybe I should switch that, so that TERM signals
'by mistake' are handled better.  It then would re-init the complete
system, stopping all services plus supervisors on TERM, and starting
them up again after being re-spawned through inittab.

I'm not yet sure about side-effects of this change though.

Regards, Gerrit.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: runit under sysvinit - coping when runsvdir dies
  2005-11-17  9:08 ` Gerrit Pape
@ 2005-11-17 11:13   ` Charles Duffy
  0 siblings, 0 replies; 4+ messages in thread
From: Charles Duffy @ 2005-11-17 11:13 UTC (permalink / raw)


Gerrit Pape wrote:
> If runsvdir receives the HUP signal, it sends a term signal to all runsv
> processes it manages.  On the TERM signal, it simply exits, leaving the
> runsv processes alone.  Maybe I should switch that, so that TERM signals
> 'by mistake' are handled better.  It then would re-init the complete
> system, stopping all services plus supervisors on TERM, and starting
> them up again after being re-spawned through inittab.

In cases where runsvdir is going to be automatically respawned on exit, 
this certainly seems to make sense, though you're right inasmuch as 
there's potential for side-effects on the system shutdown procedure... 
how about having a file's (non)existance be used to control this 
behaviour, akin to where runit-init checks for the existance of stopit 
before shutting down?



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-11-17 11:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-14 18:03 runit under sysvinit - coping when runsvdir dies Charles Duffy
2005-11-16  3:50 ` Charlie Brady
2005-11-17  9:08 ` Gerrit Pape
2005-11-17 11:13   ` Charles Duffy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).