* runit under sysvinit - coping when runsvdir dies
@ 2005-11-14 18:03 Charles Duffy
2005-11-16 3:50 ` Charlie Brady
2005-11-17 9:08 ` Gerrit Pape
0 siblings, 2 replies; 4+ messages in thread
From: Charles Duffy @ 2005-11-14 18:03 UTC (permalink / raw)
I recently had a situation on a fielded server where runsvdir (from
runit 1.2.3) apparently died; in any event, all the runsv processes
which would typically be directly under runsvdir were instead inherited
by sysvinit and running there. However, since runsvdir was respawned by
sysvinit, a great deal of CPU time was being spent continuously trying
to start new runsv instances under the fresh runsvdir -- attempts which
failed because there were still runsv instances alive and holding open
the relevant locks. I killed the old runsv instances with "runsvctrl e",
and fresh children of the new runsvdir took their place -- but there are
still some questions raised:
- How could this have happened? The system's message log doesn't show
the OOM killer taking down runsvdir or any segfault on the part of the same.
- How could such situations be more gracefully handled in the future?
Having the customer call and complain because their server was unusably
slow was a less-than-ideal way to find out about this issue.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: runit under sysvinit - coping when runsvdir dies
2005-11-14 18:03 runit under sysvinit - coping when runsvdir dies Charles Duffy
@ 2005-11-16 3:50 ` Charlie Brady
2005-11-17 9:08 ` Gerrit Pape
1 sibling, 0 replies; 4+ messages in thread
From: Charlie Brady @ 2005-11-16 3:50 UTC (permalink / raw)
Cc: supervision
On Mon, 14 Nov 2005, Charles Duffy wrote:
> I recently had a situation on a fielded server where runsvdir (from runit
> 1.2.3) apparently died; in any event, all the runsv processes which would
> typically be directly under runsvdir were instead inherited by sysvinit and
> running there. However, since runsvdir was respawned by sysvinit, a great
> deal of CPU time was being spent continuously trying to start new runsv
> instances under the fresh runsvdir -- attempts which failed because there
> were still runsv instances alive and holding open the relevant locks. I
> killed the old runsv instances with "runsvctrl e", and fresh children of the
> new runsvdir took their place -- but there are still some questions raised:
>
> - How could this have happened? The system's message log doesn't show the OOM
> killer taking down runsvdir or any segfault on the part of the same.
If any of the run scripts (or programs exec'd by the run scripts) did not
create a new process group, but sent a kill or term signal to its own
process group (pppd used to do this in the past, and maybe still does),
then I think it could have brought down the whole pack of cards. In order
to prevent this, I think that each runsv should start a new process group
for the run script to execute in.
Note that for any service with a 'down' file, runsv crashing could mean
that the service changes from an up state to a down state - which is
almost certainly not want you want to happen (you're running runit to
prevent such unrequested transitions).
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: runit under sysvinit - coping when runsvdir dies
2005-11-14 18:03 runit under sysvinit - coping when runsvdir dies Charles Duffy
2005-11-16 3:50 ` Charlie Brady
@ 2005-11-17 9:08 ` Gerrit Pape
2005-11-17 11:13 ` Charles Duffy
1 sibling, 1 reply; 4+ messages in thread
From: Gerrit Pape @ 2005-11-17 9:08 UTC (permalink / raw)
On Mon, Nov 14, 2005 at 12:03:20PM -0600, Charles Duffy wrote:
> I recently had a situation on a fielded server where runsvdir (from
> runit 1.2.3) apparently died; in any event, all the runsv processes
> which would typically be directly under runsvdir were instead inherited
> by sysvinit and running there. However, since runsvdir was respawned by
> sysvinit, a great deal of CPU time was being spent continuously trying
> to start new runsv instances under the fresh runsvdir -- attempts which
> failed because there were still runsv instances alive and holding open
> the relevant locks. I killed the old runsv instances with "runsvctrl e",
> and fresh children of the new runsvdir took their place -- but there are
> still some questions raised:
>
> - How could this have happened? The system's message log doesn't show
> the OOM killer taking down runsvdir or any segfault on the part of the same.
Charlie answered that.
> - How could such situations be more gracefully handled in the future?
> Having the customer call and complain because their server was unusably
> slow was a less-than-ideal way to find out about this issue.
If runsvdir receives the HUP signal, it sends a term signal to all runsv
processes it manages. On the TERM signal, it simply exits, leaving the
runsv processes alone. Maybe I should switch that, so that TERM signals
'by mistake' are handled better. It then would re-init the complete
system, stopping all services plus supervisors on TERM, and starting
them up again after being re-spawned through inittab.
I'm not yet sure about side-effects of this change though.
Regards, Gerrit.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: runit under sysvinit - coping when runsvdir dies
2005-11-17 9:08 ` Gerrit Pape
@ 2005-11-17 11:13 ` Charles Duffy
0 siblings, 0 replies; 4+ messages in thread
From: Charles Duffy @ 2005-11-17 11:13 UTC (permalink / raw)
Gerrit Pape wrote:
> If runsvdir receives the HUP signal, it sends a term signal to all runsv
> processes it manages. On the TERM signal, it simply exits, leaving the
> runsv processes alone. Maybe I should switch that, so that TERM signals
> 'by mistake' are handled better. It then would re-init the complete
> system, stopping all services plus supervisors on TERM, and starting
> them up again after being re-spawned through inittab.
In cases where runsvdir is going to be automatically respawned on exit,
this certainly seems to make sense, though you're right inasmuch as
there's potential for side-effects on the system shutdown procedure...
how about having a file's (non)existance be used to control this
behaviour, akin to where runit-init checks for the existance of stopit
before shutting down?
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-11-17 11:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-14 18:03 runit under sysvinit - coping when runsvdir dies Charles Duffy
2005-11-16 3:50 ` Charlie Brady
2005-11-17 9:08 ` Gerrit Pape
2005-11-17 11:13 ` Charles Duffy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).