supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* Race Condition?
       [not found] <CAN_+VLUNEEUdJ94r_HUMbuzstK_G4rcJc8d=a8KEN9ph7MzO+A@mail.gmail.com>
@ 2013-03-14  7:03 ` Lee Hambley
  0 siblings, 0 replies; only message in thread
From: Lee Hambley @ 2013-03-14  7:03 UTC (permalink / raw)
  To: supervision

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

I'm using runit in cooperation with Monit, we are still using the init.d
scripts that shipped with Ubuntu 12.04 LTS, but using runit for all
application level processes. Monit also watches over a couple of the system
level scripts.

We're seeing something where Monit (I believe is to blame) is causing the
following:

root@runitvm:~# ps aux | grep runsvdir | grep -v grep
> root      1079  0.0  0.0    188    32 ?        Ss   15:52   0:00 runsvdir
> -P /etc/service log:
> .........................................................................................................................................................................................................................runsv
> apache2: fatal: unable to setup filedescriptor for ./run: file descriptor
> not open?runsv apache2: fatal: unable to setup filedescriptor for ./run:
> file descriptor not open?


*runsv apache2: fatal: unable to setup filedescriptor for ./run: file
descriptor not open?*

I haven't been able to debug this, and the box only recovers when restarted
(reboot). There's a possible explanation here detailing a kind of race
condition:
http://blog.gmane.org/gmane.comp.sysutils.supervision.general/month=20100801

An extract from the mailing list thread I linked:

...is that at some point, your runsv ran through that code, but
> somehow managed to live and the services didn't die, i.e. another control
> message was sent and processed before the exit condition was reached, and
> runsv is still trying to supervise things - but runs into trouble with the
> closed logpipe...


It's my supposition that Monit is signalling the runsv process too often,
and leaving it in a broken state, I haven't been able to verify this
though. I wanted to run this by the mailing list before I pour too much
time into debugging something that may already be a known problem with an
obvious (to those wiser than I) workaround.

I'm running on:

$ dpkg -s runit

Architecture: amd64
Version: 2.1.1-6.2ubuntu2

$ lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04.1 LTS
Release: 12.04
Codename: precise


Thanks advance for any assistance, in the meantime I'm trying to tell Monit
to be less aggressive.

Lee Hambley
--
http://lee.hambley.name/

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2013-03-14  7:03 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAN_+VLUNEEUdJ94r_HUMbuzstK_G4rcJc8d=a8KEN9ph7MzO+A@mail.gmail.com>
2013-03-14  7:03 ` Race Condition? Lee Hambley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).