supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: Lee Hambley <lee.hambley@gmail.com>
To: supervision@list.skarnet.org
Subject: Race Condition?
Date: Thu, 14 Mar 2013 08:03:30 +0100	[thread overview]
Message-ID: <CAN_+VLUtc+JBAATN55b2o6G8tmH4Y8OKLCf-yA6maBt+c1=gmA@mail.gmail.com> (raw)
In-Reply-To: <CAN_+VLUNEEUdJ94r_HUMbuzstK_G4rcJc8d=a8KEN9ph7MzO+A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

I'm using runit in cooperation with Monit, we are still using the init.d
scripts that shipped with Ubuntu 12.04 LTS, but using runit for all
application level processes. Monit also watches over a couple of the system
level scripts.

We're seeing something where Monit (I believe is to blame) is causing the
following:

root@runitvm:~# ps aux | grep runsvdir | grep -v grep
> root      1079  0.0  0.0    188    32 ?        Ss   15:52   0:00 runsvdir
> -P /etc/service log:
> .........................................................................................................................................................................................................................runsv
> apache2: fatal: unable to setup filedescriptor for ./run: file descriptor
> not open?runsv apache2: fatal: unable to setup filedescriptor for ./run:
> file descriptor not open?


*runsv apache2: fatal: unable to setup filedescriptor for ./run: file
descriptor not open?*

I haven't been able to debug this, and the box only recovers when restarted
(reboot). There's a possible explanation here detailing a kind of race
condition:
http://blog.gmane.org/gmane.comp.sysutils.supervision.general/month=20100801

An extract from the mailing list thread I linked:

...is that at some point, your runsv ran through that code, but
> somehow managed to live and the services didn't die, i.e. another control
> message was sent and processed before the exit condition was reached, and
> runsv is still trying to supervise things - but runs into trouble with the
> closed logpipe...


It's my supposition that Monit is signalling the runsv process too often,
and leaving it in a broken state, I haven't been able to verify this
though. I wanted to run this by the mailing list before I pour too much
time into debugging something that may already be a known problem with an
obvious (to those wiser than I) workaround.

I'm running on:

$ dpkg -s runit

Architecture: amd64
Version: 2.1.1-6.2ubuntu2

$ lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04.1 LTS
Release: 12.04
Codename: precise


Thanks advance for any assistance, in the meantime I'm trying to tell Monit
to be less aggressive.

Lee Hambley
--
http://lee.hambley.name/

           reply	other threads:[~2013-03-14  7:03 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <CAN_+VLUNEEUdJ94r_HUMbuzstK_G4rcJc8d=a8KEN9ph7MzO+A@mail.gmail.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN_+VLUtc+JBAATN55b2o6G8tmH4Y8OKLCf-yA6maBt+c1=gmA@mail.gmail.com' \
    --to=lee.hambley@gmail.com \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).