supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: Jean-Michel Bruenn <jean.bruenn@ip-minds.de>
To: supervision@list.skarnet.org
Subject: Re: hello - hanging services
Date: Wed, 18 Aug 2010 17:06:35 +0200	[thread overview]
Message-ID: <20100818170635.a5a24d3f.jean.bruenn@ip-minds.de> (raw)
In-Reply-To: <20100818105735.GA13364@skarnet.org>

> >> 
> >> Difficult to implement?
> > 
> > Yes.
> 
>  More precisely, it's not so much "difficult to implement" (I've done
> it for a paying customer's project) as "impossible to do without specific
> support in the service you're trying to manage".
>  In other words, what Jean-Michel wants is a software watchdog; it can
> be done, but it's pretty intrusive. It requires having a library, a daemon,
> and making library calls in the managed process' source, sending messages
> to the daemon by doing so. The daemon is configured with a certain policy
> that decides "the service is running fine" or "the service has hung"
> depending on the frequency of the messages it receives.
> 
>  It's doable, and a watchdog library/daemon may even have its place in
> a supervision suite (I'll think about it), but it certainly has nothing
> to do with purely external process management tools such as runsvdir/runsv
> or svscan/supervise. It's a whole piece of software on its own.
> 
>  I'm certain that a lot of open source software watchdogs already exist
> out there. I'm also certain that none of them is as lightweight and easy
> to use as I'd like, but that's another story.

In fact i was thinking about something more simple, i guess you guys
know nagios? similar to nagios - Just run a command, check for output
or timeout, for example for apache, you write a script called
"hangcheck" which gets run all X seconds by runit. This script
contains something like:

#!/bin/sh
service=apache2
exec curl 127.0.0.1 || sv restart $service

e.g:
wdp@localhost:~$ curl 127.0.0.1 || echo "didnt work"
curl: (7) couldn't connect to host
didnt work

So the idea is, you can define with which command to test a service
(you don't need to use this at all) and runit is just periodically
running the hangcheck script - the hangcheck script itself is just
running a command, and deciding be exit-code whether to do something or
not (so this can be used to mail someone about a hanging/not responding
service, or to restart this service.

So there's no need for any special scripting or any special algorithm.
And if i'm right there's not much work to be done in runit - just: if
[ -f hangcheck ]; then ./hangcheck; fi (of course with the periodic
timer set, let's say 10 seconds? 1 minute?)

Cheers


  reply	other threads:[~2010-08-18 15:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-17 17:08 Jean-Michel Bruenn
     [not found] ` <Pine.LNX.4.64.1008171311210.4362@e-smith.charlieb.ott.istop.com>
2010-08-17 17:24   ` Jean-Michel Bruenn
2010-08-17 17:38     ` Charlie Brady
2010-08-18 10:57       ` Laurent Bercot
2010-08-18 15:06         ` Jean-Michel Bruenn [this message]
2010-08-18 15:23           ` Charlie Brady
2010-08-18 16:02             ` Jean-Michel Bruenn
2010-08-19  5:46               ` Laurent Bercot
2010-08-20 12:24                 ` Nicolás de la Torre
2010-08-20 14:42                   ` Tobia Conforto
2010-08-20 14:59                     ` Charlie Brady
     [not found]                     ` <BB40BB3F77C4402181674BE975669A4F@HEL.local>
2010-08-20 15:11                       ` Rehan
2010-08-20 15:13                         ` Charlie Brady
2010-08-20 15:40                     ` Laurent Bercot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100818170635.a5a24d3f.jean.bruenn@ip-minds.de \
    --to=jean.bruenn@ip-minds.de \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).