From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2020 Path: news.gmane.org!not-for-mail From: Laurent Bercot Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: pidsig 0.11 - a fghack like de-daemonisation tool Date: Fri, 4 Jun 2010 20:43:28 +0200 Message-ID: <20100604184328.GA21893@skarnet.org> References: <20100602184653.GA20534@skarnet.org> <20100603192530.GA19916@skarnet.org> <20100604162624.5a24e83c@slate.copperisle.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1275676893 23346 80.91.229.12 (4 Jun 2010 18:41:33 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 4 Jun 2010 18:41:33 +0000 (UTC) To: supervision@list.skarnet.org Original-X-From: supervision-return-2255-gcsg-supervision=m.gmane.org@list.skarnet.org Fri Jun 04 20:41:32 2010 connect(): No such file or directory Return-path: Envelope-to: gcsg-supervision@lo.gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1OKbpe-0002Uk-Fv for gcsg-supervision@lo.gmane.org; Fri, 04 Jun 2010 20:41:26 +0200 Original-Received: (qmail 26415 invoked by uid 76); 4 Jun 2010 18:43:29 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 26407 invoked by uid 1000); 4 Jun 2010 18:43:29 -0000 Mail-Followup-To: supervision@list.skarnet.org Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2020 Archived-At: > There is a weakness in this "strong supervision" model. Any service with a > 'down' file will not be restarted if its supervise/runsv or > svscan/runsvdir is replaced. If a branch of the supervision tree dies, the old subtree, including leaves (i.e. services) is still alive. Manual admin intervention is necessary to kill it off and recreate a new subtree, connected to init. If there are any services with down files, but that need to be alive, the admin can take care of them at that time. Now, if a service has a down file, and its supervisor dies, *and then* the service dies too, then the service won't be restarted indeed; but we're talking about a double failure, which should be uncommon. Nevertheless, down files are a decrease in reliability. They're practical for test and manual intervention purposes, but I've never met a real-life case where they are necessary. It's always possible to boot the machine with a nearly-empty svscan directory and populate it during the later initialization phases. -- Laurent