From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2017 Path: news.gmane.org!not-for-mail From: Charlie Brady Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: pidsig 0.11 - a fghack like de-daemonisation tool Date: Fri, 4 Jun 2010 12:54:46 -0400 (EDT) Message-ID: References: <20100602184653.GA20534@skarnet.org> <20100603192530.GA19916@skarnet.org> <20100604162624.5a24e83c@slate.copperisle.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: dough.gmane.org 1275670490 1026 80.91.229.12 (4 Jun 2010 16:54:50 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 4 Jun 2010 16:54:50 +0000 (UTC) Cc: supervision@list.skarnet.org To: Wayne Marshall Original-X-From: supervision-return-2252-gcsg-supervision=m.gmane.org@list.skarnet.org Fri Jun 04 18:54:49 2010 connect(): No such file or directory Return-path: Envelope-to: gcsg-supervision@lo.gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1OKaAT-0005AX-6V for gcsg-supervision@lo.gmane.org; Fri, 04 Jun 2010 18:54:49 +0200 Original-Received: (qmail 1996 invoked by uid 76); 4 Jun 2010 16:56:54 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 1988 invoked from network); 4 Jun 2010 16:56:53 -0000 X-X-Sender: charlieb@e-smith.charlieb.ott.istop.com In-Reply-To: <20100604162624.5a24e83c@slate.copperisle.com> Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2017 Archived-At: On Fri, 4 Jun 2010, Wayne Marshall wrote: > On Thu, 3 Jun 2010 21:25:30 +0200 > Laurent Bercot wrote: > > > > These kinds of problems are not that theoretical - just > > > recently I saw svscan/svscanboot crashing on a >1y uptime > > > box, taking many of the processes with it, including most of > > > the supervise infrastructure, very likely not due to any > > > fault in them - could be oom gone wild, cosmic rays hitting > > > svscan memory, whatever). > > > > That's a typical case of "weak" supervision, as opposed to a > > "strong" supervision chain. "Strong" supervision makes sure > > that all the infrastructure is connected to init. > > > > * svscan achieves strong supervision *if* svscanboot is > > flagged as "respawn" in /etc/inittab on System V-style inits, > > in /etc/event.d/ with Upstart, or in /etc/gettys on BSD. It > > does *not* achieve it if svscanboot is started via some > > rc.local script (as the stock daemontools instructions tell > > you to do, shame on DJB! :)) > > * perp is in the same boat, depending on how you start > > perpboot. > > ... > > Strong supervision makes sure that your supervisor process > > tree is *always* alive and complete, unless process 1 itself > > crashes, in which case you're doomed to reboot anyway. There is a weakness in this "strong supervision" model. Any service with a 'down' file will not be restarted if its supervise/runsv or svscan/runsvdir is replaced.