From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2563 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Jeff Newsgroups: gmane.comp.sysutils.supervision.general Subject: Runit Date: Fri, 03 May 2019 04:15:00 +0200 Message-ID: <11603811556849700@myt5-262fb1897c00.qloud-c.yandex.net> References: <15044531556573627@iva6-ff1651a9aa83.qloud-c.yandex.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="61552"; mail-complaints-to="usenet@blaine.gmane.org" To: supervision Original-X-From: supervision-return-2153-gcsg-supervision=m.gmane.org@list.skarnet.org Fri May 03 04:15:04 2019 Return-path: Envelope-to: gcsg-supervision@m.gmane.org Original-Received: from alyss.skarnet.org ([95.142.172.232]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hMNj5-000FoW-Iq for gcsg-supervision@m.gmane.org; Fri, 03 May 2019 04:15:03 +0200 Original-Received: (qmail 4824 invoked by uid 89); 3 May 2019 02:15:29 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Original-Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Original-Received: (qmail 4817 invoked from network); 3 May 2019 02:15:29 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.com; s=mail; t=1556849700; bh=m4USz9I4QPpIlxahHmBowpssoPVXjIK//RELmvLSD0Y=; h=Message-Id:Subject:In-Reply-To:Date:References:To:From; b=LokV2sUoHakznmaSVvb70RAiDRA8UjgV4w/fR784957y8ephWQjjyUB1iQYPvKK32 gsel8azIimM4+jenMaQyyh61AGpxKvVq45S2VjPWkNdq438na3SNn+wO+qYPItdTRP DKvlx+sVouTWF7v3LstnJwfrPuICJGHUroD+y7XQ= Authentication-Results: mxback8g.mail.yandex.net; dkim=pass header.i=@yandex.com In-Reply-To: X-Mailer: Yamail [ http://yandex.ru ] 5.0 Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2563 Archived-At: >>  If something kills runsvdir, then runit immediately enters >>  stage 3, and reboots the system. This is an acceptable response >>  to the scanner dying, but is not the same thing as supervising >>  it. If runsvdir's death is accidental, the system goes through >>  an unnecessary reboot. > > If the /etc/runit/2 process exits with code 111 or gets killed by a > signal, the runit program is actually supposed to respawn it, > according to its man page. I believe this counts as supervising at > least one process, so it would put runit in the "correct init" camp :) > > There is code that checks the 'wstat' value returned by a > wait_nohang(&wstat) call that reaps the /etc/runit/2 process, however, > it is executed only if wait_exitcode(wstat) != 0. On my computer, > wait_exitcode() returns 0 if its argument is the wstat of a process > killed by a signal, so runit indeed spawns /etc/runit/3 instead of > respawning /etc/runit/2 when, for example, I point a gun at runsvdir > on purpose and use a kill -int command specifying its PID. Changing > the condition to wait_crashed(wstat) || (wait_exitcode(wstat) != 0) > makes things work as intended. that is again one of several runit problems. among them: - see above - no setsid(2) for child procs by default in "runsv" - having only runsv managing the log pipe. - runit-init requires rw fs access without the slightest need (setting the +x bit of the /etc/runit/(stopit,reboot) files which could indeed reside on a tmpfs in /run and be symlinks have symlinks pointing to them (that is done in Void Linux) - problems with log files while bringing down the system. i never encountered that with daemontools-encore, perp(d) and s6. so it is a quite dated project that clearly shows its age. i would recommend against using it at all (except its "chpst" and "utmpset" utilities).