From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2559 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Guillermo Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: further claims Date: Wed, 1 May 2019 20:09:58 -0300 Message-ID: References: <15044531556573627@iva6-ff1651a9aa83.qloud-c.yandex.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="233459"; mail-complaints-to="usenet@blaine.gmane.org" To: supervision Original-X-From: supervision-return-2149-gcsg-supervision=m.gmane.org@list.skarnet.org Thu May 02 01:10:13 2019 Return-path: Envelope-to: gcsg-supervision@m.gmane.org Original-Received: from alyss.skarnet.org ([95.142.172.232]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hLyMe-000yb6-68 for gcsg-supervision@m.gmane.org; Thu, 02 May 2019 01:10:12 +0200 Original-Received: (qmail 15934 invoked by uid 89); 1 May 2019 23:10:37 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Original-Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Original-Received: (qmail 15927 invoked from network); 1 May 2019 23:10:36 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=Y7iRojficBBaRsUqwwrrbMqY4kFwJUCQYW0LNE4v1As=; b=Isf9H9q6RJuCju+oDRuQKzjXEoHGPTMxtQW7ye8Bl5mVUhexr1vrP1UOlZUNhFs2KM Sd6PejsrRsPsANR8BlpTnhDi18zI3rzha/ybOgDJisDBpniDspvvd7VNSVKPoV84ubC+ lvFIHZlSpoRbtkdQaWSqe505EDhF1MR3xMkiV1eqRgDAo55uUjXegg5EbM13NQYl/dkB SB+6e6xE6RSyULBPr7KtkJR1h5UOe1ySkhImwb1h8mKNBPp+ntdfEcX3ZpbkJYMIc7OA NGUXdySOqQP+XvkB8ialFIdQ1hYwuZEgPId39f9F5h1MvaFwgHIPziR0oW9Q2nEi3BO/ Ky9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=Y7iRojficBBaRsUqwwrrbMqY4kFwJUCQYW0LNE4v1As=; b=kmWGKHQnogY4U0ATVNb/7fNY3ttBQCmlaktEyYnLI2V7P2mHLIZNPUUXVc/XQV1CtH 0ucttaq8qyeL74M2poQO+2XtOWapLDuXZsN9JKtj8eIBtOgDU/W+jOmBFmVRSvzauvt7 sUI63iGG6sCLluKJaL6DVD9tToZp4blGGNKrAcedaE8IClysoIwKufx4eafF7JuzaLmN YNUEJArLQT2MZZ6tajIVM6ZlWAZl0brUsja3VyqPbMFXZ6L4WuAvC9zK6ICf4GwMge8P X5HLLG/o0CnJpmGufhP40IAjeNllmfXjQ2eedGYAUPpnSjpeUvv24FV4xPIQzZkVjfzJ HtPA== X-Gm-Message-State: APjAAAX3bPbMV2MC5nRDOHSW6zKPo1dNPL/qn2BzcMOuBoKzAo7i2is6 0gpd7pMH6vdD5IhY2ARF1xtx+saC+6n4Xy/uQbkyoQ== X-Google-Smtp-Source: APXvYqww/WCFTOjZd2BmfqH5a8IxBTVNi7qMu3x5nUN+6Mcl5pTYi5wl3lmuvo2yIaF2/l7kNSKu/OAN+kIbUkGLfwM= X-Received: by 2002:a5d:9c57:: with SMTP id 23mr325598iof.1.1556752208481; Wed, 01 May 2019 16:10:08 -0700 (PDT) In-Reply-To: Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2559 Archived-At: El mar., 30 abr. 2019 a las 5:55, Laurent Bercot escribi=C3=B3: > > >haven't you claimed process #1 should supervise long running > >child processes ? runit fulfils exactly this requirement by > >supervising the supervisor. > > Not exactly, no. > If something kills runsvdir, then runit immediately enters > stage 3, and reboots the system. This is an acceptable response > to the scanner dying, but is not the same thing as supervising > it. If runsvdir's death is accidental, the system goes through > an unnecessary reboot. If the /etc/runit/2 process exits with code 111 or gets killed by a signal, the runit program is actually supposed to respawn it, according to its man page. I believe this counts as supervising at least one process, so it would put runit in the "correct init" camp :) There is code that checks the 'wstat' value returned by a wait_nohang(&wstat) call that reaps the /etc/runit/2 process, however, it is executed only if wait_exitcode(wstat) !=3D 0. On my computer, wait_exitcode() returns 0 if its argument is the wstat of a process killed by a signal, so runit indeed spawns /etc/runit/3 instead of respawning /etc/runit/2 when, for example, I point a gun at runsvdir on purpose and use a kill -int command specifying its PID. Changing the condition to wait_crashed(wstat) || (wait_exitcode(wstat) !=3D 0) makes things work as intended. G.