From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2483
Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail
From: "Laurent Bercot" <ska-supervision@skarnet.org>
Newsgroups: gmane.comp.sysutils.supervision.general
Subject: Re: How best to ensure s6-managed services are shut down cleanly?
Date: Fri, 01 Feb 2019 19:46:00 +0000
Message-ID: <em5557b522-9ea3-4170-b228-1988636ba1f5@elzian>
References: <CAGSetNt7=3MkYCka-u48rfQ-Vbb8Xr++yNG7fDHMDXiT3hv7vQ@mail.gmail.com>
Reply-To: "Laurent Bercot" <ska-supervision@skarnet.org>
Mime-Version: 1.0
Content-Type: text/plain; format=flowed; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226";
	logging-data="35671"; mail-complaints-to="usenet@blaine.gmane.org"
User-Agent: eM_Client/7.2.33939.0
To: "supervision@list.skarnet.org" <supervision@list.skarnet.org>
Original-X-From: supervision-return-2073-gcsg-supervision=m.gmane.org@list.skarnet.org Fri Feb 01 20:46:02 2019
Return-path: <supervision-return-2073-gcsg-supervision=m.gmane.org@list.skarnet.org>
Envelope-to: gcsg-supervision@m.gmane.org
Original-Received: from alyss.skarnet.org ([95.142.172.232])
	by blaine.gmane.org with smtp (Exim 4.89)
	(envelope-from <supervision-return-2073-gcsg-supervision=m.gmane.org@list.skarnet.org>)
	id 1gpelF-00090Z-Pj
	for gcsg-supervision@m.gmane.org; Fri, 01 Feb 2019 20:46:01 +0100
Original-Received: (qmail 13545 invoked by uid 89); 1 Feb 2019 19:46:27 -0000
Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm
Original-Sender: <supervision@list.skarnet.org>
Precedence: bulk
List-Post: <mailto:supervision@list.skarnet.org>
List-Help: <mailto:supervision-help@list.skarnet.org>
List-Unsubscribe: <mailto:supervision-unsubscribe@list.skarnet.org>
List-Subscribe: <mailto:supervision-subscribe@list.skarnet.org>
Original-Received: (qmail 13538 invoked from network); 1 Feb 2019 19:46:26 -0000
In-Reply-To: <CAGSetNt7=3MkYCka-u48rfQ-Vbb8Xr++yNG7fDHMDXiT3hv7vQ@mail.gmail.com>
X-VR-SPAMSTATE: OK
X-VR-SPAMSCORE: 0
X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedtledrjeekgddufedtucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecupfgfoffgtffkveetuefngfdpqfgfvfenuceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkjghfrhgfgggtgfesthhqredttderjeenucfhrhhomhepfdfnrghurhgvnhhtuceuvghrtghothdfuceoshhkrgdqshhuphgvrhhvihhsihhonhesshhkrghrnhgvthdrohhrgheqnecurfgrrhgrmhepmhhouggvpehsmhhtphhouhhtnecuvehluhhsthgvrhfuihiivgeptd
Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2483
Archived-At: <http://permalink.gmane.org/gmane.comp.sysutils.supervision.general/2483>


>I _think_ that with my naive current setup, what actually happens is:
>
>- systemd sends a SIGTERM to s6-svscan;
>- s6-svscan sends a SIGTERM or SIGHUP to all s6-supervise processes,
>depending on what they are supervising, and then runs the finish program;
>- the s6-supervise for postgresql sends a SIGTERM and a SIGCONT to the mai=
n
>database process. It then waits for the postgresql process to terminate,
>runs its finish program if there is one, and exits;
>- because postgresql responds to SIGTERM by disallowing new connections bu=
t
>permitting existing ones to keep running, it continues doing that until
>being killed.

That sounds accurate.


>Reviewing the current docs for s6, I see that I can improve this situation
>a bit by using a "down-signal" file to tell s6-supervise to send a SIGINT
>instead of a SIGTERM.

Yes, being able to customize the signal that kills the service was a
highly requested feature. I postponed it for a long time because I
couldn't find a model that didn't jeopardize the supervisor's stability.
(The original implementation of this feature is runit's control/=20
scripts,
but a bad control script can hang runsv.) down-signal is not as=20
flexible,
but it's safe.


>  That's cool! But what I would really _like_ to do is
>wait for up to a minute to allow the database to shut down cleanly before
>the system shutdown proceeds

The question is, how does systemd decide to proceed with the rest of
the shutdown? If it's just waiting for s6-svscan to die, then it's
easy: don't allow s6-svscan to die before all your services are
properly shut down. That can be done by a single s6-svwait invocation
in .s6-svscan/finish:

#!/bin/sh
exec s6-svwait -D -T60000 /scandir/*

and s6-svscan's death won't be reported to systemd before all your
services are really down, or one minute, whichever happens sooner.

--
Laurent