How best to ensure s6-managed services are shut down cleanly?

supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed

* How best to ensure s6-managed services are shut down cleanly?
@ 2019-02-01 18:25 Brett Neumeier
  2019-02-01 19:46 ` Laurent Bercot
  0 siblings, 1 reply; 4+ messages in thread
From: Brett Neumeier @ 2019-02-01 18:25 UTC (permalink / raw)
  To: supervision

[-- Attachment #1: Type: text/plain, Size: 1848 bytes --]

I use s6 to supervise userspace services like RabbitMQ and PostgreSQL. The
s6-svscan process is launched and managed by systemd (because it's a CentOS
7 system).
What I would like to do is ensure that PostgreSQL is shut down cleanly when
the system is being powered down or rebooted. Because of the way that
PostgreSQL handles signals, the best way to do that is to send it a SIGINT
and then wait for the main server process to terminate.

I _think_ that with my naive current setup, what actually happens is:

- systemd sends a SIGTERM to s6-svscan;
- s6-svscan sends a SIGTERM or SIGHUP to all s6-supervise processes,
depending on what they are supervising, and then runs the finish program;
- the s6-supervise for postgresql sends a SIGTERM and a SIGCONT to the main
database process. It then waits for the postgresql process to terminate,
runs its finish program if there is one, and exits;
- because postgresql responds to SIGTERM by disallowing new connections but
permitting existing ones to keep running, it continues doing that until
being killed.

Reviewing the current docs for s6, I see that I can improve this situation
a bit by using a "down-signal" file to tell s6-supervise to send a SIGINT
instead of a SIGTERM. That's cool! But what I would really _like_ to do is
wait for up to a minute to allow the database to shut down cleanly before
the system shutdown proceeds -- something more like...

s6-svc -Oic -wd -T60000 /path/to/svcdir || s6-svc -Oq -wd /path/to/svcdir

Is there an elegant way to get that to happen?

It seems like maybe I could do that by running s6-svscan with the -s
option, and writing a .s6-svscan/SIGTERM handler, or by putting the
commands I want to run in the s6-svscan finish script, but if there's a
better option I am really curious to know it!

Cheers,

Brett

-- 
Brett Neumeier (bneumeier@gmail.com)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How best to ensure s6-managed services are shut down cleanly?
  2019-02-01 18:25 How best to ensure s6-managed services are shut down cleanly? Brett Neumeier
@ 2019-02-01 19:46 ` Laurent Bercot
  2019-02-02  2:19   ` Jonathan de Boyne Pollard
  2019-02-02 14:26   ` Brett Neumeier
  0 siblings, 2 replies; 4+ messages in thread
From: Laurent Bercot @ 2019-02-01 19:46 UTC (permalink / raw)
  To: supervision

>I _think_ that with my naive current setup, what actually happens is:
>
>- systemd sends a SIGTERM to s6-svscan;
>- s6-svscan sends a SIGTERM or SIGHUP to all s6-supervise processes,
>depending on what they are supervising, and then runs the finish program;
>- the s6-supervise for postgresql sends a SIGTERM and a SIGCONT to the main
>database process. It then waits for the postgresql process to terminate,
>runs its finish program if there is one, and exits;
>- because postgresql responds to SIGTERM by disallowing new connections but
>permitting existing ones to keep running, it continues doing that until
>being killed.

That sounds accurate.

>Reviewing the current docs for s6, I see that I can improve this situation
>a bit by using a "down-signal" file to tell s6-supervise to send a SIGINT
>instead of a SIGTERM.

Yes, being able to customize the signal that kills the service was a
highly requested feature. I postponed it for a long time because I
couldn't find a model that didn't jeopardize the supervisor's stability.
(The original implementation of this feature is runit's control/ 
scripts,
but a bad control script can hang runsv.) down-signal is not as 
flexible,
but it's safe.

>  That's cool! But what I would really _like_ to do is
>wait for up to a minute to allow the database to shut down cleanly before
>the system shutdown proceeds

The question is, how does systemd decide to proceed with the rest of
the shutdown? If it's just waiting for s6-svscan to die, then it's
easy: don't allow s6-svscan to die before all your services are
properly shut down. That can be done by a single s6-svwait invocation
in .s6-svscan/finish:

#!/bin/sh
exec s6-svwait -D -T60000 /scandir/*

and s6-svscan's death won't be reported to systemd before all your
services are really down, or one minute, whichever happens sooner.

--
Laurent

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How best to ensure s6-managed services are shut down cleanly?
  2019-02-01 19:46 ` Laurent Bercot
@ 2019-02-02  2:19   ` Jonathan de Boyne Pollard
  2019-02-02 14:26   ` Brett Neumeier
  1 sibling, 0 replies; 4+ messages in thread
From: Jonathan de Boyne Pollard @ 2019-02-02  2:19 UTC (permalink / raw)
  To: Supervision

[-- Attachment #1: Type: text/plain, Size: 1744 bytes --]

Laurent Bercot:

> The question is, how does systemd decide to proceed with the rest of 
> the shutdown?
>
It waits for |s6-svscan| for up to 90s, putting the infamous cylon 
warrior and "A stop job is running for s6" message on the console.  
After 90s, it starts forcibly killing stuff, not necessarily in the 
right order because it does not know that PostgreSQL should be killed 
before |s6-svscan| and that main services are best taken down before log 
services.

No, it will not wait forever for |s6-svscan| to exit.  That is not a way 
to block it.

I arrange things differently for running |service-manager| under systemd 
<http://jdebp.eu./Softwares/nosh/guide/svscan-startup.html#systemd>:

    % grep ExecStop /usr/local/lib/systemd/system/system-control-normal.service
    ExecStop=/bin/system-control start --verbose shutdown
    %

|system-control| 
<http://jdebp.eu./Softwares/nosh/guide/commands/system-control.xml> has 
all of the logic that knows to try harder if a |TERM| signal does not 
stop a service within 60s, and the |start| of |shutdown| stops running 
normal services because the |shutdown| service has |conflicts/| 
relationships with them.  None of this logic is in the service manager 
itself, which does not need to know about timeouts and alternative 
signals, it comprising mechanism not policy.

systemd will still try sending |TERM| signals to the service manager and 
force-killing stuff out of order, but because of an |After=| ordering 
only /after/ the |ExecStop| of |system-control-normal.service| has had 
its chance to shut things down in an orderly fashion.  systemd does not 
even begin taking down the service manager until after |system-control| 
has attempted to shut down all managed services.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How best to ensure s6-managed services are shut down cleanly?
  2019-02-01 19:46 ` Laurent Bercot
  2019-02-02  2:19   ` Jonathan de Boyne Pollard
@ 2019-02-02 14:26   ` Brett Neumeier
  1 sibling, 0 replies; 4+ messages in thread
From: Brett Neumeier @ 2019-02-02 14:26 UTC (permalink / raw)
  To: Laurent Bercot; +Cc: supervision

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

On Fri, Feb 1, 2019 at 1:46 PM Laurent Bercot <ska-supervision@skarnet.org>
wrote:

> The question is, how does systemd decide to proceed with the rest of
> the shutdown? If it's just waiting for s6-svscan to die, then it's
> easy: don't allow s6-svscan to die before all your services are
> properly shut down. That can be done by a single s6-svwait invocation
> in .s6-svscan/finish:
>
> #!/bin/sh
> exec s6-svwait -D -T60000 /scandir/*
>
> and s6-svscan's death won't be reported to systemd before all your
> services are really down, or one minute, whichever happens sooner.
>

Perfect! I figured there would be something. Thanks as always for your help.

-- 
Brett Neumeier (bneumeier@gmail.com)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-02-02 14:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-01 18:25 How best to ensure s6-managed services are shut down cleanly? Brett Neumeier
2019-02-01 19:46 ` Laurent Bercot
2019-02-02  2:19   ` Jonathan de Boyne Pollard
2019-02-02 14:26   ` Brett Neumeier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).