supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: "Laurent Bercot" <ska-supervision@skarnet.org>
To: supervision <supervision@list.skarnet.org>
Subject: Re: further claims
Date: Tue, 30 Apr 2019 08:56:41 +0000	[thread overview]
Message-ID: <em33d79abe-979e-471b-b7a7-63b8d895a8fa@elzian> (raw)
In-Reply-To: <15044531556573627@iva6-ff1651a9aa83.qloud-c.yandex.net>


>haven't you claimed process #1 should supervise long running
>child processes ? runit fulfils exactly this requirement by
>supervising the supervisor.

Not exactly, no.
If something kills runsvdir, then runit immediately enters
stage 3, and reboots the system. This is an acceptable response
to the scanner dying, but is not the same thing as supervising
it. If runsvdir's death is accidental, the system goes through
an unnecessary reboot.


>this lengthens the supervision chain but also has the additional
>advantage of a supervised supervisor. ;-)

No.


>maybe runsvdir was not made to run as process #1 and this was
>just a hack its author came up with to replace (SysV) init totally.

Gerrit may correct me here, but I think that was the idea, yes.
runit predates s6 and its goal was to provide a daemontools-like
supervision suite that could also be used as an init system. No
more, no less; and I think it succeeded.


>sure, if (s6-)svscan dies one is in deep shit aswell, so what is the point
>here ?

If s6-svscan dies, the pipes are still maintained in the
s6-supervise processes. You would need to kill the supervisor *and*
the scanner for the pipe to disappear, whereas with runit, the pipe
disappears and you can lose logs as soon as you kill the supervisor.
And of course, if s6-svscan runs as process 1, you cannot kill it.


>  runsv gets restarted by runsvdir but the pipe is gone (are pipes
>really closed when the opening (parent) process exits without closing
>them itself and subprocesses still use that very pipe ?)

  The problematic case is when the consumer (i.e. the logger) dies
while the producer (i.e. the service) is still outputting logs.
When that happens, you need a process to hold the reading end
of the logging pipe. If you don't have such a process, the pipe
is closed when the consumer dies, and any data that is still
in transit is lost.

  When the logging pipe is held by runsv, if runsv dies, then
this situation is possible. Of course nothing wrong happens as
long as the logger stays alive, but when the logger dies, the
service needs to die first, in order for the logging pipe to be
properly recreated without any log loss.

  When the logging pipe is held by s6-svscan and you have one
supervisor per process, then any of the supervisors or the
supervised processes may die at any time, but the logging pipe
is never broken. You'd have to go back and kill s6-svscan in
order to have a chance at ever losing logs.


 > [perpd]
>but from a design perspective it seems as reliable as s6-svscan ?
>or not since it uses a more integrated desing/approach ?

I trust Wayne to have written perpd correctly. However, from a
pure design perspective, perpd is unarguably more complex, since
it has to perform the job of the scanner + N supervisors in one
process, so it's naturally more difficult to make sure there's
no bugs in it.
The state machine in s6-supervise is complex enough. I wouldn't
want to maintain N similar constructs in one unique process. It's
doable, of course, but requires more effort to write, debug, and
maintain.


>this design simplifies communication since tasks are not
>implemented in other tools running as its (direct) subprocesses.

  Yes, that is the classic trade-off of multiprocess designs.
It's mostly a question of taste. I tend to favor multiprocess designs
because the costs of having more - and more complex - communication
is usually largely outweighed by the benefits of having significantly
less code and simpler code paths.

--
  Laurent



  reply	other threads:[~2019-04-30  8:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-29 21:33 Jeff
2019-04-30  8:56 ` Laurent Bercot [this message]
2019-05-01 23:09   ` Guillermo
2019-05-02  0:30     ` Colin Booth
2019-05-03  2:44       ` ToyBox oneit Jeff
2019-05-05  2:07         ` ToyBox init Jeff
2019-05-03  2:15     ` Runit Jeff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=em33d79abe-979e-471b-b7a7-63b8d895a8fa@elzian \
    --to=ska-supervision@skarnet.org \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).