supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: "Laurent Bercot" <ska-supervision@skarnet.org>
To: "supervision@list.skarnet.org" <supervision@list.skarnet.org>
Subject: Answers to some GitHub questions
Date: Sat, 26 Dec 2020 00:57:30 +0000	[thread overview]
Message-ID: <eme7ca9413-36f8-4300-acab-fba842d53d26@elzian> (raw)


  Hi,

  Someone on GitHub asked a series of questions about s6. I decided to
answer them here, in order to have the post in the ML archives and a
place (outside of GitHub) to refer people to if similar questions arise
in the future.


  * Benefits of s6 over runit?

  s6 has multiple benefits over runit. I'd say that its biggest advantage
is that it is actually maintained and actively developed (albeit more
slowly than I would like). Also, it has been written after runit, so it
learned from it and expanded on it, and considerable effort has been
put into it.

  From a developer's point of view, there is *a lot* that can be said in
favor of s6: technical details that are mostly invisible to the user but
that add up to significantly cleaner code. An example of this is that
s6's supervisor program, s6-supervise, is implemented as a fully
asynchronous state machine, which means it will always be responsive to
user commands no matter what happens; whereas there are situations in
which runsv, runit's supervisor program, will be unresponsive for up to
5 seconds. These details matter.

  From a user's point of view, the most visible difference is that s6
comes with a lot of additional tools that help make it a powerful
toolbox for the admin, in supervision-related matters and beyond. For
instance, s6 keeps a tally of the latest deaths of a process, and allows
you to declare permanent failure on a configurable pattern of deaths.
Or, s6 provides you with an Unix domain super-server that allows you to
quickly write local services, with access control done by uid/gid.

  A comparison between s6 and other supervision suites, including runit,
is available here: https://skarnet.org/software/s6/why.html
  It's an old page, and now very incomplete, but what is written is still
accurate.


  * Why use a database instead of symlinks?

  This question, I think, comes from a misunderstanding. s6 "uses
symlinks" just like every other supervision suite. s6-svscan has a scan
directory, just like runsvdir; s6-supervise has a service directory,
just like runsv. s6 and runit are very similar in that respect.
  Where the database comes into play is when you're using s6-rc, which is
a related, but distinct, piece of software:
  https://skarnet.org/software/s6-rc/

  s6-rc is not a supervision suite, but a service manager that runs *on
top of* the s6 supervision suite. Those are complementary tools. And
yes, s6-rc uses a database (but not a relational one, and no, there is
no SQL backend!), mostly because dependencies between services need to
be stored somewhere. Note that other service managers, like OpenRC or
systemd (the latter being a supervision suite and an init/shutdown
system as well), *also* use a database, they just do not tell you
upfront, and trying to pretend to the user that there's no database
actually makes them more complex and brittle than necessary.


  * Why don't some command-line options have a long form?

  Because s6 is not a GNU package. Long options need the getopt_long(3)
function in order to be parsed; this is a GNU extension to POSIX, and s6
tries to avoid those and stick to POSIX features as much as possible.
  The upcoming high-level interface to s6, s6-frontend, will accept long
options, but it's still in the distant future.


  * You said that the concept of runlevels in runit didn't work in case
of udevd, which had to be run unsupervised in runlevel 1. How does s6
address this problem?

  I certainly did not say that, because runit does not use runlevels,
which are a legacy concept from sysvinit, mixing two completely
different things - halt/reboot procedures (0 and 6) and machine states
(1 to 5, and 1 is a special case requiring particularly ugly hacks).
  But I did say something similar with *stages of init*.

  With the runit model, runsvdir, which starts the supervision tree, is
run in stage 2. Before stage 2, there is no supervision tree available.
And once runsvdir runs, there is no possibility of running oneshots.
So all the oneshots have to be run in stage 1. When a oneshot depends
on a longrun (a daemon), it means that that longrun also has to be run
in stage 1, which prevents it from being supervised since the
supervision tree is not running yet.

  s6 addresses this problem in two ways. Rather, more accurately, it
*does not* address this problem, because it's not for a supervision
system to solve: it's about ordering services, which is the job of a
*service manager*, not a supervision suite. (And that is why runit,
which tries to be a supervision suite and also a /sbin/init and pid 1
but not a service manager, does not solve it.)

  Instead, s6 delegates the problem to other packages that were
specifically written for this:
  - s6-linux-init: https://skarnet.org/software/s6-linux-init/
is an early boot package for Linux, that provides sysvinit
compatibility, and that starts a supervision tree as early as
possible, before running basically *any* service. By using this
package, admins can have a s6-svscan process (the equivalent of
runsvdir) running as pid 1, for the whole lifetime of the system,
before starting their regular boot sequence. This addresses the
problem of not being able to supervise longruns before stage 2: when
using s6-linux-init, you can *always* supervise daemons, no matter
how early they have to be started.
  - s6-rc: the aforementioned service manager. This package handles
dependencies between services, so it can run a complete boot sequence
in the proper order. Longruns are always run supervised, because s6-rc
relies on the existing s6 supervision tree. s6-rc supports oneshots as
well as longruns, so it could run the whole contents of /etc/runit/1 as
oneshots, and start udevd as a longrun at the proper time, so that the
parts of /etc/runit/1 that depend on udevd are only started once udevd
is up and running. (And ready, because s6 supports readiness
notifications, and s6-rc relies on that mechanism to only start a
service when all its dependencies are ready.)

  So, yes, the full picture of how s6 - or rather, the s6 ecosystem -
does it is more complex, because the problem to address is not a
trivial one and involves more than just process supervision. But when
you have s6-linux-init + s6 + s6-rc, you have a complete init system,
that can accommodate any kind of service ordering you need, without
exceptions, hacks or adhocness.


  * Why is there no man documentation for s6 and its related binaries?

  Because you have not looked carefully enough. :) There *are* s6 man
pages now, thanks to flexibeast:
  https://github.com/flexibeast/s6-man-pages
and they are linked from the s6 main page.

  Now if the question is: why aren't the man pages bundled in the s6
package? then it's a rather complex answer that involves the very sorry
state of rich text format processing tools, and my potent dislike for
the roff/mandoc/... format and for anything related to documentation
formats in general as well as afferent discussions. My position about
the whole thing is summed up here:
https://skarnet.org/cgi-bin/archive.cgi?2:mss:2583:202008:liednclklokjhnokhdhg
and the bottom hard line is that until a miracle happens and it becomes
simple to write documentation in a clear, humanly understandable format
that compiles easily to both mandoc and clean, acceptable HTML and does
not require the user to install bloated toolchains written in a language
that pulls two kitchen sinks of dependencies, if you need s6 man pages,
the ones maintained by flexibeast are where it's at.


  * Did Gerrit Pape abandon runit? Has development stopped?

  That's not a question for me to answer. But what I know is that
runit is not *actively* developed by Gerrit anymore, even if he pops
here from time to time and has not formally abandoned the project.
  What he said about it in 2016:
https://skarnet.org/cgi-bin/archive.cgi?2:mss:1417:lodleppmpcafcpjcgidf
  All of Gerrit's interventions on the supervision mailing-list:
https://www.mail-archive.com/search?a=1&l=supervision%40list.skarnet.org&haswords=&x=12&y=18&from=Gerrit+Pape&subject=&datewithin=1d&date=&notwords=&o=newest

--
  Laurent


             reply	other threads:[~2020-12-26  0:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-26  0:57 Laurent Bercot [this message]
2021-01-06 14:56 ` Alex Efros via supervision
2021-01-06 18:47   ` Colin Booth
2021-01-09 17:25   ` Laurent Bercot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eme7ca9413-36f8-4300-acab-fba842d53d26@elzian \
    --to=ska-supervision@skarnet.org \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).