supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: Jeff <sysinit@yandex.com>
To: supervision@list.skarnet.org
Subject: Re: interesting claims
Date: Thu, 16 May 2019 19:10:50 +0200	[thread overview]
Message-ID: <1190281558026650@sas2-c434f6e124b6.qloud-c.yandex.net> (raw)
In-Reply-To: <em557be3bb-11b8-4983-9c51-af19675020e8@elzian>

16.05.2019, 10:31, "Laurent Bercot" <ska-supervision@skarnet.org>:
>> The Question: As a newbie outsider I wonder, after following the
>> discussion of supervision and tasks on stages (1,2,3), that there is a
>> restrictive linear progression that prevents reversal. In terms of pid1
>> that I may not totally understand, is there a way that an admin can
>> reduce the system back to pid1 and restart processes instead of taking
>> the system down and restarting? If a glitch is found, usually it is
>> corrected and we find it simple to just do a reboot. What if you can
>> fix the problem and do it on the fly. The question would be why (or why
>> not), and I am not sure I can answer it, but if you theoretically can do
>> so, then can you also kill pid2 while pid10 is still running. With my
>> limited vision I see stages as one-way check valves in a series of fluid
>> linear flow.

take a look at (the now defunct) depinit:
http://sf.net/p/depinit/
http://depinit.sf.net/

it is said to provide very extended rollback of dependencies
(so extended gettys will not work with it according to the docs).

> Stage 1 isn't reversible; once it's done, you never touch it again,
> you don't need to "reverse" it. It would be akin to also unloading
> the kernel from memory before shutting down - it's just not necessary.

indeed.
and when something fails in that first stage a super-user rescue shell
should be started to fix it instead of any services that depend on it.
(stupid example: sethostname failed for some reason, spawn a rescue
shell for the admin to do something about it ;-).
in such cases it has to be considered whether this failure important
enough to justify interuption of the boot phase.

if not: start as much other services as possible,
output/log an error message, keep calm, and carry on,
things can be handled when a getty is up.

> stage 4

i would prefer to call it "stage 3b" since stage 4 would be start after
stage3a + b, i. e. process #1 execs into another executable, maybe
required in connection with initramfs, anopa provides such a stage 4
execline script.

> - If you want to kill every process but pid 1 and have the system
> reconstruct itself from there, then yes, it is possible, and that is
> the whole point of having a supervision tree rooted in pid 1. When
> you kill every process, the supervision tree respawns, so you always
> have a certain set of services running, and the system can always
> recover from whatever you throw at it. Try it: grab a machine with
> a supervision tree and a root shell, run "kill -9 -1", see what happens.

i wonder what happens if process #1 reacts to, say SIGTERM,
by starting the shutdown phase and doing reboot afterwards.
what if process #1 is signaled "accidently" by kill -TERM 1
(as we saw in preceding posts -1 will not reach it).
nothing is restarted and the system goes down instead since
it is assumed that the signal was not sent "accidently".

in the case of a process #1 not supervising anything, supervisor
runs with 1 < PID when killing everything "accidently"
(via kill ( -1, SIGKILL ) for example), system is bricked, reset
button has to be used:

only a privileged process can reach everything with PID > 1 that
way. there seems to be something wrong that should be fixed ASAP.
in the case of process #1 respawning the supervisor:
it restarts everything, maybe the "accident" happens again, and so on ...
could lead to the system being caught in such an "endless" loop.
maybe this can also only get fixed by powering down ...

non supervising process #1: same, but worse: reset button has to
be used, state is lost, fs are not unmounted cleanly and what not.

but in the situation of a supervising process #1 it can also be possible
to be prevented from entering the shutdown phase cleanly.



  reply	other threads:[~2019-05-16 17:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-29 19:19 Jeff
2019-04-30  2:49 ` Guillermo
2019-04-30  8:22 ` Laurent Bercot
2019-05-03  0:53   ` what init systems do you use ? Jeff
2019-05-11 18:45     ` Guillermo
2019-05-13 19:13     ` multiplexd
2019-05-13 20:36       ` Laurent Bercot
2019-05-13 21:09       ` Steve Litt
2019-05-14  2:34         ` Guillermo
2019-05-13 21:16       ` Joshua Ismael Haase Hernández
2019-05-14  5:50     ` Colin Booth
2019-05-14  7:15       ` eric vidal
2019-04-30  8:47 ` interesting claims Jonathan de Boyne Pollard
2019-05-01  7:26 ` Steve Litt
2019-05-01  7:33 ` Steve Litt
2019-05-01 18:13   ` Laurent Bercot
2019-05-15 17:22     ` Steve Litt
2019-05-15 23:22       ` Oliver Schad
2019-05-16  1:07         ` Steve Litt
2019-05-16  5:36           ` fungal-net
2019-05-16  8:32             ` Laurent Bercot
2019-05-16 17:10               ` Jeff [this message]
2019-05-17  0:23               ` Dewayne Geraghty
2019-05-17 11:21               ` fungal-net
2019-05-17 22:57                 ` Guillermo
2019-05-18  0:52                   ` Jeff
2019-05-18 16:26                     ` fungal-net
2019-05-18 20:04                       ` Guillermo
2019-05-19 11:24                         ` fungal-net
2019-05-19 12:57                           ` killall test run Jeff
2019-05-19 17:29                             ` Colin Booth
2019-05-19 20:39                             ` Guillermo
2019-05-19 23:06                               ` Laurent Bercot
2019-05-19 20:35                           ` interesting claims Guillermo
2019-05-03  1:37   ` how to handle system shutdown ? Jeff
2019-05-03 19:25     ` Laurent Bercot
2019-05-05  0:52       ` is it required to call kill() from process #1 ? Jeff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1190281558026650@sas2-c434f6e124b6.qloud-c.yandex.net \
    --to=sysinit@yandex.com \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).