From: Stefan Karrmann <S.Karrmann@web.de>
To: Laurent Bercot <ska-supervision@skarnet.org>
Cc: supervision@list.skarnet.org
Subject: Re: stage2 as a service
Date: Sun, 31 Jan 2021 21:51:55 +0100 [thread overview]
Message-ID: <20210131205155.GA26069@web.de> (raw)
In-Reply-To: <em16c497f5-cdc4-49bb-9b61-ae39745c4b6f@elzian>
Hi Laurent,
Laurent Bercot @ 2021-01-31.10:25:22 +0000:
> Hi Stefan,
> Long time no see!
Yes, but still known. I'm impressed!
> A few comments:
>
> > # optional: -- Question: Is this necessary?
> > redirfd -w 0 ${SCANDIR}/service/s6-svscan-log/fifo
> > # now the catch all logger runs
> > fdclose 0
>
> I'm not sure what you're trying to do here. The catch-all logger
> should be automatically unblocked when
> ${SCANDIR}/service/s6-svscan-log/run starts.
Yes, that's the idea.
> The fifo trick should not be visible at all in stage 2: by the time
> stage 2 is running, everything is clean and no trickery should take
> place. The point of the fifo trick is to make the supervision tree
> log to a service that is part of the same supervision tree; but once
> the tree has started, no sleight of hand is required.
For the normal case you are absolutly right. But with stage 2 as a service
you have a race condition between stage 2 and s6-svscan-log. The usual
trick for stage 2 solves this problem.
> > foreground { s6-svc -O . } # don't restart me
>
> If you have to do this, it is the first sign that you're abusing
> the supervision pattern; see below.
Well, running once is a part of supervise from the start on, by djb. It's
invented for oneshots.
> > foreground { s6-rc -l ${LIVEDIR}/live -t 10000 change ${RCDEFAULT} }
> > # notify s6-supervise:
> > fdmove 1 3
> > foreground { echo "s6-rc ready, stage 2 is up." }
> > fdclose 1 # -- Question: Is this necessary?
>
> It's not strictly necessary to close the fd after notifying readiness,
> but it's a good idea nonetheless since the fd is unusable afterwards.
> However, readiness notification is only useful when your service is
> actually providing a... service once it's ready; here, your "service"
> dies immediately, and is not restarted.
You are right.
> That's because it's really a oneshot
Yes, as implemented since djb's daemontools.
> that you're treating as a longrun, which is abusing the pattern.
>
>
> > # NB: shutdown should create ./down here, to avoid race conditions
>
> And here is the final proof: in order to make your architecture work,
> you have to *fight* supervision features, because they are getting in
> your way instead of helping you.
Well, s6-rc is using ./down, too. The shutdown is a very special case for
supervision.
> This shows that it's really not a good idea to run stage 2 as a
> supervised service. Stage 2 is really a one-time initialization script
> that should be run after the supervision tree is started, but *not*
> supervised.
Stage 2 as a service allows us to restart it, if - accidentally - it is
necessary. Obviously, that should be really seldom the case.
> > { # fallback login
> > sulogin --force -t 600 # timeout 600 seconds, i.e. 10 minutes.
> > # kernel panic
> > }
>
> Your need for sulogin here comes from the fact that you're doing quite
> complex operations in stage 1: a user-defined set of hooks, then
> several filesystem mounts, then another user-defined set of hooks.
> And even then, you're running those in foreground blocks, so you're
> not catching the errors; the only time your fallback activates is if
> the cp -a from ${REPO} fails. Was that intended?
No, I should replace foreground by if.
Well, actually I don't use the hooks. But distribution maintainers often
wants such things. E.g. they can scan for mapped devices (raid, lvm,
crypt). On the other hand, I know no distribution which uses Paul Jarc's
/fs/*.
> In any case, that's a lot of error-prone work that could be done in
> stage 2 instead. If you keep stage 1 as barebones as possible (and
> only mount one single writable filesystem for the service directories)
> you should be able to do away with sulogin entirely. sulogin is a
> horrible hack that was only written because sysvinit is complex enough
> that it needs a special debugging tool if something breaks in the
> middle.
Reasonable. I mount only /run and /var, because the log, even the
catch-all-log resides in /var/log/.
> With an s6-based init, it's not the case. Ideally, any failure that
> happens before your early getty is running can only be serious enough
> that you have to init=/bin/sh anyway. And for everything else, you have
> your early getty. No need for special tools.
Okay, thats resonable and simpler.
> > Also I may switch to s6-linux-init finally.
>
> It should definitely spare you a lot of work. That's what it's for :)
I'm still migrating from systemd to s6{,-rc} with /fs/* step by step.
Therfore, I need more flexibility than s6-linux-init.
> --
> Laurent
Kind regards,
--
Stefan Karrmann
next prev parent reply other threads:[~2021-01-31 20:52 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-28 10:08 Some suggestions on old-fashioned usage with s6 2.10.x Casper Ti. Vector
2021-01-28 11:09 ` Casper Ti. Vector
2021-01-28 14:05 ` Casper Ti. Vector
2021-01-29 1:41 ` Guillermo
2021-01-29 3:06 ` Casper Ti. Vector
2021-01-29 17:27 ` Guillermo
2021-01-29 17:39 ` Guillermo
[not found] ` <YBN7zfp/MmbcHOCF@caspervector>
2021-01-29 9:57 ` Laurent Bercot
2021-01-29 14:33 ` Casper Ti. Vector
[not found] ` <YBQcwHN1L/N2dedx@caspervector>
2021-01-29 15:48 ` Laurent Bercot
2021-01-31 7:49 ` stage2 as a service [was: Some suggestions on old-fashioned usage with s6 2.10.x] s.karrmann
2021-01-31 10:25 ` Laurent Bercot
2021-01-31 20:51 ` Stefan Karrmann [this message]
2021-02-01 10:35 ` stage2 as a service Laurent Bercot
2021-02-15 8:36 ` Some suggestions on old-fashioned usage with s6 2.10.x Casper Ti. Vector
[not found] ` <YCoykUYGXVt+BAT9@caspervector>
[not found] ` <em949fd937-c7bc-43db-9b49-3cc235b8f2ad@elzian>
2021-02-16 8:53 ` Casper Ti. Vector
[not found] <YBKNJEuGeYag91Q1@caspervector>
2021-01-28 17:21 ` Laurent Bercot
2021-01-28 19:08 ` Roy Lanek
2021-01-28 19:55 ` Casper Ti. Vector
[not found] ` <YBMWuUCUTVjUNinQ@caspervector>
2021-01-29 0:07 ` Laurent Bercot
2021-01-29 2:44 ` Casper Ti. Vector
[not found] ` <YBN2p2UkIiP8lMQy@caspervector>
2021-01-29 9:36 ` Laurent Bercot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210131205155.GA26069@web.de \
--to=s.karrmann@web.de \
--cc=ska-supervision@skarnet.org \
--cc=supervision@list.skarnet.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).