From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2473 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Kelly Dean Newsgroups: gmane.comp.sysutils.supervision.general Subject: s6 bites noob Date: Thu, 31 Jan 2019 18:32:45 +0000 Message-ID: Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="44103"; mail-complaints-to="usenet@blaine.gmane.org" To: supervision@list.skarnet.org Original-X-From: supervision-return-2063-gcsg-supervision=m.gmane.org@list.skarnet.org Thu Jan 31 19:34:01 2019 Return-path: Envelope-to: gcsg-supervision@m.gmane.org Original-Received: from alyss.skarnet.org ([95.142.172.232]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1gpHA1-000BNc-H8 for gcsg-supervision@m.gmane.org; Thu, 31 Jan 2019 19:34:01 +0100 Original-Received: (qmail 16525 invoked by uid 89); 31 Jan 2019 18:34:22 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Original-Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 16518 invoked from network); 31 Jan 2019 18:34:22 -0000 X-Originating-IP: 174.18.47.127 Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2473 Archived-At: mkdir test s6-svscan --help Well, that was surprising and unpleasant. It ignores unknown arguments, blithely starts a supervision tree in the current dir (my home dir), and spams me with a bunch of supervise errors. Ok, kill it. Next test: s6-svscan test It gives errors about supervise being unable to spawn ./run, and the child dying. What? On an empty scan dir? Oh, the previous test's accidental supervision tree ran supervise on all the current dir's subdirs--and each instance of supervise automatically created a supervise subdir of its service dir. So, now there's test/supervise, which svscan now interprets as a service dir, and starts supervise on it, which barfs. What purpose is served by supervise automatically creating the supervise and event subdirs if there's no run file? It seems to accomplish nothing but errors and confusion. Instead of creating the subdirs, and then barfing on the absence of a run file, why not just create nothing until a run file appears? The doc for svscan at least says that it creates the .s6-svscan subdir. The doc for supervise says nothing about creating the supervise subdir, though the doc for servicedir does say it. Next problem. The doc for s6-svc indicates that s6-svc -wu serv/foo will wait until it's up. But that's not what happens. Instead, it exits immediately. It also doesn't even try to start the service unless -u is also given, which is surprising, but technically not in contradiction of the doc. And if -u is given, then -wu waits forever, even after the service is up. In serv/foo/run I have: #/bin/bash echo starting; sleep 2; echo dying s6-svc -wu -u serv/foo/ will start it, but never exits. Likewise, s6-svc -wd -d serv/foo/ will stop it, but never exits. supervise itself does do its job though, and perpetually restarts run after run dies while the service is set to be up. So, I tried s6-rc. Set up service definition dir, compile database, create link, run s6-rc-init, etc, then finally s6-rc -u change foo It starts immediately, but rc then waits while foo goes through 12 to 15 start/sleep/die cycles before rc finally exits with code 0. (And foo continues cycling.) But if I press ^C on rc before it exits on its own, then it kills foo, writes a warning that it was unable to start the service because foo crashed with signal 2, and exits with code 1. So I tried it again, and this time pressed ^C on rc immediately after running it, before foo had a chance to die for the first time. It reported the same warning! The prophecy is impressive, but still, shouldn't rc just exit immediately after foo starts, and let the supervision tree independently handle foo's future death? Next test: I moved run to up, changed type to oneshot, recompiled, created new link, ran s6-rc-update, and tried foo again. This time, rc hangs forever, and up is never executed at all. When I eventually press ^C on rc, though, it doesn't say unable to start foo; it says unable to start s6rc-oneshot-runner. This is all with default configuration for skalibs, execline, s6, and s6-rc, sourced from Github, running on Debian 9, in my home directory as a non-root user (with -c option for rc-init and -l for rc-init, rc, and rc-update, to avoid polluting system dirs while testing). s6-rc doesn't understand a --version option, but s6-rc/NEWS says 0.4.1.1. And s6/NEWS says 2.8.0.0. And there appears to be an option missing for s6-rc: s6-rc -d list # List all s6-rc -a list # List all up s6-rc -d change foo # Bring foo down s6-rc -u change foo # Bring foo up s6-rc -da change # Bring all down How to bring all up? The examples above suggest it would be s6-rc -ua change But that does nothing. (And the doc does indicate that it would do nothing, since there's no selection.) And a question about the advice in the docs. if svscan's rescan is 0, and /tmp is RAM, what's the advantage of having the scan directory be /tmp/service with symlinks to service directory copies in /tmp/services, instead of simply having /tmp/services directly be the scan directory? I guess an answer might be that there can be a race between svscan's initial scan at system startup and the populating of /tmp/services, so it sees partially copied service directories. But wouldn't a simpler solution be to either delay svscan's start until the populating is complete, or add an option to disable its initial scan? With no initial scan, then you have to run svscanctl -a after the /tmp/services populating is complete, but you have to run that anyway even if you're using symlinks from a separate /tmp/service directory.