From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2881 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: "Laurent Bercot" Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runit SIGPWR support Date: Tue, 18 Feb 2020 09:39:14 +0000 Message-ID: References: <20200131043919.GF12551@cathexis.xen.prgmr.com> <20200214131544.tcvmh7tqu4hu2gul@caspervector> <1f198ed8-3682-26cd-e8d5-2efc412afde2@gmx.com> <18110531581952419@sas8-7ec005b03c91.qloud-c.yandex.net> Reply-To: "Laurent Bercot" Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="11014"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: eM_Client/7.2.36908.0 To: supervision Original-X-From: supervision-return-2470-gcsg-supervision=m.gmane-mx.org@list.skarnet.org Tue Feb 18 10:39:23 2020 Return-path: Envelope-to: gcsg-supervision@m.gmane-mx.org Original-Received: from alyss.skarnet.org ([95.142.172.232]) by ciao.gmane.io with smtp (Exim 4.92) (envelope-from ) id 1j3zLe-0002h8-JQ for gcsg-supervision@m.gmane-mx.org; Tue, 18 Feb 2020 10:39:22 +0100 Original-Received: (qmail 10838 invoked by uid 89); 18 Feb 2020 09:39:43 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Original-Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Original-Received: (qmail 10831 invoked from network); 18 Feb 2020 09:39:43 -0000 In-Reply-To: <18110531581952419@sas8-7ec005b03c91.qloud-c.yandex.net> X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: 0 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedugedrjeekgddtiecutefuodetggdotffvucfrrhhofhhilhgvmecupfgfoffgtffkveetuefngfdpqfgfvfenuceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkjghfrhgfgggtgfesthhqredttderjeenucfhrhhomhepfdfnrghurhgvnhhtuceuvghrtghothdfuceoshhkrgdqshhuphgvrhhvihhsihhonhesshhkrghrnhgvthdrohhrgheqnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmohguvgepshhmthhpohhuth Xref: news.gmane.io gmane.comp.sysutils.supervision.general:2881 Archived-At: >absolutely right, totally agreed. >i also wondered why he refuses to add this. >just catch and handle ALL possible signals, including the RT signals >and leave it to the user how to react. In the github issue you linked, I explained my exact reasoning. An additional reason is that signaling init is not a casual operation; instead it's part of a very limited API between the kernel and user space, to be used in very controlled, exhaustively listed, situations. >sorry Laurent, this is absolutely ridicolous. >we are talking about using s6 as Linux process #1 No, that's not what we were talking about. We were talking about using runit as pid 1 in a container. I just used s6 and SIGPWR-as-sent-by-lxd as an illustration of why patching software is always more complicated than using configuration switches. And I stand by my point. Now, *as a separate conversation*, you can say that s6-svscan should be able to handle every signal that the kernel can throw at it, no matter how unportable. And it is a reasonable request: there are good arguments for it. But the case for SIGPWR *is not* "that is the signal sent by lxd when it wants to shut down a container"! The case for SIGPWR is "the kernel may send this in the event of a power failure". You may find that the difference is asinine, and that I'm splitting hairs; but I'm really not, and the difference is subtle but important. In the latter case, the kernel takes precedence over init, the kernel decides what the API is and init must adapt. If the kernel says "when I get a power failure, I send you SIGPWR", init cannot say "uh, no, I wish you'd send SIGUSR2 instead". Shut up and handle SIGPWR. In the former case, lxd *emulates* a kernel, and is supposed to adapt to every kind of init that runs in a container, so it should follow existing conventions and be able to adapt to every init. And that's exactly why the lxc.signal.stop configuration switch exists! Now, "stop the machine" is not a signal that a kernel would send on its own. The decision to power off the machine comes from the admin, usually via a "shutdown" command or equivalent. And here's the thing: there is *no universal convention* on the API that a "shutdown" command must follow. None. Some inits use SIGTERM for that. Others use SIGUSR1. Others use SIGUSR2. Others use a totally different mechanism and don't send a signal to init at all. systemd, always being a special snowflake, uses SIGRTMIN+3 and SIGRTMIN+4, because any other choice made way too much sense. None of them uses SIGPWR, and for a good reason: SIGPWR does not mean "the admin requested a system shutdown", it means "power failure". And it is very possible that the action implemented by the system in case of a power failure is very different from a shutdown: it could be a suspend-to-disk, for instance (which is faster than a full shutdown, and when the power fails you want to save your data *fast*). So, even for inits that actually understand SIGPWR - and most of them actually do - SIGPWR is a *terrible* default choice of signal to send as a shutdown request. It already has a use, and the use is not a normal shutdown. Arguably, lxc.signal.halt should *always* be set to something else, be it SIGTERM, SIGUSR1, SIGUSR2, or even lolSIGRTMIN+3. So, if you're asking me to implement SIGPWR support in s6 because that's what lxd sends by default to signal a container shutdown, I will laugh at you, because you are being, uh, "ridicolous". On the other hand, if you're telling me that s6-svscan needs to understand SIGPWR in case the kernel wants to signal a power failure, you actually have a good point, and yes, I should implement SIGPWR support when this signal exists. -- Laurent