From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from alyss.skarnet.org (alyss.skarnet.org [95.142.172.232]) by inbox.vuxu.org (Postfix) with SMTP id B0D88212C8 for ; Sat, 11 Jan 2025 13:21:32 +0100 (CET) Received: (qmail 60613 invoked by uid 89); 11 Jan 2025 12:21:57 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Received: (qmail 60606 invoked from network); 11 Jan 2025 12:21:57 -0000 From: "Laurent Bercot" To: supervision@list.skarnet.org Subject: Re[2]: Scripting Stage 3 and 4 Date: Sat, 11 Jan 2025 12:21:26 +0000 Message-Id: In-Reply-To: <9ec5e5f1-35f9-4e6b-abd0-79299779b515@sopka.ch> References: <1dcb4b00-db5d-49bd-8598-fd6b2e5febdd@sopka.ch> <9ec5e5f1-35f9-4e6b-abd0-79299779b515@sopka.ch> Reply-To: "Laurent Bercot" User-Agent: eM_Client/10.1.4588.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable >Since you can get a shell easily in the finish script too: > >| if -n { mount -o remount,ro / } >| sh With no tty? and not even a redirection from/to /dev/console? That's going to be a tough recovery ;) And sure, you could wrap all your commands in a big if block. But my point is that these are *things you need to think about*. I'd rather not have to think about new things at shutdown time when I don't have to. Why change something and duplicate logic when you could just do nothing and keep what's already working and has worked for the whole lifetime of the machine? The key to understanding how to manage a shutdown, I think, is to realize that boot and shutdown *are not symmetrical*, and so, do not need to be handled in a symmetrical way. Because you set up a supervision tree at boot time does not mean you need to get rid of it at shutdown time. When you boot, you're starting from *nothing*, and you need to build up to a state where the machine is functional and able to run services, that's why it's incremental and deliberate and elaborate. It is literally a bootstrap process that needs precise ordering. When you shutdown, you're starting from a state where everything is already working, so you can rely on many more features, and you're not trying to build anything, you're just trying to ensure consistency of permanent state (aka disk) before you pull the plug. It's the only thing that matters. If you were only ever operating in RAM, starting from ROM / ro disk, and never had any mutable permanent state, your shutdown procedure could just be an immediate reboot(), or pressing the power off button; it is the case for some embedded devices. But with mutable permanent state, we need to be more careful, that's why we shut down services in an ordered way, and then make sure we can unmount the filesystems before powering off. It's the *only* reason; apart from that, you can do whatever you want. Who cares? the system is going to be down anyway. When the Armageddon comes, you want to make sure the time capsules are well sealed and buried for the next civilization to find, but you don't have to clean your room. So the goals of boot and shutdown are very different. I specifically designed s6-linux-init so it would not store any vital information in permanent mutable state, and would not hold any writing fd on a filesystem. That means s6 will not prevent you from unmounting your filesystems, parking your disks, whatever - and that the supervision tree can keep running until you pull the plug. It is designed to help you while the machine is running, and *especially* in delicate situations where you're killing things and want to be sure you can recover if something goes wrong rather than brick the system. That's why I'm saying it's less effort to keep it in place and work with disabling supervision when it needs to be disabled, than to dismantle the supervision tree and have to reimplement recovery logic. >And accidentally killing PID 1 would break the supervision tree based appr= oach too, right? My bad, "killing pid 1" is the wrong wording, because it's actually impossible. I meant: pid 1 exiting. *That* is something you need to worry about when scripting, and that won't happen if you just keep the supervision tree. >But getting a more advanced recovery method up, >e.g. an ssh server when you only have remote access >or an agetty instead of PID 1 sh on a desktop machine, >will be more reliable under a supervision tree, I can see that. Exactly. >| if -n { mount -o remount,ro / } >| foreground { s6-svc -U /run/service/recovery } You just tore down the supervision tree, and you want to start a recovery... service? :D >Anyhow, you have convinced me that keeping the supervision tree is the bet= ter way. \o/ This might be the first time someone listens to me \o/ -- Laurent