From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29036 invoked from network); 4 Oct 2022 11:54:12 -0000 Received: from alyss.skarnet.org (95.142.172.232) by inbox.vuxu.org with ESMTPUTF8; 4 Oct 2022 11:54:12 -0000 Received: (qmail 16967 invoked by uid 89); 4 Oct 2022 11:54:36 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Received: (qmail 16960 invoked from network); 4 Oct 2022 11:54:36 -0000 From: "Laurent Bercot" To: "Ihor Antonov" Subject: Re: ftrig pipe naming convention Cc: supervision@list.skarnet.org Date: Tue, 04 Oct 2022 11:54:08 +0000 Message-Id: In-Reply-To: <20220918211116.2iqsjmqdcqdw3t6h@localhost> References: <20220918154159.che33klmotj6nps6@localhost> <20220918211116.2iqsjmqdcqdw3t6h@localhost> Reply-To: "Laurent Bercot" User-Agent: eM_Client/9.1.2109.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable (I was on vacation, sorry for the delayed answer.) >Could you please elaborate on the possible race condition? This is simply= for curiosity and educational purposes. It feels like a >lot of thought was put into s6 codebase, and a lot of ideas are not >immediatedly obvious for people not intimately familiar with OS >interface. When you want to listen to notifications, one of the first questions that come to mind is: what happens if a notification comes before I start listening to it? How do I make sure I don't miss anything? That's the primary race condition for notifications, and the reason why a simple tool such as s6-ftrig-wait can only be used in the simplest of settings: when you run s6-ftrig-wait, what you get is notifications from the moment you run it, and you don't know what happened before. The answer is synchronization between the listener and the sender. In order to make sure the listener misses nothing, *first* the listener starts listening, *then* the sender is run and can notify the listener. That's how s6-ftrig-listen1 works: the rest of its command line is spawned *after* it has made sure there's a fifo listening in the fifodir, and that command line is supposed to be something that tells the sender that it's okay to start sending notifications now. ftrigr_subscribe() is the primitive that makes a program listen to notifications in a fifodir, and returns control to the client. It is important because it is asynchronous: notifications will be read and processed as soon as ftrigr_subscribe() returns, and the client can do whatever it needs to do, such as read a state or prime the notification sender, and then handle the notifications in its own time by calling ftrigr_update(). The fact that you can do something between subscribing and handling the notifications is fundamental, and what makes the model strictly more powerful than "cat < $fifo". Internally, it's the s6-ftrigrd helper program, spawned by the ftrigr_startf() primitive, that performs the system calls in the fifodirs the client is subscribed to, filters notifications according to regexps, and sends the relevant notifications to the client; it is necessary to have an external program doing that, in order to save a lot of menial work from the client and avoid forcing it into a given asynchronous programmation model. s6-ftrigrd hides all the low-level details from the client and allows the ftrigr library to remain usable in a variety of programming models. ftrigr_subscribe() simply tells s6-ftrigrd to open a new fifo in a fifodir, and waits for an answer code. If it gets a successful answer from s6-ftrigrd, it means the fifo is open and from now on every notification will be correctly received and processed. The client can then proceed to the operation that can cause notifications to be sent. s6-ftrig-listen1 runs ftrigr_subscribe(), *then* spawns the rest of its command line - that's how race conditions are avoided. In the case of supervision, this is used to track the state of a service. When a command such as s6-svwait wants to wait until a service is in a given state, *first* it runs ftrigr_subscribe(), *then* it looks at the current registered service state (in the supervise/status file), and then it computes the new service state according to the data it receives from s6-ftrigrd. There is no race window during which s6-svwait would have read the status file but not be reading notifications yet, which would risk missing state changes. That is the main race condition that the ftrigr library solves. Now, additionally to that, there is another, less serious race condition that is more directly related to what you were asking about, with directly creating fifos in fifodirs. The "send a notification" primitive is ftrigw_notify() (with its close family members for various details). It will open all fifos in a fifodir that match the hardcoded name pattern in succession, and write a byte to them. Normally, this write succeeds: there's a s6-ftrigrd reader behind each one of these fifos - and anything else means there's a problem. Most likely, it's a benign problem, such as a stale fifo: s6-ftrigrd was killed before it had the chance to clean up, so there's a useless, unused fifo lying around. ftrigw_notify() will then report, via its return code, that there was a problem; and just in case it has the rights to do so (which is most of the time), it tries and unlink the stale fifo, which cleans things up for the next notification, and makes a manual invocation of s6-cleanfifodir unnecessary. Simple and efficient. The Unix mkfifo() (or mknod()) system call just creates the fifo in the filesystem. It does not open it. In order to open a fifo to read on, you need to mkfifo() then open(). See where I'm going here? If you "mkfifo $fifo" then "cat < $fifo", and a notification just happens to arrive in between the two, ftrigw_notify() will see a fifo that has been created but is without a reader, and assume it's stale. Best case it will only report that something was wrong, which doesn't really matter (for instance, s6-supervise ignores the return code); but worst case, it will yoink that fifo from under your feet, your cat will fail and you'll have no idea why. s6-ftrigrd avoids that by creating its fifo under a name that does *not* match the pattern (starting with a dot, just to be sure), *then* opening it, *then* renaming it to its final name. So when the fifo becomes visible to ftrigw_notify(), it already has a reader. There you go, a small dive into the design and bowels of libftrig. :) -- Laurent