From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 7523 invoked from network); 6 Feb 2021 01:19:14 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 6 Feb 2021 01:19:14 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id BB3FD9BA46; Sat, 6 Feb 2021 11:19:12 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id D20079BA45; Sat, 6 Feb 2021 11:18:45 +1000 (AEST) Received: by minnie.tuhs.org (Postfix, from userid 112) id ED0829BA45; Sat, 6 Feb 2021 11:18:41 +1000 (AEST) Received: from hop.toad.com (75-101-100-43.dsl.static.fusionbroadband.com [75.101.100.43]) by minnie.tuhs.org (Postfix) with ESMTPS id ED8609BA3F for ; Sat, 6 Feb 2021 11:18:39 +1000 (AEST) Received: from hop.toad.com (localhost [127.0.0.1]) by hop.toad.com (8.12.9/8.12.9) with ESMTP id 1161IYSb031040; Fri, 5 Feb 2021 17:18:34 -0800 To: Bakul Shah , Larry McVoy , TUHS main list In-reply-to: <20210205141820.GO13701@mcvoy.com> References: <20210205003315.GK13701@mcvoy.com> <26D923CF-5319-4207-BA28-6EFA0E3BB1F8@iitbombay.org> <20210205141820.GO13701@mcvoy.com> Comments: In-reply-to Larry McVoy message dated "Fri, 05 Feb 2021 06:18:20 -0800." Date: Fri, 05 Feb 2021 17:18:34 -0800 Message-ID: <31039.1612574314@hop.toad.com> From: John Gilmore Subject: Re: [TUHS] FreeBSD behind the times? (was: Favorite unix design principles?) X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" On Thu, Feb 04, 2021 at 09:17:54PM -0800, Bakul Shah wrote: > Write(2)ing to a mapped page sounds pretty dodgy. Likely to get you > in trouble in any case. Similarly read(2)ing. Uh, no. You misunderstand completely. The purpose of the kernel is to provide a reliable interface to system facilities, that lets processes NOT DEPEND on what other processes are doing. The decision about whether Tool X uses mmap() versus read() to access a file, or mmap() versus write() to change one, is a decision that DOES NOT DEPEND on what Tool Y is doing. Tools X and Y may have been written by different groups in different decades. Tool X may have been written to use stdio, which used read(). Three years later, stdio got rewritten to use mmap() for speed, but that's invisible to the author of Tool X. And maybe an end user in 2025 decides to use both Tool X and Tool Y on the same file. So only much later will any malign interactions between read/write and mmap actually be noticed by end users. And the fix is not to create new dependencies between Tool X, stdio, and Tool Y. It is to fix the kernel so they do not depend on each other! Here is a real-life example from my own experience. There is a long-standing bug in the Linux kernel, in which the inotify() system call simply didn't work on nested file systems. This caused a long-standing bug in Ubuntu, which I reported in 2012 here: https://bugs.launchpad.net/ubuntu/+source/rpcbind/+bug/977847 The symptom was that after booting from a LiveCD image, "apt-get install" for system services (in my case an NFS client package) wouldn't work. Turned out the system startup scripts used inotify() to notice and start newly installed system services. The root cause was that inotify failed because the root file system was an "overlayfs" that overlaid a RAMdisk on top of the read-only LiveCD file system. The people who implemented "overlayfs" didn't think inotify() was important, or they thought it would be too much work to make it actually meet its specs, so they just made it ignore changes to the files in the overlaid file system. So the startup daemon's inotify() would never report the creation of new files about the new services, because those files were in the overlaying RAM disk, and so it would not start them and the user would notice the error. The underlying overlayfs bug was reported in 2011 here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/882147 As far as I know it has never been fixed. (The bug report was closed in 2019 for one of the usual bogus reasons.) The problem came because real tools (like systemd, or the tail command) actually started using inotify, assuming that as a well documented kernel interface, it would actually meet its specs. And because a completely unrelated other real tool (like the LiveCD installer) actually started using overlayfs, assuming that as a well documented kernel interface, it too would actually meet its specs. And then one day somebody tried to use both those tools together and they failed. That's why telling people "Don't use mmap() on the same file that you use read() on" is an invalid attitude for a Real Kernel Maintainer. Props to Larry McVoy for caring about this. Boos to the Linux maintainers of overlayfs who didn't give a shit. John