When I was working for a big chip testing company, a STDIO vs MMAP problem came up. The tester was at it’s heart a SPARC VME system running Solaris. The tester read in ‘patterns’ from disk, it it literally took hours to read in all the test patterns. At the scale of the large chip vendors, every minute you can’t test because the tester is booting, etc means dollars are lost. We wrote a bunch of macros that replaced the STDIO file system I/O calls with the equivalent mmap calls. . It turns out STDIO does a lot of prefetching and has some assumptions that you’re going to read a file linearly from beginning to end, whereas we wanted to jump around a lot in the pattern files. Pattern loading went from 4 hours to 30 minutes. Our customer was ecstatic. Joe McGuckin ViaNet Communications joe@via.net 650-207-0372 cell 650-213-1302 office 650-969-2124 fax > On Feb 5, 2021, at 5:18 PM, John Gilmore wrote: > > On Thu, Feb 04, 2021 at 09:17:54PM -0800, Bakul Shah wrote: >> Write(2)ing to a mapped page sounds pretty dodgy. Likely to get you >> in trouble in any case. Similarly read(2)ing. > > Uh, no. You misunderstand completely. > > The purpose of the kernel is to provide a reliable interface to system > facilities, that lets processes NOT DEPEND on what other processes are > doing. > > The decision about whether Tool X uses mmap() versus read() to access a > file, or mmap() versus write() to change one, is a decision that DOES > NOT DEPEND on what Tool Y is doing. Tools X and Y may have been written > by different groups in different decades. Tool X may have been written > to use stdio, which used read(). Three years later, stdio got rewritten > to use mmap() for speed, but that's invisible to the author of Tool X. > And maybe an end user in 2025 decides to use both Tool X and Tool Y on > the same file. So only much later will any malign interactions between > read/write and mmap actually be noticed by end users. And the fix is > not to create new dependencies between Tool X, stdio, and Tool Y. It is > to fix the kernel so they do not depend on each other! > > Here is a real-life example from my own experience. > > There is a long-standing bug in the Linux kernel, in which the inotify() > system call simply didn't work on nested file systems. This caused a > long-standing bug in Ubuntu, which I reported in 2012 here: > > https://bugs.launchpad.net/ubuntu/+source/rpcbind/+bug/977847 > > The symptom was that after booting from a LiveCD image, "apt-get > install" for system services (in my case an NFS client package) wouldn't > work. Turned out the system startup scripts used inotify() to notice > and start newly installed system services. The root cause was that > inotify failed because the root file system was an "overlayfs" that > overlaid a RAMdisk on top of the read-only LiveCD file system. The > people who implemented "overlayfs" didn't think inotify() was important, > or they thought it would be too much work to make it actually meet its > specs, so they just made it ignore changes to the files in the overlaid > file system. So the startup daemon's inotify() would never report the > creation of new files about the new services, because those files were > in the overlaying RAM disk, and so it would not start them and the user > would notice the error. > > The underlying overlayfs bug was reported in 2011 here: > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/882147 > > As far as I know it has never been fixed. (The bug report was > closed in 2019 for one of the usual bogus reasons.) > > The problem came because real tools (like systemd, or the tail command) > actually started using inotify, assuming that as a well documented > kernel interface, it would actually meet its specs. And because a > completely unrelated other real tool (like the LiveCD installer) > actually started using overlayfs, assuming that as a well documented > kernel interface, it too would actually meet its specs. And then one > day somebody tried to use both those tools together and they failed. > > That's why telling people "Don't use mmap() on the same file that you > use read() on" is an invalid attitude for a Real Kernel Maintainer. > Props to Larry McVoy for caring about this. Boos to the Linux > maintainers of overlayfs who didn't give a shit. > > John >