* [TUHS] If forking is bad, how about buffering?
@ 2024-05-13 13:34 Douglas McIlroy
2024-05-13 22:01 ` [TUHS] " Andrew Warkentin
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Douglas McIlroy @ 2024-05-13 13:34 UTC (permalink / raw)
To: TUHS main list
[-- Attachment #1: Type: text/plain, Size: 2328 bytes --]
So fork() is a significant nuisance. How about the far more ubiquitous
problem of IO buffering?
On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote:
> But it does come down to the same argument as
>
https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
The Microsoft manifesto says that fork() is an evil hack. One of the cited
evils is that one must remember to flush output buffers before forking, for
fear it will be emitted twice. But buffering is the culprit, not the
victim. Output buffers must be flushed for many other reasons: to avoid
deadlock; to force prompt delivery of urgent output; to keep output from
being lost in case of a subsequent failure. Input buffers can also steal
data by reading ahead into stuff that should go to another consumer. In all
these cases buffering can break compositionality. Yet the manifesto blames
an instance of the hazard on fork()!
To assure compositionality, one must flush output buffers at every possible
point where an unknown downstream consumer might correctly act on the
received data with observable results. And input buffering must never
ingest data that the program will not eventually use. These are tough
criteria to meet in general without sacrificing buffering.
The advent of pipes vividly exposed the non-compositionality of output
buffering. Interactive pipelines froze when users could not provide input
that would force stuff to be flushed until the input was informed by that
very stuff. This phenomenon motivated cat -u, and stdio's convention of
line buffering for stdout. The premier example of input buffering eating
other programs' data was mitigated by "here documents" in the Bourne shell.
These precautions are mere fig leaves that conceal important special cases.
The underlying evil of buffered IO still lurks. The justification is that
it's necessary to match the characteristics of IO devices and to minimize
system-call overhead. The former necessity requires the attention of
hardware designers, but the latter is in the hands of programmers. What can
be done to mitigate the pain of border-crossing into the kernel? L4 and its
ilk have taken a whack. An even more radical approach might flow from the
"whitepaper" at www.codevalley.com.
In any even the abolition of buffering is a grand challenge.
Doug
[-- Attachment #2: Type: text/html, Size: 3560 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-13 13:34 [TUHS] If forking is bad, how about buffering? Douglas McIlroy @ 2024-05-13 22:01 ` Andrew Warkentin 2024-05-14 7:10 ` Rob Pike ` (2 subsequent siblings) 3 siblings, 0 replies; 20+ messages in thread From: Andrew Warkentin @ 2024-05-13 22:01 UTC (permalink / raw) To: The Eunuchs Historic Society On Mon, May 13, 2024 at 7:42 AM Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote: > > > These precautions are mere fig leaves that conceal important special cases. The underlying evil of buffered IO still lurks. The justification is that it's necessary to match the characteristics of IO devices and to minimize system-call overhead. The former necessity requires the attention of hardware designers, but the latter is in the hands of programmers. What can be done to mitigate the pain of border-crossing into the kernel? L4 and its ilk have taken a whack. An even more radical approach might flow from the "whitepaper" at www.codevalley.com. > QNX copies messages directly between address spaces without any intermediary buffering, similarly to L4-like kernels. However, some of its libraries and servers do still use intermediary buffers. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-13 13:34 [TUHS] If forking is bad, how about buffering? Douglas McIlroy 2024-05-13 22:01 ` [TUHS] " Andrew Warkentin @ 2024-05-14 7:10 ` Rob Pike 2024-05-14 11:10 ` G. Branden Robinson 2024-05-14 22:08 ` George Michaelson 2024-05-14 22:34 ` Bakul Shah via TUHS 2024-05-19 10:41 ` Ralph Corderoy 3 siblings, 2 replies; 20+ messages in thread From: Rob Pike @ 2024-05-14 7:10 UTC (permalink / raw) To: Douglas McIlroy; +Cc: TUHS main list [-- Attachment #1: Type: text/plain, Size: 3043 bytes --] I agree with your (as usual) perceptive analysis. Only stopping by to point out that I took the buffering out of cat. I didn't have your perspicacity on why it should happen, just a desire to remove all the damn flags. When I was done, cat.c was 35 lines long. Do a read, do a write, continue until EOF. Guess what? That's all you need if you want to cat files. Sad to say Bell Labs's cat door was hard to open and most of the world still has a cat with flags. And buffers. -rob On Mon, May 13, 2024 at 11:35 PM Douglas McIlroy < douglas.mcilroy@dartmouth.edu> wrote: > So fork() is a significant nuisance. How about the far more ubiquitous > problem of IO buffering? > > On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote: > > But it does come down to the same argument as > > > https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf > > The Microsoft manifesto says that fork() is an evil hack. One of the cited > evils is that one must remember to flush output buffers before forking, for > fear it will be emitted twice. But buffering is the culprit, not the > victim. Output buffers must be flushed for many other reasons: to avoid > deadlock; to force prompt delivery of urgent output; to keep output from > being lost in case of a subsequent failure. Input buffers can also steal > data by reading ahead into stuff that should go to another consumer. In all > these cases buffering can break compositionality. Yet the manifesto blames > an instance of the hazard on fork()! > > To assure compositionality, one must flush output buffers at every > possible point where an unknown downstream consumer might correctly act on > the received data with observable results. And input buffering must never > ingest data that the program will not eventually use. These are tough > criteria to meet in general without sacrificing buffering. > > The advent of pipes vividly exposed the non-compositionality of output > buffering. Interactive pipelines froze when users could not provide input > that would force stuff to be flushed until the input was informed by that > very stuff. This phenomenon motivated cat -u, and stdio's convention of > line buffering for stdout. The premier example of input buffering eating > other programs' data was mitigated by "here documents" in the Bourne shell. > > These precautions are mere fig leaves that conceal important special > cases. The underlying evil of buffered IO still lurks. The justification is > that it's necessary to match the characteristics of IO devices and to > minimize system-call overhead. The former necessity requires the attention > of hardware designers, but the latter is in the hands of programmers. What > can be done to mitigate the pain of border-crossing into the kernel? L4 and > its ilk have taken a whack. An even more radical approach might flow from > the "whitepaper" at www.codevalley.com. > > In any even the abolition of buffering is a grand challenge. > > Doug > [-- Attachment #2: Type: text/html, Size: 4878 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-14 7:10 ` Rob Pike @ 2024-05-14 11:10 ` G. Branden Robinson 2024-05-15 14:42 ` Dan Cross 2024-05-14 22:08 ` George Michaelson 1 sibling, 1 reply; 20+ messages in thread From: G. Branden Robinson @ 2024-05-14 11:10 UTC (permalink / raw) To: TUHS main list [-- Attachment #1: Type: text/plain, Size: 5407 bytes --] I've wondered about the cat flag war myself, and have a theory. Might as well air it here since the real McCoy (and McIlroy) are available to shoot it down. :) I'm sure the following attempt at knot-slashing is not novel, but people relentlessly return to this issue as if the presence of _flags_ is the problem. (Plan 9 fans recite this point ritually, like a mantra.) I say it isn't. At 2024-05-14T17:10:38+1000, Rob Pike wrote: > I agree with your (as usual) perceptive analysis. Only stopping by to > point out that I took the buffering out of cat. I didn't have your > perspicacity on why it should happen, just a desire to remove all the > damn flags. When I was done, cat.c was 35 lines long. Do a read, do a > write, continue until EOF. Guess what? That's all you need if you want > to cat files. > > Sad to say Bell Labs's cat door was hard to open and most of the world > still has a cat with flags. And buffers. I think this dispute is a proxy fight between two communities, or more precisely two views of what cat(1), and other elementary Unix commands, primarily exist to achieve. In my opinion both perspectives are valid, and it's better to consider what each perspective wants than mandate that either is superior. Viewpoint 1: Perspective from Pike's Peak Elementary Unix commands should be elementary. Unix is a kernel. Programs that do simple things with system calls should remain simple. This practices makes the system (the kernel interface) easier to learn, and to motivate and justify to others. Programs therefore test the simplicity and utility of, and can reveal flaws in, the set of primitives that the kernel exposes. This is valuable stuff for a research organization. "Research" was right there in the CSRC's name. Viewpoint 2: "I Just Want to Serve 5 Terabytes"[1] cat(1)'s man page did not advertise the traits in the foregoing viewpoint as objectives, and never did.[2] Its avowed purpose was to copy, without interruption or separation, 1..n files from storage to and output channel or stream (which might be redirected). I don't need to tell convince that this is a worthwhile application. But when we think about the many possible ways--and destinations--a person might have in mind for that I/O channel, we have to face the necessity of buffering or performance goes through the floor. It is 1978. Some VMS or, ugh, CP/M advocate from those piddly little toy machines will come along. "Ha ha," they will say, "our OS is way faster than the storied Unix even at the simple task of dumping files". Nowhere[citation needed] outside of C tutorials is cat implemented as int c; while((c = getchar()) != EOF) putchar(c); or its read()/write() system call equivalent. The output channel might be across a network in a distributed computing environment. Nobody wants to work with one byte at a time in that situation. Ethernet's minimum packet size is 64 bytes. No one wants that kind of overhead. While composing this mail, I had a look at an early, pre-C version of cat, spelling error in the only comment line and all. https://minnie.tuhs.org/cgi-bin/utree.pl?file=V2/cmd/cat.s putc: movb r0,(r2)+ cmp r2,$obuf+512. blo 1f mov $1,r0 sys write; obuf; 512. mov $obuf,r2 Well, look at that. Buffering. The author of this tool of course knew the kernel well, including the size of its internal disk buffers (on the assumption that I/O would mainly be happening to and from disks). But that's a "leaky abstraction", or a "layering violation". (That'll be two tickets to the eternal fires of Brogrammer Hell, thanks.) Once you sweep away the break room buzzwords we understand that cat is presuming things that it should not (the size of the kernel's buffers, and the nature of devices serving as source and sink). And this, as we all know, is one of the reasons the standard I/O library came into existence. Mike Lesk, I surmise, understood that the "applications programmer" having knowledge of kernel internals was in general neither necessary nor desirable. What _should_ have happened, IMAO, is that as stdio.h came into existence and the commercialization and USG/PWB-ification of Unix became truly inevitable, is that Viewpoint 1 should have been salvaged for the benefit of continuing operating systems research and kernel development. But! We should have kept cat(1), and let it grow as many flags as practical use demanded--_except_ for `-u`--and at the _same time_ developed a new kcat(1) command that really was just a thin wrapper around system calls. Then you'd be a lot closer to measuring what the kernel was really doing, what you were paying for it, and you could still boast of your elegance in OS textbooks. I concede that the name "kcat" would have been twice the length a certain prominent user of the Unix kernel would have tolerated. Maybe "kc" would have been better. The remaining 61 alphanumeric sigils that might follow the 'k' would have been reserved for other exercises of the kernel interface. If your kernel is sufficiently lean,[3] 62 cases exercising it ought to be enough for anybody. Regards, Branden [1] https://news.ycombinator.com/item?id=29082014 [2] https://minnie.tuhs.org/cgi-bin/utree.pl?file=V1/man/man1/cat.1 [3] https://dl.acm.org/doi/10.1145/224056.224075 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-14 11:10 ` G. Branden Robinson @ 2024-05-15 14:42 ` Dan Cross 2024-05-15 16:42 ` G. Branden Robinson 0 siblings, 1 reply; 20+ messages in thread From: Dan Cross @ 2024-05-15 14:42 UTC (permalink / raw) To: G. Branden Robinson; +Cc: TUHS main list On Tue, May 14, 2024 at 7:10 AM G. Branden Robinson <g.branden.robinson@gmail.com> wrote: > [snip] > Viewpoint 1: Perspective from Pike's Peak Clever. > Elementary Unix commands should be elementary. Unix is a kernel. > Programs that do simple things with system calls should remain simple. > This practices makes the system (the kernel interface) easier to learn, > and to motivate and justify to others. Programs therefore test the > simplicity and utility of, and can reveal flaws in, the set of > primitives that the kernel exposes. This is valuable stuff for a > research organization. "Research" was right there in the CSRC's name. I believe this is at once making a more complex argument than was proffered, and at the same misses the contextual essence that Unix was created in. > Viewpoint 2: "I Just Want to Serve 5 Terabytes"[1] > > cat(1)'s man page did not advertise the traits in the foregoing > viewpoint as objectives, and never did.[2] Its avowed purpose was to > copy, without interruption or separation, 1..n files from storage to and > output channel or stream (which might be redirected). > > I don't need to tell convince that this is a worthwhile application. > But when we think about the many possible ways--and destinations--a > person might have in mind for that I/O channel, we have to face the > necessity of buffering or performance goes through the floor. > > It is 1978. Some VMS I don't know about that; VMS IO is notably slower than Unix IO by default. Unlike VMS, Unix uses the buffer cache to serialize access to the underlying storage device(s). Ironically, caching here is a major win, not just for speed, but to make it relatively easy to reason about the state of a block, since that state is removed from the minutiae of the underlying storage device and instead handled in the bio layer. Treating the block cache as a fixed-size pool yields a relatively simple state machine for synchronizing between the in-memory and on-disk representations of data. >[snip] > And this, as we all know, is one of the reasons the standard I/O library > came into existence. Mike Lesk, I surmise, understood that the > "applications programmer" having knowledge of kernel internals was in > general neither necessary nor desirable. I'm not sure about that. I suspect that the justification _may_ have been more along the lines of noting that many programs implemented their own, largely similar buffering strategies, and that it was preferable to centralize those into a single library, and also noting that building some kinds of programs was inconvenient using raw system calls. For instance, something like `gets` is handy, but is _annoying_ to write using just read(2). It can obviously be done, but if I don't have to, I'd prefer not to. > [snip] > We should have kept cat(1), and let it grow as many flags as practical > use demanded--_except_ for `-u`--and at the _same time_ developed a new > kcat(1) command that really was just a thin wrapper around system calls. > Then you'd be a lot closer to measuring what the kernel was really > doing, what you were paying for it, and you could still boast of your > elegance in OS textbooks. > [snip] Here's where I think this misses the mark: this focuses too much on the idea that simple programs exist as to be tests for, and exemplars of, the kernel system call interface, but what evidence do you have for that? A simpler explanation is that simple programs are easier to write, easier to read, easier to reason about, test, and examine for correctness. Unix amplified this with Doug's "garden hoses of data" idea and the advent of pipes; here, it was found that small, simple programs could be combined in often surprisingly unanticipated ways. Unix built up a philosophy about _how_ to write programs that was rooted in the problems that were interesting when Unix was first created. Something we often forget is that research systems are built to address problems that are interesting _to the researchers who build them_. This context can shape a system, and we see that with Unix: a highly synchronous system call interface, because overly elaborate async interfaces were hard to program; a simple file abstraction that was easy to use (open/creat/read/write/close/seek/stat) because files on other contemporary systems were baroque things that were difficult to use; a simple primitive for the creation of processes because, again, on other systems processes were very heavy, complicated things that were difficult to use. Unix took problems related to IO and processes and made them easy. By the 80s, these were pretty well understood, so focus shifted to other things (languages, networking, etc). Unix is one of those rare beasts that escaped the lab and made it out there in the wild. It became the workhorse that beget a whole two or three generations of commercial work; it's unsurprising that when the web explosion happened, Unix became the basis for it: it was there, it was familiar, and by then it wasn't a research project anymore, but a basis for serious commercial work. That it has retained the original system call interface is almost incidental; perhaps that fits with your brocolli-man analogy. - Dan C. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-15 14:42 ` Dan Cross @ 2024-05-15 16:42 ` G. Branden Robinson 2024-05-19 1:04 ` Bakul Shah via TUHS 0 siblings, 1 reply; 20+ messages in thread From: G. Branden Robinson @ 2024-05-15 16:42 UTC (permalink / raw) To: Dan Cross; +Cc: TUHS main list [-- Attachment #1: Type: text/plain, Size: 15473 bytes --] Hi Dan, Thanks for the considered response. I was beginning to fear that my musing was of moronically minimal merit. At 2024-05-15T10:42:33-0400, Dan Cross wrote: > On Tue, May 14, 2024 at 7:10 AM G. Branden Robinson > <g.branden.robinson@gmail.com> wrote: > > [snip] > > Viewpoint 1: Perspective from Pike's Peak > > Clever. If Rob's never heard _that_ one before, I am deeply disappointed. > > Elementary Unix commands should be elementary. Unix is a kernel. > > Programs that do simple things with system calls should remain > > simple. This practices makes the system (the kernel interface) > > easier to learn, and to motivate and justify to others. Programs > > therefore test the simplicity and utility of, and can reveal flaws > > in, the set of primitives that the kernel exposes. This is valuable > > stuff for a research organization. "Research" was right there in > > the CSRC's name. > > I believe this is at once making a more complex argument than was > proffered, and at the same misses the contextual essence that Unix was > created in. My understanding of that context is, "a pleasant environment for software development" (McIlroy)[0]. My notion of software development entails (when not under managerial pressure to bang something together for the exploitation of "market advantage") analysis and reanalysis of software components to make them more efficient and more composable. As a response to the perceived bloat of Multics, the development of the Unix kernel absolutely involved much critical reappraisal of what _needed_ to be in a kernel, and of which services were so essential that they must be offered. As a microkernel Kool-Aid drinker, I tend to view Unix's origin in that light, which was reinforced by the severe limitations of the PDP-7 where it was born. Possibly many of the decisions about where to draw the kernel service/userspace service line we made by instinct or seasoned judgment, but the CSRC being a research organization, I'd be surprised if matters of empirical measurement were far from top of mind. It's a shame we don't have more insight into Thompson's development process, especially in those early days. I think we have a tendency to conceive of Unix as having sprung from his fingers already crystallized, like a mineral Athena from the forehead of Zeus. I would wager (and welcome correction if he has the patience) that he made and reversed decisions based on the experience of using the system. Some episodes in McIlroy's "A Research Unix Reader" illustrate that this was a recurring feature of its _later_ development, so why not in the incubation period? That, too, is empirical measurement, even if informal. Many revisions are made in software because we find in testing that something is "too damn slow", or runs the system out of memory too often. So to summarize, I want to push back on your counter here. Making little things to measure system features is a salutary practice in OS development. Stevens's _Advanced Programming in the Unix Environment_ is, shall we say, tricked out with exhibits along these lines. The author's dedication to _measurement_ as opposed to partisan opinion is, I think, a major factor in its status as a landmark work and as nigh-essential reading for the serious Unix developer to this day. Put differently, why would anyone _care_ about making cat(1) simple if one didn't have these objectives in mind? > > Viewpoint 2: "I Just Want to Serve 5 Terabytes"[1] > > > > cat(1)'s man page did not advertise the traits in the foregoing > > viewpoint as objectives, and never did.[2] Its avowed purpose was > > to copy, without interruption or separation, 1..n files from storage > > to and output channel or stream (which might be redirected). > > > > I don't need to tell convince that this is a worthwhile application. > > But when we think about the many possible ways--and destinations--a > > person might have in mind for that I/O channel, we have to face the > > necessity of buffering or performance goes through the floor. > > > > It is 1978. Some VMS > > I don't know about that; VMS IO is notably slower than Unix IO by > default. Unlike VMS, Unix uses the buffer cache to serialize access to > the underlying storage device(s). I must confess I have little experience with VMS (and none more recent than 30 years ago) and offered it as an example mainly because it was actually around in 1978 (if still fresh from the foundry). My personal backstory is much more along the lines of my other example, CP/M on toy computers (8-bit data bus pffffffft, right?). > Ironically, caching here is a major win, not just for speed, but to > make it relatively easy to reason about the state of a block, since > that state is removed from the minutiae of the underlying storage > device and instead handled in the bio layer. Treating the block cache > as a fixed-size pool yields a relatively simple state machine for > synchronizing between the in-memory and on-disk representations of > data. I entirely agree with this. I contemplated following up Bakul Shah's post with a mention of Jim Gettys's work on bufferbloat.[1] So let me do that here, and venture the opinion that a "buffer" as popularly conceived and implemented (more or less just a hunk of memory to house data) is too damn dumb a data structure for many of the uses to which it is put. If/when people address these problems, they do what the Unix buffer cache did; they elaborate it with state. This is a repeated design pattern: see SIGURG for example. Off the top of my head I perceive three circumstances that buffers often need to manage. 1. Avoidance of underrun. Such were the joys of CD-R burning. But also important in streaming or other real-time applications to avoid interruption. Essentially you want to be able to say, "I'm running out of data at the current rate, please supply more ASAP". 2. Avoidance of overrun. The problems of modem-like flow control are familiar to most. An important insight here, reinforced if not pioneered by Gettys, is that "just making the buffer bigger", the brogrammer solution, is not always the wise choice. 3. Cancellation. Familiar to all as SIGPIPE. Sometimes all of the data in the buffer is invalidated. The sender needs to stop transmitting ASAP, and the receiver can discard whatever it has. I apologize for the armchair approach. I have no doubt that much literature exists that has covered this stuff far more rigorously. And yet much of that knowledge has not made its way down the mountain into practice. That, I think, was at least part of Doug's point. Academics may have considered the topic adequately, but practitioners are too often solving problems as if it's 1972. > >[snip] > > And this, as we all know, is one of the reasons the standard I/O > > library came into existence. Mike Lesk, I surmise, understood that > > the "applications programmer" having knowledge of kernel internals > > was in general neither necessary nor desirable. > > I'm not sure about that. I suspect that the justification _may_ have > been more along the lines of noting that many programs implemented > their own, largely similar buffering strategies, and that it was > preferable to centralize those into a single library, and also noting > that building some kinds of programs was inconvenient using raw system > calls. For instance, something like `gets` is handy, An interesting choice given its notoriety as a nuclear landmine of insecurity. ;-) > but is _annoying_ to write using just read(2). It can obviously be > done, but if I don't have to, I'd prefer not to. I think you are justifying why stdio was written _as a library_, as your points seem to be pretty typical examples of why we move code thither from applications. My emphasis is a little different: why was buffered I/O in particular (when it could so easily have been string handling) the nucleus of what would be become a large standard library with its toes in many waters, so huge that projects like uclibc and musl arose for the purpose of (in part) chopping back out the stuff they felt they didn't need? My _claim_ is that stdio.h was the first piece of the library to walk upright because the need for it was most intense. More so than with strings; in fact we've learned that Nelson's original C string library was tricky to use well, was often elaborated by others in unfortunate ways.[7] But there was no I/O at all without going through the kernel, and while there were many ways to get that job done, the best leveraged knowledge of what the kernel had to work with. And yet, the kernel might get redesigned. Could stdio itself have been done better? Korn and Vo tried.[8] > Here's where I think this misses the mark: this focuses too much on > the idea that simple programs exist as to be tests for, and exemplars > of, the kernel system call interface, but what evidence do you have > for that? A little bit of experience, long after the 1970s, of working with automated tests for the seL4 microkernel. > A simpler explanation is that simple programs are easier to > write, easier to read, easier to reason about, test, and examine for > correctness. All certainly true. But these things are just as true of programs that don't directly make system calls at all. cat(1), as ideally envisioned by Pike (if I understand the Platonic ideal of his position correctly), not only makes system calls, but dirties its hands with the standard library as little as possible (if you recognize no options, you need neither call nor reimplement getopt(3)) and certainly not for the central task. Again I think we are not so much disagreeing as much as I'm finding out that I didn't adequately emphasize the distinctions I was making. > Unix amplified this with Doug's "garden hoses of data" idea and the > advent of pipes; here, it was found that small, simple programs could > be combined in often surprisingly unanticipated ways. Agreed; but given that pipes-as-a-service are supplied by the _kernel_, we are once again talking about system calls. One of the projects I never got off the ground with seL4 was a reconsideration from first principles of what sorts of more or less POSIXish buffering and piping mechanisms should be offered (in userland of course). For those who are scandalized that a microkernel doesn't offer pipes itself, see this Heiser piece on "IPC" in that system.[2] > Unix built up a philosophy about _how_ to write programs that was > rooted in the problems that were interesting when Unix was first > created. Something we often forget is that research systems are built > to address problems that are interesting _to the researchers who build > them_. I agree. > This context can shape a system, and we see that with Unix: a > highly synchronous system call interface, because overly elaborate > async interfaces were hard to program; And still are, apparently even without the qualifier "overly elaborate". ...though Go (and JavaScript?) fans may disagree. > a simple file abstraction that was easy to use > (open/creat/read/write/close/seek/stat) because files on other > contemporary systems were baroque things that were difficult to use; Absolutely. It's a truism in the Unix community that it's possible to simulated record-oriented storage and retrieval on top of a byte stream, but hard to do the converse. Though, being a truism, it might be worthwhile to critically reconsider it and more rigorously establish how we know what we think we know. That's another reason I endorse the microkernel mission. Let's lower the cost of experimentation on parts of the system that of themselves don't demand privilege. It's a highly concurrent, NUMA world out there. > a simple primitive for the creation of processes because, again, on > other systems processes were very heavy, complicated things that were > difficult to use. It is with some dismay that I look at what they are, _on Unix_, today. https://github.com/torvalds/linux/blob/1b294a1f35616977caddaddf3e9d28e576a1adbc/include/linux/sched.h#L748 https://github.com/openbsd/src/blob/master/sys/sys/proc.h#L138 Contrast: https://github.com/jeffallen/xv6/blob/master/proc.h#L65 > Unix took problems related to IO and processes and made them easy. By > the 80s, these were pretty well understood, so focus shifted to other > things (languages, networking, etc). True, but beside my point. Pike's point about cat and its flags was, I think, a call to reconsider more fundamental things. To question what we thought we knew--about how best to design core components of the system, for example. Do we really need the efflorescence of options that perfuses not simply the GNU versions of such components (a popular sink for abuse), but Busybox and *BSD implementations as well? Every developer of such a component should consider the cost/benefit ratio of flags, and then RE-consider them at intervals. Even at the cost of backward compatibility. (Deprecation cycles and mitigation/migration plans are good.) > Unix is one of those rare beasts that escaped the lab and made it out > there in the wild. It became the workhorse that beget a whole two or > three generations of commercial work; it's unsurprising that when the > web explosion happened, Unix became the basis for it: it was there, it > was familiar, and by then it wasn't a research project anymore, but a > basis for serious commercial work. Yes, and in a sense this success has cost all of us.[3][4][5] > That it has retained the original system call interface is almost > incidental; In _structure_, sure; in detail, I'm not sure this claim withstands scrutiny. Just _count_ the system calls we have today vs. V6 or V7. > perhaps that fits with your brocolli-man analogy. I'm unfamiliar with this metaphor. It makes me wonder how to place it in company with the requirements documents that led to the Ada language: Strawman, Woodenman, Ironman, and Steelman. At least it's likely better eating than any of those. ;-) Since no one else ever says it on this list, let me point out what a terrific and unfairly maligned language Ada is. In reading the minutes of the latest WG14 meeting[6] I marvel anew at how C has over time slowly, slowly accreted type- and memory-safety features that Ada had in 1983 (or even in 1980, before its formal standardization). Regards, Branden [0] https://www.gnu.org/software/groff/manual/groff.html.node/Background.html [1] https://gettys.wordpress.com/category/bufferbloat/ [2] https://microkerneldude.org/2019/03/07/how-to-and-how-not-to-use-sel4-ipc/ [3] https://tianyin.github.io/misc/irrelevant.pdf (guess who) [4] https://www.youtube.com/watch?v=36myc8wQhLo (Timothy Roscoe) [5] https://queue.acm.org/detail.cfm?id=3212479 (David Chisnall) [6] https://www.open-std.org/JTC1/sc22/wg14/www/docs/n3227.htm Skip down to section 5. Note particularly `_Optional`. [7] https://www.symas.com/post/the-sad-state-of-c-strings [8] https://www.semanticscholar.org/paper/SFIO%3A-Safe-Fast-String-File-IO-Korn-Vo/8014266693afda38a0a177a9b434fedce98eb7de [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-15 16:42 ` G. Branden Robinson @ 2024-05-19 1:04 ` Bakul Shah via TUHS 2024-05-19 1:21 ` Larry McVoy 2024-05-19 16:04 ` Paul Winalski 0 siblings, 2 replies; 20+ messages in thread From: Bakul Shah via TUHS @ 2024-05-19 1:04 UTC (permalink / raw) To: G. Branden Robinson; +Cc: The Unix Heritage Society mailing list On May 15, 2024, at 9:42 AM, G. Branden Robinson <g.branden.robinson@gmail.com> wrote: > > I contemplated following up Bakul Shah's > post with a mention of Jim Gettys's work on bufferbloat.[1] So let me > do that here, and venture the opinion that a "buffer" as popularly > conceived and implemented (more or less just a hunk of memory to house > data) is too damn dumb a data structure for many of the uses to which it > is put. Note that even if you remove every RAM buffer between the two endpoints of a TCP connection, you still have a "buffer". Example: If you have a 1Gbps pipe between SF & NYC, the pipe itself can store something like 3.5MB to 4MB in each direction! As the pipe can be lossy, you have to buffer up N (=bandwidth*latency) bytes at the sending end (until you see an ack for the previous Nth byte), if you want to utilize the full bandwidth. Now what happens if the sender program exits right after sending the last byte? Something on behalf of the sender has to buffer up and stick around to complete the TCP dance. Even if the sender is cat -u, the kernel or a network daemon process atop a microkernel has to buffer this data[1]. Unfortunately you can't abolish latency! But where to put buffers is certainly an engineering choice that can impact compositionality or other problems such as bufferbloat. [1] This brings up a separate point: in a microkernel even a simple thing like "foo | bar" would require a third process - a "pipe service", to buffer up the output of foo! You may have reduced the overhead of individual syscalls but you will have more of cross-domain calls! ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:04 ` Bakul Shah via TUHS @ 2024-05-19 1:21 ` Larry McVoy 2024-05-19 1:26 ` Serissa ` (2 more replies) 2024-05-19 16:04 ` Paul Winalski 1 sibling, 3 replies; 20+ messages in thread From: Larry McVoy @ 2024-05-19 1:21 UTC (permalink / raw) To: Bakul Shah; +Cc: The Unix Heritage Society mailing list On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: > [1] This brings up a separate point: in a microkernel even a simple > thing like "foo | bar" would require a third process - a "pipe > service", to buffer up the output of foo! You may have reduced > the overhead of individual syscalls but you will have more of > cross-domain calls! Do any micro kernels do address space to address space bcopy()? -- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:21 ` Larry McVoy @ 2024-05-19 1:26 ` Serissa 2024-05-19 1:40 ` Bakul Shah via TUHS 2024-05-19 2:26 ` Andrew Warkentin 2 siblings, 0 replies; 20+ messages in thread From: Serissa @ 2024-05-19 1:26 UTC (permalink / raw) To: Larry McVoy; +Cc: Bakul Shah, The Unix Heritage Society mailing list MIT's FOS (Factored Operating System) research OS did cross address space copies as part of its messaging machinery. HPC networking does this by using shared memory (Cross Memory Attach and XPMEM) in a traditional kernel. -L > On May 18, 2024, at 9:21 PM, Larry McVoy <lm@mcvoy.com> wrote: > > On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: >> [1] This brings up a separate point: in a microkernel even a simple >> thing like "foo | bar" would require a third process - a "pipe >> service", to buffer up the output of foo! You may have reduced >> the overhead of individual syscalls but you will have more of >> cross-domain calls! > > Do any micro kernels do address space to address space bcopy()? > -- > --- > Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:21 ` Larry McVoy 2024-05-19 1:26 ` Serissa @ 2024-05-19 1:40 ` Bakul Shah via TUHS 2024-05-19 1:50 ` Bakul Shah via TUHS 2024-05-19 2:02 ` Larry McVoy 2024-05-19 2:26 ` Andrew Warkentin 2 siblings, 2 replies; 20+ messages in thread From: Bakul Shah via TUHS @ 2024-05-19 1:40 UTC (permalink / raw) To: Larry McVoy; +Cc: The Unix Heritage Society mailing list On May 18, 2024, at 6:21 PM, Larry McVoy <lm@mcvoy.com> wrote: > > On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: >> [1] This brings up a separate point: in a microkernel even a simple >> thing like "foo | bar" would require a third process - a "pipe >> service", to buffer up the output of foo! You may have reduced >> the overhead of individual syscalls but you will have more of >> cross-domain calls! > > Do any micro kernels do address space to address space bcopy()? mmapping the same page in two processes won't be hard but now you have complicated cat (or some iolib)! ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:40 ` Bakul Shah via TUHS @ 2024-05-19 1:50 ` Bakul Shah via TUHS 2024-05-19 2:02 ` Larry McVoy 1 sibling, 0 replies; 20+ messages in thread From: Bakul Shah via TUHS @ 2024-05-19 1:50 UTC (permalink / raw) To: Larry McVoy; +Cc: The Unix Heritage Society mailing list On May 18, 2024, at 6:40 PM, Bakul Shah <bakul@iitbombay.org> wrote: > > On May 18, 2024, at 6:21 PM, Larry McVoy <lm@mcvoy.com> wrote: >> >> On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: >>> [1] This brings up a separate point: in a microkernel even a simple >>> thing like "foo | bar" would require a third process - a "pipe >>> service", to buffer up the output of foo! You may have reduced >>> the overhead of individual syscalls but you will have more of >>> cross-domain calls! >> >> Do any micro kernels do address space to address space bcopy()? > > mmapping the same page in two processes won't be hard but now > you have complicated cat (or some iolib)! And there are other issues. As Doug said in his original message in this thread: "And input buffering must never ingest data that the program will not eventually use." Consider something like this: (echo 1; echo 2)|(read; cat) This will print 2. Emulating this with mmaped buffers and copying will not be easy.... ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:40 ` Bakul Shah via TUHS 2024-05-19 1:50 ` Bakul Shah via TUHS @ 2024-05-19 2:02 ` Larry McVoy 2024-05-19 2:28 ` Bakul Shah via TUHS 2024-05-19 2:53 ` Andrew Warkentin 1 sibling, 2 replies; 20+ messages in thread From: Larry McVoy @ 2024-05-19 2:02 UTC (permalink / raw) To: Bakul Shah; +Cc: The Unix Heritage Society mailing list On Sat, May 18, 2024 at 06:40:42PM -0700, Bakul Shah wrote: > On May 18, 2024, at 6:21???PM, Larry McVoy <lm@mcvoy.com> wrote: > > > > On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: > >> [1] This brings up a separate point: in a microkernel even a simple > >> thing like "foo | bar" would require a third process - a "pipe > >> service", to buffer up the output of foo! You may have reduced > >> the overhead of individual syscalls but you will have more of > >> cross-domain calls! > > > > Do any micro kernels do address space to address space bcopy()? > > mmapping the same page in two processes won't be hard but now > you have complicated cat (or some iolib)! I recall asking Linus if that could be done to save TLB entries, as in multiple processes map a portion of their address space (at the same virtual location) and then they all use the same TLB entries for that part of their address space. He said it couldn't be done because the process ID concept was hard wired into the TLB. I don't know if TLB tech has evolved such that a single process could have multiple "process" IDs associated with it in the TLB. I wanted it because if you could share part of your address space with another process, using the same TLB entries, then motivation for threads could go away (I've never been a threads fan but I acknowledge why you might need them). I was channeling Rob's "If you think you need threads, your processes are too fat". The idea of using processes instead of threads falls down when you consider TLB usage. And TLB usage, when you care about performance, is an issue. I could craft you some realistic benchmarks, mirroring real world work loads, that would kill the idea of replacing threads with processes unless they shared TLB entries. Think of a N-way threaded application, lots of address space used, that application uses all of the TLB. Now do that with N processes and your TLB is N times less effective. This was a conversation decades ago so maybe TLB tech now has solved this. I doubt it, if this was a solved problem I think every OS would say screw threads, just use processes and mmap(). The nice part of that model is you can choose what parts of your address space you want to share. That cuts out a HUGE swath of potential problems where another thread can go poke in a part of your address space that you don't want poked. -- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 2:02 ` Larry McVoy @ 2024-05-19 2:28 ` Bakul Shah via TUHS 2024-05-19 2:53 ` Andrew Warkentin 1 sibling, 0 replies; 20+ messages in thread From: Bakul Shah via TUHS @ 2024-05-19 2:28 UTC (permalink / raw) To: Larry McVoy; +Cc: The Unix Heritage Society mailing list On May 18, 2024, at 7:02 PM, Larry McVoy <lm@mcvoy.com> wrote: > > On Sat, May 18, 2024 at 06:40:42PM -0700, Bakul Shah wrote: >> On May 18, 2024, at 6:21???PM, Larry McVoy <lm@mcvoy.com> wrote: >>> >>> On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: >>>> [1] This brings up a separate point: in a microkernel even a simple >>>> thing like "foo | bar" would require a third process - a "pipe >>>> service", to buffer up the output of foo! You may have reduced >>>> the overhead of individual syscalls but you will have more of >>>> cross-domain calls! >>> >>> Do any micro kernels do address space to address space bcopy()? >> >> mmapping the same page in two processes won't be hard but now >> you have complicated cat (or some iolib)! > > I recall asking Linus if that could be done to save TLB entries, as in > multiple processes map a portion of their address space (at the same > virtual location) and then they all use the same TLB entries for that > part of their address space. He said it couldn't be done because the > process ID concept was hard wired into the TLB. I don't know if TLB > tech has evolved such that a single process could have multiple "process" > IDs associated with it in the TLB. Two TLB entries can point to the same physical page. Is that not good enough? One process can give its address space a..b and the kernel (or the memory daemon) maps a..b to other process'es a'..b'. a..b may be associated with a file so any IO would have to be seen by both. > I wanted it because if you could share part of your address space with > another process, using the same TLB entries, then motivation for threads > could go away (I've never been a threads fan but I acknowledge why > you might need them). I was channeling Rob's "If you think you need > threads, your processes are too fat". > The idea of using processes instead of threads falls down when you > consider TLB usage. And TLB usage, when you care about performance, is > an issue. I could craft you some realistic benchmarks, mirroring real > world work loads, that would kill the idea of replacing threads with > processes unless they shared TLB entries. Think of a N-way threaded > application, lots of address space used, that application uses all of the > TLB. Now do that with N processes and your TLB is N times less effective. > > This was a conversation decades ago so maybe TLB tech now has solved this. > I doubt it, if this was a solved problem I think every OS would say screw > threads, just use processes and mmap(). The nice part of that model > is you can choose what parts of your address space you want to share. > That cuts out a HUGE swath of potential problems where another thread > can go poke in a part of your address space that you don't want poked. You can sort of evolve plan9's rfork to do a partial address share. The issue with process vs thread is the context switch time. Sharing pages doesn't change that. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 2:02 ` Larry McVoy 2024-05-19 2:28 ` Bakul Shah via TUHS @ 2024-05-19 2:53 ` Andrew Warkentin 2024-05-19 8:30 ` Marc Rochkind 1 sibling, 1 reply; 20+ messages in thread From: Andrew Warkentin @ 2024-05-19 2:53 UTC (permalink / raw) To: The Unix Heritage Society mailing list On Sat, May 18, 2024 at 8:03 PM Larry McVoy <lm@mcvoy.com> wrote: > > On Sat, May 18, 2024 at 06:40:42PM -0700, Bakul Shah wrote: > > On May 18, 2024, at 6:21???PM, Larry McVoy <lm@mcvoy.com> wrote: > > > > > > On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: > > >> [1] This brings up a separate point: in a microkernel even a simple > > >> thing like "foo | bar" would require a third process - a "pipe > > >> service", to buffer up the output of foo! You may have reduced > > >> the overhead of individual syscalls but you will have more of > > >> cross-domain calls! > > > > > > Do any micro kernels do address space to address space bcopy()? > > > > mmapping the same page in two processes won't be hard but now > > you have complicated cat (or some iolib)! > > I recall asking Linus if that could be done to save TLB entries, as in > multiple processes map a portion of their address space (at the same > virtual location) and then they all use the same TLB entries for that > part of their address space. He said it couldn't be done because the > process ID concept was hard wired into the TLB. I don't know if TLB > tech has evolved such that a single process could have multiple "process" > IDs associated with it in the TLB. > > I wanted it because if you could share part of your address space with > another process, using the same TLB entries, then motivation for threads > could go away (I've never been a threads fan but I acknowledge why > you might need them). I was channeling Rob's "If you think you need > threads, your processes are too fat". > > The idea of using processes instead of threads falls down when you > consider TLB usage. And TLB usage, when you care about performance, is > an issue. I could craft you some realistic benchmarks, mirroring real > world work loads, that would kill the idea of replacing threads with > processes unless they shared TLB entries. Think of a N-way threaded > application, lots of address space used, that application uses all of the > TLB. Now do that with N processes and your TLB is N times less effective. > > This was a conversation decades ago so maybe TLB tech now has solved this. > I doubt it, if this was a solved problem I think every OS would say screw > threads, just use processes and mmap(). The nice part of that model > is you can choose what parts of your address space you want to share. > That cuts out a HUGE swath of potential problems where another thread > can go poke in a part of your address space that you don't want poked. > I've never been a fan of the rfork()/clone() model. With the OS I'm working on, rather than using processes that share state as threads, a process will more or less just be a collection of threads that share a command line and get replaced on exec(). All of the state usually associated with a process (e.g. file descriptor space, filesystem namespace, virtual address space, memory allocations) will instead be stored in separate container objects that can be shared between threads. It will be possible to share any of these containers between processes, or use different combinations between threads within a process. This would allow more control over what gets shared between threads/processes than rfork()/clone() because the state containers will appear in the filesystem and be explicitly bound to threads rather than being anonymous and only transferred on rfork()/clone(). Emulating rfork()/clone on top of this will be easy enough though. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 2:53 ` Andrew Warkentin @ 2024-05-19 8:30 ` Marc Rochkind 0 siblings, 0 replies; 20+ messages in thread From: Marc Rochkind @ 2024-05-19 8:30 UTC (permalink / raw) Cc: The Unix Heritage Society mailing list [-- Attachment #1: Type: text/plain, Size: 4363 bytes --] Yes, many classic commands -- cat, cp, and others -- were sleekly and succinctly written. In part because they were devoid of error checking. I recall how annoying it was one time in the early 70s to cp a bunch of files to a file system that was out of space. As I grew older, my concept of what constituted elegant programming changed. UNIX was a *research* project, not a production system! At one of the first UNIX meetings, somebody from an OSS (operations support system) was talking about the limitations of UNIX when Doug asked, "Why are you using UNIX?" Marc On Sun, May 19, 2024, 5:54 AM Andrew Warkentin <andreww591@gmail.com> wrote: > On Sat, May 18, 2024 at 8:03 PM Larry McVoy <lm@mcvoy.com> wrote: > > > > On Sat, May 18, 2024 at 06:40:42PM -0700, Bakul Shah wrote: > > > On May 18, 2024, at 6:21???PM, Larry McVoy <lm@mcvoy.com> wrote: > > > > > > > > On Sat, May 18, 2024 at 06:04:23PM -0700, Bakul Shah via TUHS wrote: > > > >> [1] This brings up a separate point: in a microkernel even a simple > > > >> thing like "foo | bar" would require a third process - a "pipe > > > >> service", to buffer up the output of foo! You may have reduced > > > >> the overhead of individual syscalls but you will have more of > > > >> cross-domain calls! > > > > > > > > Do any micro kernels do address space to address space bcopy()? > > > > > > mmapping the same page in two processes won't be hard but now > > > you have complicated cat (or some iolib)! > > > > I recall asking Linus if that could be done to save TLB entries, as in > > multiple processes map a portion of their address space (at the same > > virtual location) and then they all use the same TLB entries for that > > part of their address space. He said it couldn't be done because the > > process ID concept was hard wired into the TLB. I don't know if TLB > > tech has evolved such that a single process could have multiple "process" > > IDs associated with it in the TLB. > > > > I wanted it because if you could share part of your address space with > > another process, using the same TLB entries, then motivation for threads > > could go away (I've never been a threads fan but I acknowledge why > > you might need them). I was channeling Rob's "If you think you need > > threads, your processes are too fat". > > > > The idea of using processes instead of threads falls down when you > > consider TLB usage. And TLB usage, when you care about performance, is > > an issue. I could craft you some realistic benchmarks, mirroring real > > world work loads, that would kill the idea of replacing threads with > > processes unless they shared TLB entries. Think of a N-way threaded > > application, lots of address space used, that application uses all of the > > TLB. Now do that with N processes and your TLB is N times less > effective. > > > > This was a conversation decades ago so maybe TLB tech now has solved > this. > > I doubt it, if this was a solved problem I think every OS would say screw > > threads, just use processes and mmap(). The nice part of that model > > is you can choose what parts of your address space you want to share. > > That cuts out a HUGE swath of potential problems where another thread > > can go poke in a part of your address space that you don't want poked. > > > > I've never been a fan of the rfork()/clone() model. With the OS I'm > working on, rather than using processes that share state as threads, a > process will more or less just be a collection of threads that share a > command line and get replaced on exec(). All of the state usually > associated with a process (e.g. file descriptor space, filesystem > namespace, virtual address space, memory allocations) will instead be > stored in separate container objects that can be shared between > threads. It will be possible to share any of these containers between > processes, or use different combinations between threads within a > process. This would allow more control over what gets shared between > threads/processes than rfork()/clone() because the state containers > will appear in the filesystem and be explicitly bound to threads > rather than being anonymous and only transferred on rfork()/clone(). > Emulating rfork()/clone on top of this will be easy enough though. > [-- Attachment #2: Type: text/html, Size: 5452 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:21 ` Larry McVoy 2024-05-19 1:26 ` Serissa 2024-05-19 1:40 ` Bakul Shah via TUHS @ 2024-05-19 2:26 ` Andrew Warkentin 2 siblings, 0 replies; 20+ messages in thread From: Andrew Warkentin @ 2024-05-19 2:26 UTC (permalink / raw) To: The Unix Heritage Society mailing list On Sat, May 18, 2024 at 7:27 PM Larry McVoy <lm@mcvoy.com> wrote: > > Do any micro kernels do address space to address space bcopy()? > QNX and some L4-like kernels copy directly between address spaces. QNX copies between readv()/writev()-style vectors of arbitrary length. L4-like kernels have different forms of direct copy; Pistachio supports copying between a collection of "strings" that are limited to 4M each. seL4 on the other hand is limited to a single page-sized fixed buffer for each thread (I've been working on an as-yet unnamed fork of it that supports QNX-like vectors for the OS I'm working on; I gave up on my previous plan to use async queues and intermediary buffers to support arbitrary-length messages in user space, since that was turning out to be rather ugly and would have had a high risk of priority inversion). ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-19 1:04 ` Bakul Shah via TUHS 2024-05-19 1:21 ` Larry McVoy @ 2024-05-19 16:04 ` Paul Winalski 1 sibling, 0 replies; 20+ messages in thread From: Paul Winalski @ 2024-05-19 16:04 UTC (permalink / raw) To: Bakul Shah; +Cc: The Unix Heritage Society mailing list [-- Attachment #1: Type: text/plain, Size: 1152 bytes --] On Sat, May 18, 2024 at 9:04 PM Bakul Shah via TUHS <tuhs@tuhs.org> wrote: > > Note that even if you remove every RAM buffer between the two > endpoints of a TCP connection, you still have a "buffer". True, and it's unavoidable. The full name of the virtual circuit communication protocol is TCP/IP (Transmission Control Protocol over Internet Protocol). The underlying IP is the protocol used to actually transfer the data from machine to machine. It provides datagram service, meaning that messages may be duplicated, lost, delivered out of order, or delivered with errors. The job of TCP is to provide virtual circuit service, meaning that messages are delivered once, in order, without errors, and reliably. To cope with the underlying datagam service, TCP has to put error checksums on each message, assign sequence numbers to each message, and has to send an acknowledgement to the sender when a message is received. It also has to be prepared to resend messages if there's no acknowledgement or if the ack says the message was received with errors. You can't do all that without buffering messages. -Paul W. [-- Attachment #2: Type: text/html, Size: 1483 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-14 7:10 ` Rob Pike 2024-05-14 11:10 ` G. Branden Robinson @ 2024-05-14 22:08 ` George Michaelson 1 sibling, 0 replies; 20+ messages in thread From: George Michaelson @ 2024-05-14 22:08 UTC (permalink / raw) To: TUHS main list Maybe dd is the right place to decide how to buffer? It appears to understand thats part of it's role. I use mbuffer, and I have absolutely no idea if its proffered buffer, scatter/gather, SETSOCKOPT behaviour does or does not improve things but I use it, even though netcat exists... G ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-13 13:34 [TUHS] If forking is bad, how about buffering? Douglas McIlroy 2024-05-13 22:01 ` [TUHS] " Andrew Warkentin 2024-05-14 7:10 ` Rob Pike @ 2024-05-14 22:34 ` Bakul Shah via TUHS 2024-05-19 10:41 ` Ralph Corderoy 3 siblings, 0 replies; 20+ messages in thread From: Bakul Shah via TUHS @ 2024-05-14 22:34 UTC (permalink / raw) To: Douglas McIlroy; +Cc: The Unix Heritage Society mailing list [-- Attachment #1: Type: text/plain, Size: 3360 bytes --] Buffering is used all over the place. Even serial devices use a 16 byte of buffer -- all to reduce the cost of per unit (character, disk block or packet etc.) processing or to smooth data flow or to utilize the available bandwidth. But in such applications the receiver/sender usually has a way of getting an alert when the FIFO has data/is empty. As long as you provide that you can compose more complex network of components. Imagine components connected via FIFOs that provide empty, almost empty, almost full, full signals. And may be more in case of lossy connections. [Though at a lower level you'd model these fifo as components too so at that level there'd be *no* buffering! Sort of like Carl Hewitt's Actor model!] Your complaint seems more about how buffers are currently used and where the "network" of components are dynamically formed. > On May 13, 2024, at 6:34 AM, Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote: > > So fork() is a significant nuisance. How about the far more ubiquitous problem of IO buffering? > > On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote: > > But it does come down to the same argument as > > https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf > > The Microsoft manifesto says that fork() is an evil hack. One of the cited evils is that one must remember to flush output buffers before forking, for fear it will be emitted twice. But buffering is the culprit, not the victim. Output buffers must be flushed for many other reasons: to avoid deadlock; to force prompt delivery of urgent output; to keep output from being lost in case of a subsequent failure. Input buffers can also steal data by reading ahead into stuff that should go to another consumer. In all these cases buffering can break compositionality. Yet the manifesto blames an instance of the hazard on fork()! > > To assure compositionality, one must flush output buffers at every possible point where an unknown downstream consumer might correctly act on the received data with observable results. And input buffering must never ingest data that the program will not eventually use. These are tough criteria to meet in general without sacrificing buffering. > > The advent of pipes vividly exposed the non-compositionality of output buffering. Interactive pipelines froze when users could not provide input that would force stuff to be flushed until the input was informed by that very stuff. This phenomenon motivated cat -u, and stdio's convention of line buffering for stdout. The premier example of input buffering eating other programs' data was mitigated by "here documents" in the Bourne shell. > > These precautions are mere fig leaves that conceal important special cases. The underlying evil of buffered IO still lurks. The justification is that it's necessary to match the characteristics of IO devices and to minimize system-call overhead. The former necessity requires the attention of hardware designers, but the latter is in the hands of programmers. What can be done to mitigate the pain of border-crossing into the kernel? L4 and its ilk have taken a whack. An even more radical approach might flow from the "whitepaper" at www.codevalley.com <http://www.codevalley.com/>. > > In any even the abolition of buffering is a grand challenge. > > Doug [-- Attachment #2: Type: text/html, Size: 4859 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* [TUHS] Re: If forking is bad, how about buffering? 2024-05-13 13:34 [TUHS] If forking is bad, how about buffering? Douglas McIlroy ` (2 preceding siblings ...) 2024-05-14 22:34 ` Bakul Shah via TUHS @ 2024-05-19 10:41 ` Ralph Corderoy 3 siblings, 0 replies; 20+ messages in thread From: Ralph Corderoy @ 2024-05-19 10:41 UTC (permalink / raw) To: TUHS main list Hi, Doug wrote: > The underlying evil of buffered IO still lurks. The justification is > that it's necessary to match the characteristics of IO devices and to > minimize system-call overhead. The former necessity requires the > attention of hardware designers, but the latter is in the hands of > programmers. What can be done to mitigate the pain of border-crossing > into the kernel? Has there been any system-on-chip experimentation with hardware ‘pipes’? They have LIFOs for UARTs. What about LIFO hardware tracking the content of shared memory? Registers can be written to give the base address and buffer size. Various water marks set: every byte as it arrives versus ‘It's not worth getting out of bed for less than 64 KiB’. Read-only registers would allow polling when the buffer is full or empty, or a ‘device’ could be configured to interrupt. Trying to read/write a byte which wasn't ‘yours’ would trap. It would be two cores synchronising without the kernel thanks to hardware. -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-05-19 16:05 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-05-13 13:34 [TUHS] If forking is bad, how about buffering? Douglas McIlroy 2024-05-13 22:01 ` [TUHS] " Andrew Warkentin 2024-05-14 7:10 ` Rob Pike 2024-05-14 11:10 ` G. Branden Robinson 2024-05-15 14:42 ` Dan Cross 2024-05-15 16:42 ` G. Branden Robinson 2024-05-19 1:04 ` Bakul Shah via TUHS 2024-05-19 1:21 ` Larry McVoy 2024-05-19 1:26 ` Serissa 2024-05-19 1:40 ` Bakul Shah via TUHS 2024-05-19 1:50 ` Bakul Shah via TUHS 2024-05-19 2:02 ` Larry McVoy 2024-05-19 2:28 ` Bakul Shah via TUHS 2024-05-19 2:53 ` Andrew Warkentin 2024-05-19 8:30 ` Marc Rochkind 2024-05-19 2:26 ` Andrew Warkentin 2024-05-19 16:04 ` Paul Winalski 2024-05-14 22:08 ` George Michaelson 2024-05-14 22:34 ` Bakul Shah via TUHS 2024-05-19 10:41 ` Ralph Corderoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).