Buffering is used all over the place. Even serial devices use a 16 byte of buffer -- all to reduce the cost of per unit (character, disk block or packet etc.) processing or to smooth data flow or to utilize the available bandwidth. But in such applications the receiver/sender usually has a way of getting an alert when the FIFO has data/is empty. As long as you provide that you can compose more complex network of components. Imagine components connected via FIFOs that provide empty, almost empty, almost full, full signals. And may be more in case of lossy connections. [Though at a lower level you'd model these fifo as components too so at that level there'd be *no* buffering! Sort of like Carl Hewitt's Actor model!]

Your complaint seems more about how buffers are currently used and where the "network" of components are dynamically formed.

On May 13, 2024, at 6:34 AM, Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:

So fork() is a significant nuisance. How about the far more ubiquitous problem of IO buffering?

On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote:
> But it does come down to the same argument as
https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf

The Microsoft manifesto says that fork() is an evil hack. One of the cited evils is that one must remember to flush output buffers before forking, for fear it will be emitted twice. But buffering is the culprit, not the victim. Output buffers must be flushed for many other reasons: to avoid deadlock; to force prompt delivery of urgent output; to keep output from being lost in case of a subsequent failure. Input buffers can also steal data by reading ahead into stuff that should go to another consumer. In all these cases buffering can break compositionality. Yet the manifesto blames an instance of the hazard on fork()! 

To assure compositionality, one must flush output buffers at every possible point where an unknown downstream consumer might correctly act on the received data with observable results. And input buffering must never ingest data that the program will not eventually use. These are tough criteria to meet in general without sacrificing buffering.

The advent of pipes vividly exposed the non-compositionality of output buffering. Interactive pipelines froze when users could not provide input that would force stuff to be flushed until the input was informed by that very stuff. This phenomenon motivated cat -u, and stdio's convention of line buffering for stdout. The premier example of input buffering eating other programs' data was mitigated by "here documents" in the Bourne shell.

These precautions are mere fig leaves that conceal important special cases. The underlying evil of buffered IO still lurks. The justification is that it's necessary to match the characteristics of IO devices and to minimize system-call overhead.  The former necessity requires the attention of hardware designers, but the latter is in the hands of programmers. What can be done to mitigate the pain of border-crossing into the kernel? L4 and its ilk have taken a whack. An even more radical approach might flow from the "whitepaper" at www.codevalley.com.

In any even the abolition of buffering is a grand challenge.

Doug