The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Rob Pike <robpike@gmail.com>
To: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: TUHS main list <tuhs@tuhs.org>
Subject: [TUHS] Re: If forking is bad, how about buffering?
Date: Tue, 14 May 2024 17:10:38 +1000	[thread overview]
Message-ID: <CAKzdPgwr6=vND7vF-3+Amof=WEf6fqCN2gOsPmXB0_9Gy9U_rA@mail.gmail.com> (raw)
In-Reply-To: <CAKH6PiWr1YXhHUrgy=NW5gerDGrBnxMYEPtv1x1ho+n4FmFUzw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3043 bytes --]

I agree with your (as usual) perceptive analysis. Only stopping by to point
out that I took the buffering out of cat. I didn't have your perspicacity
on why it should happen, just a desire to remove all the damn flags. When I
was done, cat.c was 35 lines long. Do a read, do a write, continue until
EOF. Guess what? That's all you need if you want to cat files.

Sad to say Bell Labs's cat door was hard to open and most of the world
still has a cat with flags. And buffers.

-rob


On Mon, May 13, 2024 at 11:35 PM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:

> So fork() is a significant nuisance. How about the far more ubiquitous
> problem of IO buffering?
>
> On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote:
> > But it does come down to the same argument as
> >
> https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
>
> The Microsoft manifesto says that fork() is an evil hack. One of the cited
> evils is that one must remember to flush output buffers before forking, for
> fear it will be emitted twice. But buffering is the culprit, not the
> victim. Output buffers must be flushed for many other reasons: to avoid
> deadlock; to force prompt delivery of urgent output; to keep output from
> being lost in case of a subsequent failure. Input buffers can also steal
> data by reading ahead into stuff that should go to another consumer. In all
> these cases buffering can break compositionality. Yet the manifesto blames
> an instance of the hazard on fork()!
>
> To assure compositionality, one must flush output buffers at every
> possible point where an unknown downstream consumer might correctly act on
> the received data with observable results. And input buffering must never
> ingest data that the program will not eventually use. These are tough
> criteria to meet in general without sacrificing buffering.
>
> The advent of pipes vividly exposed the non-compositionality of output
> buffering. Interactive pipelines froze when users could not provide input
> that would force stuff to be flushed until the input was informed by that
> very stuff. This phenomenon motivated cat -u, and stdio's convention of
> line buffering for stdout. The premier example of input buffering eating
> other programs' data was mitigated by "here documents" in the Bourne shell.
>
> These precautions are mere fig leaves that conceal important special
> cases. The underlying evil of buffered IO still lurks. The justification is
> that it's necessary to match the characteristics of IO devices and to
> minimize system-call overhead.  The former necessity requires the attention
> of hardware designers, but the latter is in the hands of programmers. What
> can be done to mitigate the pain of border-crossing into the kernel? L4 and
> its ilk have taken a whack. An even more radical approach might flow from
> the "whitepaper" at www.codevalley.com.
>
> In any even the abolition of buffering is a grand challenge.
>
> Doug
>

[-- Attachment #2: Type: text/html, Size: 4878 bytes --]

  parent reply	other threads:[~2024-05-14  7:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-13 13:34 [TUHS] " Douglas McIlroy
2024-05-13 22:01 ` [TUHS] " Andrew Warkentin
2024-05-14  7:10 ` Rob Pike [this message]
2024-05-14 11:10   ` G. Branden Robinson
2024-05-15 14:42     ` Dan Cross
2024-05-15 16:42       ` G. Branden Robinson
2024-05-19  1:04         ` Bakul Shah via TUHS
2024-05-19  1:21           ` Larry McVoy
2024-05-19  1:26             ` Serissa
2024-05-19  1:40             ` Bakul Shah via TUHS
2024-05-19  1:50               ` Bakul Shah via TUHS
2024-05-19  2:02               ` Larry McVoy
2024-05-19  2:28                 ` Bakul Shah via TUHS
2024-05-19  2:53                 ` Andrew Warkentin
2024-05-19  8:30                   ` Marc Rochkind
2024-05-19  2:26             ` Andrew Warkentin
2024-05-19 16:04           ` Paul Winalski
2024-05-14 22:08   ` George Michaelson
2024-05-14 22:34 ` Bakul Shah via TUHS
2024-05-19 10:41 ` Ralph Corderoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKzdPgwr6=vND7vF-3+Amof=WEf6fqCN2gOsPmXB0_9Gy9U_rA@mail.gmail.com' \
    --to=robpike@gmail.com \
    --cc=douglas.mcilroy@dartmouth.edu \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).