From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [IPv6:2600:3c01:e000:146::1]) by inbox.vuxu.org (Postfix) with ESMTP id 74F3023E12 for ; Mon, 13 May 2024 15:35:07 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id DD5844368C; Mon, 13 May 2024 23:35:03 +1000 (AEST) Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) by minnie.tuhs.org (Postfix) with ESMTPS id DF43243689 for ; Mon, 13 May 2024 23:34:52 +1000 (AEST) Received: by mail-yb1-xb35.google.com with SMTP id 3f1490d57ef6-dee9943a293so369730276.0 for ; Mon, 13 May 2024 06:34:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dartmouth.edu; s=google1; t=1715607292; x=1716212092; darn=tuhs.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=6urNyB4vQQNh4584Di5I2sLQLBdVT4qpdH9MF7sOYfo=; b=tZGPZhdjvuNSLGlYp33kg6k/VE497M3qGzDWtETAxATIfBuKaZa7zai8k0t9ckMzVh bmuWqOjBLt5ruP4xtjuIxf1HlSNvdw63I1QUpTNnNt7GMWEVhFevVto1L5iLShRvC1j2 2QLbFZvgtqS6X+o1Oy5rjFV2DZtUYbAVphA4RoGWa7i7PbAt8v55PB6sJxUdnLhsBs05 sBan+FfeeDPrBS2Iqg69rWSZw3kthlvfFwwZ7aib2x4uGBSaXMdMwDyx0jqvht60LKEu ojSmK6DIpIZ+pIZsClfi8pkLE4gIJSJa0zXnW6zBlQ1vao2HprhkT7QqjrOKb/HDdUo8 HctQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715607292; x=1716212092; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6urNyB4vQQNh4584Di5I2sLQLBdVT4qpdH9MF7sOYfo=; b=f8QLlVk6A2GvNihmnZEFE6mV9aJsne+CY5NmM8H2lBaMUl4Re+hns6k2cV7v5YSCyh /aH2AkIl0cV0466VU29rQUk7iWSAWjAH9f08gF2QfzeR15a9UjRZbsfPJr7UQQXbtQ7G 0riJIx3hlLQ2S5Gc3n5350lgZhjHleWNhH4ZWHfRuyfiqviBBoi3TQ+ddz1Bd1/3Zi8E s/qmd+I41AgaN4fVmmZtchqZWmhq+gMrCK6I3A57CjC2raUNtP3S0mnXf3SZYOoLiA8Q DrFQB78LEp2ctmluHaae/y3JMGVSCX6Sq23dm6i00gRA1j9Qiuvk0jeb+2+rquy4o3Ey 24Tw== X-Gm-Message-State: AOJu0YwdcXZBl27YHesDi9EmujjtxybhKyCeuVfMf7oWa4Unvbb2ZTx3 dZSNIEToilLVOGvo1lY9o2nTDhgZjdT9ubh0vNMpdhAn1z2IcMIi0seEOhUU72wVfE8mhiMgv7t 5SGUamALsSwsi5WzPG3o+ldT/4BAYErHUgCoqYHeN20shHSKc/gI= X-Google-Smtp-Source: AGHT+IELgPK6No81PrjW9QmhWbivnQgOiKbSfrTETomHLXd3UhBDr7gph5VkYM6oaON1SDowCGfx+QuK3hpehUeCwYk= X-Received: by 2002:a25:f90e:0:b0:dc6:ff32:aae2 with SMTP id 3f1490d57ef6-dee4f3356e2mr7665047276.63.1715607291703; Mon, 13 May 2024 06:34:51 -0700 (PDT) MIME-Version: 1.0 From: Douglas McIlroy Date: Mon, 13 May 2024 09:34:36 -0400 Message-ID: To: TUHS main list Content-Type: multipart/alternative; boundary="000000000000325177061855f387" Message-ID-Hash: JY3H74WCXAJMFVMNKDQD2OM2I7XPWA25 X-Message-ID-Hash: JY3H74WCXAJMFVMNKDQD2OM2I7XPWA25 X-MailFrom: douglas.mcilroy@dartmouth.edu X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] If forking is bad, how about buffering? List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --000000000000325177061855f387 Content-Type: text/plain; charset="UTF-8" So fork() is a significant nuisance. How about the far more ubiquitous problem of IO buffering? On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote: > But it does come down to the same argument as > https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf The Microsoft manifesto says that fork() is an evil hack. One of the cited evils is that one must remember to flush output buffers before forking, for fear it will be emitted twice. But buffering is the culprit, not the victim. Output buffers must be flushed for many other reasons: to avoid deadlock; to force prompt delivery of urgent output; to keep output from being lost in case of a subsequent failure. Input buffers can also steal data by reading ahead into stuff that should go to another consumer. In all these cases buffering can break compositionality. Yet the manifesto blames an instance of the hazard on fork()! To assure compositionality, one must flush output buffers at every possible point where an unknown downstream consumer might correctly act on the received data with observable results. And input buffering must never ingest data that the program will not eventually use. These are tough criteria to meet in general without sacrificing buffering. The advent of pipes vividly exposed the non-compositionality of output buffering. Interactive pipelines froze when users could not provide input that would force stuff to be flushed until the input was informed by that very stuff. This phenomenon motivated cat -u, and stdio's convention of line buffering for stdout. The premier example of input buffering eating other programs' data was mitigated by "here documents" in the Bourne shell. These precautions are mere fig leaves that conceal important special cases. The underlying evil of buffered IO still lurks. The justification is that it's necessary to match the characteristics of IO devices and to minimize system-call overhead. The former necessity requires the attention of hardware designers, but the latter is in the hands of programmers. What can be done to mitigate the pain of border-crossing into the kernel? L4 and its ilk have taken a whack. An even more radical approach might flow from the "whitepaper" at www.codevalley.com. In any even the abolition of buffering is a grand challenge. Doug --000000000000325177061855f387 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
So fork() is a significant nu= isance. How about the far more ubiquitous problem of IO buffering?

On Sun, May 12, 2024 at 1= 2:34:20PM -0700, Adam Thornton wrote:
> But it does come down to the same= argument as
>=C2=A0https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-h= otos19.pdf

The M= icrosoft manifesto says that fork() is an evil hack. One of the cited evils= is that one must remember to flush output buffers before forking, for fear= it will be emitted twice. But buffering is the culprit, not the victim. Ou= tput buffers must be flushed for many other reasons: to avoid deadlock; to = force prompt delivery of urgent output; to keep output from being lost in c= ase of a subsequent failure. Input buffers can also steal data by reading a= head into stuff that should go to another consumer. In all these cases buff= ering can break compositionality. Yet the manifesto blames an instance of t= he hazard on fork()!=C2=A0

To assure com= positionality, one must flush output buffers at every possible point where = an unknown downstream consumer might correctly act on the received data wit= h observable results. And input buffering must never ingest data that the p= rogram will not eventually use. These are tough criteria to meet in general= without sacrificing buffering.

The advent of pipes vividly exposed the non-compositionality= of output buffering. Interactive pipelines froze when users could not prov= ide input that would force stuff to be flushed until the input was informed= by that very stuff. This phenomenon motivated cat -u, and stdio's conv= ention of line buffering for stdout. The premier example of input buffering= eating other programs' data was mitigated by "here documents"= ; in the Bourne shell.

These precautions are mere fig leaves that conceal important special = cases. The underlying evil of buffered IO still lurks. The justification is= that it's necessary to match the characteristics of IO devices and to = minimize system-call overhead.=C2=A0 The former necessity requires the atte= ntion of hardware designers, but the latter is in the hands of programmers.= What can be done to mitigate the pain of border-crossing into the kernel? = L4 and its ilk have taken a whack. An even more radical approach might flow= from the "whitepaper" at w= ww.codevalley.com.

In an= y even the abolition of buffering is a grand challenge.

Doug
--000000000000325177061855f387--