From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [IPv6:2600:3c01:e000:146::1]) by inbox.vuxu.org (Postfix) with ESMTP id 6734C25DD2 for ; Wed, 15 May 2024 00:35:02 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 93CF243322; Wed, 15 May 2024 08:34:57 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tuhs.org; s=dkim; t=1715726097; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-owner:list-unsubscribe: list-subscribe:list-post; bh=TXgAGlS5yw+2ysP3Q9X7wbLXCrXd+PodJwBopEzcdVU=; b=JPOamlQ/vjE0gqdky00O2Hox7iKa8o4k2YZ23LcS4QMhh9xOe9Or7CCGYrd5DaZkX3mTdg Xs1j/d9SR0/foPtKDSA0XlgTayEOKKp1U7Pn2h+LGMExILZsvf6nRP63QumRMizYfzKAoZ HLVkWTQG5RsXlcnKZBgewtnVuHT2CZw= Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by minnie.tuhs.org (Postfix) with ESMTPS id 5F6A143321 for ; Wed, 15 May 2024 08:34:50 +1000 (AEST) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6f4ed9dc7beso2489516b3a.1 for ; Tue, 14 May 2024 15:34:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iitbombay-org.20230601.gappssmtp.com; s=20230601; t=1715726089; x=1716330889; darn=tuhs.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=TXgAGlS5yw+2ysP3Q9X7wbLXCrXd+PodJwBopEzcdVU=; b=NdJA9Plbb+PnxDB+HyGHtzFJi3NL6oo464jZdd1P2i+5eOoB/AEFhR3eK0wlNyIJYC PON07RgZoHckPJ36VyVZb9qItsL4xffYgNFOdDMomATCYu3uQYur+6v6MvyV8xHzE3cx PASO3MoqYbCk2vEaW3+jlbEjNp50lng9poT42pSXrpHzsjg587foaCXrUnk54tkARyEq Yb1XHtWuq4EaASn0UolLKuOLAyx4GEumFM0YVEZ6hfN2ZXn/PTmzAq4JLAY5Jm0T1CoV EOGN29ZYObPASkDT8JI5BTi55Cq8L1MbHaz4yN1tMUxrChNOO+8/LpB47xTWnIH/fOo6 BSLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715726089; x=1716330889; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TXgAGlS5yw+2ysP3Q9X7wbLXCrXd+PodJwBopEzcdVU=; b=hwqgqs/q2FMrJnoU1JW6fm8yVGzOD18Wx0nVo8FAFgrXh+tPKHyskSMFYa72spCQIU 8l3ujixn8TQIi5GeHICwHbc1hvAFzfV65jQLVWbvg/9+x1FB4oY9MvYspe3nkLeu4N0Z DmWdPu7VnQWGNjSFD0blyXIxmlpdr35uxTH0R7ZK2r/iC85sXSHy/Rq2z7exgYycNHKd T4l8e1cpaS952cnO5qA/1gCx8BKsmXPICns1vCzoq2BQBU/vh5wyq1347HFqf6/D3QVe y76VAabeTHtItWI50WJOshYfF3ncI19Dk3JLaqXLtlynTAXUOx0pz5XBcir9R6TbMzyJ MGzQ== X-Gm-Message-State: AOJu0YxJhkhHISylcgvcSlLuGz21ZSDCahQ5zZTkdoahvMJjhVprhh1f Yk0Lfd1GpXviKtWVN6se1xWJAr3ueHSP4cUYSxI7KK2bYwZlmIWsTyJJcibT8g== X-Google-Smtp-Source: AGHT+IHb9k6zCAALxnnWwT2GAvv1BmMwqhNo++oxIqMopuqW5PNmzSNbNE6h6mhPAnIKOHymobRQRA== X-Received: by 2002:a05:6a00:3d49:b0:6f3:e9bc:cc3d with SMTP id d2e1a72fcca58-6f4c90905a1mr25624204b3a.3.1715726089317; Tue, 14 May 2024 15:34:49 -0700 (PDT) Received: from smtpclient.apple (107-215-223-229.lightspeed.sntcca.sbcglobal.net. [107.215.223.229]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f4d2a6641dsm9656511b3a.4.2024.05.14.15.34.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 May 2024 15:34:48 -0700 (PDT) Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_5A0B3543-549B-4927-90EC-5C9F7B1778E5" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\)) Date: Tue, 14 May 2024 15:34:37 -0700 In-Reply-To: To: Douglas McIlroy References: X-Mailer: Apple Mail (2.3774.500.171.1.1) Message-ID-Hash: AOO2SJL2KIHYELC3OZLSDOBMZMOQDPO4 X-Message-ID-Hash: AOO2SJL2KIHYELC3OZLSDOBMZMOQDPO4 X-MailFrom: bakul@iitbombay.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: The Unix Heritage Society mailing list X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: If forking is bad, how about buffering? List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Bakul Shah via TUHS Reply-To: Bakul Shah X-Spam: Yes --Apple-Mail=_5A0B3543-549B-4927-90EC-5C9F7B1778E5 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Buffering is used all over the place. Even serial devices use a 16 byte = of buffer -- all to reduce the cost of per unit (character, disk block = or packet etc.) processing or to smooth data flow or to utilize the = available bandwidth. But in such applications the receiver/sender = usually has a way of getting an alert when the FIFO has data/is empty. = As long as you provide that you can compose more complex network of = components. Imagine components connected via FIFOs that provide empty, = almost empty, almost full, full signals. And may be more in case of = lossy connections. [Though at a lower level you'd model these fifo as = components too so at that level there'd be *no* buffering! Sort of like = Carl Hewitt's Actor model!] Your complaint seems more about how buffers are currently used and where = the "network" of components are dynamically formed. > On May 13, 2024, at 6:34=E2=80=AFAM, Douglas McIlroy = wrote: >=20 > So fork() is a significant nuisance. How about the far more ubiquitous = problem of IO buffering? >=20 > On Sun, May 12, 2024 at 12:34:20PM -0700, Adam Thornton wrote: > > But it does come down to the same argument as > > = https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19= .pdf >=20 > The Microsoft manifesto says that fork() is an evil hack. One of the = cited evils is that one must remember to flush output buffers before = forking, for fear it will be emitted twice. But buffering is the = culprit, not the victim. Output buffers must be flushed for many other = reasons: to avoid deadlock; to force prompt delivery of urgent output; = to keep output from being lost in case of a subsequent failure. Input = buffers can also steal data by reading ahead into stuff that should go = to another consumer. In all these cases buffering can break = compositionality. Yet the manifesto blames an instance of the hazard on = fork()!=20 >=20 > To assure compositionality, one must flush output buffers at every = possible point where an unknown downstream consumer might correctly act = on the received data with observable results. And input buffering must = never ingest data that the program will not eventually use. These are = tough criteria to meet in general without sacrificing buffering. >=20 > The advent of pipes vividly exposed the non-compositionality of output = buffering. Interactive pipelines froze when users could not provide = input that would force stuff to be flushed until the input was informed = by that very stuff. This phenomenon motivated cat -u, and stdio's = convention of line buffering for stdout. The premier example of input = buffering eating other programs' data was mitigated by "here documents" = in the Bourne shell. >=20 > These precautions are mere fig leaves that conceal important special = cases. The underlying evil of buffered IO still lurks. The justification = is that it's necessary to match the characteristics of IO devices and to = minimize system-call overhead. The former necessity requires the = attention of hardware designers, but the latter is in the hands of = programmers. What can be done to mitigate the pain of border-crossing = into the kernel? L4 and its ilk have taken a whack. An even more radical = approach might flow from the "whitepaper" at www.codevalley.com = . >=20 > In any even the abolition of buffering is a grand challenge. >=20 > Doug --Apple-Mail=_5A0B3543-549B-4927-90EC-5C9F7B1778E5 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Buffering is = used all over the place. Even serial devices use a 16 byte of buffer -- = all to reduce the cost of per unit (character, disk block or packet = etc.) processing or to smooth data flow or to utilize the available = bandwidth. But in such applications the receiver/sender usually has a = way of getting an alert when the FIFO has data/is empty. As long as you = provide that you can compose more complex network of components. Imagine = components connected via FIFOs that provide empty, almost empty, almost = full, full signals. And may be more in case of lossy connections. = [Though at a lower level you'd model these fifo as components too so at = that level there'd be *no* buffering! Sort of like Carl Hewitt's Actor = model!]

Your complaint seems more about how buffers = are currently used and where the "network" of components are dynamically = formed.

On May 13, 2024, at 6:34=E2=80=AFAM, Douglas McIlroy = <douglas.mcilroy@dartmouth.edu> wrote:

So fork() is a significant nuisance. How about the far = more ubiquitous problem of IO buffering?

On Sun, May 12, 2024 at = 12:34:20PM -0700, Adam Thornton wrote:
> But = it does come down to the same argument as
https://www.microsoft.com/en-us/research/uploads/prod/20= 19/04/fork-hotos19.pdf

The Microsoft manifesto says that fork() is = an evil hack. One of the cited evils is that one must remember to flush = output buffers before forking, for fear it will be emitted twice. But = buffering is the culprit, not the victim. Output buffers must be flushed = for many other reasons: to avoid deadlock; to force prompt delivery of = urgent output; to keep output from being lost in case of a subsequent = failure. Input buffers can also steal data by reading ahead into stuff = that should go to another consumer. In all these cases buffering can = break compositionality. Yet the manifesto blames an instance of the = hazard on fork()! 

To assure compositionality, one must flush output = buffers at every possible point where an unknown downstream consumer = might correctly act on the received data with observable results. And = input buffering must never ingest data that the program will not = eventually use. These are tough criteria to meet in general without = sacrificing buffering.

The advent of pipes vividly exposed the = non-compositionality of output buffering. Interactive pipelines froze = when users could not provide input that would force stuff to be flushed = until the input was informed by that very stuff. This phenomenon = motivated cat -u, and stdio's convention of line buffering for stdout. = The premier example of input buffering eating other programs' data was = mitigated by "here documents" in the Bourne shell.

These precautions are = mere fig leaves that conceal important special cases. The underlying = evil of buffered IO still lurks. The justification is that it's = necessary to match the characteristics of IO devices and to minimize = system-call overhead.  The former necessity requires the attention = of hardware designers, but the latter is in the hands of programmers. = What can be done to mitigate the pain of border-crossing into the = kernel? L4 and its ilk have taken a whack. An even more radical approach = might flow from the "whitepaper" at www.codevalley.com. =

In any even the abolition of buffering is a = grand challenge.

Doug

= --Apple-Mail=_5A0B3543-549B-4927-90EC-5C9F7B1778E5--