From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [IPv6:2600:3c01:e000:146::1]) by inbox.vuxu.org (Postfix) with ESMTP id E636D24776 for ; Wed, 15 May 2024 16:43:29 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id CC3D2432EF; Thu, 16 May 2024 00:43:23 +1000 (AEST) Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by minnie.tuhs.org (Postfix) with ESMTPS id C12B2432EB for ; Thu, 16 May 2024 00:43:12 +1000 (AEST) Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2e1fa2ff499so75956541fa.0 for ; Wed, 15 May 2024 07:43:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715784191; x=1716388991; darn=tuhs.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IM1SYv+0ZujeOp0xHHI30ko/9obsTOt9hRirYj0qv64=; b=DZVOq/6rHPY2rHII3eZoZ1JqNnuiqH3qsteQNds5LhtgnuKSSlRzRgbjRAd6GJhVnH IZSDtnZL3bIrq3XWo9RjWw1nRwy9fGp904lH+8jZHFPt+yyveBbsikLuJ1ACebTgpRoC W5f8rcS0Gr8cKB4lbeoRcqkh6FNuHTzTiINzlqsTerlkzQQ+q8tuvQq7NZ4D1kmTaQVO dDIwYzv0uDuppMW/vsATCO/s7ltP0271zQrRbqw0c7UlJgQKBVU6RleDl1vbbLtXcYT6 +6O9/SWYIrMvbicAVoPv13TQE0YRCWbbq113vO92m55uw0ZZDB2cEkgrhTASLLqrqJKW VU+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715784191; x=1716388991; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IM1SYv+0ZujeOp0xHHI30ko/9obsTOt9hRirYj0qv64=; b=tI26093l5DR7bGY4eetIDSbAWK7/XrFTRD429Ht4aNCH2LdALNszCd31u3PUppVfPn IvtFnIdgFjX4KqPSzgzBa7ML+U3z6qq/lhA43RBQ3PfCbu2r2TRm4v/66sB04QqIqyi9 nB/5JCEYBiTwuWIM7pkcCBRsIeDrNMpfsjtL5UPWV7WzOzjlePkDOgTWQfwPx7roH6ik 4OFdBrp9RLXKlKNG5dnKz05/ozmueC4if5PDrIgxJIWxRrLpglq/ZkfNRqNzCXdXhD2c vXq99OuYLdu+B0qNPZ+Z+a7+uKktnF7z07/MnO0V3detbPMnZEgom+v/ri3N7v7JoQdr jEwQ== X-Gm-Message-State: AOJu0YxHunLsX6Xh/ShcB24rk0bn9d4e21DDecklSYAjNWZvMB+r28U5 q4zBPb6iC3S1lsRcOUpF1H9qmhGH/OO4bCJ90ocVyhkJ9qUtgz5uv1nd7O0k4oV7Pj2t4RVAMHc JaupViSlbj0GlVc71KqbC+rfhyuA= X-Google-Smtp-Source: AGHT+IHxMp/i4ACOLvjWYTpy/yRGzwMsl3A1Dlcw1MijH0U+a+FNDLWhYfQ2co4p2Lptsaf8HmYWtbSVHUbh2XDnO8Y= X-Received: by 2002:a2e:2a45:0:b0:2dd:6235:a5ec with SMTP id 38308e7fff4ca-2e51b17fe21mr52906871fa.17.1715784190590; Wed, 15 May 2024 07:43:10 -0700 (PDT) MIME-Version: 1.0 References: <20240514111032.2kotrrjjv772h5f4@illithid> In-Reply-To: <20240514111032.2kotrrjjv772h5f4@illithid> From: Dan Cross Date: Wed, 15 May 2024 10:42:33 -0400 Message-ID: To: "G. Branden Robinson" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: 2PZELJ6T6DJPB4UW75JISKMJVHGN4JLY X-Message-ID-Hash: 2PZELJ6T6DJPB4UW75JISKMJVHGN4JLY X-MailFrom: crossd@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: TUHS main list X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: If forking is bad, how about buffering? List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Tue, May 14, 2024 at 7:10=E2=80=AFAM G. Branden Robinson wrote: > [snip] > Viewpoint 1: Perspective from Pike's Peak Clever. > Elementary Unix commands should be elementary. Unix is a kernel. > Programs that do simple things with system calls should remain simple. > This practices makes the system (the kernel interface) easier to learn, > and to motivate and justify to others. Programs therefore test the > simplicity and utility of, and can reveal flaws in, the set of > primitives that the kernel exposes. This is valuable stuff for a > research organization. "Research" was right there in the CSRC's name. I believe this is at once making a more complex argument than was proffered, and at the same misses the contextual essence that Unix was created in. > Viewpoint 2: "I Just Want to Serve 5 Terabytes"[1] > > cat(1)'s man page did not advertise the traits in the foregoing > viewpoint as objectives, and never did.[2] Its avowed purpose was to > copy, without interruption or separation, 1..n files from storage to and > output channel or stream (which might be redirected). > > I don't need to tell convince that this is a worthwhile application. > But when we think about the many possible ways--and destinations--a > person might have in mind for that I/O channel, we have to face the > necessity of buffering or performance goes through the floor. > > It is 1978. Some VMS I don't know about that; VMS IO is notably slower than Unix IO by default. Unlike VMS, Unix uses the buffer cache to serialize access to the underlying storage device(s). Ironically, caching here is a major win, not just for speed, but to make it relatively easy to reason about the state of a block, since that state is removed from the minutiae of the underlying storage device and instead handled in the bio layer. Treating the block cache as a fixed-size pool yields a relatively simple state machine for synchronizing between the in-memory and on-disk representations of data. >[snip] > And this, as we all know, is one of the reasons the standard I/O library > came into existence. Mike Lesk, I surmise, understood that the > "applications programmer" having knowledge of kernel internals was in > general neither necessary nor desirable. I'm not sure about that. I suspect that the justification _may_ have been more along the lines of noting that many programs implemented their own, largely similar buffering strategies, and that it was preferable to centralize those into a single library, and also noting that building some kinds of programs was inconvenient using raw system calls. For instance, something like `gets` is handy, but is _annoying_ to write using just read(2). It can obviously be done, but if I don't have to, I'd prefer not to. > [snip] > We should have kept cat(1), and let it grow as many flags as practical > use demanded--_except_ for `-u`--and at the _same time_ developed a new > kcat(1) command that really was just a thin wrapper around system calls. > Then you'd be a lot closer to measuring what the kernel was really > doing, what you were paying for it, and you could still boast of your > elegance in OS textbooks. > [snip] Here's where I think this misses the mark: this focuses too much on the idea that simple programs exist as to be tests for, and exemplars of, the kernel system call interface, but what evidence do you have for that? A simpler explanation is that simple programs are easier to write, easier to read, easier to reason about, test, and examine for correctness. Unix amplified this with Doug's "garden hoses of data" idea and the advent of pipes; here, it was found that small, simple programs could be combined in often surprisingly unanticipated ways. Unix built up a philosophy about _how_ to write programs that was rooted in the problems that were interesting when Unix was first created. Something we often forget is that research systems are built to address problems that are interesting _to the researchers who build them_. This context can shape a system, and we see that with Unix: a highly synchronous system call interface, because overly elaborate async interfaces were hard to program; a simple file abstraction that was easy to use (open/creat/read/write/close/seek/stat) because files on other contemporary systems were baroque things that were difficult to use; a simple primitive for the creation of processes because, again, on other systems processes were very heavy, complicated things that were difficult to use. Unix took problems related to IO and processes and made them easy. By the 80s, these were pretty well understood, so focus shifted to other things (languages, networking, etc). Unix is one of those rare beasts that escaped the lab and made it out there in the wild. It became the workhorse that beget a whole two or three generations of commercial work; it's unsurprising that when the web explosion happened, Unix became the basis for it: it was there, it was familiar, and by then it wasn't a research project anymore, but a basis for serious commercial work. That it has retained the original system call interface is almost incidental; perhaps that fits with your brocolli-man analogy. - Dan C.