From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 7473 invoked from network); 29 Sep 2021 19:05:37 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 29 Sep 2021 19:05:37 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id DC60C9CAF6; Thu, 30 Sep 2021 05:05:36 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 788C89CAE4; Thu, 30 Sep 2021 05:05:04 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="aLeUSSqD"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 8F0429CAE4; Thu, 30 Sep 2021 05:05:01 +1000 (AEST) Received: from mail-oi1-f169.google.com (mail-oi1-f169.google.com [209.85.167.169]) by minnie.tuhs.org (Postfix) with ESMTPS id BF8C49CAE3 for ; Thu, 30 Sep 2021 05:05:00 +1000 (AEST) Received: by mail-oi1-f169.google.com with SMTP id z11so4202132oih.1 for ; Wed, 29 Sep 2021 12:05:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LQPPDsvzOt2GnNdj2QpvsXQP2OcSGFT5hk8TLR9Ae50=; b=aLeUSSqDMhZdfuHiCbcKlksmSKGQxF282/J1QaclozSvUnxLQXRYvz59Nwdrism0F5 8+45NCeD5Bct3FmKJhtdCFlLQAf7dN/FyHoYxIk80UTJxMofJ2afnxFe2ETKUp/SIx0X isxG3FMLGmkgydNy6r6VfibgcQWMSERANHDyIcpbseBS4833M/R5I8S1A9W5UzA+UV57 GMfNNtZaYCQpUqxzQBa7bm/vdJGA08PwEB5TJR0d0siyOwpRsx2yNIYXFZkdJeyKd8i2 rsG7QJyQuZo7Etf2pDMelTsU58YC9PebKdte8h9fwNfZL0hDl0Lb6GeMu1Znj7fm2qMv Rk6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LQPPDsvzOt2GnNdj2QpvsXQP2OcSGFT5hk8TLR9Ae50=; b=4pfQXNvmH/SWxwU3FziYcpZNDuKMdTIyoN04RIfskhwerrw+/fa29BTPD1/IoWPgiX Jrex8yUMe5RPXq0+LRmfRsDolL8+KMptmuXxmsDn0eZvVluk1QQIlqUu1Z6ihOhuwqwi 6wMLm13ttCyKcsOfH9Pne7F10W2i+1lbvqSFEwOcEqLnswgVkMd7ZrIjEjevnw0CoM2m DNuMtoaHxPLoAsG22mPqa9o2vu9qNmS2fxWeiefBGB+A4bprxdk7eO8h1a019relIUhk EKYfnx12Ijftiw6d2a0E9G1/oEaw539dl/bZMPpEtK0yvMIZNpyy+vD8BrB7sWrRk/R0 Pq0Q== X-Gm-Message-State: AOAM530Um9cV2dwSNZQpG3jWcd5fwrofUeRqWwgo0UTwJ/iwfAx5Rxbc NvQhCunbb/k2CjfoVlhnFRQOnRTj9SZq5CoHB2M= X-Google-Smtp-Source: ABdhPJxScCoBc6JajUD5qQcu+HlwKZbY+7t2trpVVlEeDkLqqrGDqietfTP7asPegP3gJXuKtb1rS3YoCferYeLIl7o= X-Received: by 2002:a05:6808:2128:: with SMTP id r40mr1238070oiw.24.1632942299893; Wed, 29 Sep 2021 12:04:59 -0700 (PDT) MIME-Version: 1.0 References: <20210929180742.4E63218C0BE@mercury.lcs.mit.edu> In-Reply-To: <20210929180742.4E63218C0BE@mercury.lcs.mit.edu> From: Dan Cross Date: Wed, 29 Sep 2021 15:04:23 -0400 Message-ID: To: Noel Chiappa Content-Type: multipart/alternative; boundary="000000000000b9549005cd27017e" Subject: Re: [TUHS] Systematic approach to command-line interfaces X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Eunuchs Hysterical Society Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --000000000000b9549005cd27017e Content-Type: text/plain; charset="UTF-8" On Wed, Sep 29, 2021 at 2:08 PM Noel Chiappa wrote: > > From: Larry McVoy > > > If you read(2) a page and mmap()ed it and then did a write(2) to the > > page, the mapped page is the same physical memory as the write()ed > > page. Zero coherency issues. > > Now I'm confused; read() and write() semantically include a copy operation > (so there are then two copies of that data chunk, and possible consistency > issues between them), and the copied item is not necessarily page-sized (so > you can't ensure consistency between the original+copy by mapping it in). > So > when one does a read(file, &buffer, 1), one gets a _copy of just that byte_ > in the process' address space (and similar for write()). > > Yes, there's no coherency issue between the contents of an mmap()'d page, > and > the system's idea of what's in that page of the file, but that's a > _different_ coherency issue. > > Or am I confused? > I think that mention of `read` here is a bit of a red-herring; presumably Larry only mentioned it so that the reader would assume that the read blocks are in the buffer cache when discussing the write vs mmap coherency issue with respect to unmerged page and buffer caches (e.g., `*p = 1;` modifies some page in the VM cache, but not the relevant block in the buffer cache, so a subsequent `read` on the mmap'd file descriptor won't necessarily reflect the store). I don't think that is strictly necessary, though; presumably the same problem exists even if the file data isn't in block cache yet. PS: > > From: "Greg A. Woods" > > > I now struggle with liking the the Unix concept of "everything is a > > file" -- especially with respect to actual data files. Multics also > got > > it right to use single-level storage -- that's the right abstraction > > Oh, one other thing that SLS breaks, for data files, is the whole Unix > 'pipe' > abstraction, which is at the heart of the whole Unix tools paradigm. So no > more 'cmd | wc' et al. And since SLS doesn't have the 'make a copy' > semantics of pipe output, it would be hard to trivially work around it. > I don't know about that. One could still model a pipe as an IO device; Multics supported tapes, printers and terminals, after all. It even had pipes! https://web.mit.edu/multics-history/source/Multics/doc/info_segments/pipes.gi.info Yes, one could build up a similar framework, but each command would have to > specify an input file and an output file (no more 'standard in' and 'out'), Why? To continue with the Multics example, it supported `sysin` and `sysprint` from PL/1, referring to the terminal by default. > and then the command interpreter would have to i) take command A's output > file > and feed it to command B, and ii) delete A's output file when the whole > works > was done. Yes, the user could do it manually, but compare: > > cmd aaa | wc > > and > > cmd aaa bbb > wc bbb > rm bbb > > If bbb is huge, one might run out of room, but with today's 'light my cigar > with disk blocks' life, not a problem - but it would involve more disk > traffic, as bbb would have to be written out in its entirety, not just > have a > mall piece kept in the disk cache as with a pipe. > It feels like all you need is a stream abstraction and programs written to use it, which doesn't seem incompatible with SLS. Perhaps the argument is that in a SLS system one wouldn't be motivated to write programs that way, whereas in a program with a Unix-style IO mechanism it's more natural? - Dan C. --000000000000b9549005cd27017e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Sep 29, 2021 at 2:08 PM Noel Chia= ppa <jnc@mercury.lcs.mit.edu<= /a>> wrote:
=C2=A0 =C2=A0 > From: Larry McVoy

=C2=A0 =C2=A0 > If you read(2) a page and mmap()ed it and then did a wri= te(2) to the
=C2=A0 =C2=A0 > page, the mapped page is the same physical memory as the= write()ed
=C2=A0 =C2=A0 > page. Zero coherency issues.

Now I'm confused; read() and write() semantically include a copy operat= ion
(so there are then two copies of that data chunk, and possible consistency<= br> issues between them), and the copied item is not necessarily page-sized (so=
you can't ensure consistency between the original+copy by mapping it in= ). So
when one does a read(file, &buffer, 1), one gets a _copy of just that b= yte_
in the process' address space (and similar for write()).

Yes, there's no coherency issue between the contents of an mmap()'d= page, and
the system's idea of what's in that page of the file, but that'= s a
_different_ coherency issue.

Or am I confused?



PS:
=C2=A0 =C2=A0 > From: "Greg A. Woods"

=C2=A0 =C2=A0 > I now struggle with liking the the Unix concept of "= ;everything is a
=C2=A0 =C2=A0 > file" -- especially with respect to actual data fil= es.=C2=A0 Multics also got
=C2=A0 =C2=A0 > it right to use single-level storage -- that's the r= ight abstraction

Oh, one other thing that SLS breaks, for data files, is the whole Unix '= ;pipe'
abstraction, which is at the heart of the whole Unix tools paradigm. So no<= br> more 'cmd | wc' et al. And since SLS doesn't have the 'make= a copy'
semantics of pipe output, it would be hard to trivially work around it.
=


Yes, one could build up a similar framework, but each command would have= to
specify an input file and an output file (no more 'standard in' and= 'out'),

and then the command interpreter would have to i) take command A's outp= ut file
and feed it to command B, and ii) delete A's output file when the whole= works
was done. Yes, the user could do it manually, but compare:

=C2=A0 cmd aaa | wc

and

=C2=A0 cmd aaa bbb
=C2=A0 wc bbb
=C2=A0 rm bbb

If bbb is huge, one might run out of room, but with today's 'light = my cigar
with disk blocks' life, not a problem - but it would involve more disk<= br> traffic, as bbb would have to be written out in its entirety, not just have= a
mall piece kept in the disk cache as with a pipe.


--000000000000b9549005cd27017e--