From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HTML_MESSAGE,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 4552 invoked from network); 1 Aug 2021 23:37:34 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 1 Aug 2021 23:37:34 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id AF39C9CA92; Mon, 2 Aug 2021 09:37:32 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 90D2D9CA63; Mon, 2 Aug 2021 09:37:10 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ccil-org.20150623.gappssmtp.com header.i=@ccil-org.20150623.gappssmtp.com header.b="yrXvPA1j"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 45F029CA63; Mon, 2 Aug 2021 09:37:08 +1000 (AEST) Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) by minnie.tuhs.org (Postfix) with ESMTPS id 281A59CA60 for ; Mon, 2 Aug 2021 09:37:05 +1000 (AEST) Received: by mail-qk1-f170.google.com with SMTP id x3so15069038qkl.6 for ; Sun, 01 Aug 2021 16:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccil-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BstZh/vPepjm0h5KfeqBJdiAooFtQ5kfFSJYr4N3QWU=; b=yrXvPA1jmXvOkBV7iqCxh2lNxxle6d2s8gycdQUJdf6y+DGo5KNXZJLrhD77pdFuJl L7sOJyq09qzTPBPdWC5lbJt8ZQg3ep7LzzCKLdGJTcQtJhLQsWcpHSzc/l2H5Ehq17VE /aVj0E4D5VCH5as2ms69B9sbbH/t+rTrgZ/yQoKD8dF2w2sm+EdwGkwI7P6+da5eGlEC D8Qqd2dbwL2HjwUOE0BfYfHZHNQgMlc2qbqQh5zK/YmmzD0/f452ueYWrKGG+1O+iPB8 TVAZ3jcXdA3eEuVgkzxVb0Ws9ehDo4DTkTJCdk0lejTzUcAh+ZuSsIiUaW8NTFoAcD3f 4RLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BstZh/vPepjm0h5KfeqBJdiAooFtQ5kfFSJYr4N3QWU=; b=TXrga0yqlS64du91fe6TSi/7EgFqBzCrHaLYe6ZDJZgdYNtsMpibywu8g8DSk64caQ ifIZTxX7l+gQgMzyYNQrbwurcl5j1eJp3xJebkrv1OOHWT6vF7cuLg5teg2AcKxLCS3B phYxwCRw1aNQq4Jua9SFLiUc9KbWAY/x/iW+8uWk1NHDaJn9oyaYQYPld9cgKg/pVGKa nBLbF5DcyxyHjx5XcaFJqHMJ6K/kwgSAN6D34IaUGdILIcTMhu7Vvn5vGZlYS4kMWhFq K0ndSQ5DMszneghztd79qK7avmHej1sdCJ5A/Iwr19wxawyNQrzSSuTCrr5O50My4Bf8 kbow== X-Gm-Message-State: AOAM530KguTdI9P0XSRKwNLu3nODZWC68UUybX/gDcnXeBgahHm0NS1Y LtrN9xt/9uSdyK1meJKw0quGLCdLLjpp/iHX5w7HeQ== X-Google-Smtp-Source: ABdhPJwoJBSclp0YIpjcxBJx5wcQFHKn5Y3JaIcNwtwBLuQSH9Qvhteq3LKc+0iJ8xI3ygEcSfK9+nqFpdq9Q7lfvi4= X-Received: by 2002:a05:620a:16a7:: with SMTP id s7mr11545354qkj.359.1627861024143; Sun, 01 Aug 2021 16:37:04 -0700 (PDT) MIME-Version: 1.0 References: <20210731142533.69caf929@moon> <40763c2d-52ad-eb01-8bf8-85acf6fee700@case.edu> In-Reply-To: From: John Cowan Date: Sun, 1 Aug 2021 19:36:53 -0400 Message-ID: To: Dan Cross Content-Type: multipart/alternative; boundary="000000000000168cf805c887ee55" Subject: Re: [TUHS] Systematic approach to command-line interfaces X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: TUHS main list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --000000000000168cf805c887ee55 Content-Type: text/plain; charset="UTF-8" On Sun, Aug 1, 2021 at 5:55 PM Dan Cross wrote: > Looking at other systems that were available roughly around the time of > Unix (TENEX, Multics), it strikes me that the Unix was a bit of an odd-duck > with the way it handled exec in terms of destructively overlaying the > memory of the user portion of a process with a new image; am I wrong here? > See dmr's paper at for details, but in short exec and its equivalents elsewhere have always overlaid the running program with another program. Early versions of PDP-7 Linux used the same process model as Tenex: one process per terminal which alternated between running the shell and a user program. So exec() loaded the user program on top of the shell. Indeed, this wasn't even a syscall; the shell itself wrote a tiny program loader into the top of memory that read the new program, which was open for reading, and jumped to it. Likewise, exit() was a specialized exec() that reloaded the shell. The Tenex and Multics shells had more memory to play with and didn't have to use these self-overlaying tricks[*]: they loaded your program into available memory and called it as a subroutine, which accounts for the name "shell". So it was the introduction of fork(), which came from the Berkeley Genie OS, that made the current process control regime possible. In those days, fork() wrote the current process out to the swapping disk and set up the process table with a new entry. For efficiency, the in-memory version became the child and the swapped-out version became the parent. Instantly the shell was able to run background processes by just not waiting for them, and pipelines (once the syntax was invented) could be handled with N - 1 processes in an N-stage pipeline. Huge new powers landed on the user's head. Nowadays it's a question whether fork() makes sense any more. "A fork() in the road" [Baumann et al. 2019] < https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf> is an interesting argument against fork(): * It doesn't compose. * It is insecure by default. * It is slow (there are about 25 properties a process has in addition to its memory and hardware state, and each of these needs to be copied or not) even using COW (which is itself a Good Thing and can and should be provided separately) * It is incompatible with a single-address-space design. In short, spawn() beats fork() like a drum, and fork() should be deprecated. To be sure, the paper comes out of Microsoft Research, but I find it pretty compelling anyway. [*] My very favorite self-overlaying program was the PDP-8 bootstrap for the DF32 disk drive. You toggled in two instructions at locations 30 and 31 meaning "load disk registers and go" and "jump to self" respectively, hit the Clear key on the front panel, which cleared all registers, and started up at 30. The first instruction told the disk to start reading sector 0 of the disk into location 0 in memory (because all the registers were 0, including the disk instruction register where 0 = READ) and the second instruction kept the CPU busy waiting. As the sector loaded, the two instructions were overwritten by "skip if disk ready" and "jump to previous address", which would wait until the whole sector was loaded. Then the OS could be loaded using the primitive disk driver in block 0. --000000000000168cf805c887ee55 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, Aug 1, 20= 21 at 5:55 PM Dan Cross <crossd@gmai= l.com> wrote:
=C2=A0
Looking = at=C2=A0other systems that were available roughly around the time of Unix (= TENEX, Multics), it strikes me that the Unix was a bit of an odd-duck with = the way it handled exec in terms of destructively overlaying the memory of = the user portion of a process with a new image; am I wrong here?

See dmr's = paper at <ht= tps://www.bell-labs.com/usr/dmr/www/hist.html> for details, but in s= hort exec and its equivalents elsewhere have always overlaid the running pr= ogram with another program.=C2=A0 Early versions of PDP-7 Linux used the sa= me process model as Tenex: one process per terminal which alternated betwee= n running the shell and a user program.=C2=A0 So exec() loaded the user pro= gram on top of the shell.=C2=A0 Indeed, this wasn't even a syscall; the= shell itself wrote a tiny program loader into the top of memory that read = the new program, which was open for reading,=C2=A0and jumped to it. Likewis= e, exit() was a specialized exec() that reloaded the shell.=C2=A0 The Tenex= and Multics shells had more memory to play with and didn't have to use= these self-overlaying tricks[*]: they loaded your program into available m= emory and called it as a subroutine, which accounts for the name "shel= l".

So it was the introduction of fork(), which came from the Berkeley Genie O= S, that made the current process control regime possible.=C2=A0 In those da= ys, fork() wrote the current process out to the swapping disk and set up th= e process table with a new entry.=C2=A0 For efficiency, the in-memory versi= on became the child and the swapped-out version became the parent.=C2=A0 In= stantly the shell was able to run background processes by just not waiting = for them, and pipelines (once the syntax was invented) could be handled wit= h N - 1 processes in an N-stage pipeline.=C2=A0 Huge new powers landed on t= he user's head.

Nowadays it's a question whether fork() makes sense any mor= e.=C2=A0 =C2=A0"A fork() in the road" [Baumann et al. 2019] <<= a href=3D"https://www.microsoft.com/en-us/research/uploads/prod/2019/04/for= k-hotos19.pdf">https://www.microsoft.com/en-us/research/uploads/prod/2019/0= 4/fork-hotos19.pdf> is an interesting argument against fork():
=

* It doesn= 9;t compose.
* It is insecure by default.
* It is slow (there are about 25 properties a = process has in addition to its memory and hardware state, and each of these= needs to be copied or not) even using COW (which is itself a Good Thing an= d can and should be provided separately)
* I= t is incompatible with a single-address-space design.=C2=A0

In short, spawn() beats= fork() like a drum,=C2=A0and fork() should be deprecated. To be sure, the = paper comes out of Microsoft Research, but I find it pretty compelling anyw= ay.

[*] = My very favorite self-overlaying program was the PDP-8 bootstrap for the DF= 32 disk drive.=C2=A0 You toggled in two instructions at locations 30 and 31= meaning "load disk registers and go" and "jump to self"= ; respectively, hit the Clear key on the front panel, which cleared all reg= isters, and started up at 30.=C2=A0

The first instruction told the disk to start re= ading sector 0 of the disk into location 0 in memory (because all the regis= ters were 0, including the disk instruction register where 0 =3D READ) and = the second instruction kept the CPU busy waiting.=C2=A0 As the sector loade= d,=C2=A0 the two instructions were overwritten by "skip if disk ready&= quot; and "jump to previous address", which would wait until the = whole sector was loaded.=C2=A0 Then the OS could be loaded using the primit= ive disk driver in block 0.

=
--000000000000168cf805c887ee55--