From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5818 invoked from network); 1 Jan 2023 02:57:22 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 1 Jan 2023 02:57:22 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id A5F6B42414; Sun, 1 Jan 2023 12:56:46 +1000 (AEST) Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by minnie.tuhs.org (Postfix) with ESMTPS id 7F12C41C80 for ; Sun, 1 Jan 2023 12:56:41 +1000 (AEST) Received: by mail-ed1-f48.google.com with SMTP id m21so35716495edc.3 for ; Sat, 31 Dec 2022 18:56:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=exOv9M+I9WNJgOza0/XX2qOA6TMDVDduePsxKh34C3k=; b=wWGitkA+ShrgsJ3WNG2/2NjcvaSQvUbO5RhilKwOPT9wQK0UKmbBFPa52Sc1PQ5Si9 AxBRtJ6WARQ96ruuDUGeKGfz4RgHJaRHg6Nj9f8jZ1+gAvbzF7LT5dzJtCKty33+JHaU rlySk/9X93KEd6iTK0v/TdW0sUAANmtSaymbQUobz80rZv4stpoB0/ITA7oFjmP+FOoB jRW2axO6fppjHHWP2vOQkPk9+iDVHNG3yqlLmW0/GV6pGKFK/nLmqIZB3VDT/0Sj8eWf AeYuaFDn/H+9Vqlg3axcYBMWoBQ0jdDznNye4rfltIfG1bPy5AOrAIZli4FZAwI8xTJX 5SKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=exOv9M+I9WNJgOza0/XX2qOA6TMDVDduePsxKh34C3k=; b=PgXe/OPu/JvPCjcTs9fp2zFvSA6iTlha/o+XCcdS/s/SP5bE+8vftBPKiaPTndsnTf Bujwast1vbfmO1Xunc+gymffHwN0+41nw47UG6bBA+isyEW1017vf2nOzZpcOno7P1xX nJ9DAiw2yNAdLJbORR7UodvBJq3+ZjMjC7rHOxxYsYLVe86yN+hqBL2bWOw+Eux6DJUL yq2o51lhc7LH2IwYifcgce/uDftI94v9thAiPYQHi6W3DjSl1Vk27rNf+K+rDhUH1aZi DeiRI5OxhJzS552Y1a7ndqySUExpsKQEKxMIc6UfBVTv9z28dEFZzOMEwy9zNw0eMGl7 +l6Q== X-Gm-Message-State: AFqh2kohoBkhMG9J2FE9B+9DFkn+TYTFHKhStG3He6habIlTNYlRtjvr /6SkZC4Ty8jnZCWTe8JxIabVMgD2FDAs6QkFC3lrmg== X-Google-Smtp-Source: AMrXdXuocm3BEKC8DHC9r1IbqxDTc8JZ/D4yd1gUsmzBVLVMMGQ1eo9r5sp358E2hNuJefno8dWYT9oKFH7VBqF+dXE= X-Received: by 2002:a05:6402:164a:b0:482:c049:67db with SMTP id s10-20020a056402164a00b00482c04967dbmr3522217edx.173.1672541739972; Sat, 31 Dec 2022 18:55:39 -0800 (PST) MIME-Version: 1.0 References: <52FB6638-AEFF-4A4F-8C2E-32089D577BA0@planet.nl> <18521483-A73C-4B5F-A76A-6098BD93E9BC@planet.nl> In-Reply-To: <18521483-A73C-4B5F-A76A-6098BD93E9BC@planet.nl> From: Warner Losh Date: Sat, 31 Dec 2022 19:55:28 -0700 Message-ID: To: Paul Ruizendaal Content-Type: multipart/alternative; boundary="00000000000048908b05f12af825" Message-ID-Hash: OERSRXUXGXWRZWQPOOZKEALV7NZ76SY5 X-Message-ID-Hash: OERSRXUXGXWRZWQPOOZKEALV7NZ76SY5 X-MailFrom: wlosh@bsdimp.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tuhs.tuhs.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: The Unix Historical Society , segaloco X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: A few comments on porting the Bourne shell List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --00000000000048908b05f12af825 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Dec 31, 2022 at 5:55 AM Paul Ruizendaal wrote: > The "assembly code in the Bourne shell" comment is in the same > London/Reiser paper. The full quote is: > > "The (Bourne) shell is the standard user command interpreter. It required > by far the largest conversion effort of any supposedly portable program, > for the simple reason that it is not portable. Critical portions are code= d > in assembly language and had to be painstakingly rewritten. The shell use= s > its own sbrk which is functionally different from the standard routine in > libc. The shell wants the routine which fields a signal to be passed a > parameter giving the number of the signal being caught; signal was also a > private rou- tine. This was handled by having the operating system provid= e > the parameter in the first place, doing away with the private code for > signal. The code in fixargs (for constructing the argument list to an exe= c > system call) had to be diddled." > > The files in the V7 tree on the Tuhs website are dated January 1979, so i= t > would seem that the fixes for 32V were immediately taken back to Research= . > As you point out, this means that the comments above do not refer to the > well known source code, but to a predecessor of that (which I don=E2=80= =99t think > survived). > We have ample evidence that V7 was really something more akin to a rolling release. Let me explain: We know from the leaked '50 changes' tape that many of the features were set earlier rather than later. This leaked in 1978 (if my notes are right), but I found references to it from as early as November 1976 in http://www.toad.com/early-usenix-newsletters/197611-unix-news-n11.pdf. This was 18 months after V6 was released, but over 2 years before V7 was released. In addition, we know from the AUUS newsletters in the archive document that the V7 release process process took a while to get through AT&T's legal department (IIRC a year, but I've not gone back to the AUUS newsletters to refresh my recollection). A big push of V7 was to make it portable as well (with AT&T doing an Interdata 8/32 port themselves, as well as at least looking at the Wollongong Interdata 7/32 port and the Harvard VM/370 port). In talking to Kirk and others that have been around from approximately that time, 32V was widely viewed as V7 for Vaxen. We can see evidence in the surviving 32V files of evolution from the 'PDP-11-like swapping to a more sophisticated paging algorithm' since we have the slowsys directory. It's my contention, as someone that coded in the era before good source code control, that it's evidence that somebody got it working, then renamed/copied it to slowsys while they got paging working so they could build either kernel for A/B testing. Kirk has also told me that the 32V port was started well in advance of V7's release to be both a useful product inside of Bell Labs (since Vaxen were starting to appear) as well as to prove that V7 was portable enough. I'll be the first to admit this is at best conjecture that matches available facts, artifacts and old timers recollections (sorry Kirk), but that we have no direct evidence for. It also allowed the 3BSD efforts to get going before the official V7 release due to the close ties between Bell Labs and Berkeley and the DARPA project around Unix. I believe that we can conclude that the original 'hard to port' Bourne shell was produced around the time of the 50 changes tape, give or take. And that all the unix porting efforts that pre-dated the V7 release rolled what appeared in 32V into V7 to reduce the amount of pdp-11 assembler. And those efforts are what we read about in the paper. It also goes a ways to explain the 32V meme of 'it was pdp11 swapping' because originally, in a version we no longer have, it was. But many of the 32V tapes that we have represent a later version where that had been abandoned in favor of what would evolve into System III's and later paging code. > Despite all the criticism voiced above, I think it is well understood tha= t > the original Bourne shell is an amazing piece of work that managed to fit > an enormous amount of functionality into a cramped address space. Its > longevity attests to that. That its internals became difficult to > understand is par for the course -- the 1980=E2=80=99s in essence needed = a Lions > commentary on sh. > Totally agree with that... Warner > > > On 30 Dec 2022, at 20:57, segaloco wrote: > > > > I'll have to double check later but I'm fairly certain the remaining L/= R > cheats are gone by SysV. From what I can tell much of that portability > work may have been done prior to the V7 release code base we're familiar > with, as I did some comparison and found only one significant change > between V7 and 32V code as I know it at least. Either the claims of > portability issues came between 32V and System III (meaning the shell was > accepted as "broken"? in 32V) or the code we actually see in V7 has alrea= dy > been tidied up significantly and doesn't represent the "non-portable" > version lamented in the famous quote. Does this observation hold with > reality? Is there an earlier, more PDP-11 bound version of the Bourne > Shell out there? I seem to recall reading something about some bits of i= t > even being in assembly at one point, but can't remember the quote source. > > > > - Matt G. > > > > ------- Original Message ------- > > On Friday, December 30th, 2022 at 10:25 AM, Paul Ruizendaal < > pnr@planet.nl> wrote: > > > > > >> London and Reiser report about porting the shell that =E2=80=9Cit requ= ired by > far the largest conversion effort of any supposedly portable program, for > the simple reason that it is not portable.=E2=80=9D By the time of SysIII= this is > greatly improved, but also in porting the SysIII user land it was the mos= t > complex of the set so far. > >> > >> There were three aspects that I found noteworthy: > >> > >> 1. London/Reiser apparently felt strongly about a property of casts. > The code argues that casting an l-value should not convert it into a > r-value: > >> > >> > >> > >> /* the following nonsense is required > >> * because casts turn an Lvalue > >> * into an Rvalue so two cheats > >> * are necessary, one for each context. > >> */ > >> union { int _cheat;}; > >> #define Lcheat(a) ((a)._cheat) > >> #define Rcheat(a) ((int)(a)) > >> > >> > >> > >> However, Lcheat is only used in two places (in service.c), to set and > to clear a flag in a pointer. Interestingly, the 32V code already replace= s > one of these instances with a regular r-value cast. So far, I=E2=80=99d n= ever > thought about this aspect of casts. I stumbled across it, because the Pla= n > 9 compiler did not accept the Lcheat expansion as valid C. > >> > >> 2. On the history of dup2 > >> > >> The shell code includes the following: > >> > >> > >> > >> rename(f1,f2) > >> REG INT f1, f2; > >> { > >> #ifdef RES /* research has different sys calls from TS */ > >> IF f1!=3Df2 > >> THEN dup(f1|DUPFLG, f2); > >> close(f1); > >> IF f2=3D=3D0 THEN ioset|=3D1 FI > >> FI > >> #else > >> INT fs; > >> IF f1!=3Df2 > >> THEN fs =3D fcntl(f2,1,0); > >> close(f2); > >> fcntl(f1,0,f2); > >> close(f1); > >> IF fs=3D=3D1 THEN fcntl(f2,2,1) FI > >> IF f2=3D=3D0 THEN ioset|=3D1 FI > >> FI > >> #endif > >> } > >> > >> > >> > >> I=E2=80=99ve check the 8th edition source, and indeed it supports usin= g DUPFLG > to signal to dup() that it really is dup2(). I had earlier wondered why > dup2() did not appear in research until 10th edition, but now that is > clear. It would seem that the dup of 8th edition is a direct ancestor to > dup() in Plan 9. I wonder why this way of doing things never caught on in > the other Unices. > >> > >> 3. Halfway to demand paging > >> > >> I stumbled across this one because I had a bug in my signal handling. > From early days onwards, Unix supported dynamically growing the stack > allocation, which arguably is a first step towards building the mechanism= s > for demand paging. It appears that the Bourne shell made another step, > catching page faults and expanding the data/bss allocation dynamically: > >> > >> > >> > >> VOID fault(sig) > >> REG INT sig; > >> { > >> signal(sig, fault); > >> IF sig=3D=3DMEMF > >> THEN IF setbrk(brkincr) =3D=3D -1 > >> THEN error(nospace); > >> FI > >> ELIF ... > >> > >> > >> > >> This was already present in 7th edition, so it is by no means new in > 32V or SysIII -- it had just escaped my attention as a conceptual step in > the development of Unix memory handling. > >> > > --00000000000048908b05f12af825 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Sat, Dec 31, 2022 at 5:55 AM Paul = Ruizendaal <pnr@planet.nl> wrote= :
The "asse= mbly code in the Bourne shell" comment is in the same London/Reiser pa= per. The full quote is:

"The (Bourne) shell is the standard user command interpreter. It requi= red by far the largest conversion effort of any supposedly portable program= , for the simple reason that it is not portable. Critical portions are code= d in assembly language and had to be painstakingly rewritten. The shell use= s its own sbrk which is functionally different from the standard routine in= libc. The shell wants the routine which fields a signal to be passed a par= ameter giving the number of the signal being caught; signal was also a priv= ate rou- tine. This was handled by having the operating system provide the = parameter in the first place, doing away with the private code for signal. = The code in fixargs (for constructing the argument list to an exec system c= all) had to be diddled."

The files in the V7 tree on the Tuhs website are dated January 1979, so it = would seem that the fixes for 32V were immediately taken back to Research. = As you point out, this means that the comments above do not refer to the we= ll known source code, but to a predecessor of that (which I don=E2=80=99t t= hink survived).

We have ample evidence = that V7 was really something more akin to a rolling release. Let me explain= : We know from the leaked '50 changes' tape that many of the featur= es were set earlier rather than later. This leaked in 1978 (if my notes are= right), but I found=C2=A0references to it from as early as November 1976 i= n http://www.toad.com/early-usenix-newsletters/197611-unix-news-n11.= pdf. This was 18 months after V6 was released, but over 2 years before = V7 was released. In addition, we know from the AUUS newsletters in the arch= ive document that the V7 release process process took a while to get throug= h AT&T's=C2=A0legal department (IIRC a year, but I've not gone = back to the AUUS newsletters to refresh my recollection). A big push of V7 = was to make it portable as well (with AT&T doing an Interdata 8/32 port= themselves, as well as at least looking at the Wollongong Interdata 7/32 p= ort and the Harvard VM/370 port). In talking to Kirk and others that have b= een around from approximately that time, 32V was widely viewed as V7 for Va= xen. We can see evidence in the surviving 32V files of evolution from the &= #39;PDP-11-like swapping to a more sophisticated paging algorithm' sinc= e we have the slowsys=C2=A0directory. It's my contention, as someone th= at coded in the era before good source code control, that it's evidence= that somebody got it working, then renamed/copied it to slowsys while they= got paging working so they could build either kernel for A/B testing. Kirk= has also told me that the 32V port was started well in advance of V7's= release to be both a useful product inside of Bell Labs (since Vaxen were = starting to appear) as well as to prove that V7 was portable enough. I'= ll be the first to admit this is at best conjecture that matches available = facts, artifacts and old timers recollections (sorry Kirk), but that we hav= e no direct evidence for. It also allowed the 3BSD efforts to get going bef= ore the official V7 release due to the close ties between Bell Labs and Ber= keley and the DARPA project around Unix.

I believe= that we can conclude that the original 'hard to port' Bourne shell= was produced around the time of the 50 changes tape, give or take. And tha= t all the unix porting efforts that pre-dated the V7 release rolled what ap= peared in 32V into V7 to reduce the amount of pdp-11 assembler. And those e= fforts are what we read about in the paper.

It als= o goes a ways to explain the 32V meme of 'it was pdp11 swapping' be= cause originally, in a version we no longer have, it was. But many of the 3= 2V tapes that we have represent a later version where that had been abandon= ed in favor of what would evolve into System III's and later paging cod= e.
=C2=A0
Despite all the criticism voiced above, I think it is well understood that = the original Bourne shell is an amazing piece of work that managed to fit a= n enormous amount of functionality into a cramped address space. Its longev= ity attests to that. That its internals became difficult to understand is p= ar for the course -- the 1980=E2=80=99s in essence needed a Lions commentar= y on sh.

Totally agree with that...

Warner
=C2=A0

> On 30 Dec 2022, at 20:57, segaloco <segaloco@protonmail.com> wrote:
>
> I'll have to double check later but I'm fairly certain the rem= aining L/R cheats are gone by SysV.=C2=A0 From what I can tell much of that= portability work may have been done prior to the V7 release code base we&#= 39;re familiar with, as I did some comparison and found only one significan= t change between V7 and 32V code as I know it at least.=C2=A0 Either the cl= aims of portability issues came between 32V and System III (meaning the she= ll was accepted as "broken"? in 32V) or the code we actually see = in V7 has already been tidied up significantly and doesn't represent th= e "non-portable" version lamented in the famous quote.=C2=A0 Does= this observation hold with reality?=C2=A0 Is there an earlier, more PDP-11= bound version of the Bourne Shell out there?=C2=A0 I seem to recall readin= g something about some bits of it even being in assembly at one point, but = can't remember the quote source.
>
> - Matt G.
>
> ------- Original Message -------
> On Friday, December 30th, 2022 at 10:25 AM, Paul Ruizendaal <pnr@planet.nl> wrote: >
>
>> London and Reiser report about porting the shell that =E2=80=9Cit = required by far the largest conversion effort of any supposedly portable pr= ogram, for the simple reason that it is not portable.=E2=80=9D By the time = of SysIII this is greatly improved, but also in porting the SysIII user lan= d it was the most complex of the set so far.
>>
>> There were three aspects that I found noteworthy:
>>
>> 1. London/Reiser apparently felt strongly about a property of cast= s. The code argues that casting an l-value should not convert it into a r-v= alue:
>>
>> <quote from "mode.h">
>>
>> /* the following nonsense is required
>> * because casts turn an Lvalue
>> * into an Rvalue so two cheats
>> * are necessary, one for each context.
>> */
>> union { int _cheat;};
>> #define Lcheat(a) ((a)._cheat)
>> #define Rcheat(a) ((int)(a))
>> <endquote>
>>
>>
>> However, Lcheat is only used in two places (in service.c), to set = and to clear a flag in a pointer. Interestingly, the 32V code already repla= ces one of these instances with a regular r-value cast. So far, I=E2=80=99d= never thought about this aspect of casts. I stumbled across it, because th= e Plan 9 compiler did not accept the Lcheat expansion as valid C.
>>
>> 2. On the history of dup2
>>
>> The shell code includes the following:
>>
>> <quote from =E2=80=9Cio.c=E2=80=9D>
>>
>> rename(f1,f2)
>> REG INT f1, f2;
>> {
>> #ifdef RES /* research has different sys calls from TS */
>> IF f1!=3Df2
>> THEN dup(f1|DUPFLG, f2);
>> close(f1);
>> IF f2=3D=3D0 THEN ioset|=3D1 FI
>> FI
>> #else
>> INT fs;
>> IF f1!=3Df2
>> THEN fs =3D fcntl(f2,1,0);
>> close(f2);
>> fcntl(f1,0,f2);
>> close(f1);
>> IF fs=3D=3D1 THEN fcntl(f2,2,1) FI
>> IF f2=3D=3D0 THEN ioset|=3D1 FI
>> FI
>> #endif
>> }
>> <endquote>
>>
>>
>> I=E2=80=99ve check the 8th edition source, and indeed it supports = using DUPFLG to signal to dup() that it really is dup2(). I had earlier won= dered why dup2() did not appear in research until 10th edition, but now tha= t is clear. It would seem that the dup of 8th edition is a direct ancestor = to dup() in Plan 9. I wonder why this way of doing things never caught on i= n the other Unices.
>>
>> 3. Halfway to demand paging
>>
>> I stumbled across this one because I had a bug in my signal handli= ng. From early days onwards, Unix supported dynamically growing the stack a= llocation, which arguably is a first step towards building the mechanisms f= or demand paging. It appears that the Bourne shell made another step, catch= ing page faults and expanding the data/bss allocation dynamically:
>>
>> <quote from =E2=80=9Cfault.c=E2=80=9D>
>>
>> VOID fault(sig)
>> REG INT sig;
>> {
>> signal(sig, fault);
>> IF sig=3D=3DMEMF
>> THEN IF setbrk(brkincr) =3D=3D -1
>> THEN error(nospace);
>> FI
>> ELIF ...
>> <endquote>
>>
>>
>> This was already present in 7th edition, so it is by no means new = in 32V or SysIII -- it had just escaped my attention as a conceptual step i= n the development of Unix memory handling.
>>

--00000000000048908b05f12af825--