From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [50.116.15.146]) by inbox.vuxu.org (Postfix) with ESMTP id 0935224932 for ; Wed, 8 May 2024 00:08:10 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id CB631432C1; Wed, 8 May 2024 08:08:04 +1000 (AEST) Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) by minnie.tuhs.org (Postfix) with ESMTPS id 30387432BE for ; Wed, 8 May 2024 08:07:57 +1000 (AEST) Received: by mail-pg1-x52f.google.com with SMTP id 41be03b00d2f7-53fa455cd94so2966502a12.2 for ; Tue, 07 May 2024 15:07:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715119676; x=1715724476; darn=tuhs.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=tZrKEnNf2BlxfLo8fRO8e+A53+qOGRiIQMgPoxJRl+g=; b=bV9hcwhGrGwhrQ9zOTkUWwIbnfLvvIG49CDPWcb446KapCWjlWWNtEHJzVWolVrnqw 0e18G/PEHtqPpb9ktSf3gDg5Mz3hB5Z9++71zoAp/84ayFhC0s+KxP5Tbq/XKusnrly2 I6mSGPv4vkaNY5mqUwUtbWogmY32Rbc9dazfWGfq5dM6YIrqwfaEED9tbc96cF4lQat9 2+S864s29wUOrrUgaBNwmDjm/ofLcCDJZg3sD0CqtHGzyDCGyMVuDwv2j+olaCKReuTr ey7awMAwF25VjgWQatxxauJqDsdwm07SsH6y5Ja8pw51UlZLRnvrgWZ24QuaB8lrE+8q ILug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715119676; x=1715724476; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tZrKEnNf2BlxfLo8fRO8e+A53+qOGRiIQMgPoxJRl+g=; b=aHjHnP7zuY4avCseq66gpfTFfmu/2w1Mytij37GVE30XR3tNbq9Dlt+VnUZEOZxjPj 4wj9dWo2oLWFiwYO7ix/P2b77eh6g2Uu+idulzvq2arh+4qSsAPoOzVI6BvS9jIf7/kW aoxQlwE5EG3Cv3H+GipkWiX8Wka/YsZp9X95OZ61vcPViWGCM9B/d6LJxBwHgDHsDupJ 3Po58BtDFcoVkUykfP5Yo9njZsdIWxBeiTVuWQRSsmUoxok8IyBeQB2H4XsQxdep5rxO WRXjMoBxumaMzu5WtTCLnig03P9SrkM1nnH6QhASk7CEjD26emwYcsKm5wTVgvddM51x 6r0g== X-Gm-Message-State: AOJu0YxWyrxoB3p8VLLIla/3PFFsGF5ORutHUMQiun/X2sYxXYdaSBqq +cAHdbnous0gyK7Xk/HIyas+FJ4VTmPsuAEa5qnQOkMmHg1P+IjifUlwcxMgSf+msu63s2E5usq wcug+jdNR5/2sxq5ljPpnNIZUHEs= X-Google-Smtp-Source: AGHT+IHDqG1QGjgyPN0BPYKh7Tog1eB7lBkWiiquMc0Kxb2fYDiJBy+PfceMs6QTTm6GHweBEZlE9mkfesgNvf7y+Fg= X-Received: by 2002:a17:90a:b108:b0:2b2:c278:279a with SMTP id 98e67ed59e1d1-2b6165c4eb8mr947061a91.23.1715119676399; Tue, 07 May 2024 15:07:56 -0700 (PDT) MIME-Version: 1.0 References: <18efd14f-4da6-4771-ad6a-901c6cb6105d@planet.nl> In-Reply-To: <18efd14f-4da6-4771-ad6a-901c6cb6105d@planet.nl> From: Rob Pike Date: Wed, 8 May 2024 08:07:44 +1000 Message-ID: To: Paul Ruizendaal Content-Type: multipart/alternative; boundary="0000000000000f27ab0617e46b52" Message-ID-Hash: YO5PZFB35L7ZLFTR5PO4Z3JWZ2A37LUZ X-Message-ID-Hash: YO5PZFB35L7ZLFTR5PO4Z3JWZ2A37LUZ X-MailFrom: robpike@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: On the uniqueness of DMR's C compiler List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --0000000000000f27ab0617e46b52 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I'm not sure I accept your starting position. There were several compilers for RT-11 and RSX/11-M. RSX (and perhaps RT) Fortran were threaded code, but I don't believe they all were. And of course there was BCPL, which was - and is - tiny; was it on the 11? And there were other small machines from other manufacturers, all of which had some form of Fortran and other bespoke things, such as RPG on the small IBMs. I think the uniqueness was in the set of conditions more than in the Unix C compiler itself. But you may be right. -rob On Wed, May 8, 2024 at 6:59=E2=80=AFAM Paul Ruizendaal wrot= e: > In the last months, I've spent a little time on curating John Walker's > Unix clone and software stack, including an emulator to run it: > https://gitlab.com/marinchip > > After creating a basic tool chain (edit, asm, link and a simple > executive), John set out to find a compiler. Among the first programs we= re > a port of the META 3 compiler-generator (similar to TMG on early Unix) an= d > a port of Birch-Hansen=E2=80=99s Pascal compiler. META was used to create= a > compiler that generated threaded code. He found neither compiler good > enough for his goals and settled on writing his Unix-like OS in assembler= . > As the 9900 architecture withered after 1980, this sealed the fate of thi= s > OS early on -- had he found a good compiler, the code might have competed > alongside Coherent, Idris, and Minix during the 80=E2=80=99s. > > This made me realise once more how unique the Ritchie C compiler was. In > my view its uniqueness combines three aspects: > 1. The C language itself > 2. The ability to run natively on small hardware (even an LSI-11 system) > 3. Generating code with modest overhead versus handwritten assembler (say > 30%) > > As has been observed before, working at a higher abstraction level makes > it easier to work on algorithms and on refactoring, often earning back th= e > efficiency loss. John Walkers work may be case in point: I estimate that > his hand-coded kernel is 10% larger than an equivalent V6 Unix kernel (as > compiled for the 9900 architecture). > > There are three papers on DMR=E2=80=99s website about the history of the = compiler > and a compare-and-contrast with other compilers of the era: > https://www.bell-labs.com/usr/dmr/www/primevalC.html > https://www.bell-labs.com/usr/dmr/www/chist.html > https://www.bell-labs.com/usr/dmr/www/hopl.html > > It seems to me that these papers rather understate the importance of > generating good quality code. As far as I can tell, BCPL and BLISS came > close, but were too large to run on a PDP-11 and only existed as > cross-compilers. PL/M was a cross-compiler and generated poorer code. > Pascal on small machines compiled to a virtual machine. As far as I can > tell, during most of the 70s there was no other compiler that generated > good quality code and ran natively on a small (i.e. PDP-11 class) machine= . > > As far as I can tell the uniqueness was mostly in the =E2=80=9Cc1=E2=80= =9D phase of the > compiler. The front-end code of the =E2=80=9Cc0=E2=80=9D phase seems to u= se more or less > similar techniques as many contemporary compilers. The =E2=80=9Cc1=E2=80= =9D phase seems to > have been unique in that it managed to do register allocation and > instruction selection with a pattern matcher and associated code tables > squeezed into a small address space. On a small machine, other native > compilers of the era typically proceeded to generate threaded code, code > for a virtual machine or poor quality native code that evaluated > expressions using stack operations rather than registers. > > I am not sure why DMR's approach was not more widely used in the 1970=E2= =80=99s. > The algorithms he used do not seem to be new and appear to have their roo= ts > in other (larger) compilers of the 1960=E2=80=99s. The basic design seems= to have > been in place from the very first iterations of his compiler in 1972 (see > V2 tree on TUHS) and he does not mention these algorithms as being specia= l > or innovative in his later papers. > > Any observations / opinions on why DMR=E2=80=99s approach was not more wi= dely used > in the 1970=E2=80=99s? > --0000000000000f27ab0617e46b52 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'm not sure I accept your starting position. There were seve= ral compilers for RT-11 and RSX/11-M. RSX (and perhaps RT) Fortran were thr= eaded code, but I don't believe they all were. And of course there was = BCPL, which was - and is - tiny; was it on the 11?

And there were other sm= all machines from other manufacturers, all of which had some form of Fortra= n and other bespoke things, such as RPG on the small IBMs. I think the uniq= ueness was in the set of conditions more than in the Unix C compiler itself= .
=
But you may be right.

-rob




On Wed, May 8, 2024 at 6:59= =E2=80=AFAM Paul Ruizendaal <pnr@planet= .nl> wrote:
=20 =20 =20
In the last= months, I've spent a little time on curating John Walker's Unix cl= one and software stack, including an emulator to run it:
https://gitlab.com/mari= nchip

After c= reating a basic tool chain (edit, asm, link and a simple executive), John= =C2=A0 set out to find a compiler. Among the first programs were a port of = the META 3 compiler-generator (similar to TMG on early Unix) and a port of = Birch-Hansen=E2=80=99s Pascal compiler. META was used to create a compiler = that generated threaded code. He found neither compiler good enough for his= goals and settled on writing his Unix-like OS in assembler. As the 9900 ar= chitecture withered after 1980, this sealed the fate of this OS early on --= had he found a good compiler, the code might have competed alongside Coher= ent, Idris, and Minix during the 80=E2=80=99s.

This ma= de me realise once more how unique the Ritchie C compiler was. In my view i= ts uniqueness combines three aspects:
1. The = C language itself
2. The = ability to run natively on small hardware (even an LSI-11 system)
3. Gene= rating code with modest overhead versus handwritten assembler (say 30%)

As has = been observed before, working at a higher abstraction level makes it easier= to work on algorithms and on refactoring, often earning back the efficienc= y loss. John Walkers work may be case in point: I estimate that his hand-co= ded kernel is 10% larger than an equivalent V6 Unix kernel (as compiled for= the 9900 architecture).

There a= re three papers on DMR=E2=80=99s website about the history of the compiler = and a compare-and-contrast with other compilers of the era:
https://www.bell-labs.com/usr/dmr/www/primevalC.html
htt= ps://www.bell-labs.com/usr/dmr/www/chist.html
http= s://www.bell-labs.com/usr/dmr/www/hopl.html

It seem= s to me that these papers rather understate the importance of generating go= od quality code. As far as I can tell, BCPL and BLISS came close, but were = too large to run on a PDP-11 and only existed as cross-compilers. PL/M was = a cross-compiler and generated poorer code. Pascal on small machines compil= ed to a virtual machine. As far as I can tell, during most of the 70s there= was no other compiler that generated good quality code and ran natively on= a small (i.e. PDP-11 class)=C2=A0machine.

As far = as I can tell the uniqueness was mostly in the =E2=80=9Cc1=E2=80=9D phase o= f the compiler. The front-end code of the =E2=80=9Cc0=E2=80=9D phase seems = to use more or less similar techniques as many contemporary compilers. The = =E2=80=9Cc1=E2=80=9D phase seems to have been unique in that it managed to = do register allocation and instruction selection with a pattern matcher and= associated code tables squeezed into a small address space. On a small mac= hine, other native compilers of the era typically proceeded to generate thr= eaded code, code for a virtual machine or poor quality native code that eva= luated expressions using stack operations rather than registers.

I am no= t sure why DMR's approach was not more widely used in the 1970=E2=80=99= s. The algorithms he used do not seem to be new and appear to have their ro= ots in other (larger) compilers of the 1960=E2=80=99s. The basic design see= ms to have been in place from the very first iterations of his compiler in = 1972 (see V2 tree on TUHS) and he does not mention these algorithms as bein= g special or innovative in his later papers.

Any obs= ervations / opinions on why DMR=E2=80=99s approach was not more widely used= in the 1970=E2=80=99s?
--0000000000000f27ab0617e46b52--