From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,LOTS_OF_MONEY, MAILING_LIST_MULTI,T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 14854 invoked from network); 9 Dec 2022 21:50:31 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 9 Dec 2022 21:50:31 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 231E842392; Sat, 10 Dec 2022 07:50:24 +1000 (AEST) Received: from mail-oa1-f51.google.com (mail-oa1-f51.google.com [209.85.160.51]) by minnie.tuhs.org (Postfix) with ESMTPS id 97CE242391 for ; Sat, 10 Dec 2022 07:50:16 +1000 (AEST) Received: by mail-oa1-f51.google.com with SMTP id 586e51a60fabf-144b21f5e5fso1292033fac.12 for ; Fri, 09 Dec 2022 13:50:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=LhKgZ/A7O08jgdmEY+rDFGkpewSGQWvmDDThr2LYzTk=; b=nE8ostSEBBzFMEdhAcqyhkdzCX7JF+PqxHabuS36CzS+3bpBIVwVwSdO7KYarWfKOp EqchNGcztODgaDERJ745Z7I4B16l58JyLSkq5MSjvOmgXR1yJ2iLLCVTB55pq2afydWf PlI01Qv0x3fmPebd764NxyFG3cXWIoZSJW0cyvue58/rRPexaHDErNwRMkGl+95+RGaE 8/S00acjYkZ0kVvoJknhGpho65jAC4HP2lnT/2xXUlFy9xGrc8fTcXPJsKM/oxDgGaLg BUnt22YgBfn66l7mgLnyVPM54dWVcNU95INrjuqTqfy3LmvMpDLsjq+Y0nCmKyw+A22F Kk7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=LhKgZ/A7O08jgdmEY+rDFGkpewSGQWvmDDThr2LYzTk=; b=JP8uBgalx9cpmDO+XTLXVWD1/cn+yG67LrpRVU8se41rWYBUzTQ+KP6kYm5mQTSP3k Jl7pJJmZdbV1ULVb8iEqqpqxWwNlrZG+EObyOXdCa1xPCw0rNggkt69qAhwMlFSy3Gp5 P/2I1UFfRTcWngqcbUUO1pEdllhK9SCywGd4q7ICsowIslBHJ3gX7t61XKgtZcO9zrSV BK+8nlZBbM/g2C3Ae4oMjaSAr5O5J8dVOVTYJRkbML0/TDSUN6psOzl/VvFkaBmUiD6B feZUtOLPl01rwC8mfLYBnDCimIRZ0Z1zgu4zw1fEC/ckDJSgTX/Uq4AciTULFCf6goae Mh8Q== X-Gm-Message-State: ANoB5pn1CMC+kI1fKgYwOleJl9TvYTXY52Gn6ifNfEh48GO+TtHFsQbT LQXqy702zgYbIRnv5KFAZ2KAc8Qs2v+FAVI75vBssbZnlEnqhQ== X-Google-Smtp-Source: AA0mqf6nTvJCr9dxqwTpUQLfXuvl0LjXRoDZ9RNv/pj8QiX8ISjZxPIUaq1VAcgMLS07uvIbadSAbSz8U7GETUMsoqM= X-Received: by 2002:a05:6870:5b05:b0:148:451:3b81 with SMTP id ds5-20020a0568705b0500b0014804513b81mr269069oab.175.1670622554903; Fri, 09 Dec 2022 13:49:14 -0800 (PST) MIME-Version: 1.0 From: Marc Donner Date: Fri, 9 Dec 2022 16:49:03 -0500 Message-ID: To: The Eunuchs Hysterical Society Content-Type: multipart/alternative; boundary="000000000000f0592c05ef6c1ffb" Message-ID-Hash: YBWVN7WJ5RRZF6QZXMHVUIJPXRTFF6QP X-Message-ID-Hash: YBWVN7WJ5RRZF6QZXMHVUIJPXRTFF6QP X-MailFrom: marc.donner@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tuhs.tuhs.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Douglas McIlroy X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Bringing a Chainsaw List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --000000000000f0592c05ef6c1ffb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable (Recently I mentioned to Doug McIlroy that I had infiltrated IBM East Fishkill, reputedly one of the largest semiconductor fabs in the world, with UNIX back in the 1980s. He suggested that I write it up and share it here, so here it is.) In 1986 I was working at IBM Research in Yorktown Heights, New York. I had rejoined in 1984 after completing my PhD in computer science at CMU. One day I got a phone call from Rick Dill. Rick, a distinguished physicist who had, among other things, invented a technique that was key to economically fabricating semiconductor lasers, had been my first boss at IBM Research. While I=E2=80=99d been in Pittsburgh he had taken an assignm= ent at IBM=E2=80=99s big semiconductor fab up in East Fishkill. He was working to= make production processes there more efficient. He was about to initiate a major project, with a large capital cost, that involved deploying a bunch of computers and he wanted a certified computer scientist at the project review. He invited me to drive up to Fishkill, about half an hour north of the research lab, to attend a meeting. I agreed. At the meeting I learned several things. First of all, the chipmaking process involved many steps - perhaps fifty or sixty for each wafer full of chips. The processing steps individually were expensive, and the amount spent on each wafer was substantial. Because processing was imperfect, it was imperative to check the results every few steps to make sure everything was OK. Each wafer included a number of test articles, landing points for test probes, scattered around the surface. Measurements of these test articles were carried out on a special piece of equipment, I think bought from Fairchild Semiconductor. It would take in a boat of wafers (identical wafers were grouped together on special ceramic holders called boats, for automatic handling, and all processed identically) and feed each wafer to the test station, and probe each test article in turn. The result was about a megabyte of data covering all of the wafers in the boat. At this point the data had to be analyzed. The analysis program comprised an interpreter called TAHOE along with a test program, one for each different wafer being fabricated. The results indicated whether the wafers in the boat were good, needed some rework, or had to be discarded. These were the days before local area networking at IBM, so getting the data from the test machine to the mainframe for analysis involved numerous manual steps and took about six hours. To improve quality control, each boat of wafers was only worked during a single eight-hour shift, so getting the test results generally meant a 24 hour pause in the processing of the boat, even though the analysis only took a couple of seconds of time on the mainframe. IBM had recently released a physically small mainframe based on customized CPU chips from Motorola. This machine, the size of a large suitcase and priced at about a million dollars, was suitable to locate next to each test machine, thus eliminating the six hour wait to see results. Because there were something like 50 of the big test machines at the Fishkill site, project represented a major capital expenditure. Getting funding of this size approved would take six to twelve months, and this meeting was the first step in seeking this approval. At the end of the meeting I asked for a copy of the manual for the TAHOE test language. Someone gave me a copy and I took it home over the weekend and read through it. The following Monday I called Rick up and told him that I thought I could implement an interpreter for the TAHOE language in about a month of work. That was a tiny enough investment that Rick simply wrote a letter to Ralph Gomory, then head of IBM Research, to requisition me for a month. I told the Fishkill folks that I needed a UNIX machine to do this work and they procured an RT PC running AIX 1. AIX 1 was based on System V. The critical thing to me was that it had lex, yacc, vi, and make. They set me up in an empty lab room with the machine and a work table. Relatively quickly I built a lexical analyzer for the language in lex and got an approximation to the grammar for the TAHOE language working in yacc. The rest was implementing the functions for each of the TAHOE primitives. I adopted rigorous test automation early, a practice people now call test driven development. Each time I added a capability to the interpreter I wrote a scrap of TAHOE code to test it along with a piece of reference input. I created a test target in the testing Makefile that ran the interpreter with the test program and the reference input. There were four directories, one for test scripts, one for input data, one for expected outputs, and one for actual outputs. There was a big make file that had a target for each test. Running all of the tests was simply a matter of typing =E2=80=98make test=E2=80=99 in the root of the testing tree. Only i= f all of the tests succeeded would I consider a build acceptable. As I developed the interpreter I learned to build tests also for bugs as I encountered them. This was because I discovered that I would occasionally reintroduce bugs, so these tests, with the same structure (test scrap, input data, reference output, make target) were very useful at catching backsliding before it got away from me. After a while I had implemented the entire TAHOE language. I named the interpreter MONO after looking at the maps of the area near Lake Tahoe and seeing Mono Lake, a small lake nearby. Lake Tahoe and Mono Lake, with walking routes between them. Source: Google Maps At this point I asked my handler at Fishkill for a set of real input data, a real test program, and a real set of output data. He got me the files and I set to work. The only tricky bit at this stage was the difference in floating point between the RT PC machine, which used the recently adopted IEEE 754 floating point standard and the idiosyncratic floating point implemented in the System 370 mainframes at the time. The problem was that the LSB rounding rules were different in the two machines, resulting in mismatches in results. These mismatches were way below the resolution of the actual data, but deciding how to handle this was tricky. At this point I had an interpreter, MONO, for the TAHOE language that took one specific TAHOE program, some real data, and produced output that matched the TAHOE output. Almost done. I asked my handler, a lovely guy whose name I am ashamed I do not remember, to get me the regression test suite for TAHOE. He took me over and introduced me to the woman who managed the team that was developing and maintaining the TAHOE interpreter. The TAHOE interpreter had been under development, I gathered, for about 25 years and was a large amount of assembler code. I asked her for the regression test suite for the TAHOE interpreter. She did not recognize the term, but I was not dismayed - IBM had their own names for everything (disk was DASD and a boot program was IPL) and I figured it would be Polka Dot or something equally evocative. I described what my regression test suite did and her face lit up. =E2=80=9C= What a great idea!=E2=80=9D she exclaimed. Anyway, at that point I handed the interpreter code over to the Fishkill organization. C compilers were available for the PC by that time, so they were able to deploy it on PC-AT machines that they located at each testing machine. Since a PC-AT could be had for about $5,000 in those days the savings from the original proposal was about $50 million and about a year of elapsed time. The analysis of a boat=E2=80=99s worth of data on the PC-= AT took perhaps a minute or two, so quite a bit slower than on the mainframe, but the elimination of the six-hour delay meant that a boat could progress forward in its processing on the same day rather than a day later. One of my final conversations with my Fishkill handler was about getting them some UNIX training. In those days the only way to get UNIX training was from AT&T. Doing business with AT&T at IBM in those days involved very high-level approvals - I think it required either the CEO or a direct report to the CEO. He showed me the form he needed to get approved in order to take this course, priced at about $1,500 at the time. It required twelve signatures. When I expressed horror he noted that I shouldn=E2=80= =99t worry because the first six were based in the building we were standing in. That=E2=80=99s when I began to grasp how big IBM was in those days. Anyway, about five years later I left IBM. Just before I resigned the Fishkill folks invited me up to attend a celebratory dinner. Awards were given to many people involved in the project, including me. I learned that there was now a department of more than 30 people dedicated to maintaining the program that had taken me a month to build. Rick Dill noted that one of the side benefits of the approach that I had taken was the production of a formal grammar for the TAHOE language. At one point near the end of the project I had a long reflective conversation with my Fishkill minder. He spun a metaphor about what I had done with this project. Roughly speaking, he said, =E2=80=9CWe were a bunc= h of guys cutting down trees by beating on them with stones. We heard that there was this thing called an axe, and someone sent a guy we thought would show us how to cut down trees with an axe. Imagine our surprise when he whipped out a chainsaw.=E2=80=9D =3D=3D=3D=3D=3D nygeek.net mindthegapdialogs.com/home --000000000000f0592c05ef6c1ffb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
(Recently I mentioned to Doug McIlroy that I ha= d infiltrated IBM East Fishkill, reputedly one of the largest=C2=A0semicond= uctor fabs in the world, with UNIX back in the 1980s.=C2=A0 He suggested th= at I write it up and share it here, so here it is.)

= In 1986 I was working at IBM Research in Yorktown Heights, New York.=C2=A0 = I had rejoined in 1984 after completing my PhD in computer science at CMU.<= /span>


One day I got a phone call from Rick Dill.=C2=A0 Rick, a distinguished= physicist who had, among other things, invented a technique that was key t= o economically fabricating semiconductor lasers, had been my first boss at = IBM Research.=C2=A0 While I=E2=80=99d been in Pittsburgh he had taken an as= signment at IBM=E2=80=99s big semiconductor fab up in East Fishkill.=C2=A0 = He was working to make production processes there more efficient.=C2=A0 He = was about to initiate a major project, with a large capital cost, that invo= lved deploying a bunch of computers and he wanted a certified computer scie= ntist at the project review.=C2=A0 He invited me to drive up to Fishkill, a= bout half an hour north of the research lab, to attend a meeting.=C2=A0 I a= greed.


At the meeting I learned several things.=C2=A0 First of all, th= e chipmaking process involved many steps - perhaps fifty or sixty for each = wafer full of chips.=C2=A0 The processing steps individually were expensive= , and the amount spent on each wafer was substantial.=C2=A0 Because process= ing was imperfect, it was imperative to check the results every few steps t= o make sure everything was OK.=C2=A0 Each wafer included a number of test a= rticles, landing points for test probes, scattered around the surface.=C2= =A0 Measurements of these test articles were carried out on a special piece= of equipment, I think bought from Fairchild Semiconductor.=C2=A0 It would = take in a boat of wafers (identical wafers were grouped together on special= ceramic holders called boats, for automatic handling, and all processed id= entically) and feed each wafer to the test station, and probe each test art= icle in turn.=C2=A0 The result was about a megabyte of data covering all of= the wafers in the boat.


At this point the data had to be analyzed.=C2= =A0 The analysis program comprised an interpreter called TAHOE along with a= test program, one for each different wafer being fabricated.=C2=A0 The res= ults indicated whether the wafers in the boat were good, needed some rework= , or had to be discarded.


These were the days before local area networ= king at IBM, so getting the data from the test machine to the mainframe for= analysis involved numerous manual steps and took about six hours.=C2=A0 To= improve quality control, each boat of wafers was only worked during a sing= le eight-hour shift, so getting the test results generally meant a 24 hour = pause in the processing of the boat, even though the analysis only took a c= ouple of seconds of time on the mainframe.


IBM had recently released a= physically small mainframe based on customized CPU chips from Motorola.=C2= =A0 This machine, the size of a large suitcase and priced at about a millio= n dollars, was suitable to locate next to each test machine, thus eliminati= ng the six hour wait to see results.


Because there were something like= 50 of the big test machines at the Fishkill site, project represented a ma= jor capital expenditure.=C2=A0 Getting funding of this size approved would = take six to twelve months, and this meeting was the first step in seeking t= his approval.


At the end of the meeting I asked for a copy of the manu= al for the TAHOE test language.=C2=A0 Someone gave me a copy and I took it = home over the weekend and read through it.


The following Monday I call= ed Rick up and told him that I thought I could implement an interpreter for= the TAHOE language in about a month of work.=C2=A0 That was a tiny enough = investment that Rick simply wrote a letter to Ralph Gomory, then head of IB= M Research, to requisition me for a month.=C2=A0 I told the Fishkill folks = that I needed a UNIX machine to do this work and they procured an RT PC run= ning AIX 1.=C2=A0 AIX 1 was based on System V.=C2=A0 The critical thing to = me was that it had lex, yacc, vi, and make.


They set me up in an empty= lab room with the machine and a work table.=C2=A0 Relatively quickly I bui= lt a lexical analyzer for the language in lex and got an approximation to t= he grammar for the TAHOE language working in yacc.=C2=A0 The rest was imple= menting the functions for each of the TAHOE primitives.


I adopted rigo= rous test automation early, a practice people now call test driven developm= ent.=C2=A0 Each time I added a capability to the interpreter I wrote a scra= p of TAHOE code to test it along with a piece of reference input.=C2=A0 I c= reated a test target in the testing Makefile that ran the interpreter with = the test program and the reference input.=C2=A0 There were four directories= , one for test scripts, one for input data, one for expected outputs, and o= ne for actual outputs.=C2=A0 There was a big make file that had a target fo= r each test.=C2=A0 Running all of the tests was simply a matter of typing = =E2=80=98make test=E2=80=99 in the root of the testing tree.=C2=A0 Only if = all of the tests succeeded would I consider a build acceptable.

<= br>

As I d= eveloped the interpreter I learned to build tests also for bugs as I encoun= tered them.=C2=A0 This was because I discovered that I would occasionally r= eintroduce bugs, so these tests, with the same structure (test scrap, input= data, reference output, make target) were very useful at catching backslid= ing before it got away from me.


After a while I had implemented the en= tire TAHOE language.=C2=A0 I named the interpreter MONO after looking at th= e maps of the area near Lake Tahoe and seeing Mono Lake, a small lake nearb= y.


=

Lake Tahoe and Mono = Lake, with walking routes between them.=C2=A0 Source: Google Maps


At this point I asked my handler at Fishkill fo= r a set of real input data, a real test program, and a real set of output d= ata.=C2=A0 He got me the files and I set to work.


The only tricky bit = at this stage was the difference in floating point between the RT PC machin= e, which used the recently adopted IEEE 754 floating point standard and the= idiosyncratic floating point implemented in the System 370 mainframes at t= he time.=C2=A0 The problem was that the LSB rounding rules were different i= n the two machines, resulting in mismatches in results.=C2=A0 These mismatc= hes were way below the resolution of the actual data, but deciding how to h= andle this was tricky.


At this point I had an interpreter, MONO, for t= he TAHOE language that took one specific TAHOE program, some real data, and= produced output that matched the TAHOE output.=C2=A0 Almost done.

I a= sked my handler, a lovely guy whose name I am ashamed I do not remember, to= get me the regression test suite for TAHOE.=C2=A0 He took me over and intr= oduced me to the woman who managed the team that was developing and maintai= ning the TAHOE interpreter.=C2=A0 The TAHOE interpreter had been under deve= lopment, I gathered, for about 25 years and was a large amount of assembler= code.=C2=A0 I asked her for the regression test suite for the TAHOE interp= reter.=C2=A0 She did not recognize the term, but I was not dismayed - IBM h= ad their own names for everything (disk was DASD and a boot program was IPL= ) and I figured it would be Polka Dot or something equally evocative.=C2=A0= I described what my regression test suite did and her face lit up.=C2=A0 = =E2=80=9CWhat a great idea!=E2=80=9D she exclaimed.


Anyway, at that po= int I handed the interpreter code over to the Fishkill organization.=C2=A0 = C compilers were available for the PC by that time, so they were able to de= ploy it on PC-AT machines that they located at each testing machine.=C2=A0 = Since a PC-AT could be had for about $5,000 in those days the savings from = the original proposal was about $50 million and about a year of elapsed tim= e.=C2=A0 The analysis of a boat=E2=80=99s worth of data on the PC-AT took p= erhaps a minute or two, so quite a bit slower than on the mainframe, but th= e elimination of the six-hour delay meant that a boat could progress forwar= d in its processing on the same day rather than a day later.


=

= One of my= final conversations with my Fishkill handler was about getting them some U= NIX training.=C2=A0 In those days the only way to get UNIX training was fro= m AT&T.=C2=A0 Doing business with AT&T at IBM in those days involve= d very high-level approvals - I think it required either the CEO or a direc= t report to the CEO.=C2=A0 He showed me the form he needed to get approved = in order to take this course, priced at about $1,500 at the time.=C2=A0 It = required twelve signatures.=C2=A0 When I expressed horror he noted that I s= houldn=E2=80=99t worry because the first six were based in the building we = were standing in.=C2=A0 That=E2=80=99s when I began to grasp how big IBM wa= s in those days.


Anyway, about five years later I left IBM.=C2=A0 Just= before I resigned the Fishkill folks invited me up to attend a celebratory= dinner.=C2=A0 Awards were given to many people involved in the project, in= cluding me.=C2=A0 I learned that there was now a department of more than 30= people dedicated to maintaining the program that had taken me a month to b= uild.=C2=A0 Rick Dill noted that one of the side benefits of the approach t= hat I had taken was the production of a formal grammar for the TAHOE langua= ge.


At one point near the end of the project I had a long reflective c= onversation with my Fishkill minder.=C2=A0 He spun a metaphor about what I = had done with this project.=C2=A0 Roughly speaking, he said, =E2=80=9CWe we= re a bunch of guys cutting down trees by beating on them with stones.=C2=A0= We heard that there was this thing called an axe, and someone sent a guy w= e thought would show us how to cut down trees with an axe.=C2=A0 Imagine ou= r surprise when he whipped out a chainsaw.=E2=80=9D

=
--000000000000f0592c05ef6c1ffb--