From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,RDNS_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 Received: (qmail 30199 invoked from network); 13 Mar 2020 23:32:16 -0000 Received-SPF: pass (minnie.tuhs.org: domain of minnie.tuhs.org designates 45.79.103.53 as permitted sender) receiver=inbox.vuxu.org; client-ip=45.79.103.53 envelope-from= Received: from unknown (HELO minnie.tuhs.org) (45.79.103.53) by inbox.vuxu.org with ESMTP; 13 Mar 2020 23:32:16 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 3B4B79CD94; Sat, 14 Mar 2020 09:32:14 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 70D049CD60; Sat, 14 Mar 2020 09:31:42 +1000 (AEST) Received: by minnie.tuhs.org (Postfix, from userid 112) id 991CF9CD60; Sat, 14 Mar 2020 09:31:39 +1000 (AEST) Received: from mail.cs.dartmouth.edu (mail.cs.dartmouth.edu [129.170.212.100]) by minnie.tuhs.org (Postfix) with ESMTPS id EC7719CD5F for ; Sat, 14 Mar 2020 09:31:38 +1000 (AEST) Received: from tahoe.cs.Dartmouth.EDU (tahoe.cs.dartmouth.edu [129.170.212.20]) by mail.cs.dartmouth.edu (8.15.2/8.15.2) with ESMTPS id 02DNVaa23555403 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Fri, 13 Mar 2020 19:31:36 -0400 Received: from tahoe.cs.Dartmouth.EDU (localhost.localdomain [127.0.0.1]) by tahoe.cs.Dartmouth.EDU (8.15.2/8.14.3) with ESMTP id 02DNVahn061502 for ; Fri, 13 Mar 2020 19:31:36 -0400 Received: (from doug@localhost) by tahoe.cs.Dartmouth.EDU (8.15.2/8.15.2/Submit) id 02DNVaxN061501 for tuhs@tuhs.org; Fri, 13 Mar 2020 19:31:36 -0400 From: Doug McIlroy Message-Id: <202003132331.02DNVaxN061501@tahoe.cs.Dartmouth.EDU> Date: Fri, 13 Mar 2020 19:31:36 -0400 To: tuhs@tuhs.org User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: [TUHS] The most surprising Unix programs X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" Once in a while a new program really surprises me. Reminiscing a while ago, I came up with a list of eye-opening Unix gems. Only a couple of these programs are indispensable or much used. What singles them out is their originality. I cannot imagine myself inventing any of them. What programs have struck you similarly? PDP-7 Unix The simplicity and power of the system caused me to turn away from big iron to a tiny machine. It offered the essence of the hierarchical file system, separate shell, and user-level process control that Multics had yet to deliver after hundreds of man-years' effort. Unix's lacks (e.g. record structure in the file system) were as enlightening and liberating as its novelties (e.g. shell redirection operators). dc The math library for Bob Morris's variable-precision desk calculator used backward error analysis to determine the precision necessary at each step to attain the user-specified precision of the result. In my software-components talk at the 1968 NATO conference on software engineering, I posited measurement-standard routines, which could deliver results of any desired precision, but did not know how to design one. dc still has the only such routines I know of. typo Typo ordered the words of a text by their similarity to the rest of the text. Typographic errors like "hte" tended to the front (dissimilar) end of the list. Bob Morris proudly said it would work as well on Urdu as it did on English. Although typo didn't help with phonetic misspellings, it was a godsend for amateur typists, and got plenty of use until the advent of a much less interesting, but more precise, dictionary-based spelling checker. Typo was as surprising inside as it was outside. Its similarity measure was based on trigram frequencies, which it counted in a 26x26x26 array. The small memory, which had barely room enough for 1-byte counters, spurred a scheme for squeezing large numbers into small counters. To avoid overflow, counters were updated probabilistically to maintain an estimate of the logarithm of the count. eqn With the advent of phototypesetting, it became possible, but hideously tedious, to output classical math notation. Lorinda Cherry set out to devise a higher-level description language and was soon joined by Brian Kernighan. Their brilliant stroke was to adapt oral tradition into written expression, so eqn was remarkably easy to learn. The first of its kind, eqn has barely been improved upon since. struct Brenda Baker undertook her Fortan-to-Ratfor converter against the advice of her department head--me. I thought it would likely produce an ad hoc reordering of the orginal, freed of statement numbers, but otherwise no more readable than a properly indented Fortran program. Brenda proved me wrong. She discovered that every Fortran program has a canonically structured form. Programmers preferred the canonicalized form to what they had originally written. pascal The syntax diagnostics from the compiler made by Sue Graham's group at Berkeley were the mmost helpful I have ever seen--and they were generated automatically. At a syntax error the compiler would suggest a token that could be inserted that would allow parsing to proceed further. No attempt was made to explain what was wrong. The compiler taught me Pascal in an evening, with no manual at hand. parts Hidden inside WWB (writer's workbench), Lorinda Cherry's Parts annotated English text with parts of speech, based on only a smidgen of English vocabulary, orthography, and grammar. From Parts markup, WWB inferred stylometrics such as the prevalance of adjectives, subordinate clauses, and compound sentences. The Today show picked up on WWB and interviewed Lorinda about it in the first TV exposure of anything Unix. egrep Al Aho expected his deterministic regular-expression recognizer would beat Ken's classic nondeterministic recognizer. Unfortunately, for single-shot use on complex regular expressions, Ken's could finish while egrep was still busy building a deterministic automaton. To finally gain the prize, Al sidestepped the curse of the automaton's exponentially big state table by inventing a way to build on the fly only the table entries that are actually visited during recognition. crabs Luca Cardelli's charming meta-program for the Blit window system released crabs that wandered around in empty screen space nibbling away at the ever more ragged edges of active windows. Some common threads Theory, though invisible on the surface, played a crucial role in the majority of these programs: typo, dc, struct, pascal, egrep. In fact much of their surprise lay in the novelty of the application of theory. Originators of nearly half the list--pascal, struct, parts, eqn--were women, well beyond women's demographic share of computer science. Doug McIlroy March, 2020