The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* Re: [TUHS] The most surprising Unix programs
@ 2020-03-20 14:03 Noel Chiappa
  2020-03-20 14:08 ` Richard Salz
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Noel Chiappa @ 2020-03-20 14:03 UTC (permalink / raw)
  To: tuhs; +Cc: jnc

    > From: Paul Guertin

    > I teach math in college ... Sometimes, during an exam, a student who
    > forgot to bring their calculator will ask if they can borrow mine I
    > always say "sure, but you'll regret it" and hand them the calculator
    > After wasting one or two minutes, they give it back

Maybe I'm being clueless/over-asking, but to me it's appalling that any
college student (at least all who have _any_ math requirement at all; not sure
how many that is) doesn't know how an RPN calculator works. It's not exactly
rocket science, and any reasonably intelligent high-schooler should get it
extremely quickly; just tell them it's just a representational thing, number
number operator instead of number operator number. I know it's not a key
intellectual skill, but it does seem to me to be part of comon intellectual
heritage that everyone should know, like musical scales or poetry
rhyming. Have you ever considered taking two minutes (literally!) to cover it
briefly, just 'someone tried to borrow my RPN calculator, here's the basic
idea of how they work'?

	Noel


^ permalink raw reply	[flat|nested] 54+ messages in thread
* Re: [TUHS] The most surprising Unix programs
@ 2020-03-21  1:12 Noel Chiappa
  0 siblings, 0 replies; 54+ messages in thread
From: Noel Chiappa @ 2020-03-21  1:12 UTC (permalink / raw)
  To: tuhs; +Cc: jnc

    > From: Dagobert Michelsen

    > the excellent book "G=C3=B6del, Escher, Bach: An Eternal Golden Braid"
    > from Douglas R. Hofstadter which also gives a nice introduction into
    > logic and philosopy.

IIRC, the focus of the book is how systems made out of simple components can
exhibit complex behaviours; in particular, how information-processing systems
can come to develop self-awareness.

    > From: Chet Ramey

    > One of the best books I read in high school. 
    
A book on a very similar topic to GEB, which was _extremely_ important in
developing my understanding of how the universe came to be, is "Recursive
Universe", by William Poundstone, which I recommend very highly to everyone
here. It's still in print, which is really good, because it's not as well
known as it should (IMO) be. It uses an analogy with Conway's Life to explain
how the large-scale structure of the universe can develop from a random
initial state. Buy it now!

	Noel


^ permalink raw reply	[flat|nested] 54+ messages in thread
* Re: [TUHS] The most surprising Unix programs
@ 2020-03-19 20:57 Nelson H. F. Beebe
  2020-03-19 21:18 ` Tomasz Rola
  2020-03-20  7:14 ` arnold
  0 siblings, 2 replies; 54+ messages in thread
From: Nelson H. F. Beebe @ 2020-03-19 20:57 UTC (permalink / raw)
  To: tuhs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3398 bytes --]

Tomasz Rola writes on Thu, 19 Mar 2020 21:01:20 +0100 about awk:

>> One task I would be afraid to use awk for, is html processing. Most of
>> html sources I look at nowadays seems discouraging. Extracting
>> anything of value from the mess requires something more potent, I
>> think.

If you want to tackle raw HTML from abitrary source, then I agree with
you: most HTML on the Web is not grammar conformant, there are
numerous vendor extensions, and the HTML is hideously idiosynchratic
and irregularly formatted.

The solution that I adopted 25 years ago was to write a grammar
recognizing, but violation lenient, prettyprinter for HTML.  It has
served well and I use it many times daily for my work in the BibNet
Project and TeX User Group bibliography archives, now approaching 1.55
million entries.  The latest public release is available here:

	http://www.math.utah.edu/pub/sgml/

I notice that the last version there is 1.01; I'll get that updated in
a couple of days to the latest 1.03 [subject to delays due to major
work dislocations due to the virus].  The code should install anywhere
in the Unix family without problems: I build and validate it on more
than 300 O/Ses in our test farm.

With standardized HTML, applying awk is easy, and I have more than 450
awk programs, and 380,000 lines of code, that process publisher
metadata to produce rough BibTeX entries that numerous other tools,
and some manual editing, turn into clean data for free access on the
Web.

For some journals, I run a single command of fewer than 15 characters
to download Web pages for journal issues for which I do not yet have
data, and then a single journal-specific command with no arguments
that runs a large shell script with a long pipeline that outputs
relatively clean BibTeX that then normally takes me only a couple of
minutes to visually validate in an editor session.  The major work
there is bracing of proper nouns in titles that my software did not
already handle, thereby preventing downcasing of those words in the
many bibliography styles that do so.

I'm on journal announcement lists for many publishers, so I often have
new data released to the Web just 5 to 10 minutes after receiving
e-mail about new issues.

The above-mentioned archives are at
	
	http://www.math.utah.edu/pub/bibnet
	http://www.math.utah.edu/pub/tex/bib
	http://www.math.utah.edu/pub/tex/bib/index-table.html
	http://www.math.utah.edu/pub/tex/bib/idx
	http://www.math.utah.edu/pub/tex/bib/toc	

They are mirrored at Universität Karlsruhe, Oak Ridge National
Laboratory, Sandia National Laboratory, and elsewhere.

Like Al Aho, Doug McIlroy, and Arnold Robbins, I'm a huge fan of awk;
I believe that I was the first to port it to PDP-10 TOPS-20 and VAX
VMS in the mid-1980s, and it is one of the first mandatory tools that
I install on any new computer.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe@math.utah.edu  -
- 155 S 1400 E RM 233                       beebe@acm.org  beebe@computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 54+ messages in thread
* [TUHS] The most surprising Unix programs
@ 2020-03-13 23:31 Doug McIlroy
  2020-03-14  0:40 ` Dave Horsfall
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Doug McIlroy @ 2020-03-13 23:31 UTC (permalink / raw)
  To: tuhs

Once in a while a new program really surprises me. Reminiscing a while
ago, I came up with a list of eye-opening Unix gems. Only a couple of
these programs are indispensable or much used. What singles them out is
their originality. I cannot imagine myself inventing any of them.

What programs have struck you similarly?

PDP-7 Unix

The simplicity and power of the system caused me to turn away from big
iron to a tiny machine. It offered the essence of the hierarchical
file system, separate shell, and user-level process control that Multics
had yet to deliver after hundreds of man-years' effort. Unix's lacks
(e.g. record structure in the file system) were as enlightening and
liberating as its novelties (e.g. shell redirection operators).

dc

The math library for Bob Morris's variable-precision desk calculator
used backward error analysis to determine the precision necessary at
each step to attain the user-specified precision of the result. In
my software-components talk at the 1968 NATO conference on software
engineering, I posited measurement-standard routines, which could deliver
results of any desired precision, but did not know how to design one. dc
still has the only such routines I know of.

typo

Typo ordered the words of a text by their similarity to the rest of the
text. Typographic errors like "hte" tended to the front (dissimilar) end
of the list. Bob Morris proudly said it would work as well on Urdu as it
did on English. Although typo didn't help with phonetic misspellings,
it was a godsend for amateur typists, and got plenty of use until the
advent of a much less interesting, but more precise, dictionary-based
spelling checker.

Typo was as surprising inside as it was outside. Its similarity
measure was based on trigram frequencies, which it counted in a 26x26x26
array. The small memory, which had barely room enough for 1-byte counters,
spurred a scheme for squeezing large numbers into small counters. To
avoid overflow, counters were updated probabilistically to maintain an
estimate of the logarithm of the count.

eqn

With the advent of phototypesetting, it became possible, but hideously
tedious, to output classical math notation. Lorinda Cherry set out to
devise a higher-level description language and was soon joined by Brian
Kernighan. Their brilliant stroke was to adapt oral tradition into written
expression, so eqn was remarkably easy to learn. The first of its kind,
eqn has barely been improved upon since.

struct

Brenda Baker undertook her Fortan-to-Ratfor converter against the advice
of her department head--me. I thought it would likely produce an ad hoc
reordering of the orginal, freed of statement numbers, but otherwise no
more readable than a properly indented Fortran program. Brenda proved
me wrong. She discovered that every Fortran program has a canonically
structured form. Programmers preferred the canonicalized form to what
they had originally written.

pascal

The syntax diagnostics from the compiler made by Sue Graham's group at
Berkeley were the mmost helpful I have ever seen--and they were generated
automatically. At a syntax error the compiler would suggest a token that
could be inserted that would allow parsing to proceed further. No attempt
was made to explain what was wrong. The compiler taught me Pascal in
an evening, with no manual at hand.

parts

Hidden inside WWB (writer's workbench), Lorinda Cherry's Parts annotated
English text with parts of speech, based on only a smidgen of English
vocabulary, orthography, and grammar. From Parts markup, WWB inferred
stylometrics such as the prevalance of adjectives, subordinate clauses,
and compound sentences. The Today show picked up on WWB and interviewed
Lorinda about it in the first TV exposure of anything Unix.

egrep

Al Aho expected his deterministic regular-expression recognizer would beat
Ken's classic nondeterministic recognizer. Unfortunately, for single-shot
use on complex regular expressions, Ken's could finish while egrep was
still busy building a deterministic automaton. To finally gain the prize,
Al sidestepped the curse of the automaton's exponentially big state table
by inventing a way to build on the fly only the table entries that are
actually visited during recognition.

crabs

Luca Cardelli's charming meta-program for the Blit window system released
crabs that wandered around in empty screen space nibbling away at the
ever more ragged edges of active windows.

Some common threads

Theory, though invisible on the surface, played a crucial role in the
majority of these programs: typo, dc, struct, pascal, egrep. In fact
much of their surprise lay in the novelty of the application of theory.

Originators of nearly half the list--pascal, struct, parts, eqn--were
women, well beyond women's demographic share of computer science.

Doug McIlroy 
March, 2020

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2020-09-13 15:53 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-20 14:03 [TUHS] The most surprising Unix programs Noel Chiappa
2020-03-20 14:08 ` Richard Salz
2020-03-20 14:52   ` Larry McVoy
2020-03-20 14:58     ` Dagobert Michelsen
2020-03-20 15:05       ` Richard Salz
2020-03-20 22:09       ` Mike Markowski
2020-03-20 15:03     ` Gregg Levine
2020-03-20 15:05       ` Chet Ramey
2020-03-20 22:06     ` Dave Horsfall
2020-03-21  4:59     ` Wesley Parish
2020-03-20 21:57   ` Dave Horsfall
2020-03-22 18:05     ` Tony Finch
2020-03-20 15:07 ` Nemo
2020-03-20 19:03   ` Adam Thornton
2020-03-20 16:07 ` Grant Taylor via TUHS
2020-09-13 15:44   ` Juergen Nickelsen
  -- strict thread matches above, loose matches on Subject: below --
2020-03-21  1:12 Noel Chiappa
2020-03-19 20:57 Nelson H. F. Beebe
2020-03-19 21:18 ` Tomasz Rola
2020-03-20  7:14 ` arnold
2020-03-20  7:49   ` Thomas Paulsen
2020-03-20  8:18     ` arnold
2020-03-13 23:31 Doug McIlroy
2020-03-14  0:40 ` Dave Horsfall
2020-03-14 11:30 ` Harald Arnesen
2020-03-14 12:24   ` Clem Cole
2020-03-15 22:01     ` Rob Pike
2020-03-15 22:14       ` Larry McVoy
2020-03-15 23:34         ` Warner Losh
2020-03-16  2:45           ` Anthony Martin
2020-03-15 22:30       ` Clem Cole
2020-03-15 23:20       ` Dave Horsfall
2020-03-16  0:56         ` Rob Pike
2020-03-20 23:20           ` Dave Horsfall
2020-03-20 23:35             ` Toby Thain
2020-03-21  0:34             ` Rob Pike
2020-03-17 13:03 ` ca6c
2020-03-17 13:30   ` Andy Kosela
2020-03-17 14:53     ` Cág
2020-03-17 14:57       ` Larry McVoy
2020-03-17 14:59         ` Arrigo Triulzi
2020-03-17 15:40   ` Steve Nickolas
2020-03-17 22:28   ` Dave Horsfall
2020-03-18  0:17     ` Jon Steinhart
2020-03-18  3:28       ` Dave Horsfall
2020-03-18  8:40     ` arnold
2020-03-19 12:26     ` Mike Markowski
2020-03-19 21:31       ` Dave Horsfall
2020-03-20 11:48         ` paul
2020-03-20 15:40           ` Grant Taylor via TUHS
2020-03-20 16:40             ` Jon Steinhart
2020-03-20 17:23               ` Grant Taylor via TUHS
2020-03-20 18:43               ` Rich Morin
2020-03-19 20:01   ` Tomasz Rola

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).