This somewhat stale note was sent some time ago, but was ignored
because it was sent from an unregistered email address.
> And if the Unix patriarchs were perhaps mistaken about how useful
> "head" might be and whether or not it should have been considered
> verboten.
Point well taken.
I don't know which of head(1) and sed(1) came first. They appeared in
different places at more or less the same time. We in Research
declined to adopt head because we already knew the idiom "sed 10q".
However one shouldn't have to do related operations in unrelated ways.
We finally admitted head in v10.
Head was independently invented by Mike Lesk. It was Lesk's
program that was deemed superfluous.
Head might not have been written if tail didn't exist. But, unlike head,
tail strayed from the tao of "do one thing well". Tail -r and tail -f are
as cringeworthy as cat -v.
-f is a strange feature that effectively turns a regular file into a pipe
with memory by polling for new data, A clean general alternative
might be to provide an open(2) mode that makes reads at the current
file end block if some process has the file open for writing.
-r is weird because it enables backwards reading, but only as
limited by count. Better would be a program, say revfile, that simply
reads backwards by lines. Then tail p has an elegant implementation:
revfile p | head | revfile
Doug
Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
> -r is weird because it enables backwards reading, but only as
> limited by count. Better would be a program, say revfile, that simply
> reads backwards by lines. Then tail p has an elegant implementation:
> revfile p | head | revfile
The GNU coreutils provides "tac" (c-a-t backwards) which does that
job. It was adopted from a long-ago posting of same on comp.sources.something.
It should be standard on just about any Linux system.
(It too has too many options, but let's not go there.)
Thanks,
Arnold
> On Jul 14, 2021, at 9:19 PM, arnold@skeeve.com wrote: > > The GNU coreutils provides "tac" (c-a-t backwards) ... > (It too has too many options, but let's not go there.) GNU pretty much anything has too many options. I do aesthetically appreciate that tail -f is a bit of an abomination, but realistically it’s also 90% or more of my actual use of tail. There obviously needs to be SOMETHING that lets you watch the end of a growing file as it’s growing. Adam
Arnold: (It too has too many options, but let's not go there.) beside the usual help and version only 3 options what is not very much for a gnu tool. A emulate cat option is missing ;-) Thanks,
On Wed, Jul 14, 2021 at 10:38:06PM -0400, Douglas McIlroy wrote: > Head might not have been written if tail didn't exist. But, unlike head, > tail strayed from the tao of "do one thing well". Tail -r and tail -f are > as cringeworthy as cat -v. > > -f is a strange feature that effectively turns a regular file into a pipe > with memory by polling for new data, A clean general alternative > might be to provide an open(2) mode that makes reads at the current > file end block if some process has the file open for writing. OTOH, this would mean adding more functionality (read: complexity) into the kernel, and there has always been a general desire to avoid pushing <stuff> into the kernel when it can be done in userspace. Do you really think using a blocking read(2) is somehow more superior than using select(2) to wait for new data to be appended to the file? And even if we did this using a new open(2) mode, are you saying we should have a separate executable in /bin which would then be identical to cat, except that it uses a different open(2) mode? > -r is weird because it enables backwards reading, but only as > limited by count. Better would be a program, say revfile, that simply > reads backwards by lines. Then tail p has an elegant implementation: > revfile p | head | revfile I'll note, with amusement, that -r is one option which is *NOT* in the GNU version of tail. I see it in FreeBSD, but this looks like a BSD'ism. So for those like to claim that the GNU utilities have laden with useless options, this is one which can't be left at the feet of GNU coreutils. - Ted
Some comments from someone (me) who tends to be pickier than most about cramming programs together and endless sets of options: I, too, had always thought sed was older than head. I stand corrected. I have a long-standing habit of typing sed 10q but don't spend much time fussing about head. When I arrived at Bell Labs in late summer 1984, tail -f was in /usr/bin and in the manual, readslow was only in /usr/bin. readslow was like tail -f, except it either printed the entire file first or (option -e) started at the end of the file. I was told readslow had come first, and had been invented in a hurry because people wanted to watch in real time the moves logged by one of Belle's chess matches. Vague memory says it was written by pjw; the name and the code style seem consistent with that. Personally I feel like tail -r and tail -f both fit reasonably well within what tail does, since both have to do with the bottom of the file, though -r's implementation does make for a special extra code path in tail so maybe a separate program is better. What I think is a bigger deal is that I have frequently missed tail -r on Linux systems, and somehow hadn't spotted tac; thanks to whoever here (was it Ted?) pointed it out first! On the other hand, adding data-processing functions to cat has never made sense to me. It seems to originate from a mistaken notion that cat's focus is printing data on terminals, rather than concatenating data from different places. Here is a test: if cat -v and cat -n and all that make sense, why shouldn't cat also subsume tr and pr and even grep? What makes converting control characters and numbering lines so different from swapping case and adding page headers? I don't see the distinction, and so I think vis(1) (in later Research) makes more sense than cat -v and nl(1) (in Linux for a long time) more sense than cat -n. (I'd also happily argue that given nl, pr shouldn't number lines. That a program was in V6 or V7 doesn't make it perfect.) And all those special options to wc that amounted to doing arithmetic on the output were always just silly. I'm glad they were retracted. On the other other hand, why didn't I know about tac? Because there are so damn many programs in /usr/bin these days. When I started with UNIX ca. 1980, the manual (even the BSD version) was still short enough that one could sit down and read it through, section by section, and keep track of what one had read, and remember what all the different tools did. That hasn't been true for decades. This could be an argument for adding to existing programs (which many people already know about) rather than adding new programs (which many people will never notice). The real problem is that the system is just too damn big. On an Ubuntu 18.04 system I run, ls /usr/bin | wc -l shows 3242 entries. How much of that is redundant? How much is rarely or never used? Nobody knows, and I suspect few even try to find out. And because nobody knows, few are brave enough to throw things away, or even trim out bits of existing things. One day in the late 1980s, I helped out with an Introduction to UNIX talk at a DECUS symposium. One of the attendees noticed the `total' line in the output of ls, and asked why is that there? doesn't that contradict the principles of tools' output you've just been talking about? I thought about it, and said yes, you're right, that's a bit of old history and shouldn't be there any more. When I got home to New Jersey, I took the `total' line out of Research ls. Good luck doing anything like that today. Norman Wilson Toronto ON
On the subject of tac (concatenate and print files in reverse), I can report that the tool was written by my late friend Jay Lepreau in the Department of Computer Science (now, School of Computing) at the University of Utah. The GNU coreutils distribution for src/tac.c contains a copyright for 1988-2020. I searched my TOPS-20 PDP-10 archives, and found no source code for tac, but I did find an older TOPS-20 executable in Jay's personal directory with a file date of 17-Mar-1987. There isn't much else in that directory, so I suspect that he just copied over a needed tool from his Department of Computer Science TOPS-20 system to ours in the College of Science. ---------------------------------------- P.S. Jay was the first to get Steve Johnson's Portable C Compiler, pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the move from writing utilities in Pascal and PDP-10 assembly language to doing them in C. The oldest C file for pcc in our PDP-10 archives is dated 17-Mar-1981, with other pcc files dated to mid-1983, and final compiler executables dated 12-May-1986. Four system header files are dated as late as 4-Oct-1986, presumably patched after the compiler was built. Later, Kok Chen and Ken Harrenstien's kcc provided another C compiler that added support for byte datatypes, where a byte could be anything from 1 to 36 bits. The oldest distribution of kcc in our archives is labeled "Fifth formal distribution snapshot" and dated 20-Apr-1988. My info-kcc mailing list archives date from the list beginning, with an initial post from Ken dated 27-Jul-1986 announcing the availability of kcc at sri-nic.arpa. By mid-1987, we had a dozen Sun workstations and NFS fileserver; they marked the beginning of our move to a Unix workstation environment, away from large, expensive, and electricity-gulping PDP-10 and VAX mainframes. By the summer of 1991, those mainframes were retired. I recall speaking to a used-equipment vendor about our VAX 8600, which cost about US$450K (discounted academic pricing) in 1986, and was told that its value was depreciating about 20% per month. Although many of us missed TOPS-20 features, I don't think anyone was sad to say goodbye to VMS. We always felt that the VMS developers worked in isolation from the PDP-10 folks, and thus learned nothing from them. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
Nelson H. F. Beebe: P.S. Jay was the first to get Steve Johnson's Portable C Compiler, pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the move from writing utilities in Pascal and PDP-10 assembly language to doing them in C. ====== How did that C implementation handle ASCII text on the DEC-10? Were it a from-scratch UNIX port it might make sense to store four eight- or nine-bit bytes to a word, but if (as I sense it was) it was C running on TOPS-10 or TOPS-20, it would have had to work comfortably with DEC's convention of five 7-bit characters (plus a spare bit used by some programs as a flag). Norman Wilson Toronto ON
[-- Attachment #1: Type: text/plain, Size: 1554 bytes --] The 'second' C compiler was a PDP-10 and Honeywell (36-bit) target Alan Synder did for his MIT Thesis. It was originally targeted to ITS for the PDP-10, but it ran on Tops-20 also. My >>memory<< is he used a 7-bit Character, ala SAIL, with 5 chars stored in a word with a bit leftover. You can check it out: https://github.com/PDP-10/Snyder-C-compiler I believe that C compiler Nelson is talking about I believe is actually Synder's that Jay either ported from ITS or WAITS. We had some form of the Synder compiler on the PDP-10's at CMU in the late 1970s. It was either Mike Accetta or Fil Aleva that wrote a program to read PDP-10 backup tapes, that I updated to deal with TOPS-20/TENEX 'dumper' format which was similar/only different. ᐧ On Thu, Jul 15, 2021 at 3:03 PM Norman Wilson <norman@oclsc.org> wrote: > Nelson H. F. Beebe: > > P.S. Jay was the first to get Steve Johnson's Portable C Compiler, > pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the > move from writing utilities in Pascal and PDP-10 assembly language to > doing them in C. > > ====== > > How did that C implementation handle ASCII text on the DEC-10? > Were it a from-scratch UNIX port it might make sense to store > four eight- or nine-bit bytes to a word, but if (as I sense it > was) it was C running on TOPS-10 or TOPS-20, it would have had > to work comfortably with DEC's convention of five 7-bit characters > (plus a spare bit used by some programs as a flag). > > Norman Wilson > Toronto ON > [-- Attachment #2: Type: text/html, Size: 3154 bytes --]
[-- Attachment #1: Type: text/plain, Size: 1722 bytes --] g/Synder/s//Snyder/ -- sigh.... ᐧ On Thu, Jul 15, 2021 at 3:27 PM Clem Cole <clemc@ccc.com> wrote: > The 'second' C compiler was a PDP-10 and Honeywell (36-bit) target Alan > Synder did for his MIT Thesis. > It was originally targeted to ITS for the PDP-10, but it ran on Tops-20 > also. > > My >>memory<< is he used a 7-bit Character, ala SAIL, with 5 chars stored > in a word with a bit leftover. > > You can check it out: https://github.com/PDP-10/Snyder-C-compiler > > I believe that C compiler Nelson is talking about I believe is actually > Synder's that Jay either ported from ITS or WAITS. > > We had some form of the Synder compiler on the PDP-10's at CMU in the late > 1970s. > It was either Mike Accetta or Fil Aleva that wrote a program to read > PDP-10 backup tapes, that I updated to deal with TOPS-20/TENEX 'dumper' > format which was similar/only different. > ᐧ > > On Thu, Jul 15, 2021 at 3:03 PM Norman Wilson <norman@oclsc.org> wrote: > >> Nelson H. F. Beebe: >> >> P.S. Jay was the first to get Steve Johnson's Portable C Compiler, >> pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the >> move from writing utilities in Pascal and PDP-10 assembly language to >> doing them in C. >> >> ====== >> >> How did that C implementation handle ASCII text on the DEC-10? >> Were it a from-scratch UNIX port it might make sense to store >> four eight- or nine-bit bytes to a word, but if (as I sense it >> was) it was C running on TOPS-10 or TOPS-20, it would have had >> to work comfortably with DEC's convention of five 7-bit characters >> (plus a spare bit used by some programs as a flag). >> >> Norman Wilson >> Toronto ON >> > [-- Attachment #2: Type: text/html, Size: 3927 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2491 bytes --] The C compiler we had at NMT that Greg Titus wrote/rewrote allowed one to pick a number of different choices for character size (5, 6, 7 or 8). It defaulted to 7 or 8. I recall that the defaults produced OK results for student work, but that was a bit slow for pushing the envelope without some very careful choices. But it was good enough for me to write my OS group project running under 'ZAYEF' a DecSystem-20 emulator running on the DecSystem 20 under TOPS-20... My first exposure to virtual machines... It was a total trip to have 18 bit pointers and weird interrupt semantics.... I really rather working on the VAX 11/750 in 'C' and later on the Sun3/50s more, though. In part because the debugger was better (or at least more approachable by my poor undergraduate mind). Warner On Thu, Jul 15, 2021 at 1:28 PM Clem Cole <clemc@ccc.com> wrote: > The 'second' C compiler was a PDP-10 and Honeywell (36-bit) target Alan > Synder did for his MIT Thesis. > It was originally targeted to ITS for the PDP-10, but it ran on Tops-20 > also. > > My >>memory<< is he used a 7-bit Character, ala SAIL, with 5 chars stored > in a word with a bit leftover. > > You can check it out: https://github.com/PDP-10/Snyder-C-compiler > > I believe that C compiler Nelson is talking about I believe is actually > Synder's that Jay either ported from ITS or WAITS. > > We had some form of the Synder compiler on the PDP-10's at CMU in the late > 1970s. > It was either Mike Accetta or Fil Aleva that wrote a program to read > PDP-10 backup tapes, that I updated to deal with TOPS-20/TENEX 'dumper' > format which was similar/only different. > ᐧ > > On Thu, Jul 15, 2021 at 3:03 PM Norman Wilson <norman@oclsc.org> wrote: > >> Nelson H. F. Beebe: >> >> P.S. Jay was the first to get Steve Johnson's Portable C Compiler, >> pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the >> move from writing utilities in Pascal and PDP-10 assembly language to >> doing them in C. >> >> ====== >> >> How did that C implementation handle ASCII text on the DEC-10? >> Were it a from-scratch UNIX port it might make sense to store >> four eight- or nine-bit bytes to a word, but if (as I sense it >> was) it was C running on TOPS-10 or TOPS-20, it would have had >> to work comfortably with DEC's convention of five 7-bit characters >> (plus a spare bit used by some programs as a flag). >> >> Norman Wilson >> Toronto ON >> > [-- Attachment #2: Type: text/html, Size: 4217 bytes --]
> Message: 7
> Date: Thu, 15 Jul 2021 10:28:04 -0400
> From: "Theodore Y. Ts'o"
> Subject: Re: [TUHS] head/sed/tail (was The Unix shell: a 50-year view)
>
> On Wed, Jul 14, 2021 at 10:38:06PM -0400, Douglas McIlroy wrote:
>> Head might not have been written if tail didn't exist. But, unlike head,
>> tail strayed from the tao of "do one thing well". Tail -r and tail -f are
>> as cringeworthy as cat -v.
>>
>> -f is a strange feature that effectively turns a regular file into a pipe
>> with memory by polling for new data, A clean general alternative
>> might be to provide an open(2) mode that makes reads at the current
>> file end block if some process has the file open for writing.
>
> OTOH, this would mean adding more functionality (read: complexity)
> into the kernel, and there has always been a general desire to avoid
> pushing <stuff> into the kernel when it can be done in userspace. Do
> you really think using a blocking read(2) is somehow more superior
> than using select(2) to wait for new data to be appended to the file?
>
> And even if we did this using a new open(2) mode, are you saying we
> should have a separate executable in /bin which would then be
> identical to cat, except that it uses a different open(2) mode?
Yes, it would put more complexity into the kernel, but maybe it is conceptually elegant.
Consider a classic pipe or a socket and the behaviour of read(2) for those objects. The behaviour of read(2) that Doug proposes for a file would make it in line with that for a classic pipe or a socket. Hence, maybe it should not be a mode, but the standard behaviour.
I often think that around 1981 the Unix community missed an opportunity to really think through how networking should integrate with the foundations of Unix. It seems to me that at that time there was an opportunity to merge files, pipes and sockets into a coherent, simple framework. If the 8th edition file-system-switch had been introduced already in V6 or V7, maybe this would have happened.
On the other hand, the installed base was probably already too large in 1981 to still make breaking changes to core concepts. V7 may have been the last chance saloon for that.
Paul
>> -f is a strange feature that effectively turns a regular file into a pipe >> with memory by polling for new data, A clean general alternative >> might be to provide an open(2) mode that makes reads at the current >> file end block if some process has the file open for writing. > OTOH, this would mean adding more functionality (read: complexity) > into the kernel, and there has always been a general desire to avoid > pushing <stuff> into the kernel when it can be done in userspace. Do > you really think using a blocking read(2) is somehow more superior > than using select(2) to wait for new data to be appended to the file? I'm showing my age. tail -f antedated select(2) and was implemented by alternately sleeping and reading. select(2) indeed overcomes that clumsiness. > I'll note, with amusement, that -r is one option which is *NOT* in the > GNU version of tail. I see it in FreeBSD, but this looks like a > BSD'ism. -r came from Bell Labs. This reinforces the point that the ancients had their imperfections. Doug
[-- Attachment #1: Type: text/plain, Size: 1603 bytes --] On Thu, Jul 15, 2021 at 6:00 PM Douglas McIlroy < douglas.mcilroy@dartmouth.edu> wrote: > I'm showing my age. tail -f antedated select(2) and was implemented > by alternately sleeping and reading. select(2) indeed overcomes that > clumsiness. > A fd at EOF is considered by select and friends to be ready, as it is possible to read from it without hanging. > -r came from Bell Labs. This reinforces the point that the ancients > had their imperfections. > A Unix zealot, having heard that Master Foo was wise in the Great Way, came to him for instruction. Master Foo said to him: “When the Patriarch Thompson invented Unix, he did not understand it. Then he gained in understanding, and no longer invented it.” “When the Patriarch McIlroy invented the pipe, he knew that it would transform software, but did not know that it would transform mind.” “When the Patriarch Ritchie invented C, he condemned programmers to a thousand hells of buffer overruns, heap corruption, and stale-pointer bugs.” “Truly, the Patriarchs were blind and foolish!” The zealot was greatly angered by the Master's words. “These enlightened ones,” he protested, “gave us the Great Way of Unix. Surely, if we mock them we will lose merit and be reborn as beasts or MCSEs.” “Is your code ever completely without stain and flaw?” demanded Master Foo. “No,” admitted the zealot, “no man's is.” “The wisdom of the Patriarchs” said Master Foo, “was that they *knew* they were fools.” Upon hearing this, the zealot was enlightened. [-- Attachment #2: Type: text/html, Size: 2544 bytes --]
Clem Cole asks: >> Did you know that before PCC the 'second' C compiler was a PDP-10 >> target Alan Snyder did for his MIT Thesis? >> [https://github.com/PDP-10/Snyder-C-compiler] I was unaware of that compiler until sometime in the 21st Century, long after our PDP-10 was retired on 31-Oct-1990. The site https://github.com/PDP-10/Snyder-C-compiler/tree/master/tops20 supplies a list of some of Snyder's files, but they don't match anything in our TOPS-20 archives of almost 180,000 files. I then looked into our 1980s-vintage pcc source tree and compared it with a snapshot of the current pcc source code taken three weeks ago. The latter has support for these architectures aarch64 hppa m16c mips64 pdp11 sparc64 amd64 i386 m68k nova pdp7 superh arm i86 mips pdp10 powerpc vax and the pdp10 directory contains these files: CVS README code.c local.c local2.c macdefs.h order.c table.c All 5 of those *.c files are present in our TOPS-20 archives. I then grepped those archives for familiar strings: % find . -name '*.[ch]' | sort | \ xargs egrep -n -i 'scj|feldman|johnson|snyder|bell|at[&]t|mit|m.i.t.' ./code.c:8: * Based on Steve Johnson's pdp-11 version ./code2.c:19: * Based on Steve Johnson's pdp-11 version ./cpp.c:1678: stsym("TOPS20"); /* for compatibility with Snyder */ ./local.c:4: * Based on Steve Johnson's pdp-11 version ./local2.c:4: * Based on Steve Johnson's pdp-11 version ./local2.c:209: case 'A': /* emit a label */ ./match.c:2: * match.c - based on Steve Johnson's pdp11 version ./optim.c:318: * Turn 'em into regular PCONV's ./order.c:5: * Based on Steve Johnson's pdp-11 version ./pftn.c:967: * fill out previous word, to permit pointer ./pftn.c:1458: register commflag = 0; /* flag for labelled common declarations */ ./pftn2.c:1011: * fill out previous word, to permit pointer ./pftn2.c:1502: register commflag = 0; /* flag for labelled common declarations */ ./reader.c:632: p2->op = NOASG p2->op; /* this was omitted in 11 & /6 !! */ ./table.c:128: " movei A1,1\nZN", /* ZN = emit branch */ ./xdefs.c:13: * symbol table maintainence Thus, I'm confident that Jay's work was based on Steve Johnson's compiler, rather than Alan Snyder's. Norman Wilson asks: >> ... >> How did that C implementation handle ASCII text on the DEC-10? >> Were it a from-scratch UNIX port it might make sense to store >> four eight- or nine-bit bytes to a word, but if (as I sense it >> was) it was C running on TOPS-10 or TOPS-20, it would have had >> to work comfortably with DEC's convention of five 7-bit characters >> (plus a spare bit used by some programs as a flag). >> ... Our pcc compiler treated char* as a pointer to 7-bit ASCII strings, stored in the top 35 bits of a word, with the low-order bit normally zero; a 1-bit there meant that the word contained a 5-digit line number that some compilers and editors would report. Of course, that low-order non-character bit meant that memset(), memcpy(), and memmove() had somewhat dicey semantics, but I no longer recall their specs. kcc later gave us access to the PDP-10's 1- to 36-bit byte instructions. For text processing, 5 x 7b + 1b bits matched the conventions for all other programming languages on the PDP-10. When it came time to implement NFS, and exchange files and data with 32-bit-word machines, we needed the ability to handle files of 4 x 8b + 4b and 9 x 8b (in two 36-bit words), and kcc provided that. The one's-complement 36-bit Univac 1108 machines chose instead to store text in a 4 x 9b format, because that architecture had quarter-word load/store instructions, but not the general variable byte instructions of the PDP-10. Our campus had an 1108 at the University of Utah Computer Center, but I chose to avoid it, because it was run in batch mode with punched cards, and never got networking. By contrast, our TOPS-20, BSD, RSX-11, SunOS, and VMS systems all had interactive serial-line terminals, and there was no punched card support at all. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
On Jul 14, 2021, at 7:38 PM, Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
>
> -r is weird because it enables backwards reading, but only as
> limited by count. Better would be a program, say revfile, that simply
> reads backwards by lines. Then tail p has an elegant implementation:
> revfile p | head | revfile
tail -n can be smarter in that it can simply read the last K bytes
and see if there are n lines. If not, it can read back further.
revfile would have to read the whole file, which could be a lot
more than n lines! tail -n < /dev/tty may never terminate but it
will use a small finite amount of memory.
-- Bakul
On Thu, Jul 15, 2021 at 3:27 PM Nelson H. F. Beebe <beebe@math.utah.edu> wrote:
> Our campus had an 1108 at the
> University of Utah Computer Center, but I chose to avoid it, because
> it was run in batch mode with punched cards, and never got networking.
It was a terrible beast. One place to submit card decks +
undergraduate procrastination was an unhappy combination. Later
there was a crude form a timesharing grafted on to it, with some very
shrill terminals attached. The details are mercifully vague but I
think you basically had a 'session' and whatever files you created in
that session didn't persist once you'd logged out.
I remember trying to use the C compiler on that DEC-20 but never got
very far with it; I thought the SAIL compiler was more interesting.
Shows what foresight I had...
--
Jim (op.davis@science.utah.edu, iirc)
On Thursday, July 15th, 2021 at 4:18 PM, Jim Davis <jim.epost@gmail.com> wrote:
> I remember trying to use the C compiler on that DEC-20 but never got
>
> very far with it; I thought the SAIL compiler was more interesting.
>
> Shows what foresight I had...
Speaking of SAIL (and I suppose further derailing an already derailed discussion), I've occasionally looked for more information about the environment (typically whenever a book or article briefly mentions SAIL as a place with lots of custom hardware and software) but come up with little. Anyone know of good description of SAIL computer systems?
john
[-- Attachment #1: Type: text/plain, Size: 5668 bytes --] Nelson thanks. Excellent bit of snooping. I wonder why Jay did his version? Maybe he wanted a more modern C features since the Snyder compiler would been based on a very early C dialect. Steve Johnson do you have any insight? As I understand it, Alan started his work by rewritting your Honeywell B compiler to be a C compiler when the C language was quite young and many features we take for granted were not yet created. Clem On Thu, Jul 15, 2021 at 6:26 PM Nelson H. F. Beebe <beebe@math.utah.edu> wrote: > Clem Cole asks: > > >> Did you know that before PCC the 'second' C compiler was a PDP-10 > >> target Alan Snyder did for his MIT Thesis? > >> [https://github.com/PDP-10/Snyder-C-compiler] > > I was unaware of that compiler until sometime in the 21st Century, > long after our PDP-10 was retired on 31-Oct-1990. > > The site > > https://github.com/PDP-10/Snyder-C-compiler/tree/master/tops20 > > supplies a list of some of Snyder's files, but they don't match > anything in our TOPS-20 archives of almost 180,000 files. > > I then looked into our 1980s-vintage pcc source tree and compared > it with a snapshot of the current pcc source code taken three > weeks ago. The latter has support for these architectures > > aarch64 hppa m16c mips64 pdp11 sparc64 > amd64 i386 m68k nova pdp7 superh > arm i86 mips pdp10 powerpc vax > > and the pdp10 directory contains these files: > > CVS README code.c local.c local2.c macdefs.h order.c table.c > > All 5 of those *.c files are present in our TOPS-20 archives. I then > grepped those archives for familiar strings: > > % find . -name '*.[ch]' | sort | \ > xargs egrep -n -i > 'scj|feldman|johnson|snyder|bell|at[&]t|mit|m.i.t.' > ./code.c:8: * Based on Steve Johnson's pdp-11 version > ./code2.c:19: * Based on Steve Johnson's pdp-11 version > ./cpp.c:1678: stsym("TOPS20"); /* for > compatibility with Snyder */ > ./local.c:4: * Based on Steve Johnson's pdp-11 version > ./local2.c:4: * Based on Steve Johnson's pdp-11 version > ./local2.c:209: case 'A': /* emit a label */ > ./match.c:2: * match.c - based on Steve Johnson's pdp11 version > ./optim.c:318: * Turn > 'em into regular PCONV's > ./order.c:5: * Based on Steve Johnson's pdp-11 version > ./pftn.c:967: * fill out previous word, to > permit pointer > ./pftn.c:1458: register commflag = 0; /* flag for > labelled common declarations */ > ./pftn2.c:1011: * fill out previous word, to > permit pointer > ./pftn2.c:1502: register commflag = 0; /* flag for > labelled common declarations */ > ./reader.c:632: p2->op = NOASG p2->op; /* this was > omitted in 11 & /6 !! */ > ./table.c:128: " movei A1,1\nZN", /* ZN = > emit branch */ > ./xdefs.c:13: * symbol table maintainence > > Thus, I'm confident that Jay's work was based on Steve Johnson's > compiler, rather than Alan Snyder's. > > Norman Wilson asks: > > >> ... > >> How did that C implementation handle ASCII text on the DEC-10? > >> Were it a from-scratch UNIX port it might make sense to store > >> four eight- or nine-bit bytes to a word, but if (as I sense it > >> was) it was C running on TOPS-10 or TOPS-20, it would have had > >> to work comfortably with DEC's convention of five 7-bit characters > >> (plus a spare bit used by some programs as a flag). > >> ... > > Our pcc compiler treated char* as a pointer to 7-bit ASCII strings, > stored in the top 35 bits of a word, with the low-order bit normally > zero; a 1-bit there meant that the word contained a 5-digit line > number that some compilers and editors would report. Of course, that > low-order non-character bit meant that memset(), memcpy(), and > memmove() had somewhat dicey semantics, but I no longer recall their > specs. > > kcc later gave us access to the PDP-10's 1- to 36-bit byte > instructions. > > For text processing, 5 x 7b + 1b bits matched the conventions for all > other programming languages on the PDP-10. When it came time to > implement NFS, and exchange files and data with 32-bit-word machines, > we needed the ability to handle files of 4 x 8b + 4b and 9 x 8b (in > two 36-bit words), and kcc provided that. > > The one's-complement 36-bit Univac 1108 machines chose instead to > store text in a 4 x 9b format, because that architecture had > quarter-word load/store instructions, but not the general variable > byte instructions of the PDP-10. Our campus had an 1108 at the > University of Utah Computer Center, but I chose to avoid it, because > it was run in batch mode with punched cards, and never got networking. > By contrast, our TOPS-20, BSD, RSX-11, SunOS, and VMS systems all had > interactive serial-line terminals, and there was no punched card > support at all. > > > ------------------------------------------------------------------------------- > - Nelson H. F. Beebe Tel: +1 801 581 5254 > - > - University of Utah FAX: +1 801 581 4148 > - > - Department of Mathematics, 110 LCB Internet e-mail: > beebe@math.utah.edu - > - 155 S 1400 E RM 233 beebe@acm.org > beebe@computer.org - > - Salt Lake City, UT 84112-0090, USA URL: > http://www.math.utah.edu/~beebe/ - > > ------------------------------------------------------------------------------- > -- Sent from a handheld expect more typos than usual [-- Attachment #2: Type: text/html, Size: 7391 bytes --]
Clem Cole asks: >> I wonder why Jay did his version? Maybe he wanted more modern >> C features since the Snyder compiler would been based on a very >> early C dialect. I never talked to Jay about his motivation for working on pcc for TOPS-20. I visited Ken Harrenstien at SRI, but regrettably, only once. I have great admiration for his work on kcc, and later, his PDP-10 emulator (written in C, of course). I suspect that the main reason was that in the early 1980s, we could still see years of use of our PDP-10 systems in Computer Science and the College of Science, yet, being Node 4 of the original 5 nodes of the Arpanet, we were in frequent contact with Berkeley people who were active in the BSD effort. Besides our PDP-10s, we had several PDP-11s and VAXes that could run Unix, so we wanted our software to run on all of those systems, and C would be the obvious common programming language. Also, the PC revolution that started in roughly 1980 made it clear that computers were going to get a lot cheaper, and a lot more numerous, so in academia, we would phase out our Big Iron machines and replace them with Unix workstations. To help our users begin to make the transition to Unix, I wrote this Rosetta Stone document: Unix for TOPS-20 Users [24-June-1987] http://www.math.utah.edu/~beebe/publications/1987/t20unix.pdf I've not looked at it in years, and I might now cringe at parts, but most of our users in the College of Science were not computer geeks, but just scientists and mathematicians trying to do their research and teaching, so they needed help. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
John Floren asks >> Anyone know of good description of SAIL computer systems? SAIL was both an operating system at Stanford, and a programming language at the same site, but the SAIL compiler ran on multiple operating systems on the PDP-10. The first edition of William M. Newman and Robert F. Sproull's ``Principles of Interactive Computer Graphics'', McGraw-Hill, 1973, ISBN 0-07-046337-9, had an appendix on the SAIL language, but that book is in my campus office, and this week, I'm working at home, so I cannot check how much they had to say about SAIL. The bitsavers archive should have programming language manuals for the SAIL language. When TeX and METAFONT were first written in 1977--1978, Don Knuth programmed them both in SAIL, because it had the needed data structures, recursion, and a good debugger. However, by 1982, despite the MAINSAIL effort to port the SAIL language to other platforms, it became clear that a different implementation language was called for, and the only candidate that offered portability to multiple CPU architectures and operating systems at the time was Pascal. That language has a number of syntactic aggravations, including fixed-length character strings, so Don used his tangle preprocessor to rewrite strings as lists of integers, and otherwise stuck to a strict subset of Pascal. By the late 1980s, the Pascal code was translated, first manually, then automatically to C, and that is the language in which it gets compiled today. Any changes to the source code, however, are done strictly in the original Pascal subset. This year, I have built TeX Live 2021 on AMD64, ARM32, ARM64, Alpha, M68K, MIPS, PowerPC, RISC-V64, S390x, SPARC, and x86 CPUs, under numerous operating systems, demonstrating that, thanks to C and Unix, TeX and METAFONT remain widely portable. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
Warner Losh <imp@bsdimp.com> wrote:
> ... But it was good enough for me to write my OS group project running
> under 'ZAYEF' a DecSystem-20 emulator running ...
Fascinating. That's the Hebrew word used for "forgery" or "fakery".
Appropriate for an emulator. Whoever named it both knew Hebrew and
had a sense of humor. :-)
Arnold
Clem Cole wrote: > The 'second' C compiler was a PDP-10 and Honeywell (36-bit) target > Alan Synder did for his MIT Thesis. It was originally targeted to ITS > for the PDP-10, but it ran on Tops-20 also. My >>memory<< is he used > a 7-bit Character, ala SAIL, with 5 chars stored in a word with a bit > leftover. On ITS it only ever stored characters as full 36-bit words! So sizeof char == 1 == sizeof int. This is allowed per the C standard. (Maybe it was updated somewhere else, I dunno.) KCC does support 6/7/8/9 bits per character. I think 9 is the default, or else things like memcpy doesn't work. > I believe that C compiler Nelson is talking about I believe is > actually Synder's that Jay either ported from ITS or WAITS. I think it's a different compiler based on pcc. But I also think code was moved between various PDP-10 C compilers and libraries, so it's sometimes hard to tell one from another. There was also "Sargasso C", but I don't know much about that one. Maybe its claim to fame is as the original implementation language for the VT100 test program vttest still in use today. There was even an attempt to port GCC, and maybe it's still in use today somewhere around the Seattle area.
John Floren wrote: > Speaking of SAIL (and I suppose further derailing an already derailed > discussion), I've occasionally looked for more information about the > environment (typically whenever a book or article briefly mentions > SAIL as a place with lots of custom hardware and software) but come up > with little. Anyone know of good description of SAIL computer systems? I'm risking the Wrath of the Moderator here, but I really want to supply some information. Sorry, this is very far from Unix. But hey, SUDS was used to design the Stanford SUN Unix workstation. What do you mean with "SAIL computer systems"? I think upthread SAIL was referencing the Algol compiler written at the Stanford AI lab. But SAIL was also an acronym for the entire lab, AND also used as a name for the main timesharing computer hardware. The hardware was first a PDP-6, then adding a PDP-10 (KA10), then a KL10. The operating system was eventually named WAITS, but was also sometimes called SAIL or just SYSTEM. WAITS was also run on two Foonlies at other sites, and those could also be called SAIL computer systems in some sense. I gather you probably mean the AI lab and its computers. The best place for information is saildart.org, and Bruce Baumgart is working on a tome called "SAILDART_Prolegomenon". This work in progress is 116 pages. https://github.com/PDP-10/waits/blob/master/doc/SAILDART_Prolegomenon_2016.pdf
> Clem Cole asks: >> I wonder why Jay did his version? Maybe he wanted more modern C >> features since the Snyder compiler would been based on a very early C >> dialect. I would guess that was one strong reason. Snyder was at Bell Labs during the very time B transformed into C, and brought that version back to MIT. If you think K&R C looks outdated and crufty, you may balk at this "primeval C". (I find it quite charming myself.) The compiler is also quite slow and the emitted code is not very good. Nelson H. F. Beebe wrote: > Besides our PDP-10s, we had several PDP-11s and VAXes that could run > Unix, so we wanted our software to run on all of those systems I think the Snyder compiler wouldn't be the best for moving code around these computers. I can see pcc would be a much better choice.
>> -r is weird because it enables backwards reading, but only as >> limited by count. Better would be a program, say revfile, that simply >> reads backwards by lines. Then tail p has an elegant implementation: >> revfile p | head | revfile > tail -n can be smarter in that it can simply read the last K bytes > and see if there are n lines. If not, it can read back further. > revfile would have to read the whole file, which could be a lot > more than n lines! tail -n < /dev/tty may never terminate but it > will use a small finite amount of memory. Revfile would work the same way. When head has seen enough and terminates, revfile will get SIGPIPE and stop. I agree that, depending on scheduling and buffer management, revfile might read more than tail -n, but it wouldn't read the whole of a humongous file. Doug
[-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1854 bytes --] Doug McIlroy asks about the Rosetta Stone table relating TOPS-20 commands to Unix command in my ``Unix for TOPS-20 Users'' document: >> I was puzzled, though, by the Unix command "leave", which is >> not in the manuals I edited, nor is it in Linux. What does >> (or did) it do? I reread that 1987 document this morning, and found a few small mistakes, but on the whole, I still agree with what I wrote 34 years ago, and I'm pleased that almost everything there about Unix still applies today. I confess that I had forgotten about the TOPS-20 ALERT command and its Unix equivalent, leave. As Doug noted, leave is not in Linux systems, but it still exists in the BSD world, in DragonFlyBSD, FreeBSD, NetBSD, OpenBSD, and their many derivatives. From a bleeding-edge FreeBSD 14 system, I find % man leave LEAVE(1) FreeBSD General Commands Manual LEAVE(1) NAME leave – remind you when you have to leave SYNOPSIS leave [[+]hhmm] DESCRIPTION The leave utility waits until the specified time, then reminds you that you have to leave. You are reminded 5 minutes and 1 minute before the actual time, at the time, and every minute thereafter. When you log off, leave exits just before it would have printed the next message. ... ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
[-- Attachment #1: Type: text/plain, Size: 2250 bytes --] On Fri, Jul 16, 2021 at 4:05 AM Lars Brinkhoff <lars@nocrew.org> wrote: > Clem Cole wrote: > > The 'second' C compiler was a PDP-10 and Honeywell (36-bit) target > > Alan Synder did for his MIT Thesis. It was originally targeted to ITS > > for the PDP-10, but it ran on Tops-20 also. My >>memory<< is he used > > a 7-bit Character, ala SAIL, with 5 chars stored in a word with a bit > > leftover. > > On ITS it only ever stored characters as full 36-bit words! So sizeof > char == 1 == sizeof int. This is allowed per the C standard. (Maybe it > was updated somewhere else, I dunno.) > Ah - that makes sense. I never programmed the Honeywell in anything but Dartmouth BASIC (mostly) and any early FORTRAN (very little) and the whole idea of storage size was somewhat oblivious to me at the point as I was a youngster when I did that. Any idea did the Honeywell treat chars as 36-bit entities also? Steve, maybe you remember? Also, please remember that the standard does not yet exist for a good 10 years ;-) At this point, the 'standard' was the Ritchie Compiler for the PDP-11. At the time, we to run wanted the program on all of the UNIX/v6 systems and CMU's version of TOPS-10 and later TOPS-20 as an interchange format. Thus, I have memories of having to use the "c =& 0177" idiom in the backup/dumper program in a number of places [remember tar does not yet exist, and tp/stp was a binary program]. Beyond that, I don't remember much about the running C on the 10s. I think we started trying to move Harvard's stp to TOPS-10, but ran into an issue [maybe the directory size] and stopped. Since backup (dumper) was heavily used, we were trying to get IUS and SUS to be able to be backed up and handled the same way the operators did the backup for the 10's. In my own case, I had learned SAIL (and BLISS) on the 10s before C on the PDP-11, plus this was an early C program for me, maybe my second or third non-trivial one after I worked with Ted on fsck, so coming from the PDP-10/SAIL/BLISS *et al *world, 7-bit chars certainly seemed normal. I also remember having an early 'ah-ha' moment, when the difference between a 7-bit and 8-bit char started to become important. Clem ᐧ [-- Attachment #2: Type: text/html, Size: 3595 bytes --]
On Jul 16, 2021, at 5:09 AM, Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
>
>>> -r is weird because it enables backwards reading, but only as
>>> limited by count. Better would be a program, say revfile, that simply
>>> reads backwards by lines. Then tail p has an elegant implementation:
>>> revfile p | head | revfile
>
>> tail -n can be smarter in that it can simply read the last K bytes
>> and see if there are n lines. If not, it can read back further.
>> revfile would have to read the whole file, which could be a lot
>> more than n lines! tail -n < /dev/tty may never terminate but it
>> will use a small finite amount of memory.
>
> Revfile would work the same way. When head has seen enough
> and terminates, revfile will get SIGPIPE and stop. I agree that,
> depending on scheduling and buffer management, revfile might
> read more than tail -n, but it wouldn't read the whole of a
> humongous file.
Good point! But when the input is from a device it would have to
buffer up everything since it doesn't know how much head would
want. No big deal of course but I was just pointing out that tail
can "behave better" in all cases!
-- Bakul
On Friday, July 16th, 2021 at 1:27 AM, Lars Brinkhoff <lars@nocrew.org> wrote:
> John Floren wrote:
>
> > Speaking of SAIL (and I suppose further derailing an already derailed
> >
> > discussion), I've occasionally looked for more information about the
> >
> > environment (typically whenever a book or article briefly mentions
> >
> > SAIL as a place with lots of custom hardware and software) but come up
> >
> > with little. Anyone know of good description of SAIL computer systems?
>
> I'm risking the Wrath of the Moderator here, but I really want to supply
>
> some information. Sorry, this is very far from Unix. But hey, SUDS was
>
> used to design the Stanford SUN Unix workstation.
>
> What do you mean with "SAIL computer systems"? I think upthread SAIL
>
> was referencing the Algol compiler written at the Stanford AI lab. But
>
> SAIL was also an acronym for the entire lab, AND also used as a name for
>
> the main timesharing computer hardware. The hardware was first a PDP-6,
>
> then adding a PDP-10 (KA10), then a KL10. The operating system was
>
> eventually named WAITS, but was also sometimes called SAIL or just
>
> SYSTEM. WAITS was also run on two Foonlies at other sites, and those
>
> could also be called SAIL computer systems in some sense.
>
> I gather you probably mean the AI lab and its computers. The best place
>
> for information is saildart.org, and Bruce Baumgart is working on a tome
>
> called "SAILDART_Prolegomenon". This work in progress is 116 pages.
>
> https://github.com/PDP-10/waits/blob/master/doc/SAILDART_Prolegomenon_2016.pdf
Yes, WAITS is what I was thinking of. As I mentioned in my previous mail,
it feels like the SAIL timesharing systems get mentioned briefly in
a lot of accounts of historical computing, sometimes with mention that
they had some sort of (relatively) advanced video terminals, but no
in-depth descriptions of the actual hardware/software environment.
I will take a look at saildart.org and the Prolegomenon, thanks!
John
[-- Attachment #1: Type: text/plain, Size: 516 bytes --] On Fri, Jul 16, 2021, 1:38 AM <arnold@skeeve.com> wrote: > Warner Losh <imp@bsdimp.com> wrote: > > > ... But it was good enough for me to write my OS group project running > > under 'ZAYEF' a DecSystem-20 emulator running ... > > Fascinating. That's the Hebrew word used for "forgery" or "fakery". > Appropriate for an emulator. Whoever named it both knew Hebrew and > had a sense of humor. :-) > It was billed as not really a DECsystem 20, but a really good fake. Clearly a nod to that meaning. Warner Arnold > [-- Attachment #2: Type: text/html, Size: 1218 bytes --]
On Fri, Jul 16, 2021 at 08:17:18AM -0600, Nelson H. F. Beebe wrote:
>
> I confess that I had forgotten about the TOPS-20 ALERT command and its
> Unix equivalent, leave. As Doug noted, leave is not in Linux systems,
> but it still exists in the BSD world, in DragonFlyBSD, FreeBSD,
> NetBSD, OpenBSD, and their many derivatives.
The leave program isn't installed by default, but it is available in
many if not most distributions (I checked Debian, Fedora and Ubuntu).
You just have to install it ("apt-get install leave" on Debian and
Ubuntu). The upstream sources at least for the Debian package is from
NetBSD.
- Ted
[-- Attachment #1: Type: text/plain, Size: 1338 bytes --] On Fri, Jul 16, 2021 at 7:20 AM Clem Cole <clemc@ccc.com> wrote: > > > On Fri, Jul 16, 2021 at 4:05 AM Lars Brinkhoff <lars@nocrew.org> wrote: > >> Clem Cole wrote: >> > The 'second' C compiler was a PDP-10 and Honeywell (36-bit) target >> > Alan Synder did for his MIT Thesis. It was originally targeted to ITS >> > for the PDP-10, but it ran on Tops-20 also. My >>memory<< is he used >> > a 7-bit Character, ala SAIL, with 5 chars stored in a word with a bit >> > leftover. >> >> On ITS it only ever stored characters as full 36-bit words! So sizeof >> char == 1 == sizeof int. This is allowed per the C standard. (Maybe it >> was updated somewhere else, I dunno.) >> > > Ah - that makes sense. I never programmed the Honeywell in anything but > Dartmouth BASIC (mostly) and any early FORTRAN (very little) and the whole > idea of storage size was somewhat oblivious to me at the point as I was a > youngster when I did that. Any idea did the Honeywell treat chars as > 36-bit entities also? Steve, maybe you remember? > > The Honeywell 6000 machines ran GCOS; the system standard was six six-bit characters per word. The Honeywell 6100 machines ran Multics; the system standard was four nine-bit characters per word. For Multics C, sizeof (*) != sizeof (int) and NULL != 0, so a lot of "portable" C code wasn't. -- Charles [-- Attachment #2: Type: text/html, Size: 2291 bytes --]
> For Multics C, ... NULL != 0
I know what you mean, but the formulation is paradoxical,
as the expression NULL==0 is always true in C :)
Doug