The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] PDP-10 UNIX?
@ 2017-09-18 17:24 Nelson H. F. Beebe
  2017-09-18 18:10 ` Lars Brinkhoff
  0 siblings, 1 reply; 26+ messages in thread
From: Nelson H. F. Beebe @ 2017-09-18 17:24 UTC (permalink / raw)


I worked on, and co-managed, TOPS-20 on DECsystem 20/40 and 20/60
systems with the PDP-10 KL-10 CPU from September 1978 to 31 October
1990, when our 20/60 was retired.  (A second 20/60 on our campus in
the Department of Computer Science had been retired a year or two
earlier).

There were two C compilers on the system, Ken Harrenstien's kcc, and
Steve Johnson's pcc, the latter ported to TOPS-20 by my late friend
Jay Lepreau (1952--2008).

pcc was a straightforward port intended to make C programming, and
porting of C software, fairly easy on the PDP-10, but without
addressing many of the architectural features of that CPU.

kcc was written by Ken Harrenstien from scratch, and designed
explicitly for support of the PDP-10 architecture.  In particular, it
included an O/S system call interface (the JSYS instruction), and
support for pointers to all byte sizes from 1 to 36.  Normal
addressing on the PDP-10 is by word, with an 18-bit address space.
Thus, two 18-bit fields fit in a 36-bit word, ideally suited for
Lisp's CAR and CDR (contents of address/decrement register, used for
first and rest addressing of lists).  However, PDP-10 byte pointers
encode the byte size and offset in the second half of a word.

Pointer words could contain an indirect bit, which caused the CPU to
automatically load a memory word at that address, and repeat if that
word was found to be an indirect pointer.  That processing was handled
by the LOAD instructions, so it worked for all programming languages.

Characters on the ten-or-so different PDP-10 operating systems were
normally 7-bit ASCII, stored left to right in a word, with the
right-most low-order bit set to 0, UNLESS the word was intended to be
a 5-decimal-digit line number, in which case, that bit was set to 1.
Compilers and some other tools ignored line-number words.

As the need to communicate with other systems with 8-, 16-, and 32-bit
words grew, we had to accommodate files with 8-bit characters, which
could be stored as four left-adjusted characters with 4 rightmost zero
bits, or handled as 9 consecutive 8-bit characters in two adjacent
36-bit words.  That was convenient for binary file transfer, but I
don't recall ever seeing 9-bit characters used for text files.

By contrast, on the contemporary 36-bit Univac 11xx systems running
EXEC-8, the O/S was extended from 6 six-bit Fieldata chararacters per
word to 9-bit extended ASCII (and ISO 8859-n Latin-n) characters: the
reason was that the Univac CPU had quarterword access instructions,
but not arbitrary byte-size instructions like the PDP-10.  I don't
think that there ever was a C compiler on those Univac systems.

On the PDP-10, memory locations 0--15 are mapped to machine registers
of those numbers: short loops could be copied into those locations and
would then run about 3x faster, if there weren't too many memory
references.  Register 0 was not hardwired to a zero value, so
dereferencing a NULL pointer could return any address, and could even
be legitimate in some code.  The kcc documentation reports:

>> ...
>> 	The "NULL" pointer is represented internally as a zero word,
>> i.e. the same representation as the integer value 0, regardless of
>> the type of the pointer.  The PDP-10 address 0 (AC 0) is zeroed and
>> never used by KCC, in order to help catch any use of NULL pointers.
>> ...

In kcc, the C fopen() call second argument was extended with extra
flag letters:

>> ...
>>          The user can override either the bytesize or the conversion
>>  by adding explicit specification characters, which should come after
>>  any regular specification characters:
>>          "C"     Force LF-conversion.
>>          "C-"    Force NO LF-conversion.
>>          "7"     Force 7-bit bytesize.
>>          "8"     Force 8-bit bytesize.
>>          "9"     Force 9-bit bytesize.
>>          "T"     Open for thawed access (TOPS-10/TENEX only)
>> 
>>          These are KCC-specific however, and are not portable to other
>>  systems.  Note that the actual LF conversion is done by the USYS (Unix
>>  simulation) level calls (read() and write()) rather than STDIO.
>> ...

As the PDP-10 evolved, addressing was extended from 18 bits to 22
bits, and kcc had support for such extended addresses.

Inside the kcc compiler,

>> ...
>> 	Chars are aligned on 9-bit byte boundaries, shorts on halfword
>> boundaries, and all other data types on word boundaries (with the
>> exception of bitfields and the _KCCtype_charN types).  Converting any
>> pointer to a (char *) and back is always possible, as a char is the
>> smallest possible object.  If the original object was larger than a
>> char, the char pointer will point to the first byte of the object; this
>> is the leftmost 9-bit byte in a word (if word-aligned) or in the halfword
>> (if a short).
>> ...

That design choice meant that the common assumption that a 32-bit word
holds 4 characters remained true on the PDP-10.  The _KCCtype_charN
types could have N from 1 to 36.  The case N = 6 was special: it
handled the SIXBIT character representation used by compilers,
linkers, and the O/S to encode external function names mapped to a
6-bit character set unique to the PDP-10, allowing 6-character unique
names for symbols.

I didn't readily find documentation of kcc features on the Web, so for
those who would like to learn more about support of C and Unix code on
the PDP-10, I created this FTP/Web site today:

	http://www.math.utah.edu/pub/kcc
	 ftp://ftp.math.utah.edu/pub/kcc

It supplies several *.doc files; the user.doc file is likely the one
of most interest for this discussion.

Getting C onto TOP-20 was hugely important for us, because it gave us
access to many Unix tools (I was the first to port Brian Kernighan's
awk language to the PDP-10, and also to the VAX VMS native C
compiler), and eased the transition from TOPS-20 to Unix that began
for our users about 1984, and continued until our complete move in
summer 1991, when we retired our last VAX VMS systems.

Finally, here is a pointer to a document that I wrote about that
transition:

	http://www.math.utah.edu/~beebe/reports/1987/t20unix.pdf

P.S. I'll be happy to entertain further questions about these two C
compilers on the PDP-10, offline if you prefer, or on this list.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe at math.utah.edu  -
- 155 S 1400 E RM 233                       beebe at acm.org  beebe at computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 26+ messages in thread
* [TUHS] PDP-10 UNIX?
@ 2017-09-20 18:40 Nelson H. F. Beebe
  2017-09-24 18:29 ` arnold
  0 siblings, 1 reply; 26+ messages in thread
From: Nelson H. F. Beebe @ 2017-09-20 18:40 UTC (permalink / raw)


Warner Losh <imp at bsdimp.com> kindly corrected my statement that kcc
compiler on the PDP-10 was done by Ken Harrenstien, pointing out that
it was actually begun by Kok Chen (whence, the name kcc).

I've just dug into the source tree for the compiler, and found this
leading paragraph in kcc5.vmshelp (filesystem date of 3-Sep-1988) that
provides proper credits:

>> ...
>>          KCC is a compiler for the C language on the PDP-10.  It was
>>  originally begun by Kok Chen of Stanford University around 1981 (hence
>>  the name "KCC"), improved by a number of people at Stanford and Columbia
>>  (primarily David Eppstein, KRONJ), and then adopted by Ken Harrenstien
>>  and Ian Macky of SRI International as the starting point for what is now
>>  a complete and supported implementation of C.  KCC implements C as
>>  described by the following references:
>> 
>>          H&S: Harbison and Steele, "C: A Reference Manual",
>>           HS1: (1st edition) Prentice-Hall, 1984, ISBN 0-13-110008-4
>>           HS2: (2nd edition) Prentice-Hall, 1987, ISBN 0-13-109802-0
>>          K&R: Kernighan and Ritchie, "The C Programming Language",
>>                  Prentice-Hall, 1978, ISBN 0-13-110163-3
>> 
>>          Currently KCC is only supported for TOPS-20, although there is
>>  no reason it cannot be used for other PDP-10 systems or processors.
>>  The remaining discussion assumes you are on a TOPS-20 system.
>> ...

I met Ken only once, in his office at SRI, but back in our TOPS-20
days, we had several e-mail contacts.

----------------------------------------

P.S. In these days of multi-million line compilers, it is interesting
to inspect the kcc source code line count:

	% find . -name '*.[ch]' | xargs cat | wc -l
	80298

A similar check on a 10-Oct-2016 snapshot of the actively-maintained
pcc compiler for Unix systems found 155896 lines.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe at math.utah.edu  -
- 155 S 1400 E RM 233                       beebe at acm.org  beebe at computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 26+ messages in thread
* [TUHS] PDP-10 UNIX?
@ 2017-09-18 21:14 Noel Chiappa
  2017-09-19  6:46 ` Lars Brinkhoff
  0 siblings, 1 reply; 26+ messages in thread
From: Noel Chiappa @ 2017-09-18 21:14 UTC (permalink / raw)


    > That makes sense if it's '73.  That would be the Ritchie front end and
    > v5/v6 syntax as I remember=20

Here:

  http://publications.csail.mit.edu/lcs/specpub.php?id=717

is the TR describing it (well, this report covers one by him for the Honeywell
6000 series, but IIRC it's the same compiler). I didn't read the whole thing
slowly, but glancing quickly at it, it sounds like it's possible a 'from
scratch' thing?

	 Noel


^ permalink raw reply	[flat|nested] 26+ messages in thread
[parent not found: <mailman.1031.1505666037.3779.tuhs@minnie.tuhs.org>]
* [TUHS] PDP-10 UNIX?
@ 2017-09-17 14:28 Arthur Krewat
  2017-09-17 14:31 ` Warner Losh
  0 siblings, 1 reply; 26+ messages in thread
From: Arthur Krewat @ 2017-09-17 14:28 UTC (permalink / raw)


Was there ever a UNIX or even the thought of porting one to a PDP-10?

36-bit machine, 18-bit addresses (more on KL10 and KS10), and:

*0 would return register 0 instead of a SIGSEGV ;)

8-bit bytes would have been a wasteful exercise, but you never know. 
(losing 4 bits of every 36-bit word)

thanks!
art k.



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-09-24 18:29 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 17:24 [TUHS] PDP-10 UNIX? Nelson H. F. Beebe
2017-09-18 18:10 ` Lars Brinkhoff
2017-09-20  4:55   ` Warner Losh
  -- strict thread matches above, loose matches on Subject: below --
2017-09-20 18:40 Nelson H. F. Beebe
2017-09-24 18:29 ` arnold
2017-09-18 21:14 Noel Chiappa
2017-09-19  6:46 ` Lars Brinkhoff
     [not found] <mailman.1031.1505666037.3779.tuhs@minnie.tuhs.org>
2017-09-18  2:34 ` Johnny Billquist
2017-09-18 15:30   ` Arthur Krewat
2017-09-18 16:46     ` Steve Johnson
2017-09-18 17:25       ` Clem Cole
2017-09-18 17:40         ` Larry McVoy
2017-09-18 20:08         ` Chris Torek
2017-09-17 14:28 Arthur Krewat
2017-09-17 14:31 ` Warner Losh
2017-09-17 15:01   ` Lars Brinkhoff
2017-09-17 15:22     ` Arthur Krewat
2017-09-18 13:50       ` Clem Cole
2017-09-18 16:42         ` Arthur Krewat
2017-09-18 19:58         ` Lars Brinkhoff
2017-09-18 20:10           ` Clem Cole
2017-09-18 20:22             ` Lars Brinkhoff
2017-09-18 20:43               ` Clem cole
2017-09-19  9:06           ` Mutiny 
2017-09-19  9:56             ` Lars Brinkhoff
2017-09-17 16:33     ` Warner Losh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).