The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] speaking of early C compilers
@ 2014-10-27 10:32 Jason Stevens
  2014-10-27 13:03 ` Brantley Coile
  2014-10-27 17:09 ` scj
  0 siblings, 2 replies; 28+ messages in thread
From: Jason Stevens @ 2014-10-27 10:32 UTC (permalink / raw)


has anyone ever tried to compile any of the old C compilers with a 'modern'
C compiler?

I tried a few from the 80's (Microsoft/Borland) and there is a bunch of
weird stuff where integers suddenly become structs, structures reference
fields that aren't in that struct,   

c01.c
        register int t1;
....
                t1->type = UNSIGN;


And my favorite which is closing a bunch of file handles for the heck of it,
and redirecting stdin/out/err from within the program instead of just
opening the file and using fread/fwrite.. 

c00.c
	if (freopen(argv[2], "w", stdout)==NULL ||
(sbufp=fopen(argv[3],"w"))==NULL)


How did any of this compile?  How did this stuff run without clobbering
each-other?

I don't know why but I started to look at this stuff with some half hearted
attempt at getting Apout running on Windows.  Naturally there is no fork, so
when a child process dies, the whole thing crashes out.  I guess I could
simulate a fork with threads and containing all the cpu variables to a
structure for each thread, but that sounds like a lot of work for a limited
audience.

But there really is some weird stuff in v7's c compiler.



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-27 13:46 Noel Chiappa
  0 siblings, 0 replies; 28+ messages in thread
From: Noel Chiappa @ 2014-10-27 13:46 UTC (permalink / raw)


    > From: random832 at fastmail.us

    > Did casting not exist back then?

No, not in the early V6 compiler. It was only added as of the Typesetter
compiler. (I think if you look in those 'Recent C Changes' things I sent in
recently {Oct 17}, you'll find mention of it.)

	Noel



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-27 13:54 Jason Stevens
  0 siblings, 0 replies; 28+ messages in thread
From: Jason Stevens @ 2014-10-27 13:54 UTC (permalink / raw)


 Thanks for clearing that the whole members out of nowhere thing.

I had thought (ha ha) that since I don't have a working fork, I could just
rebuild CC as a native 
executable, and then just call apout for each stage, but I never realized
how interdependent
they all are, at least C0 to C1.

It's crazy to think of how much this stuff cost once upon a time.  

And now we live in the era of javascript pdp-11's
http://pdp11.aiju.de/

-----Original Message-----
From: Brantley Coile
To: Jason Stevens
Cc: tuhs at minnie.tuhs.org
Sent: 10/27/14 9:03 PM
Subject: Re: [TUHS] speaking of early C compilers

Early C allowed you to use the '->' operator with any scaler.  See early
C reference manuals.  This is the reason there is one operator to access
a member of a structure using a pointer and another, '.', to access a
member in a static structure.  The B language had no types, everything
was a word, and dmr evolved C from B.  At first it made sense to use the
'->' operator to mean add a constant to whatever is on the left and use
as an l-value.  

You will also find that member names share a single name space.   The
simple symbol table had an bit in each entry to delineate members from
normal variables.  You could only use the same member name in two
different structs if the members had the same offsets.  In other words,
it was legal to add a member name to the symbol table that was already
there if the value of the symbol was the same as the existing entry. 

Dennis' compilers kept some backward compatibility even after the
language evolved away from them. 

This really shows the value of evolving software instead of thinking one
has all the answers going into development.  If one follows the
development of C one sees the insights learned as they went.  The study
of these early Unix systems have a great deal to teach that will be
valuable in the post Moore's law age.  Much of the worlds software will
need to a re-evolution. 

By the way, did you notice the compiler overwrites itself?   We used to
have to work in tiny spaces.  Four megabytes was four million dollars. 

Sent from my iPad

> On Oct 27, 2014, at 6:42 AM, Jason Stevens
<jsteve at superglobalmegacorp.com> wrote:
> 
> has anyone ever tried to compile any of the old C compilers with a
'modern'
> C compiler?
> 
> I tried a few from the 80's (Microsoft/Borland) and there is a bunch
of
> weird stuff where integers suddenly become structs, structures
reference
> fields that aren't in that struct,   
> 
> c01.c
>        register int t1;
> ....
>                t1->type = UNSIGN;
> 
> 
> And my favorite which is closing a bunch of file handles for the heck
of it,
> and redirecting stdin/out/err from within the program instead of just
> opening the file and using fread/fwrite.. 
> 
> c00.c
>    if (freopen(argv[2], "w", stdout)==NULL ||
> (sbufp=fopen(argv[3],"w"))==NULL)
> 
> 
> How did any of this compile?  How did this stuff run without
clobbering
> each-other?
> 
> I don't know why but I started to look at this stuff with some half
hearted
> attempt at getting Apout running on Windows.  Naturally there is no
fork, so
> when a child process dies, the whole thing crashes out.  I guess I
could
> simulate a fork with threads and containing all the cpu variables to a
> structure for each thread, but that sounds like a lot of work for a
limited
> audience.
> 
> But there really is some weird stuff in v7's c compiler.
> _______________________________________________
> TUHS mailing list
> TUHS at minnie.tuhs.org
> https://minnie.tuhs.org/mailman/listinfo/tuhs



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-27 14:48 Noel Chiappa
  2014-10-27 15:09 ` Ronald Natalie
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Noel Chiappa @ 2014-10-27 14:48 UTC (permalink / raw)


    > From: Jason Stevens

    > has anyone ever tried to compile any of the old C compilers with a
    > 'modern' C compiler?
    > ...
    > How did any of this compile? How did this stuff run without clobbering
    > each-other?

As Ron Natalie said, the early kernels are absolutely littered with all sorts
of stuff that, by today's standards, are totally unacceptable. Using a
variable declared as an int as a pointer, using a variable declared as a
'foo' pointer as a 'bar' pointer, yadda-yadda.

I ran (tripped, actually :-) across several of these while trying to get my
pipe-splicing code to work. (I used Version 6 since i) I am _totally_
familiar with it, and ii) it's what I had running.)

For example, I tried to be all nice and modern and declared my pointer
variables to be the correct type. The problem is that Unix generated unique
ID's to sleep on with code like "sleep(p+1, PPIPE)", and the value generated
by "p+1" depends on what type "p" is declared as - and if you look in pipe.c,
you'll see it's often declared as an int pointer. So when _I_ wrote
"sleep((p + 1), PPIPE)", with "p" declared as a "stuct file pointer", I got
the wrong number.

I can only speculate as to why they wrote code like this. I think part of it
is, as Brantley Coile points out, historical artifacts due to the evolution
of C from (originally) BCPL. That may have gotten them used to writing code
in a certain way - I don't know. I also expect the modern mindset (of being
really strict about types, and formal about coverting data from one to
another) was still evolving back then - partly because they often didn't
_have_ the tools (e.g. casts) to do it right. Another possibility is that
they were using line editors, and maintaining more extensive source is a pain
with an editor like that. Why write "struct file *p" wnen you can just write
"*p"? And of course everyone was so space-concious back then, with those tiny
disks (an RK05 pack is, after all, only 2.5MB - only slightly larger than a
3.5" floppy!) every byte counted.


I have to say, though, that it's really kind of jarring to read this stuff.

I have so much respect for their overall structure (the way the kernel is
broken down into sub-systems, and the sub-systems into routines), how they
managed to get a very powerful (by anyone's standards, even today's) OS into
such a small amount of code... And the _logic_ of any given routine is
usually quite nice, too: clear and efficient. And I love their commenting
style - no cluttering up the code with comments unless there's something that
really needs elucidation, just a short header to say, at a high level, what
the routine does (and sometimes how and why).

So when I see these funky declarations (e.g. "int *p" for something that's
_only_ going to be used to point to a "struct file"), I just cringe - even
though I sort of understand (see above) why it's like that. It's probably the
thing I would most change, if I could.

	Noel



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-27 15:48 Noel Chiappa
  2014-10-27 16:25 ` Dave Horsfall
  0 siblings, 1 reply; 28+ messages in thread
From: Noel Chiappa @ 2014-10-27 15:48 UTC (permalink / raw)


    > From: Dave Horsfall <dave at horsfall.org>

    > What, as opposed to spelling creat() with an "e"?

Actually, that one never bothered me at all!

I tended to be more annoyed by _extra_ characters; e.g. the fact that 'change
directory' was (in standard V6) "chdir" (as opposed to just plain "cd") I
found far more irritating! Why make that one _five_ characters, when most
common commands are two?! (cc, ld, mv, rm, cp, etc, etc, etc...)

	Noel



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-27 16:50 Norman Wilson
  0 siblings, 0 replies; 28+ messages in thread
From: Norman Wilson @ 2014-10-27 16:50 UTC (permalink / raw)


Noel Chiappa:

> I tended to be more annoyed by _extra_ characters; e.g. the fact that 
> 'change directory' was (in standard V6) "chdir" (as opposed to just 
> plain "cd") I found far more irritating! Why make that one _five_ 
> characters, when most common commands are two?! (cc, ld, mv, rm, cp, 
> etc, etc, etc...)

In the earliest systems, e.g. that on the PDP-7, the change-directory
command was just `ch'.

Two vague memories about the change:

-- Dennis, in one of his retrospective papers (possibly that
in the 1984 all-UNIX BLTJ issue, but I don't have it handy at
the moment) remarked about ch becoming chdir but couldn't
remember why that happened.

-- Someone else, possibly Tom Duff, once suggested to me that
in the earliest systems, the working directory was the only
thing that could be changed: no chown, no chmod.  Hence just
ch for chdir.  I don't know offhand whether that's true, but
it makes a good story.

Personally I'd rather have to type chdir and leav off th
trailing e on many other words than creat if it let me off
dealing with pieces of key system infrastructure that insist
on printing colour-change ANSI escape sequences (with, so far
as I can tell, no way to disable them) and give important files
names beginning with - so that grep pattern * produces an error.
But that happens in Linux, not UNIX.

Norman Wilson
Toronto ON



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-27 18:16 Nelson H. F. Beebe
  0 siblings, 0 replies; 28+ messages in thread
From: Nelson H. F. Beebe @ 2014-10-27 18:16 UTC (permalink / raw)


Norman Wilson writes today:

>> ...
>> -- Dennis, in one of his retrospective papers (possibly that
>> in the 1984 all-UNIX BLTJ issue, but I don't have it handy at
>> the moment) remarked about ch becoming chdir but couldn't
>> remember why that happened.
>> ...

The reference below contains on page 5 this comment by Dennis:

>> (Incidentally, chdir was spelled ch; why this was expanded when we
>>  went to the PDP-11 I don't remember)

@String{pub-PH                  = "Pren{\-}tice-Hall"}
@String{pub-PH:adr              = "Upper Saddle River, NJ 07458, USA"}

@Book{ATT:AUS86-2,
  author =       "AT{\&T}",
  key =          "ATT",
  title =        "{AT}{\&T UNIX} System Readings and Applications",
  volume =       "II",
  publisher =    pub-PH,
  address =      pub-PH:adr,
  pages =        "xii + 324",
  year =         "1986",
  ISBN =         "0-13-939845-7",
  ISBN-13 =      "978-0-13-939845-2",
  LCCN =         "QA76.76.O63 U553 1986",
  bibdate =      "Sat Oct 28 08:25:58 2000",
  bibsource =    "http://www.math.utah.edu/pub/tex/bib/master.bib",
  acknowledgement = ack-nhfb,
  xxnote =       "NB: special form AT{\&T} required to get correct
                 alpha-style labels.",
}

That chapter of that book comes from this paper:

@String{j-ATT-BELL-LAB-TECH-J   = "AT\&T Bell Laboratories Technical Journal"}

@Article{Ritchie:1984:EUT,
  author =       "Dennis M. Ritchie",
  title =        "Evolution of the {UNIX} time-sharing system",
  journal =      j-ATT-BELL-LAB-TECH-J,
  volume =       "63",
  number =       "8 part 2",
  pages =        "1577--1593",
  month =        oct,
  year =         "1984",
  CODEN =        "ABLJER",
  DOI =          "http://dx.doi.org/10.1002/j.1538-7305.1984.tb00054.x"
  ISSN =         "0748-612X",
  ISSN-L =       "0748-612X",
  bibdate =      "Fri Nov 12 09:17:39 2010",
  bibsource =    "Compendex database;
                 http://www.math.utah.edu/pub/tex/bib/bstj1980.bib",
  abstract =     "This paper presents a brief history of the early
                 development of the UNIX operating system. It
                 concentrates on the evolution of the file system, the
                 process-control mechanism, and the idea of pipelined
                 commands. Some attention is paid to social conditions
                 during the development of the system.",
  acknowledgement = ack-nhfb,
  fjournal =     "AT\&T Bell Laboratories Technical Journal",
  topic =        "computer systems programming",
}

Incidentally, on modern systems with tcsh and csh, I use both chdir
and cd; the long form does the bare directory change, whereas the
short form is an alias that also updates the shell prompt string and
the terminal window title.

I also have a personal alias "xd" (eXchange Directory) that is short
for the tcsh & bash sequence "pushd !*; cd .", allowing easy jumping
back and forth between pairs of directories, with updating of prompts
and window titles.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe at math.utah.edu  -
- 155 S 1400 E RM 233                       beebe at acm.org  beebe at computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 28+ messages in thread
* [TUHS] speaking of early C compilers
@ 2014-10-28  1:55 Jason Stevens
  2014-10-28 12:52 ` Ronald Natalie
  0 siblings, 1 reply; 28+ messages in thread
From: Jason Stevens @ 2014-10-28  1:55 UTC (permalink / raw)


Wow BSD on a supercomputer!  That sounds pretty cool! 

http://web.ornl.gov/info/reports/1986/3445600639931.pdf

From here it mentions it could scale to 16 process execution modules
(CPU's?) 

while here http://ftp.arl.mil/mike/comphist/hist.html it mentions 4 PEMs
which each could run 8 processes.

It still looks like an amazing machine.

-----Original Message-----
From: Ronald Natalie
To: Noel Chiappa
Cc: tuhs at minnie.tuhs.org
Sent: 10/27/14 11:09 PM
Subject: Re: [TUHS] speaking of early C compilers

We thought the kernels got cleaned up a lot by the time we got to the
BSD releases.    We were wrong.
When porting our variant of the 4 BSD to the Denelcor HEP supercomputer
we found a rather amusing failure.

The HEP was a 64 bit word machine but it had partial words of 16 and 32
bits.   The way it handled these was to encode the word size in the
lower bits of the address (since the bottom three weren't used in word
addressing anyhow).    If the bottom three were zero, then it was the
full word.  If it was 2 or 6, it was the left or right half word, and
1,3, 5, and 7 yielded the four quarter words.  (Byte operations used
different instructions so they directly addressed the memory).

Now Mike Muuss who did the C compiler port made sure that all the casts
did the right thing.   If you cast "int *" to "short *" it would tweak
the low order bits to make things work.    However the BSD kernel in
several places did what I call conversion by union:  essentially this:

union carbide {
     char*  c;
     short* s;
     int*     i;
} u;

u.s  = ...some valid short* ...
int* ip = u.i;

Note the compiler has no way of telling that you are storing and
retrieving through different union members and hence the low order bits
ended up reflecting the wrong word size and this led to some flamboyant
failures.     I then spent a week running around the kernel making these
void* and fixing up all the access points to properly cast the accesses
to it.

The other amusing thing was what to call the data types.     Since this
was a word machine, there was a real predisposition to call the 64 bit
sized thing "int" but that meant we needed another typename for the 32
bit thing (since we decided to leave short for the 16 bit integer).
I lobbied hard for "medium" but we ended up using int32.   Of course,
this is long before the C standards ended up reserving the _ prefix for
the implementation.

The afore mentioned fact that all the structure members shared the same
namespace in the original implementation is why the practice of using
letter prefixes on them (like b_flags and b_next etc... rather than just
flags or next) that persisted long after the C compiler got this issue
resolved.

Frankly, I really wish they'd have fixed arrays in C to be properly
functioning types at the same time they fixed structs to be proper types
as well.     Around the time of the typesetter or V7 releases we could
assign and return structs but arrays still had the silly "devolve into
pointers" behavior that persists unto this day and still causes problems
among the newbies.

_______________________________________________
TUHS mailing list
TUHS at minnie.tuhs.org
https://minnie.tuhs.org/mailman/listinfo/tuhs



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2014-10-28 22:02 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-27 10:32 [TUHS] speaking of early C compilers Jason Stevens
2014-10-27 13:03 ` Brantley Coile
2014-10-27 13:34   ` Ronald Natalie
2014-10-27 13:40     ` random832
2014-10-27 14:04       ` Clem Cole
2014-10-27 15:04       ` Dave Horsfall
2014-10-27 17:09 ` scj
2014-10-27 20:35   ` Ronald Natalie
2014-10-27 21:34     ` Clem Cole
2014-10-28  1:09       ` Dave Horsfall
2014-10-28  2:06         ` Clem Cole
2014-10-28 12:22           ` Ronald Natalie
2014-10-28 12:42             ` Clem Cole
2014-10-28 13:03               ` Ronald Natalie
2014-10-28 22:02                 ` John Cowan
2014-10-27 13:46 Noel Chiappa
2014-10-27 13:54 Jason Stevens
2014-10-27 14:48 Noel Chiappa
2014-10-27 15:09 ` Ronald Natalie
2014-10-27 15:13 ` Dave Horsfall
2014-10-27 16:52 ` Dan Cross
2014-10-27 15:48 Noel Chiappa
2014-10-27 16:25 ` Dave Horsfall
2014-10-28  0:16   ` John Cowan
2014-10-27 16:50 Norman Wilson
2014-10-27 18:16 Nelson H. F. Beebe
2014-10-28  1:55 Jason Stevens
2014-10-28 12:52 ` Ronald Natalie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).