The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: scj@yaccman.com (scj@yaccman.com)
Subject: [TUHS] yacc for Unix v5
Date: Sat, 27 Sep 2014 21:48:27 -0700	[thread overview]
Message-ID: <d112b61d87fe83c74684b66eb4fd1a4a.squirrel@webmail.yaccman.com> (raw)
In-Reply-To: <CADxT5N68F8Xnkn4LvbmWGB0pebz9gZqzPP2CYYr0dyK-+0NCrw@mail.gmail.com>


> I've been looking into the history of yacc (yet another compiler
> compiler).    ....


The Yacc history was not particularly clean in the early days.  When Unix
came along, I inherited a B compiler from Dennis for the GE (later
Honeywell) mainframe at Bell Labs.  It was a recursive descent compiler
for statements with a bolted on precedence mechanism for expressions.  I
wanted to add some features (notably, some way to do exclusive OR). 
Dennis and I settled on the ^ character and after a lot of sweat I got it
to work on the GE, and Dennis put the change in the Unix B as well.  But
it was very painful.

There was a compiler/compiler in use at the Labs, imported I think by Doug
McIlroy, called TMG.  It was a backtracking recursive descent compiler
that could compile almost anything but was unbearably slow and, if it
encountered a syntax error would backtrack back to the first character of
the input file and say, in effect, "something's wrong in here somewhere".

Al Aho overheard some of the discussions about adding the operator and the
problems with TMG, and suggested that we explore this new LR parsing that
Knuth had invented a few yuears earlier.  Al offered to build a table to
handle the expressions in B, and I took him up on the offer and gave him
an expression grammar with about 35 rules.  Al labored by hand for two
days, and gave me a table.  I typed it in and it didn't work.  We debugged
it, and Al spent another handful of hours and gave me another table.  It
was better, but still didn't work.  After a couple of more go-arounds, I
said "maybe you could show me what you are doing, and I could get the
computer to do it."  And Yacc was born.

The original version of Yacc, in B, was a huge CPU hog, and couldn't do
very large grammars.  I remember it once taking 20 minutes to process a
grammar.  And it brought the Unix system to its knees.  So after the first
version, most of the activity was in improving the algorithm and data
structure, and overlaying temporary arrays so we could do bigger grammars
on the PDP-11.  I estimated that I sped Yacc up by a factor of 10,000 in
its first several years.

In the middle of this work, I spent a 9-month sabbatical at the University
of Waterloo in Canada.  I did some work on Yacc while there, but when I
returned to Bell Labs, I discovered that B had been supplanted by C, and
an MIT co-op student had converted the B compiler to a C compiler, and had
also translated Yacc from B to C.  He had also made some substantial
changes in the code generation algorithm, improving much of the code but
also introducing a bunch of fairly obscure bugs.

One thing about the early Unix releases. Most of the work of preparing the
releases was done by a very small number of people (1-3).  Unix in those
days operated in a state of continuous development and integration.  For
example, Dennis did most of his compiler work between midnight and 4 AM. 
We would come in almost every day to find a new C compiler, with
yesterday's compiler tucked away in a known place.  Most of the time we
never noticed.  But if there was a problem we sent email to Dennis and
used the previous compiler.  When Dennis came in after lunch, the bug was
usually fixed in an hour or two.  The rest of us operated in a similar
fashion.

So the decision of what source code for Yacc or PCC was copied into a
given release was often invisible to the author of the code.  The only
discussion I remember relating to the code I had written was a lunchtime
discussion about whether the lawyers would let us distribute PCC.  This
was resolved by putting it in the release and not saying anything and
nothing happened...





      reply	other threads:[~2014-09-28  4:48 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-27 19:03 Mark Longridge
2014-09-28  4:48 ` scj [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d112b61d87fe83c74684b66eb4fd1a4a.squirrel@webmail.yaccman.com \
    --to=scj@yaccman.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).