The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: rochkind@basepath.com (Marc Rochkind)
Subject: [TUHS] About SCCS (and PWB)
Date: Mon, 9 Nov 2015 16:49:32 -0700	[thread overview]
Message-ID: <CAOkr1zVwzuHNjarRxSuX9b8uuPwCqecf7rKyd5TOO4sgroEQUA@mail.gmail.com> (raw)
In-Reply-To: <20151109230509.GC24474@mcvoy.com>

Since you asked, here's the true story of how I came up with the delta
encoding, a story never before told.

I was living in a garden apartment in Sayreville, NJ, and at night would
walk my girlfriend's dog along a hillside just outside our front door. It
was usually cold, I didn't like the dog (still don't like dogs), and hated
dodging the piles of dog shit while he tugged on the leash. So, as a coping
mechanism, I used to let my mind wander, and one evening it was wandering
and wondering about a problem I was struggling with, which was how to store
the source and the deltas all in the same file. (It was a "data set," on
the IBM OS/360 system we were using--we weren't on UNIX yet.)

Anyway, no doubt simultaneously with this unpleasant animal taking a shit,
I came up with idea of surrounding pieces of text with markers. (The
algorithm itself is documented in my original 1975 paper, which you can
read about here: http://basepath.com/aup/talks/SCCS-Slideshow.pdf.)

(Wouldn't this be an even better story if I said that the little piles of
dog poop on the hillside looked like markers in the soft glow of a full
moon? It's not true, but perhaps I'll tell it that way if the occasion
arises in the future.)

When I got inside, I started to sketch out how the markers might work, and
came up with interesting observation that insertion start/end markers
obviously nested, but deletion start/end markers did not nest with insert
start/end markers. This is obvious if you think about it the right way:
When you delete, the text you're deleting could have been added at various
times, but when you insert, the inserted text is always added at the same
time.

I didn't have replacement markers; insert and delete were enough, I thought.

I kept fooling around with the idea until I had an algorithm that I thought
would work to retrieve any version with a single pass. (It's in the paper,
referenced above.)

To prove the algorithm to be correct, I enumerated all possible cases of
insertions mixed in with deletions. I don't recall how many cases I had,
but I think it was around 20 or 30. Then I painstakingly went though every
case, making sure the algorithm produced the right answer. This was a rare
example of me doing actual work.

Coding it up, as I remember, was very easy, as the scheme is pretty simple.
I'm sure I had it running in SNOBOL4 in a day or two. Redesigning SCCS in C
for UNIX came maybe a year or so later, but the algorithm remained the same.

Larry very kindly says: "SCCS has interleaved deltas. It's a brilliant
design that has far far better performance than anything else out there."

Maybe it was brilliant, but I can tell you that I was just trying to pass
the time while that stupid dog did his business.

--Marc

On Mon, Nov 9, 2015 at 4:05 PM, Larry McVoy <lm at mcvoy.com> wrote:

> On Mon, Nov 09, 2015 at 04:02:44PM -0700, Marc Rochkind wrote:
> > I just got on this list today, and I see that Larry McVoy asks:
> >
> > "I wish Marc was on this list, be fun to chat."
> >
> > I'd be happy to chime in on SCCS or early PWB questions, to the extent I
> > remember anything.
>
> Awesome!  How about a start of how you came up with the SCCS design,
> in particular the interleaved delta format (we internally call it
> "the weave")?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20151109/48873390/attachment.html>


  reply	other threads:[~2015-11-09 23:49 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-09 23:02 Marc Rochkind
2015-11-09 23:05 ` Larry McVoy
2015-11-09 23:49   ` Marc Rochkind [this message]
2015-11-10  1:09     ` Clem Cole
2015-11-10  1:27       ` Marc Rochkind

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOkr1zVwzuHNjarRxSuX9b8uuPwCqecf7rKyd5TOO4sgroEQUA@mail.gmail.com \
    --to=rochkind@basepath.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).