From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, SUBJ_ALL_CAPS autolearn=ham autolearn_force=no version=3.4.2 Received: from minnie.tuhs.org (minnie.tuhs.org [45.79.103.53]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id 1ba12395 for ; Thu, 12 Sep 2019 04:21:14 +0000 (UTC) Received: by minnie.tuhs.org (Postfix, from userid 112) id AB04E9487F; Thu, 12 Sep 2019 14:21:13 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id BC8C194790; Thu, 12 Sep 2019 14:20:27 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=algebras-org.20150623.gappssmtp.com header.i=@algebras-org.20150623.gappssmtp.com header.b="tiuaor0V"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id D774194790; Thu, 12 Sep 2019 14:20:22 +1000 (AEST) Received: from mail-io1-f41.google.com (mail-io1-f41.google.com [209.85.166.41]) by minnie.tuhs.org (Postfix) with ESMTPS id 31A3193D35 for ; Thu, 12 Sep 2019 14:20:22 +1000 (AEST) Received: by mail-io1-f41.google.com with SMTP id n197so51359100iod.9 for ; Wed, 11 Sep 2019 21:20:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=algebras-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OrJhtXrp3s2QepkWAA0e9A21L+Vdap/8SmRANSUolRM=; b=tiuaor0VHJTVCilMIApiJ9CRaPTJjKPvIwZ4K7WCNrsUej4i7f8Vdyzs4yD6qDVbhS RoXe5rleU6HTCuKPWR04I/Xc8SSmqclkFGJrzQ0LEDEJ2gfeCNP9QeJuaZq62NHmKirY GY3oeRXo2mQCQ+FuaeIJI9RsfiUScmw+4xEi5ivqgsD7W3mvr+Bupa9IC4srj6tw48PB 48HEH5XJnHDdyh1raL/UTnbzH3cKVkIKmgTWydRix6U8WCwDS/ixPJvzIzelnwWGc4RC 54pIjG9UL4aCOOxcPXr734YcC4nENtSBb+vSdpbaxPnOtTboOJOVX3ax8XYdtpm09GU3 U/3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OrJhtXrp3s2QepkWAA0e9A21L+Vdap/8SmRANSUolRM=; b=NEdDX3Ng3PCSIzlat+1ImbZF1Ia0YFpo0l333ytrEjKOOX63qEZnm/ggi63kpFLqTf hxCTXxhldiXfb4rEUAHZIMzY3dvjVl71RN1u56dnliWxtW9Iyv1/4+4ZJsWvOsdkC8/f eg+coIQZYbLX2PHYPeD4A/H1r54UrCoYzRbyHF2PQHQO4ZLnHm+dOgKUlq0Loyj7Oflf ikim1jMnVrQpsgUbALDtQi+n2M8S/eaqrQDeLo3bcMNTi10eQwPl0fPbNUZp3WAbRssJ 8todSU6q3ZBSflON2x5yvRW+ZQHjid/EhNhhhwXFYBN5FI7davx1xGjFG53dURPvQqTp KVww== X-Gm-Message-State: APjAAAU3KWj7jFzRjUcYIAkrM773Y4mSLye0Ap5m4MSK2szbX2VNzhR9 n+BLBaK215gWzSGSmZTtMVsaHrfC140//5PKMyMRHw== X-Google-Smtp-Source: APXvYqw1qUdjbZqMSl9T2MNlMtniInlmsXqE4kjVUA16wE8NabX5+8sd1GijINBE94erQU9j1eyz+tyoEudhNxNmTec= X-Received: by 2002:a5e:c74d:: with SMTP id g13mr1909790iop.65.1568262021476; Wed, 11 Sep 2019 21:20:21 -0700 (PDT) MIME-Version: 1.0 References: <20190911181101.GF3143@mcvoy.com> <20190912034346.GJ2046@mcvoy.com> In-Reply-To: <20190912034346.GJ2046@mcvoy.com> From: George Michaelson Date: Thu, 12 Sep 2019 11:20:08 +0700 Message-ID: To: Larry McVoy Content-Type: text/plain; charset="UTF-8" Subject: Re: [TUHS] SCCS X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Eunuchs Hysterical Society , rick@rbsmith.com Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" If you want to be trendy, call this an event stream, and say its eventually consistent. What Larry and the other RCS haters forget is that back in the day, when we all had more hair, RCS was --->FAST<--- and SCCS was S.L.O.W. because running forward edits on a base state of 1000 edits is slow. Since the majority action is increment +1 on the head state the RCS model, whilst broken in many ways was FAST -G On Thu, Sep 12, 2019 at 10:44 AM Larry McVoy wrote: > > On Thu, Sep 12, 2019 at 08:49:35AM +1000, Dave Horsfall wrote: > > On Wed, 11 Sep 2019, Larry McVoy wrote: > > > > >SCCS is awesome, I'll post why in a different thread. It is light years > > >better than RCS, Tichy lied. > > > > Agreed; I used it for years, then was forced to use RCS in my next job :-( > > Marc Rochkind is on the list, I believe I invited him, but he can spell > check what I'm about to say. > > SCCS is awesome. People have no idea how good it is as a version control > system because Walter Tichy got his PhD for writing RCS and claiming it > was better (don't get me started on how wrong that was). > > Let me start with how RCS works. You all know about diff and patch, > someone does a diff, produces a patch, and Larry Wall wrote patch(1) > that takes the patch and file and applies it. In a straight line > history here is how RCS works. The most recent version of the file > is stored in whole, the delta behind that is a reverse patch to get > to that, same for the next delta behind that and so on. > > Branches in RCS are forward patches. Sort of. You start with the > whole file that is the most recent version, reverse patch your way > back to the branch point, and then forward patch your way down to > the most recent version on the branch. > > Yeah, branches in RCS suck, very expensive. > > So why is SCCS awesome? It is an entirely different approach. > SCCS is a set based system, for any version, there is a set > of deltas that are in that version and you apply them to the > file part of the data. > > Huh? What does that mean? OK, you've all seen SCCS revisions, 1.1, > 1.2, 1.3, 1.3.1.1, etc. Yeah, that's for humans (and truth be told those > revs are not that great). For SCCS internally there are serial numbers. > All those are are a monotonically increasing number, doesn't matter > if you are on the trunk or on a branch, each time you add a delta the > internal number, or serial, is the last serial plus 1. > > When you go to check out a version of the file, that's the set. > It's the set of serials that make up that file. If you wanted > 1.3 and there are no branches, the set is the serial of 1.3 (3) > and the parent of 1.3 which is 1.2 (2) and 1.1 (1). So your set > is 1,2,3. > > Here is the awesome part. The way the data is stored in SCCS > is nothing like patches, it's what we call a weave. All versions > of the file are woven together in the following way. There are > 3 operators: > > insert: ^AI > delete: ^AD > end: ^AE > > So if you checked in a file that looked like > > I > love > TUHS > > The weave would be > > ^AI 1 > I > love > TUHS > ^AE 1 > > Lets say that Clem changed that to > > I > really > love > TUHS > > The new weave would look like: > > ^AI 1 > I > ^AI 2 > really > ^AE 2 > love > TUHS > ^AE 1 > > and if I changed it to > > I > *really* > love > TUHS > > the weave looks like > > ^AI 1 > I > ^AD 3 > ^AI 2 > really > ^AE 2 > ^AE 3 > ^AI 3 > *really* > ^AE 3 > love > TUHS > ^AE 1 > > So a checkout is 2 things, build up the set of serials that need to be > active for this checkout, and walk the weave. For each serial you see > you are either in this set and I need to do what it says, or this is > not in my set and I need to do the opposite of what it says. > > So that is really really fast compared to RCS. RCS reads the whole > file and has to do compute, SCCS reads the whole file and does a > very tiny fraction of that compute, so tiny that you can't measure > it compared to reading the file. > > But wait, it gets better, much better. Lets talk about branches > and merges. RCS is copy by value across a merge, SCCS is copy by > reference. Marc thought about the set stuff enough to realize > wouldn't be cool if you could manipulate the set. He added include > and exclude operators. > > Imagine if you and I were having an argument about some video being > checked in. You checked it in, I deleted it, you checked it in, I deleted > it. Suppose that was a 1GB video. In RCS, each time we argued that is > another GB, we did that 4 times, there 4GB of diffs in the history. > > In SCCS, you could do that with includes and excludes, those 4 times > would be about 30 bytes because all they are doing is saying "In the > set", "Not in the set". > > Cool I guess but what is the real life win? Merges. In a weave based > system like SCCS you can add 1GB on a branch and I can merge that onto > the trunk and all I did was say "include your serials". I didn't copy > your work, I referenced it. Tiny meta data to do so. > > That has more implications than you might think. Think annotations. > git blame, know that? It shows who did what? Yeah, well git is > copy by value just like RCS. It's not just about the space used, > it is also about who did what. If bwk did one thing and dmr did > another thing and little old lm merged dmr's stuff into the bwk > trunk, in a copy by value system, all of dmr's work will look like > I wrote it in bwk's trunk. > > SCCS avoids that. If I merged dmr's work into bwk's tree, if it > all automerged, annotations would show it all as dmr's work, yeah, > I did the merge but I didn't do anything so I don't show up in > annotations. If there was a conflict and I had to resolve that > conflict, then that, and that alone, would show up as my work. > > For Marc, I worked with Rick Smith, he found me after I had done a > reimplentation of SCCS. He has forgotten more about weaves than I will > ever know. But he was really impressed with my code (which you can > go see at bitkeeper.org, and it is not my code, it is my teams code, > Rick bugfixed all my mistakes) because I embraced the set thing and the > way I wrote the code you could have N of my data structures and pulled > out N versions of the file in one pass. He had not seen that before, > to me it just seemed the most natural way to do it. > > SCCS is awesome, Marc should be held up as a hero for that. Most people > have no idea how much better it is as a format, to this day people still > do it wrong. The hg people at facebook sort of got it, they did an > import of SCCS (it was BitKeeper which is SCCS on super steriods). > But it is rare that someone gets it. I wanted Marc to know we got it. > > --lm