The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: "Greg A. Woods" <woods@robohack.ca>
To: The Unix Heritage Society mailing list <tuhs@tuhs.org>
Subject: Re: [TUHS] another conversion of the CSRG BSD SCCS archives to Git
Date: Sat, 30 Nov 2019 17:25:22 -0800	[thread overview]
Message-ID: <m1ibDzG-0036tPC@more.local> (raw)
In-Reply-To: <20191129215258.Vgu-C%steffen@sdaoden.eu>

At Fri, 29 Nov 2019 22:52:58 +0100, Steffen Nurpmeso <steffen@sdaoden.eu> wrote:
Subject: Re: [TUHS] another conversion of the CSRG BSD SCCS archives to Git
>
> Greg A. Woods wrote in <m1iVoBV-0036tPC@more.local>:
>  |I've been fixing and enhancing James Youngman's git-sccsimport to use
>  |with some of my SCCS archives, and I thought it might be the ultimate
>  |stress test of it to convert the CSRG BSD SCCS archives.
>  |
>  |The conversion takes about an hour to run on my old-ish Dell server.
>  |
>  |This conversion is unlike others -- there is some mechanical compression
>  |of related deltas into a single Git commit.
>  |
>  |https://github.com/robohack/ucb-csrg-bsd
>
> Thanks for taking the time to produce a CSRG repo that seems to
> mimic changesets as they really happened.  As i never made it
> there on my own, i have switched to yours some weeks ago.  (Mind
> you, after doing "gc --aggressive --prune=all" the repository size
> has more than halved, it was the final reason to prepare new
> repositories on a vhost with good internet connection before
> getting this through my flaky wifi here.  Storage and internet
> bandwidth and their cost really do not seem to bother anyone
> anymore.  I have no offense in mind, i only recognized it (the
> hard way).)

Ah!  I did indeed forget the "git gc" step that many conversion guides
recommend.  I might change the import script to do that automatically,
particularly if it has also initialised the repository in the same run.

Apparently github themselves run it regularly:

	https://stackoverflow.com/a/56020315/816536

Probably they do this by configuring "gc.auto" in each repository,
though I've not found any reference to what they might configure it to.

However it seems that without the "--aggressive" option, nothing will be
done in this repository.  With it though I go from 316M down to just 71M.

I don't see any way to force/tell/ask github to run "git gc --aggressive".

Perhaps I can just delete it from github and immediately re-create it
with the re-packed repository, and in theory all the hashes should stay
the same and any existing clones should be unaffected.  What do you think?

Note I have some thoughts of re-doing the whole conversion anyway, with
with more ideas on to dealing with "removed" files (SCCS files renamed
to the likes of "S.foo") and also including the many files that were
never checked into SCCS, perhaps even on a per-release basis, thus being
able to create release tags that can be checked out to match the actual
releases on the CDs.  But this will not happen quite so soon.

--
					Greg A. Woods <gwoods@acm.org>

Kelowna, BC     +1 250 762-7675           RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>     Avoncote Farms <woods@avoncote.ca>

  reply	other threads:[~2019-12-01  1:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-16  2:51 Greg A. Woods
2019-11-29 21:52 ` Steffen Nurpmeso
2019-12-01  1:25   ` Greg A. Woods [this message]
2019-12-02 18:36     ` Steffen Nurpmeso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1ibDzG-0036tPC@more.local \
    --to=woods@robohack.ca \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).