From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <20090211014302.GP22259@masters6.cs.jhu.edu> References: <1234305943.4957.189.camel@goose.sun.com> <20090211014302.GP22259@masters6.cs.jhu.edu> Date: Wed, 11 Feb 2009 19:07:48 +0100 Message-ID: <5d375e920902111007i7812a94av44827c592800f85f@mail.gmail.com> From: Uriel To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [9fans] Plan 9 source history (was: Re: source browsing via http is back) Topicbox-Message-UUID: 9ead2246-ead4-11e9-9d60-3106f5b1d025 Oh, glad that somebody found my partial git port useful, I might give it another push some time. Having a git/hg repo of the plan9 history is something I have been thinking about for a while, really cool that you got something going already. Will you provide a standard git web interface (and a 'native' git interface for more efficient cloning)? Peace uriel On Wed, Feb 11, 2009 at 2:43 AM, Nathaniel W Filardo wrote: > On Tue, Feb 10, 2009 at 02:45:43PM -0800, Roman V. Shaposhnik wrote: >> On Tue, 2009-02-10 at 17:28 -0500, erik quanstrom wrote: >> > what leads you to beleve that that amount of sharing will be >> > significant? >> >> Just a hunch so far. I don't have hard data to prove anything. >> On the other hand, I'd be surprised if massive updates (not pulling >> in a couple of months) didn't benefit from the sharing. >> >> Thanks, >> Roman. > > I have mirrored, with vac -f, every sources dump from 2002 to > yesterday with > -e acme/acid/386 -e acme/acid/alpha -e acme/acid/arm \ > -e acme/acid/mips -e acme/acid/power -e acme/bin/386 \ > -e acme/bin/alpha -e acme/bin/arm -e acme/bin/mips \ > -e acme/bin/power -e acme/mail/386 -e acme/mail/alpha \ > -e acme/mail/arm -e acme/mail/mips -e acme/mail/power \ > -e sys/man/vol1.ps -e sys/man/vol1.ps.gz -e sys/man/vol1.pdf \ > LICENSE* NOTICE acme lib rc sys ; > intending to get all the source and not the binaries. I patched my vac to > ignore atimes (replacing the vac metadata field with the mtime) to increase > metadata block sharing. As of 2009/0205 (a convenient snapshot to du), this > represents about 140.7 MB of data per dump. The entire copy takes 550 MB > (240 MB actual storage in Venti). (With no sharing whatsoever, this would > be approx. 310 GB.) I would like to re-archive this with the Rabin > fingerprinting vac for comparison. > > (In case anybody wants to rush out and recreate the results, it took > roughly 10 to 15 minutes per dump to dispatch all the Tstat requests to > sources.) > > Incidentally, a git repository of the crawls, from 2002/1212 to 2009/0205, > is available at http://mirrors.acm.jhu.edu/trees/plan9native/ . Git gets > the data down to 165M after a gc run, so perhaps it's a better idea than a > venti-based mirror. I haven't managed to make my version of Uriel's port > (thanks for the start! :) ) of git do the right thing in enough cases yet, > so the git repo may not be updated for a while, but I figured somebody might > want to play with it in the interim. > > --nwf; >