9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: ori@eigenstate.org
To: staal1978@gmail.com, 9fans@9fans.net
Subject: Re: [9fans] Software preservation in the post-hg era
Date: Sun, 10 May 2020 12:04:23 -0700	[thread overview]
Message-ID: <AAB31498F8B459E93F85E0988364045E@eigenstate.org> (raw)
In-Reply-To: <CAK8RtFrPJ4B0SqD_aC2zHQYL4ki2ednwWtqbpv6V03FRSmLmiQ@mail.gmail.com>

> Den tors 7 maj 2020 16:17Dave MacFarlane <driusan@gmail.com> skrev:
> 
>>
>> On Mon, Mar 30, 2020 at 9:12 PM Sean Hinchee <henesy.dev@gmail.com> wrote:
>>
>>> As a footnote, there's a decent git client written in Go that works
>>> alright on plan9 [4], but it's slow and memory intensive at the
>>> moment.
>>>
>>>
>> [...]
>>
>> [4] https://github.com/driusan/dgit
>>
>> This (and the fact that the speed of Go on Plan9/amd64 seems to be finally
>> be useable enough to do development again as of 1.14..) finally gave me the
>> kick I needed to fix some of the hacks that were causing performance
>> problems on clone. The self-clone time went from ~160s to ~13s on my
>> machine (compared to ~8s with "real" git) If there's other parts that you
>> were referring to as being slow and memory intensive let me know (or if you
>> still find it's memory intensive, I didn't benchmark that part..)
>>
>> - Dave
>>
> 
> 
> How does it compare performance wise with git9 ?
> 
> https://github.com/oridb/git9

I'll be honest, I'm using git9 because of the improved
interactions, rather than performance -- it's fast enough
for most of my usage. Still, this got curious enough to
test a bit. Here are the results:

It's close for cloning dgit -- I'm seeing about 3 seconds
for dgit with git/clone, 4.5 using dgit to clone itself.

	% time git/clone https://github.com/driusan/dgit
	0.81u 1.08s 2.70r 

(Looking closer, about 1.5 seconds of that comes
from the dircp to pull data out of /mnt/git/ and
into the working directory.)

When testing dgit, I redirected output to /dev/null,
since it printed enough that it affected the time.
It's *really* chatty -- for the larger test, it
produced more than 50 megabytes of status text.

	cpu% time rc -c 'dgit clone https://github.com/driusan/dgit >[2]/dev/null'
	0.47u 0.55s 4.32r

It seems like there's something accidentally
quadratic, though. Cloning a larger repository
-- in this case, perl5 -- takes 160s on git9,
and 1200 seconds on dgit. For comparison, git
on OpneBSD with different (but comparable)
hardware takes about 90 seconds.

	cpu% time git/clone https://github.com/Perl/perl5.git
	94.40u 14.16s 159.30r

	cpu% time ./dgit clone https://github.com/Perl/perl5.git >[2]/tmp/dgit.log
	121.93u 22.16s 1211.30r

I only skimmed the dgit code quickly, and didn't see
an obvious answer: do you cache objects that you've
decompressed, or do you iterate over full delta chains
every time?

One other bug report -- it seems that dgit hard-codes the
default branch as origin/master, but perl uses 'origin/blead',
so the checkout fails with 'Could not find origin/master'

There are still places where git9 is very slow. Sending lots
of commits at once in big repositories stands out.

Two reasons for this: we don't deltify, and we walk too much
data deciding what should go into the pack. There's also a
bug that causes certain kinds of merge to push the whole
history spuriously, which is.. only wasteful rather than
incorrect -- but wasteful isn't good.

Pushing all perl commits to an empty repository, for example:

	# this is the size of the packfile git gives us
	cpu% du -sh .git
	297.043M	.git

	# pushing to git is slow
	cpu% git/push -u git+ssh://192.168.1.10/tmp/p5.git
	1783.08u 444.15s 2835.86r

	# and our undeltified packfiles are 10x the size
	# that they should be
	$ du -sh p5.git; 
	4.2Gp5.git

I can't compare with dgit, since dgit doesn't support ssh
pushes, and I'm not going to set up http pushes right now.


  reply	other threads:[~2020-05-10 19:04 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-31  1:11 Sean Hinchee
2020-03-31  1:52 ` Fazlul Shahriar
2020-03-31 14:45 ` ori
2020-03-31 16:59   ` Lucio De Re
2020-03-31 18:37   ` Dave MacFarlane
2020-03-31 19:08     ` Sigrid Solveig Haflínudóttir
2020-04-01 10:47     ` Charles Forsyth
2020-03-31 18:39 ` Xiao-Yong Jin
2020-04-01  1:00 ` Fazlul Shahriar
2020-05-07 14:15 ` Dave MacFarlane
2020-05-10  4:25   ` Lucio De Re
2020-05-10 16:27     ` hiro
2020-05-10  4:51   ` Jens Staal
2020-05-10 19:04     ` ori [this message]
2020-05-10 19:13     ` Dave MacFarlane
2020-05-10 18:37 ` Jim Manley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AAB31498F8B459E93F85E0988364045E@eigenstate.org \
    --to=ori@eigenstate.org \
    --cc=9fans@9fans.net \
    --cc=staal1978@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).