From: ori@eigenstate.org
To: staal1978@gmail.com, 9fans@9fans.net
Subject: Re: [9fans] Software preservation in the post-hg era
Date: Sun, 10 May 2020 12:04:23 -0700 [thread overview]
Message-ID: <AAB31498F8B459E93F85E0988364045E@eigenstate.org> (raw)
In-Reply-To: <CAK8RtFrPJ4B0SqD_aC2zHQYL4ki2ednwWtqbpv6V03FRSmLmiQ@mail.gmail.com>
> Den tors 7 maj 2020 16:17Dave MacFarlane <driusan@gmail.com> skrev:
>
>>
>> On Mon, Mar 30, 2020 at 9:12 PM Sean Hinchee <henesy.dev@gmail.com> wrote:
>>
>>> As a footnote, there's a decent git client written in Go that works
>>> alright on plan9 [4], but it's slow and memory intensive at the
>>> moment.
>>>
>>>
>> [...]
>>
>> [4] https://github.com/driusan/dgit
>>
>> This (and the fact that the speed of Go on Plan9/amd64 seems to be finally
>> be useable enough to do development again as of 1.14..) finally gave me the
>> kick I needed to fix some of the hacks that were causing performance
>> problems on clone. The self-clone time went from ~160s to ~13s on my
>> machine (compared to ~8s with "real" git) If there's other parts that you
>> were referring to as being slow and memory intensive let me know (or if you
>> still find it's memory intensive, I didn't benchmark that part..)
>>
>> - Dave
>>
>
>
> How does it compare performance wise with git9 ?
>
> https://github.com/oridb/git9
I'll be honest, I'm using git9 because of the improved
interactions, rather than performance -- it's fast enough
for most of my usage. Still, this got curious enough to
test a bit. Here are the results:
It's close for cloning dgit -- I'm seeing about 3 seconds
for dgit with git/clone, 4.5 using dgit to clone itself.
% time git/clone https://github.com/driusan/dgit
0.81u 1.08s 2.70r
(Looking closer, about 1.5 seconds of that comes
from the dircp to pull data out of /mnt/git/ and
into the working directory.)
When testing dgit, I redirected output to /dev/null,
since it printed enough that it affected the time.
It's *really* chatty -- for the larger test, it
produced more than 50 megabytes of status text.
cpu% time rc -c 'dgit clone https://github.com/driusan/dgit >[2]/dev/null'
0.47u 0.55s 4.32r
It seems like there's something accidentally
quadratic, though. Cloning a larger repository
-- in this case, perl5 -- takes 160s on git9,
and 1200 seconds on dgit. For comparison, git
on OpneBSD with different (but comparable)
hardware takes about 90 seconds.
cpu% time git/clone https://github.com/Perl/perl5.git
94.40u 14.16s 159.30r
cpu% time ./dgit clone https://github.com/Perl/perl5.git >[2]/tmp/dgit.log
121.93u 22.16s 1211.30r
I only skimmed the dgit code quickly, and didn't see
an obvious answer: do you cache objects that you've
decompressed, or do you iterate over full delta chains
every time?
One other bug report -- it seems that dgit hard-codes the
default branch as origin/master, but perl uses 'origin/blead',
so the checkout fails with 'Could not find origin/master'
There are still places where git9 is very slow. Sending lots
of commits at once in big repositories stands out.
Two reasons for this: we don't deltify, and we walk too much
data deciding what should go into the pack. There's also a
bug that causes certain kinds of merge to push the whole
history spuriously, which is.. only wasteful rather than
incorrect -- but wasteful isn't good.
Pushing all perl commits to an empty repository, for example:
# this is the size of the packfile git gives us
cpu% du -sh .git
297.043M .git
# pushing to git is slow
cpu% git/push -u git+ssh://192.168.1.10/tmp/p5.git
1783.08u 444.15s 2835.86r
# and our undeltified packfiles are 10x the size
# that they should be
$ du -sh p5.git;
4.2Gp5.git
I can't compare with dgit, since dgit doesn't support ssh
pushes, and I'm not going to set up http pushes right now.
next prev parent reply other threads:[~2020-05-10 19:04 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-31 1:11 Sean Hinchee
2020-03-31 1:52 ` Fazlul Shahriar
2020-03-31 14:45 ` ori
2020-03-31 16:59 ` Lucio De Re
2020-03-31 18:37 ` Dave MacFarlane
2020-03-31 19:08 ` Sigrid Solveig Haflínudóttir
2020-04-01 10:47 ` Charles Forsyth
2020-03-31 18:39 ` Xiao-Yong Jin
2020-04-01 1:00 ` Fazlul Shahriar
2020-05-07 14:15 ` Dave MacFarlane
2020-05-10 4:25 ` Lucio De Re
2020-05-10 16:27 ` hiro
2020-05-10 4:51 ` Jens Staal
2020-05-10 19:04 ` ori [this message]
2020-05-10 19:13 ` Dave MacFarlane
2020-05-10 18:37 ` Jim Manley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AAB31498F8B459E93F85E0988364045E@eigenstate.org \
--to=ori@eigenstate.org \
--cc=9fans@9fans.net \
--cc=staal1978@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).