List for cgit developers and users
 help / color / mirror / Atom feed
* Cgit cache and slownesses
       [not found]     ` <CAHmME9pL0irbwoxK3y-RmHQ7mf9LMERSMUnS4z52r8m98anA4g@mail.gmail.com>
@ 2014-03-19 14:30       ` esajine
  2014-03-19 18:33         ` esajine
  0 siblings, 1 reply; 2+ messages in thread
From: esajine @ 2014-03-19 14:30 UTC (permalink / raw)


On 8/23/2013 8:17 AM, Jason A. Donenfeld wrote:
> The list is now opened up. Would you pass this message along to it?
> Sorry for the delay in this. Back from vacation now, getting things
> organized finally.
>

Hi!

Unfortunately, my message didn't seem to get any traction, so i have to 
repeat it:

http://lists.zx2c4.com/pipermail/cgit/2013-August/001526.html

I have 1600+ git repos that cgit is working with and this amount is
growing consistently.
Cgit experiences some slowness at times and I'm trying to find some
ways to improve the performance.

1. I have specified the cache amount as 2000 repos. sometimes it takes
up to a minute or more to get to the page while drilling down to the
tree or snapshots. I'm curious if I should try to increase ttl for the
repos (currently using default values) Is there anything else that could
make Cgit faster? Increasing TTL is one way probably but it will show
some outdated info...

Considering this I thought about the following solution (analog with
Jenkins):
We are serving repositories via Git protocol. Git protocol since some
1.8.* version supports --access-hook. Access hook script that we have can get the name of
the git service used and if it is receive-pack it calls Jenkins URL with
the url of the repository affected using curl and schedule Jenkins poll
immediately. This poll effectively gets the most recent changes and
builds the code.

Now It would probably increase the performance a lot if instead of
expiring Cgit cache every 5 min we could expire it upon a push.
But when i checked how many pushes per day we have we
averaged in about 200. That means that if we would use that approach we
would rescan even more often sometimes then with hardcoded value.
So, may be the solution would be either in
a) rescanning only the repo that was touched (that's preferred of
course) and then combine it with full rescan every ttl interval,
but in this case it would be like daily or
b) introduce a logic that would only rescan upon push but not more often
then once in ttl period I.e. the ttl time would be the least interval
for cache expiration. (if last_push_time - last_scan_time > ttl ; then
scan, else wait for ttl expiration)

So eventually i'm looking for Cgit to have a URL like that:
http://server/cgi-bin/cgit.cgi/myrepo?do_scan  that would update this
particular repo info in cgit cache.

2. It seems that the caching mechanism doesn't work properly for me
because I have changed the cache-root and i don't see any files being
saved in this new location - the folder is always empty.

Thanks,
Eugene





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Cgit cache and slownesses
  2014-03-19 14:30       ` Cgit cache and slownesses esajine
@ 2014-03-19 18:33         ` esajine
  0 siblings, 0 replies; 2+ messages in thread
From: esajine @ 2014-03-19 18:33 UTC (permalink / raw)


On 3/19/2014 10:30 AM, Eugene Sajine wrote:
> On 8/23/2013 8:17 AM, Jason A. Donenfeld wrote:
>> The list is now opened up. Would you pass this message along to it?
>> Sorry for the delay in this. Back from vacation now, getting things
>> organized finally.
>>
>
> Hi!
>
> Unfortunately, my message didn't seem to get any traction, so i have 
> to repeat it:
>
> http://lists.zx2c4.com/pipermail/cgit/2013-August/001526.html
>
> I have 1600+ git repos that cgit is working with and this amount is
> growing consistently.
> Cgit experiences some slowness at times and I'm trying to find some
> ways to improve the performance.
>
> 1. I have specified the cache amount as 2000 repos. sometimes it takes
> up to a minute or more to get to the page while drilling down to the
> tree or snapshots. I'm curious if I should try to increase ttl for the
> repos (currently using default values) Is there anything else that could
> make Cgit faster? Increasing TTL is one way probably but it will show
> some outdated info...
>
> Considering this I thought about the following solution (analog with
> Jenkins):
> We are serving repositories via Git protocol. Git protocol since some
> 1.8.* version supports --access-hook. Access hook script that we have 
> can get the name of
> the git service used and if it is receive-pack it calls Jenkins URL with
> the url of the repository affected using curl and schedule Jenkins poll
> immediately. This poll effectively gets the most recent changes and
> builds the code.
>
> Now It would probably increase the performance a lot if instead of
> expiring Cgit cache every 5 min we could expire it upon a push.
> But when i checked how many pushes per day we have we
> averaged in about 200. That means that if we would use that approach we
> would rescan even more often sometimes then with hardcoded value.
> So, may be the solution would be either in
> a) rescanning only the repo that was touched (that's preferred of
> course) and then combine it with full rescan every ttl interval,
> but in this case it would be like daily or
> b) introduce a logic that would only rescan upon push but not more often
> then once in ttl period I.e. the ttl time would be the least interval
> for cache expiration. (if last_push_time - last_scan_time > ttl ; then
> scan, else wait for ttl expiration)
>
> So eventually i'm looking for Cgit to have a URL like that:
> http://server/cgi-bin/cgit.cgi/myrepo?do_scan  that would update this
> particular repo info in cgit cache.
>
> 2. It seems that the caching mechanism doesn't work properly for me
> because I have changed the cache-root and i don't see any files being
> saved in this new location - the folder is always empty.
>
> Thanks,
> Eugene
>
>
>
I didn't see that that the new version has some chaching fixes - I'll 
try to use the newest one and come back with the results


Thanks,
Eugene


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-03-19 18:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHmME9r73f-VygyT2oT8kDjEOCApbKFMPVTifiE-MSpDgE5=Yg@mail.gmail.com>
     [not found] ` <520D356E.1060106@interactivebrokers.com>
     [not found]   ` <5213DEAE.2030605@interactivebrokers.com>
     [not found]     ` <CAHmME9pL0irbwoxK3y-RmHQ7mf9LMERSMUnS4z52r8m98anA4g@mail.gmail.com>
2014-03-19 14:30       ` Cgit cache and slownesses esajine
2014-03-19 18:33         ` esajine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).