List for cgit developers and users
 help / color / mirror / Atom feed
* cache issue
@ 2014-03-23 14:06 beber
  2015-02-28 12:06 ` bertrand
  0 siblings, 1 reply; 10+ messages in thread
From: beber @ 2014-03-23 14:06 UTC (permalink / raw)


Hi,

I'm getting some trouble with cgit on enlightenment platforms and cache
since some time, but at it seems to be reproductable with cgit 0.10 here
is a report.

The cache configuration look like this :

cache-root=../cache
cache-size=10000
cache-static-ttl=1
cache-dynamic-ttl=1
cache-repo-ttl=1
cache-root-ttl=1
cache-scanrc-ttl=5

* Main page

$ curl -sD - -o /dev/null https://git.enlightenment.org/ \
  | grep -E '^(Date|Expires|Last-Modified): '
Date: Sun, 23 Mar 2014 14:02:08 GMT
Expires: Sun, 23 Mar 2014 14:02:52 GMT
Last-Modified: Sun, 23 Mar 2014 14:01:52 GMT

In this page, core/elementary.git is shown as last modified '58 min.'
ago.

$ curl -s https://git.enlightenment.org/ \
  | sed -e 's;<html xmlns=.*>;<html>;' \
	| xmlstarlet fo -o -D -R --html 2> /dev/null \
	| xmlstarlet sel -T -t \
      -m "html/body/div/div/table/tr/td/a[@title='core/elementary.git']" \
			-v "../..//span[@class='age-mins']" -n
58 min.

* Repo page

$ curl -sD - -o /dev/null https://git.enlightenment.org/core/elementary.git/ \
  | grep -E '^(Date|Expires|Last-Modified): '
Date: Sun, 23 Mar 2014 14:02:14 GMT
Expires: Mon, 10 Mar 2014 20:49:55 GMT
Last-Modified: Mon, 10 Mar 2014 20:48:55 GMT

As you see, the Expires header is wrong as the configuration state it
should not be older than 1 minute (cache-repo-ttl).

Here, master is the last modified branch and is shown as last modified
'3 hours' ago.

$ curl -s https://git.enlightenment.org/core/elementary.git/ \
  | sed -e 's;<html xmlns=.*>;<html>;' \
	| xmlstarlet fo -o -D -R --html 2> /dev/null \
	| xmlstarlet sel -T -t \
      -m "html/body/div/div/table/tr/td/a[@href='/core/elementary.git/log/']" \
      -v "../..//span[@class='age-hours']" -n
3 hours

* All branch page

$ curl -sD - -o /dev/null https://git.enlightenment.org/core/elementary.git/refs/heads \
  | grep -E '^(Date|Expires|Last-Modified): '
Date: Sun, 23 Mar 2014 14:02:22 GMT
Expires: Sun, 23 Mar 2014 14:03:22 GMT
Last-Modified: Sun, 23 Mar 2014 14:02:22 GMT

In this page, the master is showned as last modified '61 min.' ago.

$ curl -s https://git.enlightenment.org/core/elementary.git/refs/heads \
  | sed -e 's;<html xmlns=.*>;<html>;' \
	| xmlstarlet fo -o -D -R --html 2> /dev/null \
	| xmlstarlet sel -t \
      -m "html/body/div/div/table/tr/td/a[@href='/core/elementary.git/log/']" \
      -v "../..//span[@class='age-mins']" -n
61 min.

How can we fix this ? I have some .lock files in the cache directory, is
there any way to flush the lock files after some period ? Also, to help
debugging this, it should be nice to have a X-Cgit-Cache: header
containing the cache file used given to user.

Once I remove '*.lock' in cache directory, 'Repo page' and 'All branch
page' are equal, but 'Main page' is not OK (ordered in the same as
before) :

62 min.
65 min.
65 min.

-- 
Beber


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2014-03-23 14:06 cache issue beber
@ 2015-02-28 12:06 ` bertrand
  2015-02-28 12:37   ` john
  0 siblings, 1 reply; 10+ messages in thread
From: bertrand @ 2015-02-28 12:06 UTC (permalink / raw)


Hi,

We are still experiencing the issue. Is there any fixes with newer 
releases ?

Cheers,
Bertrand

On 23/03/2014 14:06, Bertrand Jacquin wrote:
> Hi,
> 
> I'm getting some trouble with cgit on enlightenment platforms and cache
> since some time, but at it seems to be reproductable with cgit 0.10 
> here
> is a report.
> 
> The cache configuration look like this :
> 
> cache-root=../cache
> cache-size=10000
> cache-static-ttl=1
> cache-dynamic-ttl=1
> cache-repo-ttl=1
> cache-root-ttl=1
> cache-scanrc-ttl=5
> 
> * Main page
> 
> $ curl -sD - -o /dev/null https://git.enlightenment.org/ \
>   | grep -E '^(Date|Expires|Last-Modified): '
> Date: Sun, 23 Mar 2014 14:02:08 GMT
> Expires: Sun, 23 Mar 2014 14:02:52 GMT
> Last-Modified: Sun, 23 Mar 2014 14:01:52 GMT
> 
> In this page, core/elementary.git is shown as last modified '58 min.'
> ago.
> 
> $ curl -s https://git.enlightenment.org/ \
>   | sed -e 's;<html xmlns=.*>;<html>;' \
> 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
> 	| xmlstarlet sel -T -t \
>       -m 
> "html/body/div/div/table/tr/td/a[@title='core/elementary.git']" \
> 			-v "../..//span[@class='age-mins']" -n
> 58 min.
> 
> * Repo page
> 
> $ curl -sD - -o /dev/null 
> https://git.enlightenment.org/core/elementary.git/ \
>   | grep -E '^(Date|Expires|Last-Modified): '
> Date: Sun, 23 Mar 2014 14:02:14 GMT
> Expires: Mon, 10 Mar 2014 20:49:55 GMT
> Last-Modified: Mon, 10 Mar 2014 20:48:55 GMT
> 
> As you see, the Expires header is wrong as the configuration state it
> should not be older than 1 minute (cache-repo-ttl).
> 
> Here, master is the last modified branch and is shown as last modified
> '3 hours' ago.
> 
> $ curl -s https://git.enlightenment.org/core/elementary.git/ \
>   | sed -e 's;<html xmlns=.*>;<html>;' \
> 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
> 	| xmlstarlet sel -T -t \
>       -m 
> "html/body/div/div/table/tr/td/a[@href='skins/larry/core/elementary.git/log/']" 
> \
>       -v "../..//span[@class='age-hours']" -n
> 3 hours
> 
> * All branch page
> 
> $ curl -sD - -o /dev/null
> https://git.enlightenment.org/core/elementary.git/refs/heads \
>   | grep -E '^(Date|Expires|Last-Modified): '
> Date: Sun, 23 Mar 2014 14:02:22 GMT
> Expires: Sun, 23 Mar 2014 14:03:22 GMT
> Last-Modified: Sun, 23 Mar 2014 14:02:22 GMT
> 
> In this page, the master is showned as last modified '61 min.' ago.
> 
> $ curl -s https://git.enlightenment.org/core/elementary.git/refs/heads 
> \
>   | sed -e 's;<html xmlns=.*>;<html>;' \
> 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
> 	| xmlstarlet sel -t \
>       -m 
> "html/body/div/div/table/tr/td/a[@href='skins/larry/core/elementary.git/log/']" 
> \
>       -v "../..//span[@class='age-mins']" -n
> 61 min.
> 
> How can we fix this ? I have some .lock files in the cache directory, 
> is
> there any way to flush the lock files after some period ? Also, to help
> debugging this, it should be nice to have a X-Cgit-Cache: header
> containing the cache file used given to user.
> 
> Once I remove '*.lock' in cache directory, 'Repo page' and 'All branch
> page' are equal, but 'Main page' is not OK (ordered in the same as
> before) :
> 
> 62 min.
> 65 min.
> 65 min.

-- 
Bertrand


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-02-28 12:06 ` bertrand
@ 2015-02-28 12:37   ` john
  2015-03-01 18:43     ` bertrand
  0 siblings, 1 reply; 10+ messages in thread
From: john @ 2015-02-28 12:37 UTC (permalink / raw)


On Sat, Feb 28, 2015 at 12:06:41PM +0000, Bertrand Jacquin wrote:
> We are still experiencing the issue. Is there any fixes with newer 
> releases ?

I have just tried to reproduce this with the latest version and have not
been able to do so, but I'm not aware of any changes that should have an
effect on this (there is one cache change, 6ceba45 Skip cache slot when
time-to-live is zero, but that only applies if you set one of the *-ttl
values to zero).

The cache timeout logic relies on the mtime of the cache file, so this
could be affected by your filesystem, but it sounds like the problem is
that the .lock files are not being cleaned up.  When CGit finds a .lock
file for a cache slot it is trying to use it will just serve the stale
content, on the assumption that is has only just been replaced.

I can't see many ways that you can end up with stale lock files; the
only options are:

1) CGit crashes, in which case there should be some evidence in the
   system log.
2) rename(2) fails, presumably because the destination file exists.


> On 23/03/2014 14:06, Bertrand Jacquin wrote:
> > Hi,
> > 
> > I'm getting some trouble with cgit on enlightenment platforms and cache
> > since some time, but at it seems to be reproductable with cgit 0.10 
> > here
> > is a report.
> > 
> > The cache configuration look like this :
> > 
> > cache-root=../cache
> > cache-size=10000
> > cache-static-ttl=1
> > cache-dynamic-ttl=1
> > cache-repo-ttl=1
> > cache-root-ttl=1
> > cache-scanrc-ttl=5
> > 
> > * Main page
> > 
> > $ curl -sD - -o /dev/null https://git.enlightenment.org/ \
> >   | grep -E '^(Date|Expires|Last-Modified): '
> > Date: Sun, 23 Mar 2014 14:02:08 GMT
> > Expires: Sun, 23 Mar 2014 14:02:52 GMT
> > Last-Modified: Sun, 23 Mar 2014 14:01:52 GMT
> > 
> > In this page, core/elementary.git is shown as last modified '58 min.'
> > ago.
> > 
> > $ curl -s https://git.enlightenment.org/ \
> >   | sed -e 's;<html xmlns=.*>;<html>;' \
> > 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
> > 	| xmlstarlet sel -T -t \
> >       -m 
> > "html/body/div/div/table/tr/td/a[@title='core/elementary.git']" \
> > 			-v "../..//span[@class='age-mins']" -n
> > 58 min.
> > 
> > * Repo page
> > 
> > $ curl -sD - -o /dev/null 
> > https://git.enlightenment.org/core/elementary.git/ \
> >   | grep -E '^(Date|Expires|Last-Modified): '
> > Date: Sun, 23 Mar 2014 14:02:14 GMT
> > Expires: Mon, 10 Mar 2014 20:49:55 GMT
> > Last-Modified: Mon, 10 Mar 2014 20:48:55 GMT
> > 
> > As you see, the Expires header is wrong as the configuration state it
> > should not be older than 1 minute (cache-repo-ttl).
> > 
> > Here, master is the last modified branch and is shown as last modified
> > '3 hours' ago.
> > 
> > $ curl -s https://git.enlightenment.org/core/elementary.git/ \
> >   | sed -e 's;<html xmlns=.*>;<html>;' \
> > 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
> > 	| xmlstarlet sel -T -t \
> >       -m 
> > "html/body/div/div/table/tr/td/a[@href='skins/larry/core/elementary.git/log/']" 
> > \
> >       -v "../..//span[@class='age-hours']" -n
> > 3 hours
> > 
> > * All branch page
> > 
> > $ curl -sD - -o /dev/null
> > https://git.enlightenment.org/core/elementary.git/refs/heads \
> >   | grep -E '^(Date|Expires|Last-Modified): '
> > Date: Sun, 23 Mar 2014 14:02:22 GMT
> > Expires: Sun, 23 Mar 2014 14:03:22 GMT
> > Last-Modified: Sun, 23 Mar 2014 14:02:22 GMT
> > 
> > In this page, the master is showned as last modified '61 min.' ago.
> > 
> > $ curl -s https://git.enlightenment.org/core/elementary.git/refs/heads 
> > \
> >   | sed -e 's;<html xmlns=.*>;<html>;' \
> > 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
> > 	| xmlstarlet sel -t \
> >       -m 
> > "html/body/div/div/table/tr/td/a[@href='skins/larry/core/elementary.git/log/']" 
> > \
> >       -v "../..//span[@class='age-mins']" -n
> > 61 min.
> > 
> > How can we fix this ? I have some .lock files in the cache directory, 
> > is
> > there any way to flush the lock files after some period ? Also, to help
> > debugging this, it should be nice to have a X-Cgit-Cache: header
> > containing the cache file used given to user.
> > 
> > Once I remove '*.lock' in cache directory, 'Repo page' and 'All branch
> > page' are equal, but 'Main page' is not OK (ordered in the same as
> > before) :
> > 
> > 62 min.
> > 65 min.
> > 65 min.
> 
> -- 
> Bertrand
> _______________________________________________
> CGit mailing list
> CGit at lists.zx2c4.com
> http://lists.zx2c4.com/mailman/listinfo/cgit


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-02-28 12:37   ` john
@ 2015-03-01 18:43     ` bertrand
  2015-03-01 19:36       ` john
  0 siblings, 1 reply; 10+ messages in thread
From: bertrand @ 2015-03-01 18:43 UTC (permalink / raw)


Hi John,

On 28/02/2015 12:37, John Keeping wrote:
> On Sat, Feb 28, 2015 at 12:06:41PM +0000, Bertrand Jacquin wrote:
>> We are still experiencing the issue. Is there any fixes with newer
>> releases ?
> 
> I have just tried to reproduce this with the latest version and have 
> not
> been able to do so, but I'm not aware of any changes that should have 
> an
> effect on this (there is one cache change, 6ceba45 Skip cache slot when
> time-to-live is zero, but that only applies if you set one of the *-ttl
> values to zero).
> 
> The cache timeout logic relies on the mtime of the cache file, so this
> could be affected by your filesystem, but it sounds like the problem is
> that the .lock files are not being cleaned up.

The filesystem here is a ext4 with no specific option except noatime 
which quiet common.

> When CGit finds a .lock
> file for a cache slot it is trying to use it will just serve the stale
> content, on the assumption that is has only just been replaced.

So there is so assumption the .lock can be obsolete ?

> I can't see many ways that you can end up with stale lock files; the
> only options are:
> 
> 1) CGit crashes, in which case there should be some evidence in the
>    system log.

That might happend, the cgi can in this case be killed after 60 seconds.

> 2) rename(2) fails, presumably because the destination file exists.

I'll try an update in any case.

Thanks

>> On 23/03/2014 14:06, Bertrand Jacquin wrote:
>> > Hi,
>> >
>> > I'm getting some trouble with cgit on enlightenment platforms and cache
>> > since some time, but at it seems to be reproductable with cgit 0.10
>> > here
>> > is a report.
>> >
>> > The cache configuration look like this :
>> >
>> > cache-root=../cache
>> > cache-size=10000
>> > cache-static-ttl=1
>> > cache-dynamic-ttl=1
>> > cache-repo-ttl=1
>> > cache-root-ttl=1
>> > cache-scanrc-ttl=5
>> >
>> > * Main page
>> >
>> > $ curl -sD - -o /dev/null https://git.enlightenment.org/ \
>> >   | grep -E '^(Date|Expires|Last-Modified): '
>> > Date: Sun, 23 Mar 2014 14:02:08 GMT
>> > Expires: Sun, 23 Mar 2014 14:02:52 GMT
>> > Last-Modified: Sun, 23 Mar 2014 14:01:52 GMT
>> >
>> > In this page, core/elementary.git is shown as last modified '58 min.'
>> > ago.
>> >
>> > $ curl -s https://git.enlightenment.org/ \
>> >   | sed -e 's;<html xmlns=.*>;<html>;' \
>> > 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
>> > 	| xmlstarlet sel -T -t \
>> >       -m
>> > "html/body/div/div/table/tr/td/a[@title='core/elementary.git']" \
>> > 			-v "../..//span[@class='age-mins']" -n
>> > 58 min.
>> >
>> > * Repo page
>> >
>> > $ curl -sD - -o /dev/null
>> > https://git.enlightenment.org/core/elementary.git/ \
>> >   | grep -E '^(Date|Expires|Last-Modified): '
>> > Date: Sun, 23 Mar 2014 14:02:14 GMT
>> > Expires: Mon, 10 Mar 2014 20:49:55 GMT
>> > Last-Modified: Mon, 10 Mar 2014 20:48:55 GMT
>> >
>> > As you see, the Expires header is wrong as the configuration state it
>> > should not be older than 1 minute (cache-repo-ttl).
>> >
>> > Here, master is the last modified branch and is shown as last modified
>> > '3 hours' ago.
>> >
>> > $ curl -s https://git.enlightenment.org/core/elementary.git/ \
>> >   | sed -e 's;<html xmlns=.*>;<html>;' \
>> > 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
>> > 	| xmlstarlet sel -T -t \
>> >       -m
>> > "html/body/div/div/table/tr/td/a[@href='skins/larry/core/elementary.git/log/']"
>> > \
>> >       -v "../..//span[@class='age-hours']" -n
>> > 3 hours
>> >
>> > * All branch page
>> >
>> > $ curl -sD - -o /dev/null
>> > https://git.enlightenment.org/core/elementary.git/refs/heads \
>> >   | grep -E '^(Date|Expires|Last-Modified): '
>> > Date: Sun, 23 Mar 2014 14:02:22 GMT
>> > Expires: Sun, 23 Mar 2014 14:03:22 GMT
>> > Last-Modified: Sun, 23 Mar 2014 14:02:22 GMT
>> >
>> > In this page, the master is showned as last modified '61 min.' ago.
>> >
>> > $ curl -s https://git.enlightenment.org/core/elementary.git/refs/heads
>> > \
>> >   | sed -e 's;<html xmlns=.*>;<html>;' \
>> > 	| xmlstarlet fo -o -D -R --html 2> /dev/null \
>> > 	| xmlstarlet sel -t \
>> >       -m
>> > "html/body/div/div/table/tr/td/a[@href='skins/larry/core/elementary.git/log/']"
>> > \
>> >       -v "../..//span[@class='age-mins']" -n
>> > 61 min.
>> >
>> > How can we fix this ? I have some .lock files in the cache directory,
>> > is
>> > there any way to flush the lock files after some period ? Also, to help
>> > debugging this, it should be nice to have a X-Cgit-Cache: header
>> > containing the cache file used given to user.
>> >
>> > Once I remove '*.lock' in cache directory, 'Repo page' and 'All branch
>> > page' are equal, but 'Main page' is not OK (ordered in the same as
>> > before) :
>> >
>> > 62 min.
>> > 65 min.
>> > 65 min.
>> 
>> --
>> Bertrand
>> _______________________________________________
>> CGit mailing list
>> CGit at lists.zx2c4.com
>> http://lists.zx2c4.com/mailman/listinfo/cgit

-- 
Bertrand


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-03-01 18:43     ` bertrand
@ 2015-03-01 19:36       ` john
  2015-03-03  9:31         ` john
  0 siblings, 1 reply; 10+ messages in thread
From: john @ 2015-03-01 19:36 UTC (permalink / raw)


On Sun, Mar 01, 2015 at 06:43:17PM +0000, Bertrand Jacquin wrote:
> On 28/02/2015 12:37, John Keeping wrote:
> > On Sat, Feb 28, 2015 at 12:06:41PM +0000, Bertrand Jacquin wrote:
> >> We are still experiencing the issue. Is there any fixes with newer
> >> releases ?
> > 
> > I have just tried to reproduce this with the latest version and have 
> > not
> > been able to do so, but I'm not aware of any changes that should have 
> > an
> > effect on this (there is one cache change, 6ceba45 Skip cache slot when
> > time-to-live is zero, but that only applies if you set one of the *-ttl
> > values to zero).
> > 
> > The cache timeout logic relies on the mtime of the cache file, so this
> > could be affected by your filesystem, but it sounds like the problem is
> > that the .lock files are not being cleaned up.
> 
> The filesystem here is a ext4 with no specific option except noatime 
> which quiet common.
> 
> > When CGit finds a .lock
> > file for a cache slot it is trying to use it will just serve the stale
> > content, on the assumption that is has only just been replaced.
> 
> So there is so assumption the .lock can be obsolete ?
> 
> > I can't see many ways that you can end up with stale lock files; the
> > only options are:
> > 
> > 1) CGit crashes, in which case there should be some evidence in the
> >    system log.
> 
> That might happend, the cgi can in this case be killed after 60 seconds.

I wonder if we should be doing something like this (which is probably 3
patches if cleaned up, but shows the idea):

-- >8 --
diff --git a/cache.c b/cache.c
index e0bb47a..e7649ad 100644
--- a/cache.c
+++ b/cache.c
@@ -195,6 +195,7 @@ static int unlock_slot(struct cache_slot *slot, int replace_old_slot)
 	else
 		err = unlink(slot->lock_name);
 
+	slot->lock_name = NULL;
 	if (err)
 		return errno;
 
@@ -343,6 +344,14 @@ static int process_slot(struct cache_slot *slot)
 	return err;
 }
 
+static struct cache_slot the_slot;
+
+void cache_cleanup_locks(void)
+{
+	if (the_slot.lock_name)
+		unlock_slot(&the_slot, 0);
+}
+
 /* Print cached content to stdout, generate the content if necessary. */
 int cache_process(int size, const char *path, const char *key, int ttl,
 		  cache_fill_fn fn)
@@ -351,7 +360,6 @@ int cache_process(int size, const char *path, const char *key, int ttl,
 	int i;
 	struct strbuf filename = STRBUF_INIT;
 	struct strbuf lockname = STRBUF_INIT;
-	struct cache_slot slot;
 	int result;
 
 	/* If the cache is disabled, just generate the content */
@@ -377,13 +385,13 @@ int cache_process(int size, const char *path, const char *key, int ttl,
 	}
 	strbuf_addbuf(&lockname, &filename);
 	strbuf_addstr(&lockname, ".lock");
-	slot.fn = fn;
-	slot.ttl = ttl;
-	slot.cache_name = filename.buf;
-	slot.lock_name = lockname.buf;
-	slot.key = key;
-	slot.keylen = strlen(key);
-	result = process_slot(&slot);
+	the_slot.fn = fn;
+	the_slot.ttl = ttl;
+	the_slot.cache_name = filename.buf;
+	the_slot.lock_name = lockname.buf;
+	the_slot.key = key;
+	the_slot.keylen = strlen(key);
+	result = process_slot(&the_slot);
 
 	strbuf_release(&filename);
 	strbuf_release(&lockname);
diff --git a/cache.h b/cache.h
index 9392836..f0d1c75 100644
--- a/cache.h
+++ b/cache.h
@@ -28,6 +28,9 @@ extern int cache_process(int size, const char *path, const char *key, int ttl,
 /* List info about all cache entries on stdout */
 extern int cache_ls(const char *path);
 
+/* Cleanup open cache lock files on abnormal exit */
+extern void cache_cleanup_locks(void);
+
 /* Print a message to stdout */
 __attribute__((format (printf,1,2)))
 extern void cache_log(const char *format, ...);
diff --git a/cgit.c b/cgit.c
index d9a8b1f..0912a3d 100644
--- a/cgit.c
+++ b/cgit.c
@@ -1031,6 +1031,26 @@ static int calc_ttl()
 	return ctx.cfg.cache_repo_ttl;
 }
 
+static void cleanup_handler(int signum)
+{
+	cache_cleanup_locks();
+}
+
+static void register_signal_handlers(void)
+{
+	struct sigaction sa = {
+		.sa_handler = cleanup_handler,
+		.sa_flags = SA_RESETHAND,
+	};
+	sigemptyset(&sa.sa_mask);
+
+	sigaction(SIGHUP, &sa, NULL);
+	sigaction(SIGINT, &sa, NULL);
+	sigaction(SIGQUIT, &sa, NULL);
+	sigaction(SIGPIPE, &sa, NULL);
+	sigaction(SIGTERM, &sa, NULL);
+}
+
 int main(int argc, const char **argv)
 {
 	const char *path;
@@ -1039,6 +1059,8 @@ int main(int argc, const char **argv)
 	cgit_init_filters();
 	atexit(cgit_cleanup_filters);
 
+	register_signal_handlers();
+
 	prepare_context();
 	cgit_repolist.length = 0;
 	cgit_repolist.count = 0;


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-03-01 19:36       ` john
@ 2015-03-03  9:31         ` john
  2015-03-03 15:40           ` Jason
  0 siblings, 1 reply; 10+ messages in thread
From: john @ 2015-03-03  9:31 UTC (permalink / raw)


On Sun, Mar 01, 2015 at 07:36:00PM +0000, John Keeping wrote:
> On Sun, Mar 01, 2015 at 06:43:17PM +0000, Bertrand Jacquin wrote:
> > On 28/02/2015 12:37, John Keeping wrote:
> > > On Sat, Feb 28, 2015 at 12:06:41PM +0000, Bertrand Jacquin wrote:
> > >> We are still experiencing the issue. Is there any fixes with newer
> > >> releases ?
> > > 
> > > I have just tried to reproduce this with the latest version and have 
> > > not
> > > been able to do so, but I'm not aware of any changes that should have 
> > > an
> > > effect on this (there is one cache change, 6ceba45 Skip cache slot when
> > > time-to-live is zero, but that only applies if you set one of the *-ttl
> > > values to zero).
> > > 
> > > The cache timeout logic relies on the mtime of the cache file, so this
> > > could be affected by your filesystem, but it sounds like the problem is
> > > that the .lock files are not being cleaned up.
> > 
> > The filesystem here is a ext4 with no specific option except noatime 
> > which quiet common.
> > 
> > > When CGit finds a .lock
> > > file for a cache slot it is trying to use it will just serve the stale
> > > content, on the assumption that is has only just been replaced.
> > 
> > So there is so assumption the .lock can be obsolete ?
> > 
> > > I can't see many ways that you can end up with stale lock files; the
> > > only options are:
> > > 
> > > 1) CGit crashes, in which case there should be some evidence in the
> > >    system log.
> > 
> > That might happend, the cgi can in this case be killed after 60 seconds.
> 
> I wonder if we should be doing something like this (which is probably 3
> patches if cleaned up, but shows the idea):

Having thought about this a bit more, maybe this is the safer way to do
it, since this will detect stale .lock files no matter how they're
caused.

-- >8 --
diff --git a/cache.c b/cache.c
index 801e63f..dbfa6a9 100644
--- a/cache.c
+++ b/cache.c
@@ -161,10 +161,24 @@ static int close_lock(struct cache_slot *slot)
  */
 static int lock_slot(struct cache_slot *slot)
 {
-	slot->lock_fd = open(slot->lock_name, O_RDWR | O_CREAT | O_EXCL,
+	struct flock lock = {
+		.l_type = F_WRLCK,
+		.l_whence = SEEK_SET,
+		.l_start = 0,
+		.l_len = 0,
+	};
+
+	slot->lock_fd = open(slot->lock_name, O_RDWR | O_CREAT,
 			     S_IRUSR | S_IWUSR);
 	if (slot->lock_fd == -1)
 		return errno;
+	if (fcntl(slot->lock_fd, F_SETLK, &lock) < 0) {
+		int saved_errno = errno;
+		fprintf(stderr, "failed to lock slot!");
+		close(slot->lock_fd);
+		slot->lock_fd = -1;
+		return saved_errno;
+	}
 	if (xwrite(slot->lock_fd, slot->key, slot->keylen + 1) < 0)
 		return errno;
 	return 0;


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-03-03  9:31         ` john
@ 2015-03-03 15:40           ` Jason
  2015-03-03 19:22             ` john
  0 siblings, 1 reply; 10+ messages in thread
From: Jason @ 2015-03-03 15:40 UTC (permalink / raw)


Feel free to send a patch, John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20150303/954cad2f/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-03-03 15:40           ` Jason
@ 2015-03-03 19:22             ` john
  2015-03-03 22:56               ` Jason
  0 siblings, 1 reply; 10+ messages in thread
From: john @ 2015-03-03 19:22 UTC (permalink / raw)


From eb741581ec16d3249e7d207c3dbc4a433a8f329b Mon Sep 17 00:00:00 2001
Message-Id: <eb741581ec16d3249e7d207c3dbc4a433a8f329b.1425410489.git.john at keeping.me.uk>
From: John Keeping <john at keeping.me.uk>
Date: Tue, 3 Mar 2015 19:01:24 +0000
Subject: [PATCH] cache: use F_SETLK to avoid stale lock files

If CGit is killed while it holds a lock on a cache slot (for example
because it is taking too long to generate a page), the lock file will be
left in place.  This prevents any future attempt to use the same slot
since it will fail to exclusively create the lock file.

Since CGit is the only program that should be manipulating lock files,
we can use advisory locking to detect whether another process is
actually using the lock file or if it is now stale.

I have confirmed that this works on Linux by setting a short TTL in a
custom cgitrc and running the following with CGit patched to print a
message to stderr if the fcntl(2) fails:

	$ export CGIT_CONFIG=$PWD/cgitrc
	$ export QUERY_STRING=url=cgit/tree/ui-shared.c
	$ ./cgit |
		grep -v -e '^<div class=.footer.>' \
			-e '^Last-Modified: ' \
			-e ^'Expires: ' >expect
	$ seq 50000 | dd bs=8192 |
		parallel -j200 "diff -u expect <(./cgit |
			grep -v -e '^<div class=.footer.>' \
				-e '^Last-Modified: ' \
				-e ^'Expires: ') || echo BAD"

This printed the fail message several times without ever printing "BAD".

Signed-off-by: John Keeping <john at keeping.me.uk>
---
On Tue, Mar 03, 2015 at 04:40:01PM +0100, Jason A. Donenfeld wrote:
> Feel free to send a patch, John.

Here it is!

 cache.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/cache.c b/cache.c
index 801e63f..900b161 100644
--- a/cache.c
+++ b/cache.c
@@ -161,10 +161,23 @@ static int close_lock(struct cache_slot *slot)
  */
 static int lock_slot(struct cache_slot *slot)
 {
-	slot->lock_fd = open(slot->lock_name, O_RDWR | O_CREAT | O_EXCL,
+	struct flock lock = {
+		.l_type = F_WRLCK,
+		.l_whence = SEEK_SET,
+		.l_start = 0,
+		.l_len = 0,
+	};
+
+	slot->lock_fd = open(slot->lock_name, O_RDWR | O_CREAT,
 			     S_IRUSR | S_IWUSR);
 	if (slot->lock_fd == -1)
 		return errno;
+	if (fcntl(slot->lock_fd, F_SETLK, &lock) < 0) {
+		int saved_errno = errno;
+		close(slot->lock_fd);
+		slot->lock_fd = -1;
+		return saved_errno;
+	}
 	if (xwrite(slot->lock_fd, slot->key, slot->keylen + 1) < 0)
 		return errno;
 	return 0;
-- 
2.3.1.308.g754cd77



^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-03-03 19:22             ` john
@ 2015-03-03 22:56               ` Jason
  2015-03-03 23:43                 ` john
  0 siblings, 1 reply; 10+ messages in thread
From: Jason @ 2015-03-03 22:56 UTC (permalink / raw)


Thanks! Merged. It would be nice to see a test case built out of the
example you gave in the commit message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20150303/b8b0d25f/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* cache issue
  2015-03-03 22:56               ` Jason
@ 2015-03-03 23:43                 ` john
  0 siblings, 0 replies; 10+ messages in thread
From: john @ 2015-03-03 23:43 UTC (permalink / raw)


On Tue, Mar 03, 2015 at 11:56:11PM +0100, Jason A. Donenfeld wrote:
> Thanks! Merged. It would be nice to see a test case built out of the
> example you gave in the commit message.

It has to run for a couple of minutes to get sensible results.  I guess
we could introduce an EXPENSIVE prerequisite in the same way Git does,
but I'm not sure how often anyone would run the test suite with that
turned on.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-03 23:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-23 14:06 cache issue beber
2015-02-28 12:06 ` bertrand
2015-02-28 12:37   ` john
2015-03-01 18:43     ` bertrand
2015-03-01 19:36       ` john
2015-03-03  9:31         ` john
2015-03-03 15:40           ` Jason
2015-03-03 19:22             ` john
2015-03-03 22:56               ` Jason
2015-03-03 23:43                 ` john

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).