cache-size implementation downsides

List for cgit developers and users
 help / color / mirror / Atom feed

* cache-size implementation downsides
@ 2018-06-13 19:02 konstantin
  2018-06-16 15:46 ` john
  0 siblings, 1 reply; 6+ messages in thread
From: konstantin @ 2018-06-13 19:02 UTC (permalink / raw)


Hi, all:

TL;DR: The way cache-size is implemented is pretty clever and efficient,
but has important downsides due to filename collisions.

Long:

The way caching is implemented in cgit, it uses the path of the URL
request (plus any query strings) as the cache key. That key is then
passed through a fast one-way hashing function to arrive to a
deterministic integer between 0 and the value of cache-size. That
integer is then used to generate the name of the file that will be
stored in the cache-root directory.

This is clever and fast, because this means cgit doesn't need to list
the contents of the cache-root directory to make sure there are no more
than cache-size number of files in it (directory listing is expensive).
However, this also means that if the value of cache-size is set
sufficiently low, there will inevitably be collisions. To make sure that
we don't serve the wrong cache, cgit stores the full key (path+query
string) inside the cache file at offset 0. When there is a cache file in
place matching the request, cgit makes sure that it is for the correct
key before serving it. If the keys differ, the old cache value is
thrown out and the new response is generated and cached.

Like I said, this is clever and fast because we get enforcement of the
maximum number of files basically for free. However, collisions have
important downsides on a busy site like git.kernel.org:

1. If two popular but expensive URLs have the same cache-file object,
they can negate a lot of cache savings. E.g. if a very expensive diff
has the same cache-file as another very expensive diff, and both are
alternately hit with sufficient frequency, it results in a lot of cache
misses and high load.

2. I have witnessed cache corruption due to collisions (which is
a bug in itself). One of our frontends was hit by a lot of agressive
crawling of snapshots that raised the load to 60+ (many, many gzip
processes). After we blackholed the bot, some of the cache objects for
non-snapshot URLs had trailing gzip junk in them, meaning that either
two instances were writing to the same file, or something else resulted
in cache corruption. This is probably a race condition somewhere in the
locking code.

3. If we raise the value of cache-size higher in hopes to avoid
collisions, we risk running out of disk space due to large snapshots
accumulating in cache-root, since nothing cleans them up.

#3 is our interim solution, and we use a cronjob [1] that runs
every 5 minutes to clean up old snapshots static URLs, attempting to
respect the cache-ttl values. However, I wonder if a more robust cache
backend implementation can help resolve some of these problems,
especially the high chance of collisions when cache-size is set
sufficiently low.

.. [1] https://gist.github.com/mricon/28dd36fa093f9cb0748f63756f0f0a8d

-K
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20180613/ce149cb0/attachment.asc>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* cache-size implementation downsides
  2018-06-13 19:02 cache-size implementation downsides konstantin
@ 2018-06-16 15:46 ` john
  2018-06-19 23:02   ` john
  2018-06-20  6:01   ` list
  0 siblings, 2 replies; 6+ messages in thread
From: john @ 2018-06-16 15:46 UTC (permalink / raw)


On Wed, Jun 13, 2018 at 03:02:42PM -0400, Konstantin Ryabitsev wrote:
> 2. I have witnessed cache corruption due to collisions (which is
> a bug in itself). One of our frontends was hit by a lot of agressive
> crawling of snapshots that raised the load to 60+ (many, many gzip
> processes). After we blackholed the bot, some of the cache objects for
> non-snapshot URLs had trailing gzip junk in them, meaning that either
> two instances were writing to the same file, or something else resulted
> in cache corruption. This is probably a race condition somewhere in the
> locking code.

I've had a look at this, and I think we might end up dropping our lock
too early thanks to this code (in fill_slot()):

	/* Restore stdout */
	if (dup2(tmp, STDOUT_FILENO) == -1) {

Before this line, STDOUT_FILENO refers to lock_fd which has a POSIX
advisory record lock on the entire file.  However, the documentation for
that says:

       *  If a process closes any file descriptor referring to a file, then all
          of the process's locks on that file are released, regardless of the
          file descriptor(s)  on  which  the  locks  were  obtained.  This is
          bad: it means that a process can lose its locks on a file such as
          /etc/passwd or  /etc/mtab  when for some reason a library function
          decides to open, read, and close the same file.

I haven't verified this, but I suspect that dup'ing the original stdout
over STDOUT_FILENO is equivalent to closing a file descriptor referring
to our lock file.  And thus the lock is released at this point, which is
before we rename the lock file over the cache file.

If that is correct, then there is a window during which a new process
can open the lock file to write new content and successfully acquire the
lock on that file even though it is still being used by another process.

-- >8 --
Subject: [PATCH] cache: close race window when unlocking slots

We use POSIX advisory record locks to control access to cache slots, but
these have an unhelpful behaviour in that they are released when any
file descriptor referencing the file is closed by this process.

Mostly this is okay, since we know we won't be opening the lock file
anywhere else, but there is one place that it does matter: when we
restore stdout we dup2() over a file descriptor referring to the file,
thus closing that descriptor.

Since we restore stdout before unlocking the slot, this creates a window
during which the slot content can be overwritten.  The fix is reasonably
straightforward: simply restore stdout after unlocking the slot, but the
diff is a bit bigger because this requires us to move the temporary
stdout FD into struct cache_slot.

Signed-off-by: John Keeping <john at keeping.me.uk>
---
 cache.c | 37 ++++++++++++++-----------------------
 1 file changed, 14 insertions(+), 23 deletions(-)

diff --git a/cache.c b/cache.c
index 0901e6e..2c70be7 100644
--- a/cache.c
+++ b/cache.c
@@ -29,6 +29,7 @@ struct cache_slot {
 	cache_fill_fn fn;
 	int cache_fd;
 	int lock_fd;
+	int stdout_fd;
 	const char *cache_name;
 	const char *lock_name;
 	int match;
@@ -197,6 +198,13 @@ static int unlock_slot(struct cache_slot *slot, int replace_old_slot)
 	else
 		err = unlink(slot->lock_name);
 
+	/* Restore stdout and close the temporary FD. */
+	if (slot->stdout_fd >= 0) {
+		dup2(slot->stdout_fd, STDOUT_FILENO);
+		close(slot->stdout_fd);
+		slot->stdout_fd = -1;
+	}
+
 	if (err)
 		return errno;
 
@@ -208,42 +216,24 @@ static int unlock_slot(struct cache_slot *slot, int replace_old_slot)
  */
 static int fill_slot(struct cache_slot *slot)
 {
-	int tmp;
-
 	/* Preserve stdout */
-	tmp = dup(STDOUT_FILENO);
-	if (tmp == -1)
+	slot->stdout_fd = dup(STDOUT_FILENO);
+	if (slot->stdout_fd == -1)
 		return errno;
 
 	/* Redirect stdout to lockfile */
-	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1) {
-		close(tmp);
+	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1)
 		return errno;
-	}
 
 	/* Generate cache content */
 	slot->fn();
 
 	/* Make sure any buffered data is flushed to the file */
-	if (fflush(stdout)) {
-		close(tmp);
+	if (fflush(stdout))
 		return errno;
-	}
 
 	/* update stat info */
-	if (fstat(slot->lock_fd, &slot->cache_st)) {
-		close(tmp);
-		return errno;
-	}
-
-	/* Restore stdout */
-	if (dup2(tmp, STDOUT_FILENO) == -1) {
-		close(tmp);
-		return errno;
-	}
-
-	/* Close the temporary filedescriptor */
-	if (close(tmp))
+	if (fstat(slot->lock_fd, &slot->cache_st))
 		return errno;
 
 	return 0;
@@ -393,6 +383,7 @@ int cache_process(int size, const char *path, const char *key, int ttl,
 	strbuf_addstr(&lockname, ".lock");
 	slot.fn = fn;
 	slot.ttl = ttl;
+	slot.stdout_fd = -1;
 	slot.cache_name = filename.buf;
 	slot.lock_name = lockname.buf;
 	slot.key = key;
-- 
2.17.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* cache-size implementation downsides
  2018-06-16 15:46 ` john
@ 2018-06-19 23:02   ` john
  2018-06-20  6:01   ` list
  1 sibling, 0 replies; 6+ messages in thread
From: john @ 2018-06-19 23:02 UTC (permalink / raw)


On Sat, Jun 16, 2018 at 04:46:21PM +0100, John Keeping wrote:
> On Wed, Jun 13, 2018 at 03:02:42PM -0400, Konstantin Ryabitsev wrote:
> > 2. I have witnessed cache corruption due to collisions (which is
> > a bug in itself). One of our frontends was hit by a lot of agressive
> > crawling of snapshots that raised the load to 60+ (many, many gzip
> > processes). After we blackholed the bot, some of the cache objects for
> > non-snapshot URLs had trailing gzip junk in them, meaning that either
> > two instances were writing to the same file, or something else resulted
> > in cache corruption. This is probably a race condition somewhere in the
> > locking code.
> 
> I've had a look at this, and I think we might end up dropping our lock
> too early thanks to this code (in fill_slot()):
> 
> 	/* Restore stdout */
> 	if (dup2(tmp, STDOUT_FILENO) == -1) {
> 
> Before this line, STDOUT_FILENO refers to lock_fd which has a POSIX
> advisory record lock on the entire file.  However, the documentation for
> that says:
> 
>        *  If a process closes any file descriptor referring to a file, then all
>           of the process's locks on that file are released, regardless of the
>           file descriptor(s)  on  which  the  locks  were  obtained.  This is
>           bad: it means that a process can lose its locks on a file such as
>           /etc/passwd or  /etc/mtab  when for some reason a library function
>           decides to open, read, and close the same file.
> 
> I haven't verified this, but I suspect that dup'ing the original stdout
> over STDOUT_FILENO is equivalent to closing a file descriptor referring
> to our lock file.  And thus the lock is released at this point, which is
> before we rename the lock file over the cache file.
> 
> If that is correct, then there is a window during which a new process
> can open the lock file to write new content and successfully acquire the
> lock on that file even though it is still being used by another process.

I confirmed this behaviour with trace-cmd.

Before:
            cgit-7291  : posix_lock_inode:     fl=0x0xffff88020faa6258 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x(nil) fl_owner=0x0xffff88007a8f2680 fl_pid=7291 fl_flags=FL_POSIX fl_type=F_WRLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7291  : fcntl_setlk:          fl=0x0xffff88020faa6258 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x(nil) fl_owner=0x0xffff88007a8f2680 fl_pid=7291 fl_flags=FL_POSIX fl_type=F_WRLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7291  : locks_get_lock_context: dev=0x8:0x14 ino=0x4e3e13 type=F_UNLCK ctx=0xffff8801beeb7930
            cgit-7291  : posix_lock_inode:     fl=0x0xffffc90003627da8 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x(nil) fl_owner=0x0xffff88007a8f2680 fl_pid=7291 fl_flags=FL_POSIX|FL_CLOSE fl_type=F_UNLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7291  : locks_remove_posix:   fl=0x0xffffc90003627da8 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x(nil) fl_owner=0x0xffff88007a8f2680 fl_pid=7291 fl_flags=FL_POSIX|FL_CLOSE fl_type=F_UNLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7291  : sys_enter_rename:     oldname: 0x559bd0bd5830, newname: 0x559bd0bd57d0
            cgit-7291  : sys_exit_rename:      0x0

After:
            cgit-7488  : posix_lock_inode:     fl=0x0xffff8802122c43e8 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x(nil) fl_owner=0x0xffff8800310958c0 fl_pid=7488 fl_flags=FL_POSIX fl_type=F_WRLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7488  : fcntl_setlk:          fl=0x0xffff8802122c43e8 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x(nil) fl_owner=0x0xffff8800310958c0 fl_pid=7488 fl_flags=FL_POSIX fl_type=F_WRLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7488  : sys_enter_rename:     oldname: 0x56512cd7f830, newname: 0x56512cd7f7d0
            cgit-7488  : sys_exit_rename:      0x0
            cgit-7488  : locks_get_lock_context: dev=0x8:0x14 ino=0x4e3e13 type=F_UNLCK ctx=0xffff8802144daa10
            cgit-7488  : posix_lock_inode:     fl=0x0xffffc900038d7da8 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x0xffff880006f5b780 fl_owner=0x0xffff8800310958c0 fl_pid=7488 fl_flags=FL_POSIX|FL_CLOSE fl_type=F_UNLCK fl_start=0 fl_end=9223372036854775807 ret=0
            cgit-7488  : locks_remove_posix:   fl=0x0xffffc900038d7da8 dev=0x8:0x14 ino=0x4e3e13 fl_next=0x0xffff880006f5b780 fl_owner=0x0xffff8800310958c0 fl_pid=7488 fl_flags=FL_POSIX|FL_CLOSE fl_type=F_UNLCK fl_start=0 fl_end=9223372036854775807 ret=0


I'm planning to queue the patch below on jk/for-jason and send a PR in
the next day or two, but it would be nice to get a reviewed-by before I
do that.

> -- >8 --
> Subject: [PATCH] cache: close race window when unlocking slots
> 
> We use POSIX advisory record locks to control access to cache slots, but
> these have an unhelpful behaviour in that they are released when any
> file descriptor referencing the file is closed by this process.
> 
> Mostly this is okay, since we know we won't be opening the lock file
> anywhere else, but there is one place that it does matter: when we
> restore stdout we dup2() over a file descriptor referring to the file,
> thus closing that descriptor.
> 
> Since we restore stdout before unlocking the slot, this creates a window
> during which the slot content can be overwritten.  The fix is reasonably
> straightforward: simply restore stdout after unlocking the slot, but the
> diff is a bit bigger because this requires us to move the temporary
> stdout FD into struct cache_slot.
> 
> Signed-off-by: John Keeping <john at keeping.me.uk>
> ---
>  cache.c | 37 ++++++++++++++-----------------------
>  1 file changed, 14 insertions(+), 23 deletions(-)
> 
> diff --git a/cache.c b/cache.c
> index 0901e6e..2c70be7 100644
> --- a/cache.c
> +++ b/cache.c
> @@ -29,6 +29,7 @@ struct cache_slot {
>  	cache_fill_fn fn;
>  	int cache_fd;
>  	int lock_fd;
> +	int stdout_fd;
>  	const char *cache_name;
>  	const char *lock_name;
>  	int match;
> @@ -197,6 +198,13 @@ static int unlock_slot(struct cache_slot *slot, int replace_old_slot)
>  	else
>  		err = unlink(slot->lock_name);
>  
> +	/* Restore stdout and close the temporary FD. */
> +	if (slot->stdout_fd >= 0) {
> +		dup2(slot->stdout_fd, STDOUT_FILENO);
> +		close(slot->stdout_fd);
> +		slot->stdout_fd = -1;
> +	}
> +
>  	if (err)
>  		return errno;
>  
> @@ -208,42 +216,24 @@ static int unlock_slot(struct cache_slot *slot, int replace_old_slot)
>   */
>  static int fill_slot(struct cache_slot *slot)
>  {
> -	int tmp;
> -
>  	/* Preserve stdout */
> -	tmp = dup(STDOUT_FILENO);
> -	if (tmp == -1)
> +	slot->stdout_fd = dup(STDOUT_FILENO);
> +	if (slot->stdout_fd == -1)
>  		return errno;
>  
>  	/* Redirect stdout to lockfile */
> -	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1) {
> -		close(tmp);
> +	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1)
>  		return errno;
> -	}
>  
>  	/* Generate cache content */
>  	slot->fn();
>  
>  	/* Make sure any buffered data is flushed to the file */
> -	if (fflush(stdout)) {
> -		close(tmp);
> +	if (fflush(stdout))
>  		return errno;
> -	}
>  
>  	/* update stat info */
> -	if (fstat(slot->lock_fd, &slot->cache_st)) {
> -		close(tmp);
> -		return errno;
> -	}
> -
> -	/* Restore stdout */
> -	if (dup2(tmp, STDOUT_FILENO) == -1) {
> -		close(tmp);
> -		return errno;
> -	}
> -
> -	/* Close the temporary filedescriptor */
> -	if (close(tmp))
> +	if (fstat(slot->lock_fd, &slot->cache_st))
>  		return errno;
>  
>  	return 0;
> @@ -393,6 +383,7 @@ int cache_process(int size, const char *path, const char *key, int ttl,
>  	strbuf_addstr(&lockname, ".lock");
>  	slot.fn = fn;
>  	slot.ttl = ttl;
> +	slot.stdout_fd = -1;
>  	slot.cache_name = filename.buf;
>  	slot.lock_name = lockname.buf;
>  	slot.key = key;
> -- 
> 2.17.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* cache-size implementation downsides
  2018-06-16 15:46 ` john
  2018-06-19 23:02   ` john
@ 2018-06-20  6:01   ` list
  2018-06-20  7:44     ` john
  1 sibling, 1 reply; 6+ messages in thread
From: list @ 2018-06-20  6:01 UTC (permalink / raw)


John Keeping <john at keeping.me.uk> on Sat, 2018/06/16 16:46:
> -- >8 --  
> Subject: [PATCH] cache: close race window when unlocking slots

You should add a "From:" line for easy git-am. ;)

> We use POSIX advisory record locks to control access to cache slots, but
> these have an unhelpful behaviour in that they are released when any
> file descriptor referencing the file is closed by this process.
> 
> Mostly this is okay, since we know we won't be opening the lock file
> anywhere else, but there is one place that it does matter: when we
> restore stdout we dup2() over a file descriptor referring to the file,
> thus closing that descriptor.
> 
> Since we restore stdout before unlocking the slot, this creates a window
> during which the slot content can be overwritten.  The fix is reasonably
> straightforward: simply restore stdout after unlocking the slot, but the
> diff is a bit bigger because this requires us to move the temporary
> stdout FD into struct cache_slot.
> 
> Signed-off-by: John Keeping <john at keeping.me.uk>

Reviewed-by: Christian Hesse <mail at eworm.de>

> ---
>  cache.c | 37 ++++++++++++++-----------------------
>  1 file changed, 14 insertions(+), 23 deletions(-)
> 
> diff --git a/cache.c b/cache.c
> index 0901e6e..2c70be7 100644
> --- a/cache.c
> +++ b/cache.c
> @@ -29,6 +29,7 @@ struct cache_slot {
>  	cache_fill_fn fn;
>  	int cache_fd;
>  	int lock_fd;
> +	int stdout_fd;
>  	const char *cache_name;
>  	const char *lock_name;
>  	int match;

Not relevant for this commit, but... Is there a reason that struct cache_slot
lives in cache.c, not cache.h?

> @@ -197,6 +198,13 @@ static int unlock_slot(struct cache_slot *slot, int
> replace_old_slot) else
>  		err = unlink(slot->lock_name);
>  
> +	/* Restore stdout and close the temporary FD. */
> +	if (slot->stdout_fd >= 0) {
> +		dup2(slot->stdout_fd, STDOUT_FILENO);
> +		close(slot->stdout_fd);
> +		slot->stdout_fd = -1;
> +	}
> +
>  	if (err)
>  		return errno;
>  
> @@ -208,42 +216,24 @@ static int unlock_slot(struct cache_slot *slot, int
> replace_old_slot) */
>  static int fill_slot(struct cache_slot *slot)
>  {
> -	int tmp;
> -
>  	/* Preserve stdout */
> -	tmp = dup(STDOUT_FILENO);
> -	if (tmp == -1)
> +	slot->stdout_fd = dup(STDOUT_FILENO);
> +	if (slot->stdout_fd == -1)
>  		return errno;
>  
>  	/* Redirect stdout to lockfile */
> -	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1) {
> -		close(tmp);
> +	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1)
>  		return errno;
> -	}
>  
>  	/* Generate cache content */
>  	slot->fn();
>  
>  	/* Make sure any buffered data is flushed to the file */
> -	if (fflush(stdout)) {
> -		close(tmp);
> +	if (fflush(stdout))
>  		return errno;
> -	}
>  
>  	/* update stat info */
> -	if (fstat(slot->lock_fd, &slot->cache_st)) {
> -		close(tmp);
> -		return errno;
> -	}
> -
> -	/* Restore stdout */
> -	if (dup2(tmp, STDOUT_FILENO) == -1) {
> -		close(tmp);
> -		return errno;
> -	}
> -
> -	/* Close the temporary filedescriptor */
> -	if (close(tmp))
> +	if (fstat(slot->lock_fd, &slot->cache_st))
>  		return errno;
>  
>  	return 0;
> @@ -393,6 +383,7 @@ int cache_process(int size, const char *path, const
> char *key, int ttl, strbuf_addstr(&lockname, ".lock");
>  	slot.fn = fn;
>  	slot.ttl = ttl;
> +	slot.stdout_fd = -1;
>  	slot.cache_name = filename.buf;
>  	slot.lock_name = lockname.buf;
>  	slot.key = key;


-- 
main(a){char*c=/*    Schoene Gruesse                         */"B?IJj;MEH"
"CX:;",b;for(a/*    Best regards             my address:    */=0;b=c[a++];)
putchar(b-1/(/*    Chris            cc -ox -xc - && ./x    */b/42*2-3)*42);}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20180620/d1537a98/attachment.asc>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* cache-size implementation downsides
  2018-06-20  6:01   ` list
@ 2018-06-20  7:44     ` john
  2018-06-20  7:55       ` list
  0 siblings, 1 reply; 6+ messages in thread
From: john @ 2018-06-20  7:44 UTC (permalink / raw)


On Wed, Jun 20, 2018 at 08:01:11AM +0200, Christian Hesse wrote:
> John Keeping <john at keeping.me.uk> on Sat, 2018/06/16 16:46:
> > -- >8 --  
> > Subject: [PATCH] cache: close race window when unlocking slots
> 
> You should add a "From:" line for easy git-am. ;)

"git am --scissors" will pick it up from the message headers, won't it?

> > We use POSIX advisory record locks to control access to cache slots, but
> > these have an unhelpful behaviour in that they are released when any
> > file descriptor referencing the file is closed by this process.
> > 
> > Mostly this is okay, since we know we won't be opening the lock file
> > anywhere else, but there is one place that it does matter: when we
> > restore stdout we dup2() over a file descriptor referring to the file,
> > thus closing that descriptor.
> > 
> > Since we restore stdout before unlocking the slot, this creates a window
> > during which the slot content can be overwritten.  The fix is reasonably
> > straightforward: simply restore stdout after unlocking the slot, but the
> > diff is a bit bigger because this requires us to move the temporary
> > stdout FD into struct cache_slot.
> > 
> > Signed-off-by: John Keeping <john at keeping.me.uk>
> 
> Reviewed-by: Christian Hesse <mail at eworm.de>

Thanks for the review!

> > ---
> >  cache.c | 37 ++++++++++++++-----------------------
> >  1 file changed, 14 insertions(+), 23 deletions(-)
> > 
> > diff --git a/cache.c b/cache.c
> > index 0901e6e..2c70be7 100644
> > --- a/cache.c
> > +++ b/cache.c
> > @@ -29,6 +29,7 @@ struct cache_slot {
> >  	cache_fill_fn fn;
> >  	int cache_fd;
> >  	int lock_fd;
> > +	int stdout_fd;
> >  	const char *cache_name;
> >  	const char *lock_name;
> >  	int match;
> 
> Not relevant for this commit, but... Is there a reason that struct cache_slot
> lives in cache.c, not cache.h?

I guess because it's only needed in this file, so we can encapsulate it
here and avoid leaking the implementation details to other parts of the
code.

But this predates my involvement with CGit by a few years.

> > @@ -197,6 +198,13 @@ static int unlock_slot(struct cache_slot *slot, int
> > replace_old_slot) else
> >  		err = unlink(slot->lock_name);
> >  
> > +	/* Restore stdout and close the temporary FD. */
> > +	if (slot->stdout_fd >= 0) {
> > +		dup2(slot->stdout_fd, STDOUT_FILENO);
> > +		close(slot->stdout_fd);
> > +		slot->stdout_fd = -1;
> > +	}
> > +
> >  	if (err)
> >  		return errno;
> >  
> > @@ -208,42 +216,24 @@ static int unlock_slot(struct cache_slot *slot, int
> > replace_old_slot) */
> >  static int fill_slot(struct cache_slot *slot)
> >  {
> > -	int tmp;
> > -
> >  	/* Preserve stdout */
> > -	tmp = dup(STDOUT_FILENO);
> > -	if (tmp == -1)
> > +	slot->stdout_fd = dup(STDOUT_FILENO);
> > +	if (slot->stdout_fd == -1)
> >  		return errno;
> >  
> >  	/* Redirect stdout to lockfile */
> > -	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1) {
> > -		close(tmp);
> > +	if (dup2(slot->lock_fd, STDOUT_FILENO) == -1)
> >  		return errno;
> > -	}
> >  
> >  	/* Generate cache content */
> >  	slot->fn();
> >  
> >  	/* Make sure any buffered data is flushed to the file */
> > -	if (fflush(stdout)) {
> > -		close(tmp);
> > +	if (fflush(stdout))
> >  		return errno;
> > -	}
> >  
> >  	/* update stat info */
> > -	if (fstat(slot->lock_fd, &slot->cache_st)) {
> > -		close(tmp);
> > -		return errno;
> > -	}
> > -
> > -	/* Restore stdout */
> > -	if (dup2(tmp, STDOUT_FILENO) == -1) {
> > -		close(tmp);
> > -		return errno;
> > -	}
> > -
> > -	/* Close the temporary filedescriptor */
> > -	if (close(tmp))
> > +	if (fstat(slot->lock_fd, &slot->cache_st))
> >  		return errno;
> >  
> >  	return 0;
> > @@ -393,6 +383,7 @@ int cache_process(int size, const char *path, const
> > char *key, int ttl, strbuf_addstr(&lockname, ".lock");
> >  	slot.fn = fn;
> >  	slot.ttl = ttl;
> > +	slot.stdout_fd = -1;
> >  	slot.cache_name = filename.buf;
> >  	slot.lock_name = lockname.buf;
> >  	slot.key = key;


^ permalink raw reply	[flat|nested] 6+ messages in thread

* cache-size implementation downsides
  2018-06-20  7:44     ` john
@ 2018-06-20  7:55       ` list
  0 siblings, 0 replies; 6+ messages in thread
From: list @ 2018-06-20  7:55 UTC (permalink / raw)


John Keeping <john at keeping.me.uk> on Wed, 2018/06/20 08:44:
> On Wed, Jun 20, 2018 at 08:01:11AM +0200, Christian Hesse wrote:
> > John Keeping <john at keeping.me.uk> on Sat, 2018/06/16 16:46:  
> > > -- >8 --    
> > > Subject: [PATCH] cache: close race window when unlocking slots  
> > 
> > You should add a "From:" line for easy git-am. ;)  
> 
> "git am --scissors" will pick it up from the message headers, won't it?

Ah, nice... Learning about new features all the time. :-p
Thanks for the hint!
-- 
main(a){char*c=/*    Schoene Gruesse                         */"B?IJj;MEH"
"CX:;",b;for(a/*    Best regards             my address:    */=0;b=c[a++];)
putchar(b-1/(/*    Chris            cc -ox -xc - && ./x    */b/42*2-3)*42);}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.zx2c4.com/pipermail/cgit/attachments/20180620/4f1c504e/attachment.asc>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-20  7:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-13 19:02 cache-size implementation downsides konstantin
2018-06-16 15:46 ` john
2018-06-19 23:02   ` john
2018-06-20  6:01   ` list
2018-06-20  7:44     ` john
2018-06-20  7:55       ` list

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).