9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] vtcache exhaustion
@ 2010-02-22 21:03 Anthony Sorace
  2010-02-22 22:20 ` erik quanstrom
  2010-02-23  0:15 ` Russ Cox
  0 siblings, 2 replies; 4+ messages in thread
From: Anthony Sorace @ 2010-02-22 21:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1810 bytes --]

I've been running a vac-based backup on a few unix systems for a while
now.
A bit over a week ago, one of them started failing with this error:

	vac: vtcachelocal: asked for block #6289076; only 6288808 blocks

Adding -v to the vac invocation, it seems to be failing in the same
place each
night. The last file listed is 12k, the next is 13k.

I'm running vac with -a to get the dump-like hierarchy. While the size
of my
current backup set is not much larger than the months of successful
runs, there
were a few days right before the failure when part of a large video
archive was
being accidentally included. The total backup size for those days was
about
10GB, but with low churn. The currently-failing backups have passed
the part
of the filesystem where that data is stored and have correctly omitted
it.

The last paragraph of venti-cache(3) (on p9p, although venti-cache(2)
reads
the same here) includes a few interesting statements. First:

	If a new cache block must be allocated... but the cache is filled...
the
	library prints the score and reference count of every block in the
cache
	and then aborts.

This isn't happening; I get only the above "asked for" line. I don't
see in the
source anywhere this *ought* to be happening, either. Is the man page
simply
out of date? Was this pulled at some point? It seems like it'd be
helpful.

The same paragraph also notes:

	A full cache indicates either that the cache is too small, or, more
	commonly, that cache blocks are being leaked.

I've had a hard time tracking blocks through their local vs. global
states. Any
tips for tracking down any potential leaks? To get limping along
again, ought it
be safe to simply bump up the size of the cache in the vacfsopen call?

Anthony Sorace
Strand 1


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 201 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] vtcache exhaustion
  2010-02-22 21:03 [9fans] vtcache exhaustion Anthony Sorace
@ 2010-02-22 22:20 ` erik quanstrom
  2010-02-23  0:15 ` Russ Cox
  1 sibling, 0 replies; 4+ messages in thread
From: erik quanstrom @ 2010-02-22 22:20 UTC (permalink / raw)
  To: 9fans

> I've been running a vac-based backup on a few unix systems for a while
> now.  A bit over a week ago, one of them started failing with this
> error:
>
> 	vac: vtcachelocal: asked for block #6289076; only 6288808
> 	blocks
>
[...]
>
> 	If a new cache block must be allocated...  but the cache is
> 	filled...
> the
> 	library prints the score and reference count of every block in
> 	the
> cache
> 	and then aborts.

i mention this because it sounds so similar to a bug i found
in ken's fs.  i hope this is useful, but it probablly isn't.

if you change an indirect block but not a direct
block, it's possible to miss dumping the direct block if that
block doesn't happen to be cached.  this is because the test
was

from cw.c:/^isdirty

	if(tag >= Tind1 && tag <= Tmaxind)
		/* botch, get these modified */
		if(s != Cnone)
			return 1;

but i found that this is safer:

	/*
	 * botch: we should mark the parents of modified
	 * indirect blocks as split-on-dump.
	 */
	if(tag >= Tind1 && tag <= Tmaxind)
		return 2;

a better solution would be to do as the comment suggests.

i think that vac cache.c:/^lumpWalk makes the same sort
of determination about these lines (from the plan 9 version)

cache.c:623,626
if(0)fprint(2, "lumpWalk: %V:%s %d:%d-> %V:%d\n", u->score, lumpState(u->state), u->type, offset, score, type);
	v = cacheGetLump(c, score, type, size);
	if(v == nil)
		return nil;


- erik



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] vtcache exhaustion
  2010-02-22 21:03 [9fans] vtcache exhaustion Anthony Sorace
  2010-02-22 22:20 ` erik quanstrom
@ 2010-02-23  0:15 ` Russ Cox
  2010-02-23  0:54   ` Venkatesh Srinivas
  1 sibling, 1 reply; 4+ messages in thread
From: Russ Cox @ 2010-02-23  0:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I am not sure, but I have seen things like this since
I tried to expand the venti protocol to >16-bit blocks.
I have since decided that was a mistake, too intrusive
a change, but I have not yet gotten around to rolling
back to the old versions of the files.

That might make the bug go away, whatever it is.
Somehow it seems unlikely you have that many blocks
in your cache.

Russ


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] vtcache exhaustion
  2010-02-23  0:15 ` Russ Cox
@ 2010-02-23  0:54   ` Venkatesh Srinivas
  0 siblings, 0 replies; 4+ messages in thread
From: Venkatesh Srinivas @ 2010-02-23  0:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, Feb 22, 2010 at 7:15 PM, Russ Cox <rsc@swtch.com> wrote:
> I am not sure, but I have seen things like this since
> I tried to expand the venti protocol to >16-bit blocks.
> I have since decided that was a mistake, too intrusive
> a change, but I have not yet gotten around to rolling
> back to the old versions of the files.
>
> That might make the bug go away, whatever it is.
> Somehow it seems unlikely you have that many blocks
> in your cache.

I doubt it. I saw this bug in 2/2009, before the v4 protocol was deployed...

-- vs



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-02-23  0:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-22 21:03 [9fans] vtcache exhaustion Anthony Sorace
2010-02-22 22:20 ` erik quanstrom
2010-02-23  0:15 ` Russ Cox
2010-02-23  0:54   ` Venkatesh Srinivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).