9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] could not write super block; waiting 10 seconds
@ 2012-03-26 10:18 Richard Miller
  2012-03-26 12:04 ` Russ Cox
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Miller @ 2012-03-26 10:18 UTC (permalink / raw)
  To: 9fans

Has anyone else been unsettled by the occasional messages from
fossil saying (1) "could not write super block; waiting 10 seconds"
and (2) "blistAlloc: called on clean block"?

Patch fossil-superblock-write gets rid of them.

(1) When taking a snapshot, blockWrite in cache.c is called to write
an updated super block S, which has a pointer to the root block R
for the new epoch.  To maintain consistency on the disk, R must be
written before S, so blockWrite checks whether R is still in the
cache and marked dirty.  Very rarely, blockWrite finds R locked (eg
because the flush thread is just now writing it), so it gives up and
returns zero.  The zero return is OK when blockWrite is called by
the flush thread, because the flush thread can get on with writing
out other blocks before coming back to try the failed block again.
But when blockWrite is called by superWrite, there's nothing else to
do; hence the 10 second sleep and warning message.  The solution is
to add a waitlock parameter to blockWrite, so superWrite can tell it
to wait for a locked dependent block.

(2) After the new super block S is sent to the disk write queue,
superWrite removes the previous epoch's root block R' from the
active file system.  This is normally done by attaching a BList
entry to S in the cache, noting that R' must be marked closed after
S actually goes to the disk.  Rarely, S has already been written by
the time blistAlloc is called.  In this case the correct thing was
being done (just close R' immediately), but a spurious warning was
produced.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] could not write super block; waiting 10 seconds
  2012-03-26 10:18 [9fans] could not write super block; waiting 10 seconds Richard Miller
@ 2012-03-26 12:04 ` Russ Cox
  0 siblings, 0 replies; 4+ messages in thread
From: Russ Cox @ 2012-03-26 12:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, Mar 26, 2012 at 6:18 AM, Richard Miller <9fans@hamnavoe.com> wrote:
> (1) When taking a snapshot, blockWrite in cache.c is called to write
> an updated super block S, which has a pointer to the root block R
> for the new epoch.  To maintain consistency on the disk, R must be
> written before S, so blockWrite checks whether R is still in the
> cache and marked dirty.  Very rarely, blockWrite finds R locked (eg
> because the flush thread is just now writing it), so it gives up and
> returns zero.  The zero return is OK when blockWrite is called by
> the flush thread, because the flush thread can get on with writing
> out other blocks before coming back to try the failed block again.
> But when blockWrite is called by superWrite, there's nothing else to
> do; hence the 10 second sleep and warning message.  The solution is
> to add a waitlock parameter to blockWrite, so superWrite can tell it
> to wait for a locked dependent block.
>
> (2) After the new super block S is sent to the disk write queue,
> superWrite removes the previous epoch's root block R' from the
> active file system.  This is normally done by attaching a BList
> entry to S in the cache, noting that R' must be marked closed after
> S actually goes to the disk.  Rarely, S has already been written by
> the time blistAlloc is called.  In this case the correct thing was
> being done (just close R' immediately), but a spurious warning was
> produced.

Than you for cleaning these up.  These are both things that
I meant to come back to some day, but I never did.

Russ


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] could not write super block; waiting 10 seconds
  2005-01-17 15:25 Steve Simon
@ 2005-01-17 18:34 ` Christopher Nielsen
  0 siblings, 0 replies; 4+ messages in thread
From: Christopher Nielsen @ 2005-01-17 18:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, Jan 17, 2005 at 03:25:54PM +0000, Steve Simon wrote:
> 
> I am getting a message from fossil:
> 
> 	could not write super block; waiting 10 seconds
> 			[system hangs for 10 secs]
> 	blistAlloc: called on clean block
> 
> a few times a day recently, both at home and at work
> (very different hardware - though both have SCSI disks).
> 
> I guess these are just warnings and I can happily ignore them?

FWIW, I've been seeing this too on ATA disks.

-- 
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [9fans] could not write super block; waiting 10 seconds
@ 2005-01-17 15:25 Steve Simon
  2005-01-17 18:34 ` Christopher Nielsen
  0 siblings, 1 reply; 4+ messages in thread
From: Steve Simon @ 2005-01-17 15:25 UTC (permalink / raw)
  To: 9fans

Hi,

I am getting a message from fossil:

	could not write super block; waiting 10 seconds
			[system hangs for 10 secs]
	blistAlloc: called on clean block

a few times a day recently, both at home and at work
(very different hardware - though both have SCSI disks).

I guess these are just warnings and I can happily ignore them?

-Steve


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-03-26 12:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-26 10:18 [9fans] could not write super block; waiting 10 seconds Richard Miller
2012-03-26 12:04 ` Russ Cox
  -- strict thread matches above, loose matches on Subject: below --
2005-01-17 15:25 Steve Simon
2005-01-17 18:34 ` Christopher Nielsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).