9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] kfs locks
@ 2004-03-07 17:55 David Tolpin
  2004-03-07 19:56 ` Charles Forsyth
  0 siblings, 1 reply; 7+ messages in thread
From: David Tolpin @ 2004-03-07 17:55 UTC (permalink / raw)
  To: 9fans

Hi,

cmd_sync sets 

rlock(&mainlock); 

...
syncall();
...
runlock(&mainlock);

...
syncproc, the background process forked by kfs, runs 
syncblock()
without setting rlock(&mainlock)

Can it the cause that I am getting wrenwrite errors after
'check f' on startup, if cmd_sync is not called after cmd_check
and before forking 'sync' and 'serve'?

I am still struggling to understand the problem. I've changed
the disk to be sure and still can reproduce the error if cmd_sync
is not called after the list of free blocks is rebuilt after blackdown.

David Tolpin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] kfs locks
  2004-03-07 17:55 [9fans] kfs locks David Tolpin
@ 2004-03-07 19:56 ` Charles Forsyth
  2004-03-07 20:03   ` David Tolpin
  0 siblings, 1 reply; 7+ messages in thread
From: Charles Forsyth @ 2004-03-07 19:56 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 497 bytes --]

the intent is that the sync command ensures that all current IO has gone
to disc without allowing file system activity
meanwhile--that's what the rlock of mainlock does--
but syncproc is allowed to run
concurrently with file system activity
to encourage the disc to be reasonably up to date
but without ensuring it.
there are interlocks on the buffers themselves (for instance)
to prevent confusion between several processes active
at once, whether for file system activity or syncproc.

[-- Attachment #2: Type: message/rfc822, Size: 2490 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: [9fans] kfs locks
Date: Sun, 7 Mar 2004 21:55:37 +0400 (AMT)
Message-ID: <200403071755.i27HtbcR080605@adat.davidashen.net>

Hi,

cmd_sync sets 

rlock(&mainlock); 

...
syncall();
...
runlock(&mainlock);

...
syncproc, the background process forked by kfs, runs 
syncblock()
without setting rlock(&mainlock)

Can it the cause that I am getting wrenwrite errors after
'check f' on startup, if cmd_sync is not called after cmd_check
and before forking 'sync' and 'serve'?

I am still struggling to understand the problem. I've changed
the disk to be sure and still can reproduce the error if cmd_sync
is not called after the list of free blocks is rebuilt after blackdown.

David Tolpin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] kfs locks
  2004-03-07 19:56 ` Charles Forsyth
@ 2004-03-07 20:03   ` David Tolpin
  2004-03-07 20:12     ` Charles Forsyth
  0 siblings, 1 reply; 7+ messages in thread
From: David Tolpin @ 2004-03-07 20:03 UTC (permalink / raw)
  To: 9fans

> there are interlocks on the buffers themselves (for instance)
> to prevent confusion between several processes active
> at once, whether for file system activity or syncproc.
>

But still,

If I sync immediately after rebuilding the freelist, the
kfs works fine. If I let syncproc do that, wrenwrite is passed
offsets not present in the file system eventually.

After a lot of investigation, I can only image a possible
cause in insufficient interlocking, but fail to understand
how this happens. 

This is not a hardware problem, I am able to reproduce it
with a different fast enough computer.

David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] kfs locks
  2004-03-07 20:03   ` David Tolpin
@ 2004-03-07 20:12     ` Charles Forsyth
  2004-03-07 20:20       ` David Tolpin
  0 siblings, 1 reply; 7+ messages in thread
From: Charles Forsyth @ 2004-03-07 20:12 UTC (permalink / raw)
  To: 9fans

in a previous message, with various debugging prints turned on
and inserted, there seemed to be driver requests to abort a command.
you don't seem to see the problem unless you power cycle,
and although i don't use it as much as i did, i still use kfs a
fair bit (and i do sometimes just switch off that machine),
and i don't see similar problems.
now, it's quite likely that i'm running an older version of kfs,
and there were changes in chk in november (for instance), although
they look benign.  i thought i'd like to know more about the
actual cause of wrenwrite errors before trying to guess what and
where the problem might be!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] kfs locks
  2004-03-07 20:12     ` Charles Forsyth
@ 2004-03-07 20:20       ` David Tolpin
  2004-03-08  3:27         ` jmk
  0 siblings, 1 reply; 7+ messages in thread
From: David Tolpin @ 2004-03-07 20:20 UTC (permalink / raw)
  To: 9fans

> in a previous message, with various debugging prints turned on
> and inserted, there seemed to be driver requests to abort a command.

write is being aborted, because it is asked to write at offset 
not present on the disk. It looks like the in-memory structure 
is corrupt.

> you don't seem to see the problem unless you power cycle,
> and although i don't use it as much as i did, i still use kfs a
> fair bit (and i do sometimes just switch off that machine),
> and i don't see similar problems.
> now, it's quite likely that i'm running an older version of kfs,

It is quite likely that your machine is slower. If I slow down
the kfs or kernel code by inserting more debugging, the problem
disappears because the structure is flushed before being overwritten.

> actual cause of wrenwrite errors before trying to guess what and
> where the problem might be!

offsets are negative vlongs. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] kfs locks
  2004-03-07 20:20       ` David Tolpin
@ 2004-03-08  3:27         ` jmk
  2004-03-08  5:26           ` David Tolpin
  0 siblings, 1 reply; 7+ messages in thread
From: jmk @ 2004-03-08  3:27 UTC (permalink / raw)
  To: 9fans

On Sun Mar  7 15:22:37 EST 2004, dvd@davidashen.net wrote:
> > in a previous message, with various debugging prints turned on
> > and inserted, there seemed to be driver requests to abort a command.
> 
> write is being aborted, because it is asked to write at offset 
> not present on the disk. It looks like the in-memory structure 
> is corrupt.
> 
> > you don't seem to see the problem unless you power cycle,
> > and although i don't use it as much as i did, i still use kfs a
> > fair bit (and i do sometimes just switch off that machine),
> > and i don't see similar problems.
> > now, it's quite likely that i'm running an older version of kfs,
> 
> It is quite likely that your machine is slower. If I slow down
> the kfs or kernel code by inserting more debugging, the problem
> disappears because the structure is flushed before being overwritten.
> 
> > actual cause of wrenwrite errors before trying to guess what and
> > where the problem might be!
> 
> offsets are negative vlongs. 

The offset is supposed to be checked way before getting to the
ATA driver:
1) the write and seek system calls check the offset is not negative;
2) the higher level of the disc driver checks the address requested
   is within the partition.

What does your partition table look like?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] kfs locks
  2004-03-08  3:27         ` jmk
@ 2004-03-08  5:26           ` David Tolpin
  0 siblings, 0 replies; 7+ messages in thread
From: David Tolpin @ 2004-03-08  5:26 UTC (permalink / raw)
  To: 9fans

> The offset is supposed to be checked way before getting to the
> ATA driver:

Yes, I see. 
Not all offsets are negative, some. I've added debugging print (("%lld",...)) 
to wrenwrite, not to the ATA driver. I am getting offsets beyond the range of the
filesystem, sometimes negative.

> What does your partition table look like?

geometry 29336832 512 16383 16 63
part data 0 29336832
part plan9 63 8225280
part 9fat 63 102463
part fs 102463 7442495
part nvram 7442495 7442496
part swap 7442496 8225280
part plan9.1 8225280 16450560
part swap.cpu 8225343 10322495


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-03-08  5:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-07 17:55 [9fans] kfs locks David Tolpin
2004-03-07 19:56 ` Charles Forsyth
2004-03-07 20:03   ` David Tolpin
2004-03-07 20:12     ` Charles Forsyth
2004-03-07 20:20       ` David Tolpin
2004-03-08  3:27         ` jmk
2004-03-08  5:26           ` David Tolpin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).