9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] KFS Crash
@ 2001-02-14 23:06 rob pike
  0 siblings, 0 replies; 6+ messages in thread
From: rob pike @ 2001-02-14 23:06 UTC (permalink / raw)
  To: 9fans

> 	1. If rio crashes is there a way to kill
> 	   rio and get back to the shell?
>
> Not really.

Yes really, although it's not pretty.

On bad rio days (I have them more than most people, I suspect,
although still not very often) I need to debug rio after it's crashed.
If you have a shell prompt, you're golden but it takes a little work.
Hit carriage return until every one gets you a shell prompt; that
means the rio process holding the keyboard open has filled its
buffer.  Then, although echo will still be off, the shell is all yours.
Type
	kill rio|rc
and you'll get echo back and you can restart rio or whatever else
you want to do.

There, isn't that disgusting?  But I helped track down a corrupt
network interface on an IRIX machine a little while ago so I'm in
a debugging, sharing mood.

-rob



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] KFS Crash
  2001-02-14 22:01 Russ Cox
@ 2001-02-15 14:31 ` Mark C. Otto
  0 siblings, 0 replies; 6+ messages in thread
From: Mark C. Otto @ 2001-02-15 14:31 UTC (permalink / raw)
  To: 9fans

Booting up with the install disk, I was able to get a window under rio.  I could
not start kfs though:

% disk/kfs -f /dev/sdC0/fs
File system main inconsistent
Would you like to ream it (y/n)?  hell no. tag = <badtag>; expected Tdir
kfs init 496: FID1 attach to root

# The root of the problem is at the root.  Russ is right:  at least it asks
before
# wiping out your file system.

mount /srv/kfs /n/kfs
mount: sys: write on closed pop pc=0x00003abe

I can get to the 9fat file system.  I can probably reconstruct most of what I
lost expect for a few mail messages and my plan9 wish list.  I'm sure all the
important things will come back to me. :)

For the autopsy, I have always run disk/kfscmd halt to shut the system down
except for the few times the system locked up.  I have never run disk/kfscmd
check.  I have 500mb disk and had just installed wiki and pq.  (I didn't install
TeX for space reasons.)  That may have nearly filled the disk.  I haven't run
Russ's df in a while, so I won't be able to confirm that.  It did get some 9p
and bad address errors before it locked up.  I should have had a plan before I
rebooted the final time.  Next time I might use Rob's rio crash fall back
procedure.


To debug rio after it's crashed:
1. If you have a shell prompt, hit carriage return until every one gets you a
shell prompt; that means the rio process holding the keyboard open has filled
its
buffer.  Then, although echo will still be off, the shell is all yours.
2. Type
     kill rio|rc
and you'll get echo back and you can restart rio or whatever else
you want to do.

3. After that run the disk diagnostics and rebuild the disk

 disk/kfscmd -r check
 disk/kfscmd -fdtw check  # I'm guessing at the options from kfscmd(8)


Guess this is the kick to usurp another old pc for a cpu/auth server, set up
dhcpd(8), and u9fs on my linux box.  Unfortunately, I have JMK's disliked
Adaptec SCSI controllers (AHA-2940U2/AHA-2940U2W PCI and AIC-7880 PCI), so I
can't set up a real file server yet.

Russ, Rob, and Scott thanks for helping me at least find out what state my
system is in.

Mark


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] KFS Crash
@ 2001-02-14 22:02 Russ Cox
  0 siblings, 0 replies; 6+ messages in thread
From: Russ Cox @ 2001-02-14 22:02 UTC (permalink / raw)
  To: 9fans

	It's kinda bad that it doesn't give you a chance to answer the question.
	A kfs on a rescue floppy would at least let you check the filesystem
	and maybe recover something.

At least it assumes a negative answer rather
than reformatting your file system right then
and there.

Russ


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] KFS Crash
@ 2001-02-14 22:01 Russ Cox
  2001-02-15 14:31 ` Mark C. Otto
  0 siblings, 1 reply; 6+ messages in thread
From: Russ Cox @ 2001-02-14 22:01 UTC (permalink / raw)
  To: 9fans

	1. If rio crashes is there a way to kill
	   rio and get back to the shell?

Not really.

	2. Given that I could, what diagnostics
	   could I run to identify the problem?
	3. Is there a way to repair the disk at this point?

I rarely see kfs get that hosed.  Are you sure
the root is from: got the right file system?

I'd try booting an install floppy and using it
as a rescue disk: ignore the install process,
draw yourself a new window, and try to start kfs
manually:

	disk/kfs -f /dev/sdC0/fs
	mount /srv/kfs /n/kfs

and maybe you'll get a bit farther.

There's almost always a way to repair the disk, depending
on how much energy you're willing to devote to it.
I have a clumsy C program that tries to pull
out textual data from broken kfs file systems
if you need something that wasn't backed up.

	4. Given that you don't have much faith
	   in kfs and some of us are using it
	   exclusively on our standalone terminals,
	   what sort of maintainance, such as
	   disk/kfscmd check, should we be doing?

One method is to set up two file systems and run
check say once a month.  When you start getting
things like bad tags, ream the other file system
and copy your data over; repeat.

For the most part, kfs is stable.  It gets unstable
fairly fast if you frequently don't "disk/kfscmd halt"
before shutting down, or if you crash your kernels a lot
(implies the first, but usually a bigger problem
since you can die during heavy disk i/o).

Nothing's set in stone but I think one hope for the
fabled file server rewrite is to have kfs build
from the same sources, which may at least exercise it
more.

Russ


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] KFS Crash
  2001-02-14 21:43 ` [9fans] KFS Crash Mark C. Otto
@ 2001-02-14 21:57   ` Scott Schwartz
  0 siblings, 0 replies; 6+ messages in thread
From: Scott Schwartz @ 2001-02-14 21:57 UTC (permalink / raw)
  To: 9fans

> Would you like to ream it (y/n)?        tag = <badtag>;Tdir

It's kinda bad that it doesn't give you a chance to answer the question.
A kfs on a rescue floppy would at least let you check the filesystem
and maybe recover something.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [9fans] KFS Crash
  2001-02-01  8:08 [9fans] plan 9 wiki experiment Quinn Dunkan
@ 2001-02-14 21:43 ` Mark C. Otto
  2001-02-14 21:57   ` Scott Schwartz
  0 siblings, 1 reply; 6+ messages in thread
From: Mark C. Otto @ 2001-02-14 21:43 UTC (permalink / raw)
  To: 9fans

I had bad addresses and 9p message start filling the screen.  Rio locked up, and
I didn't see any way to get out of it, so I rebooted.  Unfortunately, I couldn't
even boot up, not even with my boot floppy.


kfs...boot: nop...File system main inconsistent
Would you like to ream it (y/n)?        tag = <badtag>;Tdir
kfs init 5: FID1 attach to root
boot: read nop: file does not exist
panic: boot process died: unknown
ktrace /kernel/path 8010649e 8026bdb4
8026bd54=801063a3 8026hd64=80171915
...
cpu0: exiting

Looks like I am going to have to reinstall the system, unless my terminal is
just trying to be sympathetic with Russ's. :-(  I have the following questions:

1. If rio crashes is there a way to kill rio and get back to the shell?
2. Given that I could, what diagnostics could I run to identify the problem?
3. Is there a way to repair the disk at this point?
4. Given that you don't have much faith in kfs and some of us are using it
exclusively on our standalone terminals, what sort of maintainance, such as
disk/kfscmd check, should we be doing?

Mark


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-02-15 14:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-14 23:06 [9fans] KFS Crash rob pike
  -- strict thread matches above, loose matches on Subject: below --
2001-02-14 22:02 Russ Cox
2001-02-14 22:01 Russ Cox
2001-02-15 14:31 ` Mark C. Otto
2001-02-01  8:08 [9fans] plan 9 wiki experiment Quinn Dunkan
2001-02-14 21:43 ` [9fans] KFS Crash Mark C. Otto
2001-02-14 21:57   ` Scott Schwartz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).