9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] How to get the diagnostics of fs(3)
@ 2008-10-15  8:28 Christian Kellermann
  2008-10-15  8:39 ` Fco. J. Ballesteros
  2008-10-15 12:42 ` erik quanstrom
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Kellermann @ 2008-10-15  8:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 713 bytes --]

Dear List,

I have experienced a disk crash in a mirrored fs(3). It turned out
that the mirroring has not been successful since December 2007 which
is quite a loss for me now.

To prevent a case like this it would have helped If I had seen the
error messages by fs(3) earlier/at all. By browsing through the
code with the intention to add somehting there I found several
prints that do issue the right warnings. What I don't see at the
moment: Where do they go to? How is #k started and how can I redirect
the stdout of #k to a file that I can monitor?

Thanks for your insights,

Christian

-- 
You may use my gpg key for replies:
pub  1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)

[-- Attachment #2: Type: application/pgp-signature, Size: 202 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [9fans]  How to get the diagnostics of fs(3)
  2008-10-15  8:28 [9fans] How to get the diagnostics of fs(3) Christian Kellermann
@ 2008-10-15  8:39 ` Fco. J. Ballesteros
  2008-10-15  8:47   ` Christian Kellermann
  2008-10-15 12:42 ` erik quanstrom
  1 sibling, 1 reply; 11+ messages in thread
From: Fco. J. Ballesteros @ 2008-10-15  8:39 UTC (permalink / raw)
  To: 9fans

We use aux/clog to send the contents of /dev/kprint to /sys/log/$sysname
We bind '#k' by hand after booting our server, but how you do it it depends
on the particular config for your machine.

>  From: Christian.Kellermann@nefkom.net
>  To: 9fans@cse.psu.edu
>  Reply-To: 9fans@9fans.net
>  Date: Wed Oct 15 10:31:41 CET 2008
>  Subject: [9fans] How to get the diagnostics of fs(3)
>
>  Dear List,
>
>  I have experienced a disk crash in a mirrored fs(3). It turned out
>  that the mirroring has not been successful since December 2007 which
>  is quite a loss for me now.
>
>  To prevent a case like this it would have helped If I had seen the
>  error messages by fs(3) earlier/at all. By browsing through the
>  code with the intention to add somehting there I found several
>  prints that do issue the right warnings. What I don't see at the
>  moment: Where do they go to? How is #k started and how can I redirect
>  the stdout of #k to a file that I can monitor?
>
>  Thanks for your insights,
>
>  Christian
>
>  --
>  You may use my gpg key for replies:
>  pub 1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)
>
>
>  — 2
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15  8:39 ` Fco. J. Ballesteros
@ 2008-10-15  8:47   ` Christian Kellermann
  2008-10-15  9:13     ` Fco. J. Ballesteros
  2008-10-15 15:48     ` ron minnich
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Kellermann @ 2008-10-15  8:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 479 bytes --]

* Fco. J. Ballesteros <nemo@lsub.org> [081015 10:42]:
> We use aux/clog to send the contents of /dev/kprint to /sys/log/$sysname
> We bind '#k' by hand after booting our server, but how you do it it depends
> on the particular config for your machine.
> 

As it has been I placed the fs= line in plan9.ini as I had my
fossil/venti starting off the fs(3) server. This might not be a
good idea now. Still if you want to keep it this way, can I still
get to the output?


[-- Attachment #2: Type: application/pgp-signature, Size: 202 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15  8:47   ` Christian Kellermann
@ 2008-10-15  9:13     ` Fco. J. Ballesteros
  2008-10-15 15:48     ` ron minnich
  1 sibling, 0 replies; 11+ messages in thread
From: Fco. J. Ballesteros @ 2008-10-15  9:13 UTC (permalink / raw)
  To: 9fans

All the output sent to the console is available via /dev/kprint.
If you copy that file somewhere, eg, using aux/clog, all your messages
should be in that file.

>  From: Christian.Kellermann@nefkom.net
>  To: 9fans@9fans.net
>  Reply-To: 9fans@9fans.net
>  Date: Wed Oct 15 10:48:09 CET 2008
>  Subject: Re: [9fans] How to get the diagnostics of fs(3)
>
>  * Fco. J. Ballesteros <nemo@lsub.org> [081015 10:42]:
>  > We use aux/clog to send the contents of /dev/kprint to /sys/log/$sysname
>  > We bind '#k' by hand after booting our server, but how you do it it depends
>  > on the particular config for your machine.
>  >
>
>  As it has been I placed the fs= line in plan9.ini as I had my
>  fossil/venti starting off the fs(3) server. This might not be a
>  good idea now. Still if you want to keep it this way, can I still
>  get to the output?
>
>
>
>  — 2
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15  8:28 [9fans] How to get the diagnostics of fs(3) Christian Kellermann
  2008-10-15  8:39 ` Fco. J. Ballesteros
@ 2008-10-15 12:42 ` erik quanstrom
  1 sibling, 0 replies; 11+ messages in thread
From: erik quanstrom @ 2008-10-15 12:42 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]

we have three different console servers at coraid.
so i've changed how consoles and console logging
works.  maybe this will be useful to other people.

here are the changes that make this work
1. instead of consoledb use /lib/ndb/consoledb.$sysname.
and have a general test for this new file in cpurc.
	if(test -f /lib/ndb/consoledb.$sysname){
		aux/consolefs -c /lib/ndb/consoledb.$sysname
		startclog
	}
2. startclog (attached) uses a new consoledb tuple
clog=$sysname, e.g. from /lib/ndb/consoledb.baron
	console=ila dev=/dev/eia5
		gid=sys
		clog=baron

(the -r flag is part of a very sneaky trick involving booting
from a skeleton kfs and binding large bits of the namespace
back via aan so that the console logger can survive loosing
its connection to the fs without having console logging
end up someplace other than the man fs.  the way to start
it is startclog; startclog -r.  this setup survived a switch gone
wild a few weeks ago.)

3. add the directory /sys/log/clog & create the console
files you wish.

- erik

[-- Attachment #2: startclog --]
[-- Type: text/plain, Size: 663 bytes --]

#!/bin/rc
rfork e
rflag=0
if(~ $1 -r)
	rflag=1
cd /mnt/consoles
for(i in *){
	x = `{ndb/query -f/lib/ndb/consoledb.$sysname console $i clog}
	if(~ $rflag 0 && ~ $x $sysname){
		log=/sys/log/clog/$i
		if(~ $rflag 1){
			mkdir -p /tmp/clog
			log=/tmp/clog/$i
		}
		while(~ 1 1){
			echo aux/clog $log /mnt/consoles/$i
			aux/clog /mnt/consoles/$i $log && echo clog exits $status
			sleep 10;
		}
	}&
}
fn rotate{
	rm /tmp/oclog/*
	cp /tmp/clog/* /tmp/oclog
	rm /tmp/clog/*
}
if(~ $rflag 1){
	mkdir -p /tmp/oclog /tmp/clog
	while(~ 1 1){
		sleep 600
		s=`{du -a /tmp/clog| sed 's/[ 	].*//g'}
		if(test $s -gt 10000)
			rotate
	}
}&

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15  8:47   ` Christian Kellermann
  2008-10-15  9:13     ` Fco. J. Ballesteros
@ 2008-10-15 15:48     ` ron minnich
  2008-10-15 15:54       ` andrey mirtchovski
  1 sibling, 1 reply; 11+ messages in thread
From: ron minnich @ 2008-10-15 15:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

simliar story here, lost disk, lost data. In this case because I made
a mistake.

I've been wondering -- how much of your file systems are you using out there?

Seems to me that one could build a fanless wonder with a small via
embedded system and USB sticks -- 3 of them -- as the disks, running
in the fs redundant mode. But that means you're limited in size to
64GB. Performance of the corsair voyagers is actually quite good, so
on a 100bt network you probably would not feel too much pain.

It just depends on whether 64G is enough. But the fanless wonder
removes all motors, which seems a good idea.

ron



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15 15:48     ` ron minnich
@ 2008-10-15 15:54       ` andrey mirtchovski
  2008-10-15 16:19         ` erik quanstrom
  0 siblings, 1 reply; 11+ messages in thread
From: andrey mirtchovski @ 2008-10-15 15:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

back when i had mirroring via devfs (some three years ago now) i used
'cmp' to verify that the disks were being correctly written to and
that no errors have occurred. i ran cmp from the nigtly log at least
couple of times a week.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15 15:54       ` andrey mirtchovski
@ 2008-10-15 16:19         ` erik quanstrom
  2008-10-15 18:37           ` Christian Kellermann
  0 siblings, 1 reply; 11+ messages in thread
From: erik quanstrom @ 2008-10-15 16:19 UTC (permalink / raw)
  To: mirtchovski, 9fans

> back when i had mirroring via devfs (some three years ago now) i used
> 'cmp' to verify that the disks were being correctly written to and
> that no errors have occurred. i ran cmp from the nigtly log at least
> couple of times a week.

that's a good idea.

when using the mirror device, no write error is issued unless all mirrored
copies are unwritable.  this would make me nervous since without a
checksum, it's hard to know which copy is good.

(/n/sources/plan9/sys/src/9/port/devfs.c:697)

- erik



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15 16:19         ` erik quanstrom
@ 2008-10-15 18:37           ` Christian Kellermann
  2008-10-16  0:57             ` erik quanstrom
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Kellermann @ 2008-10-15 18:37 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: mirtchovski

* erik quanstrom <quanstro@coraid.com> [081015 18:23]:
> > back when i had mirroring via devfs (some three years ago now) i used
> > 'cmp' to verify that the disks were being correctly written to and
> > that no errors have occurred. i ran cmp from the nigtly log at least
> > couple of times a week.
>
> that's a good idea.
>
> when using the mirror device, no write error is issued unless all mirrored
> copies are unwritable.  this would make me nervous since without a
> checksum, it's hard to know which copy is good.
>
> (/n/sources/plan9/sys/src/9/port/devfs.c:697)

Also it means that while the writes may be ok, you will find out
that some sectors are corrupt at the wrong time. For this a cmp
would not be sufficient would it? An interesting thing in my case
is that I got cmp errors from the start, even with freshly nulled
partitions and identical partitions and disks. I still cannot figure
out why.

Cheers,

Christian



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-15 18:37           ` Christian Kellermann
@ 2008-10-16  0:57             ` erik quanstrom
  2008-10-16  7:38               ` Christian Kellermann
  0 siblings, 1 reply; 11+ messages in thread
From: erik quanstrom @ 2008-10-16  0:57 UTC (permalink / raw)
  To: 9fans

On Wed Oct 15 14:38:47 EDT 2008, Christian.Kellermann@nefkom.net wrote:
> [...] An interesting thing in my case
> is that I got cmp errors from the start, even with freshly nulled
> partitions and identical partitions and disks. I still cannot figure
> out why.

were these cmp errors in any particular place on the partition?

- erik



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] How to get the diagnostics of fs(3)
  2008-10-16  0:57             ` erik quanstrom
@ 2008-10-16  7:38               ` Christian Kellermann
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Kellermann @ 2008-10-16  7:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 566 bytes --]

* erik quanstrom <quanstro@quanstro.net> [081016 03:00]:
> On Wed Oct 15 14:38:47 EDT 2008, Christian.Kellermann@nefkom.net wrote:
> > [...] An interesting thing in my case
> > is that I got cmp errors from the start, even with freshly nulled
> > partitions and identical partitions and disks. I still cannot figure
> > out why.
> 
> were these cmp errors in any particular place on the partition?

unfortunately this data got lost in the noise...


-- 
You may use my gpg key for replies:
pub  1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)

[-- Attachment #2: Type: application/pgp-signature, Size: 202 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-10-16  7:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-15  8:28 [9fans] How to get the diagnostics of fs(3) Christian Kellermann
2008-10-15  8:39 ` Fco. J. Ballesteros
2008-10-15  8:47   ` Christian Kellermann
2008-10-15  9:13     ` Fco. J. Ballesteros
2008-10-15 15:48     ` ron minnich
2008-10-15 15:54       ` andrey mirtchovski
2008-10-15 16:19         ` erik quanstrom
2008-10-15 18:37           ` Christian Kellermann
2008-10-16  0:57             ` erik quanstrom
2008-10-16  7:38               ` Christian Kellermann
2008-10-15 12:42 ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).