9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] i/o error: wrenwrite
@ 2004-03-03  7:57 David Tolpin
  2004-03-03  8:06 ` Fco.J.Ballesteros
  2004-03-03 12:31 ` boyd, rounin
  0 siblings, 2 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03  7:57 UTC (permalink / raw)
  To: 9fans


Hi,

after a power failure, I'm getting repeating lines with
I/O error: wrenwrite

The file system is kfs. 

After reboot from the installation CD, fshalt and c-a-d,
everything is working.

Is it normal?

David



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  7:57 [9fans] i/o error: wrenwrite David Tolpin
@ 2004-03-03  8:06 ` Fco.J.Ballesteros
  2004-03-03  8:10   ` David Tolpin
  2004-03-03  9:15   ` Richard Miller
  2004-03-03 12:31 ` boyd, rounin
  1 sibling, 2 replies; 41+ messages in thread
From: Fco.J.Ballesteros @ 2004-03-03  8:06 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 140 bytes --]

When I've seen wrenwrite complaints, it has always been due to
about-to-fail disks. But don't know if that's your case. It was long
ago.

[-- Attachment #2: Type: message/rfc822, Size: 2142 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: [9fans] i/o error: wrenwrite
Date: Wed, 3 Mar 2004 11:57:01 +0400 (AMT)
Message-ID: <200403030757.i237v1Sr097380@adat.davidashen.net>


Hi,

after a power failure, I'm getting repeating lines with
I/O error: wrenwrite

The file system is kfs. 

After reboot from the installation CD, fshalt and c-a-d,
everything is working.

Is it normal?

David

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:06 ` Fco.J.Ballesteros
@ 2004-03-03  8:10   ` David Tolpin
  2004-03-03  8:15     ` Fco.J.Ballesteros
  2004-03-03  9:55     ` Bruce Ellis
  2004-03-03  9:15   ` Richard Miller
  1 sibling, 2 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03  8:10 UTC (permalink / raw)
  To: 9fans

> Content-Type: text/plain; charset="US-ASCII"
> Content-Transfer-Encoding: 7bit
>
> When I've seen wrenwrite complaints, it has always been due to
> about-to-fail disks. But don't know if that's your case. It was long
> ago.
>

Unlikely. FreeBSD runs for several hours under fs stress test
without any slightest sign of a problem. On the same hardware.




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:10   ` David Tolpin
@ 2004-03-03  8:15     ` Fco.J.Ballesteros
  2004-03-03  8:22       ` David Tolpin
  2004-03-03  9:55     ` Bruce Ellis
  1 sibling, 1 reply; 41+ messages in thread
From: Fco.J.Ballesteros @ 2004-03-03  8:15 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 98 bytes --]

Perhaps your disk recovered your bad blocks using spare ones and
now there's no problem at all.

[-- Attachment #2: Type: message/rfc822, Size: 2355 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] i/o error: wrenwrite
Date: Wed, 3 Mar 2004 12:10:15 +0400 (AMT)
Message-ID: <200403030810.i238AFrb097441@adat.davidashen.net>

> Content-Type: text/plain; charset="US-ASCII"
> Content-Transfer-Encoding: 7bit
>
> When I've seen wrenwrite complaints, it has always been due to
> about-to-fail disks. But don't know if that's your case. It was long
> ago.
>

Unlikely. FreeBSD runs for several hours under fs stress test
without any slightest sign of a problem. On the same hardware.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:15     ` Fco.J.Ballesteros
@ 2004-03-03  8:22       ` David Tolpin
  2004-03-03  8:34         ` Fco.J.Ballesteros
  0 siblings, 1 reply; 41+ messages in thread
From: David Tolpin @ 2004-03-03  8:22 UTC (permalink / raw)
  To: 9fans

>
> Perhaps your disk recovered your bad blocks using spare ones and
> now there's no problem at all.
>

1) what exactly does the word mean?
2) how it depends on rebooting it from a different media?
3) among the messages displayed last time there was one that
it cannot open /adm/timezone/local. After the reboot, /adm/timezone/local
is where it should and unchanged (that is, my timezone as I put
it there).

Can it be something with controller state not properly initialized?
How exactly should I report my hardware configuration?
Why it only happens after a power failure and not during normal work,
if it is a hardware problem? 

I am not a hardware expert. I am just trying to port some programs
to Plan9, and use it as a platform -- it was said to be 'finished',
that is, usable for work, and I hope it is mature indeed.

David Tolpin



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:22       ` David Tolpin
@ 2004-03-03  8:34         ` Fco.J.Ballesteros
  2004-03-03  8:59           ` David Tolpin
  2004-03-03  9:51           ` Geoff Collyer
  0 siblings, 2 replies; 41+ messages in thread
From: Fco.J.Ballesteros @ 2004-03-03  8:34 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1182 bytes --]

I am not a hardware expert either. Wren was the name for the disk
device in fs(8), and I think kfs inherited the name. It refers to a rw
disk or partition. The messages can be understood AFAIK as I/O errors,
which usually correspond to broken hardware.

Regarding 2, I don't know how that may be.
Regarding 3, I know disks that do automatically what time ago you did
by hand (declaring some blocks as defects and instructing the disk to use
other spare ones instead). The only reason I may find for this is that your disk
detected an error and was able to correct it; but I'm not a hardware expert
either.

A power failure may cause all this depending on the disk you use, because it
may lead to broken disks (although I admit I've not seen this since long ago).

We use Plan 9 here for daily work: It runs a lab for students, our accounts,
we write programs and documents on it, read mail, etc. I indeed can say
that it's more reliable than Linux, according to my experience (that can be
different for others, of course). So I'd not be scared to use plan 9 for daily
work; I'd be to switch back to what I used before.

If I may help somehow, let me know.



[-- Attachment #2: Type: message/rfc822, Size: 2852 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] i/o error: wrenwrite
Date: Wed, 3 Mar 2004 12:22:35 +0400 (AMT)
Message-ID: <200403030822.i238MZBg097480@adat.davidashen.net>

>
> Perhaps your disk recovered your bad blocks using spare ones and
> now there's no problem at all.
>

1) what exactly does the word mean?
2) how it depends on rebooting it from a different media?
3) among the messages displayed last time there was one that
it cannot open /adm/timezone/local. After the reboot, /adm/timezone/local
is where it should and unchanged (that is, my timezone as I put
it there).

Can it be something with controller state not properly initialized?
How exactly should I report my hardware configuration?
Why it only happens after a power failure and not during normal work,
if it is a hardware problem? 

I am not a hardware expert. I am just trying to port some programs
to Plan9, and use it as a platform -- it was said to be 'finished',
that is, usable for work, and I hope it is mature indeed.

David Tolpin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:34         ` Fco.J.Ballesteros
@ 2004-03-03  8:59           ` David Tolpin
  2004-03-03  9:51           ` Geoff Collyer
  1 sibling, 0 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03  8:59 UTC (permalink / raw)
  To: 9fans

How to boot from CD and check filesystem? That is, to mount? I've tried
but somehow cannot figure out how to.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:06 ` Fco.J.Ballesteros
  2004-03-03  8:10   ` David Tolpin
@ 2004-03-03  9:15   ` Richard Miller
  2004-03-03  9:18     ` David Tolpin
  1 sibling, 1 reply; 41+ messages in thread
From: Richard Miller @ 2004-03-03  9:15 UTC (permalink / raw)
  To: 9fans

> When I've seen wrenwrite complaints, it has always been due to
> about-to-fail disks.

An "i/o error" report from wrenwrite is not necessarily a hardware
problem.  It may be a failed seek because a block address in the
filesystem has been corrupted.  This should be repairable (with some
loss of data) by kfscmd chk.

-- Richard



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  9:15   ` Richard Miller
@ 2004-03-03  9:18     ` David Tolpin
  0 siblings, 0 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03  9:18 UTC (permalink / raw)
  To: 9fans

> > When I've seen wrenwrite complaints, it has always been due to
> > about-to-fail disks.
>
> An "i/o error" report from wrenwrite is not necessarily a hardware
> problem.  It may be a failed seek because a block address in the
> filesystem has been corrupted.  This should be repairable (with some
> loss of data) by kfscmd chk.

I don't get a chance to run kfscmd check since wrenwrite are 
endlessly repeating each time I reboot system.

And when I just reboot from CD, and then c-a-d to boot from the
hard disk, everything is normal. I can reproduce it easily 
by pressing power button.

It is something to do with the state of hardware not properly
initialized. 

Is there a way to get more debugging information?

David Tolpin



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:34         ` Fco.J.Ballesteros
  2004-03-03  8:59           ` David Tolpin
@ 2004-03-03  9:51           ` Geoff Collyer
  2004-03-03  9:54             ` David Tolpin
  2004-03-03 20:35             ` splite
  1 sibling, 2 replies; 41+ messages in thread
From: Geoff Collyer @ 2004-03-03  9:51 UTC (permalink / raw)
  To: 9fans

There's no great mystery to the name `wren'; there was a line of SCSI
disks at the time (ca.  1990) called Wren I, Wren II, Wren III, etc.,
I think made by Fujitsu, who earlier made the Eagle, Super-Eagle
(a.k.a.  the Super-Turkey), Swallow, etc.  So `wren' just denotes a
SCSI disk.

Is your machine on a UPS?  If not, all bets are off after a power
failure; if so, it shouldn't even notice a short power failure.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  9:51           ` Geoff Collyer
@ 2004-03-03  9:54             ` David Tolpin
  2004-03-03 10:39               ` matt
  2004-03-03 20:35             ` splite
  1 sibling, 1 reply; 41+ messages in thread
From: David Tolpin @ 2004-03-03  9:54 UTC (permalink / raw)
  To: 9fans

>
> Is your machine on a UPS?  If not, all bets are off after a power
> failure; if so, it shouldn't even notice a short power failure.
>

My machine is on UPS, but the power was off for more than UPS
could hold. Here in Armenia, it is not a rare occasion.

But if I just turn it off with power button, FreeBSD comes up
just fine after intensive disk writes just before poweroff;
and Plan9 always shows the same behaviour -- endless wrenwrite
messages on console, untill I reboot from CD and do nothing.

This has something to do with the hardware state not properly
initialized.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  8:10   ` David Tolpin
  2004-03-03  8:15     ` Fco.J.Ballesteros
@ 2004-03-03  9:55     ` Bruce Ellis
  2004-03-03 10:00       ` David Tolpin
  1 sibling, 1 reply; 41+ messages in thread
From: Bruce Ellis @ 2004-03-03  9:55 UTC (permalink / raw)
  To: 9fans

i would love to have an OS that doesn't tell you that
the disk is crook as rookie.  but call me young
and willing.

brucee
----- Original Message ----- 
From: "David Tolpin" <dvd@davidashen.net>
To: <9fans@cse.psu.edu>
Sent: Wednesday, March 03, 2004 7:10 PM
Subject: Re: [9fans] i/o error: wrenwrite


> > Content-Type: text/plain; charset="US-ASCII"
> > Content-Transfer-Encoding: 7bit
> >
> > When I've seen wrenwrite complaints, it has always been due to
> > about-to-fail disks. But don't know if that's your case. It was long
> > ago.
> >
> 
> Unlikely. FreeBSD runs for several hours under fs stress test
> without any slightest sign of a problem. On the same hardware


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  9:55     ` Bruce Ellis
@ 2004-03-03 10:00       ` David Tolpin
  2004-03-03 10:47         ` Richard Miller
  0 siblings, 1 reply; 41+ messages in thread
From: David Tolpin @ 2004-03-03 10:00 UTC (permalink / raw)
  To: 9fans

> i would love to have an OS that doesn't tell you that
> the disk is crook as rookie.  but call me young
> and willing.

FreeBSD does report any failures with the disk, but all
reports are about filesystem inconsistencies, none about
hw failures. 

Plan9 is not telling that the disk has a failure. It falls
in an endless loop leaving me no chance to take an action.

I am just asking what can I do to pin the bug down.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  9:54             ` David Tolpin
@ 2004-03-03 10:39               ` matt
  2004-03-03 10:43                 ` David Tolpin
  2004-03-03 12:44                 ` boyd, rounin
  0 siblings, 2 replies; 41+ messages in thread
From: matt @ 2004-03-03 10:39 UTC (permalink / raw)
  To: 9fans

> But if I just turn it off with power button, FreeBSD comes up just fine after intensive disk writes just before poweroff;

I wouldn't rely on that feature if I were you.

m



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 10:39               ` matt
@ 2004-03-03 10:43                 ` David Tolpin
  2004-03-03 11:37                   ` Charles Forsyth
  2004-03-03 12:46                   ` boyd, rounin
  2004-03-03 12:44                 ` boyd, rounin
  1 sibling, 2 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 10:43 UTC (permalink / raw)
  To: 9fans

> > But if I just turn it off with power button, FreeBSD comes up just fine after intensive disk writes just before poweroff;
>
> I wouldn't rely on that feature if I were you.
>

I do not rely on this feature. I properly do shutdown on all my servers
and terminals, and my file server's uptime is more than year now at home.

But I cannot allocate a 2KWh UPS to every computer; and sometimes
computers loose power. After this accident, I must be able to repair
the file system, either automatically, or manually,.

Plan9 has a bug that puts it into endless loop on my hardware
after reboot without fshalt. I am asking where to look at to
fix the bug.

David Tolpin


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 10:00       ` David Tolpin
@ 2004-03-03 10:47         ` Richard Miller
  2004-03-03 11:19           ` David Tolpin
  0 siblings, 1 reply; 41+ messages in thread
From: Richard Miller @ 2004-03-03 10:47 UTC (permalink / raw)
  To: 9fans

> How to boot from CD and check filesystem?

If you've booted Plan 9 from the Plan 9 distribution CD, you should
be able to do:

disk/kfs -f/dev/sdC0/fs
disk/kfscmd 'check r'

Replace "sdC0" with the name of your kfs disk device.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 10:47         ` Richard Miller
@ 2004-03-03 11:19           ` David Tolpin
  2004-03-03 11:25             ` lucio
  2004-03-03 13:41             ` jmk
  0 siblings, 2 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 11:19 UTC (permalink / raw)
  To: 9fans

> > How to boot from CD and check filesystem?
>
> If you've booted Plan 9 from the Plan 9 distribution CD, you should
> be able to do:
>
> disk/kfs -f/dev/sdC0/fs
> disk/kfscmd 'check r'

I had tried that before, and probably misspelled something. Worked
this time. Does not report any errors, just finishes smoothly.

Powercycling -> endless wrenwrite i/o error messages ->
booting from CD -> 'check r' no errors
rebooting from sdC0 normal.

Just booting from CD and immediately rebooting helps as well.

What can be the cause?



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 11:19           ` David Tolpin
@ 2004-03-03 11:25             ` lucio
  2004-03-03 11:34               ` David Tolpin
  2004-03-03 13:41             ` jmk
  1 sibling, 1 reply; 41+ messages in thread
From: lucio @ 2004-03-03 11:25 UTC (permalink / raw)
  To: 9fans

> What can be the cause?

You'll have to tell us something about your hardware.  As you have no
doubt gathered by now, no one here has experienced the exact problem
you encountered.  In my experience, the best person to deal with it is
Jim McKie (sp?) <jmk@plan9.bell-labs.com>.  But it may be something
simple or something, once investigated, that others with knowledge of
the hardware understand better.

Would it be SCSI drives you're dealing with?

++L



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 11:25             ` lucio
@ 2004-03-03 11:34               ` David Tolpin
  2004-03-03 20:11                 ` splite
  0 siblings, 1 reply; 41+ messages in thread
From: David Tolpin @ 2004-03-03 11:34 UTC (permalink / raw)
  To: 9fans

> Jim McKie (sp?) <jmk@plan9.bell-labs.com>.  But it may be something
> simple or something, once investigated, that others with knowledge of
> the hardware understand better.
>
> Would it be SCSI drives you're dealing with?

No. It is IDE.

cpu% cat /dev/sdC0/ctl
inquiry IBM-DTLA-307015                         
config 045A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 16
geometry 29336832 512 16383 16 63

cpu% cat /dev/kmesg
cpu0: 798MHz GenuineIntel PentiumIII/Xeon (cpuid: AX 0x0683 DX 0x383FBFF)
ELCR: 0C20
#l0: i82557: 10Mbps port 0x1400 irq 5: 0002A5561D54

Turning dma or rwm on or off before powercycling does not change the
picture.

David


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 10:43                 ` David Tolpin
@ 2004-03-03 11:37                   ` Charles Forsyth
  2004-03-03 12:19                     ` David Tolpin
  2004-03-03 12:46                   ` boyd, rounin
  1 sibling, 1 reply; 41+ messages in thread
From: Charles Forsyth @ 2004-03-03 11:37 UTC (permalink / raw)
  To: 9fans

>>Plan9 has a bug that puts it into endless loop on my hardware
>>after reboot without fshalt. I am asking where to look at to
>>fix the bug.

quite right.

if you can recompile a kernel while the system is up, and run that,
you could try adding code to /sys/src/9/pc/sdata.c to
atainterrupt (say) to help diagnose the i/o error more precisely.
"i/o error" isn't much help as you've discovered.
there might already be debugging code that can be switched on.
(i did have a very quick look at it myself but i didn't see
anything just right for this case, but i might easily have missed it.)

round about here, say:
	if(status & Err)
		drive->error = inb(cmdport+Error);

i suspect the error might happen much earlier, before it gets to
generate an interrupt, but that should become apparent if you
see no error output from that point.  just a print() of the status
might do.  i suspect the controller ends up in a state where its
registers don't look right to the driver, or indeed where the
controller needs a bit of a whack from the driver after the power cycling,
in which case it probably can't generate IO at all and it's (perhaps)
timing out the command.  it's interesting that it's apparently only
wrenwrite not wrenread that generates errors (or i missed something).
that in itself might help pinpoint the problem.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 11:37                   ` Charles Forsyth
@ 2004-03-03 12:19                     ` David Tolpin
  0 siblings, 0 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 12:19 UTC (permalink / raw)
  To: 9fans

>
> round about here, say:
> 	if(status & Err)
> 		drive->error = inb(cmdport+Error);
>
> i suspect the error might happen much earlier, before it gets to
> generate an interrupt, but that should become apparent if you
> see no error output from that point.  just a print() of the status
> might do.  i suspect the controller ends up in a state where its
> registers don't look right to the driver, or indeed where the
> controller needs a bit of a whack from the driver after the power cycling,
> in which case it probably can't generate IO at all and it's (perhaps)

1) I have added print("status=%d\n",status)

which during normal boot displays status=81 occasionally.

During wrenwrite loop the point is not reached.

2) I then added a call to print("atainterrupt\n") at the top
of atainterrupt(). The result has been that the computer boots
with this kernel (after a few minutes of course) and does not
with kernel without that 'brake'. 

Does it mean it is something about timeouts?

I'll continute investigating.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  7:57 [9fans] i/o error: wrenwrite David Tolpin
  2004-03-03  8:06 ` Fco.J.Ballesteros
@ 2004-03-03 12:31 ` boyd, rounin
  1 sibling, 0 replies; 41+ messages in thread
From: boyd, rounin @ 2004-03-03 12:31 UTC (permalink / raw)
  To: 9fans

> after a power failure, I'm getting repeating lines with
> I/O error: wrenwrite

    http://www.insultant.net/uk/BP2004/DSC00558.JPG

btw: the BP shots are un-retouched, just rotated.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 10:39               ` matt
  2004-03-03 10:43                 ` David Tolpin
@ 2004-03-03 12:44                 ` boyd, rounin
  1 sibling, 0 replies; 41+ messages in thread
From: boyd, rounin @ 2004-03-03 12:44 UTC (permalink / raw)
  To: 9fans

> I wouldn't rely on that feature if I were you.

if it's the FFS (apart from the appalling design) there's
redundancy and no doubt bad block mapping.  it only
took 'em 10 years to get it right ... for some value of 'right'.




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 10:43                 ` David Tolpin
  2004-03-03 11:37                   ` Charles Forsyth
@ 2004-03-03 12:46                   ` boyd, rounin
  1 sibling, 0 replies; 41+ messages in thread
From: boyd, rounin @ 2004-03-03 12:46 UTC (permalink / raw)
  To: 9fans

i have a cunning plan, my lord:

   a screwdriver, sticky tape, adb, dd and a magtape



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 11:19           ` David Tolpin
  2004-03-03 11:25             ` lucio
@ 2004-03-03 13:41             ` jmk
  2004-03-03 13:45               ` David Tolpin
  1 sibling, 1 reply; 41+ messages in thread
From: jmk @ 2004-03-03 13:41 UTC (permalink / raw)
  To: 9fans

On Wed Mar  3 06:22:36 EST 2004, dvd@davidashen.net wrote:
> > > How to boot from CD and check filesystem?
> >
> > If you've booted Plan 9 from the Plan 9 distribution CD, you should
> > be able to do:
> >
> > disk/kfs -f/dev/sdC0/fs
> > disk/kfscmd 'check r'
> 
> I had tried that before, and probably misspelled something. Worked
> this time. Does not report any errors, just finishes smoothly.
> 
> Powercycling -> endless wrenwrite i/o error messages ->
> booting from CD -> 'check r' no errors
> rebooting from sdC0 normal.
> 
> Just booting from CD and immediately rebooting helps as well.
> 
> What can be the cause?

What type of discs are they, ATA, SCSI, etc?
Are they set to spin up automatically when power is applied?


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 13:41             ` jmk
@ 2004-03-03 13:45               ` David Tolpin
  2004-03-03 13:58                 ` C H Forsyth
  0 siblings, 1 reply; 41+ messages in thread
From: David Tolpin @ 2004-03-03 13:45 UTC (permalink / raw)
  To: 9fans

>
> What type of discs are they, ATA, SCSI, etc?

ATA.

cpu% cat /dev/sdC0/ctl
inquiry IBM-DTLA-307015                         
config 045A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 16
geometry 29336832 512 16383 16 63


> Are they set to spin up automatically when power is applied?

I do not know. What are other options if I boot from it?

I am booting, see
kfs...version...time...

and then 

wrenwrite: i/o error

many times.

If I put enough debugging output into the kernel's atainterrupt,
the kernel is able to boot.

David



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 13:58                 ` C H Forsyth
@ 2004-03-03 13:58                   ` lucio
  2004-03-03 14:07                     ` C H Forsyth
  0 siblings, 1 reply; 41+ messages in thread
From: lucio @ 2004-03-03 13:58 UTC (permalink / raw)
  To: 9fans

>>> Are they set to spin up automatically when power is applied?
>>I do not know. What are other options if I boot from it?
> 
> it's usually selected by a jumper on the drive itself.

But not the avarage ATA, at least not the ones I'm familiar with.
Plus David makes it clear the system has booted at the time the error
occurs, so unless the drive is actual _spun_down_...

I imagine though it's the first write, although, again, you won't get
the progress messages in that sequence unless the disk has been
checked and, in the case of a power failure, it usually takes a while
to get past the "check".

++L



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 13:45               ` David Tolpin
@ 2004-03-03 13:58                 ` C H Forsyth
  2004-03-03 13:58                   ` lucio
  0 siblings, 1 reply; 41+ messages in thread
From: C H Forsyth @ 2004-03-03 13:58 UTC (permalink / raw)
  To: 9fans


>> Are they set to spin up automatically when power is applied?
>I do not know. What are other options if I boot from it?

it's usually selected by a jumper on the drive itself.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 14:07                     ` C H Forsyth
@ 2004-03-03 14:04                       ` David Tolpin
  2004-03-03 14:14                         ` lucio
  2004-03-03 14:41                         ` Derek Fawcus
  0 siblings, 2 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 14:04 UTC (permalink / raw)
  To: 9fans

> >>But not the avarage ATA, at least not the ones I'm familiar with.
>
> sorry, my mistake.  i work too much with SCSI drives ...
>

That was my thought too. SCSI drives can be started by a command;
but ATA drives start when powered, aren't way? But in assumption
that I am wrong I've opened the box, unplugged the data cable and
turned power on. It started instantly.

David


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 13:58                   ` lucio
@ 2004-03-03 14:07                     ` C H Forsyth
  2004-03-03 14:04                       ` David Tolpin
  0 siblings, 1 reply; 41+ messages in thread
From: C H Forsyth @ 2004-03-03 14:07 UTC (permalink / raw)
  To: 9fans

>>But not the avarage ATA, at least not the ones I'm familiar with.

sorry, my mistake.  i work too much with SCSI drives ...



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 14:04                       ` David Tolpin
@ 2004-03-03 14:14                         ` lucio
  2004-03-03 14:23                           ` jmk
  2004-03-03 14:41                         ` Derek Fawcus
  1 sibling, 1 reply; 41+ messages in thread
From: lucio @ 2004-03-03 14:14 UTC (permalink / raw)
  To: 9fans

> That was my thought too. SCSI drives can be started by a command;
> but ATA drives start when powered, aren't way? But in assumption
> that I am wrong I've opened the box, unplugged the data cable and
> turned power on. It started instantly.

There is power-saving mode, and they do stop turning.  So I presume
there is a command to start them up again.

But I don't think that's it.  I think you should dump as much of the
status of the drive controller as the device driver allows at the
first encounter with the error, somebody here may (Christopher
Nielsen, I seem to remember, did the 48-bit extensions to ATA) be able
to identify what is going out of kilter.

What might also hep would be to try an old kernel, say a year old, see
if it has the same symptoms.

++L



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 14:14                         ` lucio
@ 2004-03-03 14:23                           ` jmk
  2004-03-03 14:24                             ` David Tolpin
  2004-03-03 21:34                             ` Exact Eios " David Tolpin
  0 siblings, 2 replies; 41+ messages in thread
From: jmk @ 2004-03-03 14:23 UTC (permalink / raw)
  To: 9fans

If you're putting in debugging code, try putting it in port/devsd.c:sdbio()
to figure out which of the Eio's is the problem. There's code there for dealing
with initial drive conditions, etc.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 14:23                           ` jmk
@ 2004-03-03 14:24                             ` David Tolpin
  2004-03-03 21:34                             ` Exact Eios " David Tolpin
  1 sibling, 0 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 14:24 UTC (permalink / raw)
  To: 9fans

> If you're putting in debugging code, try putting it in port/devsd.c:sdbio()
> to figure out which of the Eio's is the problem. There's code there for dealing
> with initial drive conditions, etc.
>

I'll do, many thanks. I a few hours, have to go. The problem is that
I need more time to understand how the parts are interrelated in
the kernel. The problem goes away when debugging slows down the
process, so I cannot just dump everything.

David


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 14:04                       ` David Tolpin
  2004-03-03 14:14                         ` lucio
@ 2004-03-03 14:41                         ` Derek Fawcus
  1 sibling, 0 replies; 41+ messages in thread
From: Derek Fawcus @ 2004-03-03 14:41 UTC (permalink / raw)
  To: 9fans

On Wed, Mar 03, 2004 at 06:04:38PM +0400, David Tolpin wrote:
> 
> That was my thought too. SCSI drives can be started by a command;
> but ATA drives start when powered, aren't way?

I believe there is a config option (command) that will change ATA disks
such that they subsequently require a command to start.  i.e. they power
up in the powersave mode,  and need the normal 'leave power save mode'
command to get them going.

Though I also believe the command (feature set subset if I recall) is
optional anyway.

(I spent too long working on ATA disk drivers ~ 8 years ago)

DF


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 11:34               ` David Tolpin
@ 2004-03-03 20:11                 ` splite
  2004-03-03 20:25                   ` David Tolpin
  0 siblings, 1 reply; 41+ messages in thread
From: splite @ 2004-03-03 20:11 UTC (permalink / raw)
  To: 9fans

On Wed, Mar 03, 2004 at 03:34:38PM +0400, David Tolpin wrote:
> 
> cpu% cat /dev/sdC0/ctl
> inquiry IBM-DTLA-307015                         
> config 045A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 16
> geometry 29336832 512 16383 16 63

Oh my, the dread Deathstar 75GXP.

Before you go any further, I'd recommend downloading Hitachi's (formerly
IBM's) Drive Fitness Test floppy image and run its "advanced" drive test
a few times.  Your disk may be about to go toes-up.

http://www.hgst.com/hdd/support/download.htm

(I understand that it's not giving you trouble under FreeBSD, but it could
be that FreeBSD is masking or oblivious to a drive problem.)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 20:11                 ` splite
@ 2004-03-03 20:25                   ` David Tolpin
  0 siblings, 0 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 20:25 UTC (permalink / raw)
  To: 9fans

>
> Oh my, the dread Deathstar 75GXP.
>
> Before you go any further, I'd recommend downloading Hitachi's (formerly
> IBM's) Drive Fitness Test floppy image and run its "advanced" drive test
> a few times.  Your disk may be about to go toes-up.
>
> http://www.hgst.com/hdd/support/download.htm

Thank you, of course  I will. But for me the problem is not the disk
fails (in case it does); but that the diagnostics is problematic.
Anyway, I am going to test the disk and debug the kernel.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03  9:51           ` Geoff Collyer
  2004-03-03  9:54             ` David Tolpin
@ 2004-03-03 20:35             ` splite
  2004-03-03 21:25               ` Geoff Collyer
  1 sibling, 1 reply; 41+ messages in thread
From: splite @ 2004-03-03 20:35 UTC (permalink / raw)
  To: 9fans

On Wed, Mar 03, 2004 at 01:51:52AM -0800, Geoff Collyer wrote:
> There's no great mystery to the name `wren'; there was a line of SCSI
> disks at the time (ca.  1990) called Wren I, Wren II, Wren III, etc.,
> I think made by Fujitsu

Nope, CDC.  Later spun off into Imprimis, which was still later bought by
Seagate.

At one point my home box (OEM 12-slot VME crate that my wife painted a
lovely green) had three Wren IVs, one from each company.  Hell of a
way to get 1GB of storage, but cheap.  Later replaced the lot with one
Wren 7.

None of them ever gave me a day's trouble (unlike the pile of Deathstars
we've replaced this year), so I appreciate Plan 9's little "homage".


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 20:35             ` splite
@ 2004-03-03 21:25               ` Geoff Collyer
  2004-03-03 21:42                 ` splite
  0 siblings, 1 reply; 41+ messages in thread
From: Geoff Collyer @ 2004-03-03 21:25 UTC (permalink / raw)
  To: 9fans

CDC, right!  Thanks for the history; I knew Fujitsu didn't sound
right, but it's been a long time.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Exact Eios Re: [9fans] i/o error: wrenwrite
  2004-03-03 14:23                           ` jmk
  2004-03-03 14:24                             ` David Tolpin
@ 2004-03-03 21:34                             ` David Tolpin
  1 sibling, 0 replies; 41+ messages in thread
From: David Tolpin @ 2004-03-03 21:34 UTC (permalink / raw)
  To: 9fans; +Cc: jmk


The exact places where Eio is reported in sdbio are 
almost all occurences:

devsd.c:800
			l = unit->dev->ifc->bio(unit, 0, 0, b, nb, bno);
			if(l < 0) {
					error(Eio);

few occurences:

devsd.c:820
			l = unit->dev->ifc->bio(unit, 0, 1, b, nb, bno);
			if(l < 0) {
					error(Eio);

Is it a disk failure? The Hitachi's test suite says the disk is good.

Would it make sense to make error reporting more detailed?

David Tolpin


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 21:25               ` Geoff Collyer
@ 2004-03-03 21:42                 ` splite
  2004-03-03 22:48                   ` boyd, rounin
  0 siblings, 1 reply; 41+ messages in thread
From: splite @ 2004-03-03 21:42 UTC (permalink / raw)
  To: 9fans

On Wed, Mar 03, 2004 at 01:25:29PM -0800, Geoff Collyer wrote:
> CDC, right!  Thanks for the history; I knew Fujitsu didn't sound
> right, but it's been a long time.

Yeah, I was only, ummm, three or four back then.  It was actually my dad's
computer.  He got it from his dad.  I just, like, googled on "wren" while
jamming to Eamon on my iPod and doing my, like, social studies homework so
I'd sound all like l33t and stuff.  CUL8R


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [9fans] i/o error: wrenwrite
  2004-03-03 21:42                 ` splite
@ 2004-03-03 22:48                   ` boyd, rounin
  0 siblings, 0 replies; 41+ messages in thread
From: boyd, rounin @ 2004-03-03 22:48 UTC (permalink / raw)
  To: 9fans

> Yeah, I was only, ummm, three or four back then.  It was actually my dad's
> computer.  ...

    http://www.insultant.net/uk/BP2004/DSC00556.JPG



^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2004-03-03 22:48 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-03  7:57 [9fans] i/o error: wrenwrite David Tolpin
2004-03-03  8:06 ` Fco.J.Ballesteros
2004-03-03  8:10   ` David Tolpin
2004-03-03  8:15     ` Fco.J.Ballesteros
2004-03-03  8:22       ` David Tolpin
2004-03-03  8:34         ` Fco.J.Ballesteros
2004-03-03  8:59           ` David Tolpin
2004-03-03  9:51           ` Geoff Collyer
2004-03-03  9:54             ` David Tolpin
2004-03-03 10:39               ` matt
2004-03-03 10:43                 ` David Tolpin
2004-03-03 11:37                   ` Charles Forsyth
2004-03-03 12:19                     ` David Tolpin
2004-03-03 12:46                   ` boyd, rounin
2004-03-03 12:44                 ` boyd, rounin
2004-03-03 20:35             ` splite
2004-03-03 21:25               ` Geoff Collyer
2004-03-03 21:42                 ` splite
2004-03-03 22:48                   ` boyd, rounin
2004-03-03  9:55     ` Bruce Ellis
2004-03-03 10:00       ` David Tolpin
2004-03-03 10:47         ` Richard Miller
2004-03-03 11:19           ` David Tolpin
2004-03-03 11:25             ` lucio
2004-03-03 11:34               ` David Tolpin
2004-03-03 20:11                 ` splite
2004-03-03 20:25                   ` David Tolpin
2004-03-03 13:41             ` jmk
2004-03-03 13:45               ` David Tolpin
2004-03-03 13:58                 ` C H Forsyth
2004-03-03 13:58                   ` lucio
2004-03-03 14:07                     ` C H Forsyth
2004-03-03 14:04                       ` David Tolpin
2004-03-03 14:14                         ` lucio
2004-03-03 14:23                           ` jmk
2004-03-03 14:24                             ` David Tolpin
2004-03-03 21:34                             ` Exact Eios " David Tolpin
2004-03-03 14:41                         ` Derek Fawcus
2004-03-03  9:15   ` Richard Miller
2004-03-03  9:18     ` David Tolpin
2004-03-03 12:31 ` boyd, rounin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).