* [9fans] i/o error: wrenwrite @ 2004-03-03 7:57 David Tolpin 2004-03-03 8:06 ` Fco.J.Ballesteros 2004-03-03 12:31 ` boyd, rounin 0 siblings, 2 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 7:57 UTC (permalink / raw) To: 9fans Hi, after a power failure, I'm getting repeating lines with I/O error: wrenwrite The file system is kfs. After reboot from the installation CD, fshalt and c-a-d, everything is working. Is it normal? David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 7:57 [9fans] i/o error: wrenwrite David Tolpin @ 2004-03-03 8:06 ` Fco.J.Ballesteros 2004-03-03 8:10 ` David Tolpin 2004-03-03 9:15 ` Richard Miller 2004-03-03 12:31 ` boyd, rounin 1 sibling, 2 replies; 41+ messages in thread From: Fco.J.Ballesteros @ 2004-03-03 8:06 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 140 bytes --] When I've seen wrenwrite complaints, it has always been due to about-to-fail disks. But don't know if that's your case. It was long ago. [-- Attachment #2: Type: message/rfc822, Size: 2142 bytes --] From: David Tolpin <dvd@davidashen.net> To: 9fans@cse.psu.edu Subject: [9fans] i/o error: wrenwrite Date: Wed, 3 Mar 2004 11:57:01 +0400 (AMT) Message-ID: <200403030757.i237v1Sr097380@adat.davidashen.net> Hi, after a power failure, I'm getting repeating lines with I/O error: wrenwrite The file system is kfs. After reboot from the installation CD, fshalt and c-a-d, everything is working. Is it normal? David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:06 ` Fco.J.Ballesteros @ 2004-03-03 8:10 ` David Tolpin 2004-03-03 8:15 ` Fco.J.Ballesteros 2004-03-03 9:55 ` Bruce Ellis 2004-03-03 9:15 ` Richard Miller 1 sibling, 2 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 8:10 UTC (permalink / raw) To: 9fans > Content-Type: text/plain; charset="US-ASCII" > Content-Transfer-Encoding: 7bit > > When I've seen wrenwrite complaints, it has always been due to > about-to-fail disks. But don't know if that's your case. It was long > ago. > Unlikely. FreeBSD runs for several hours under fs stress test without any slightest sign of a problem. On the same hardware. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:10 ` David Tolpin @ 2004-03-03 8:15 ` Fco.J.Ballesteros 2004-03-03 8:22 ` David Tolpin 2004-03-03 9:55 ` Bruce Ellis 1 sibling, 1 reply; 41+ messages in thread From: Fco.J.Ballesteros @ 2004-03-03 8:15 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 98 bytes --] Perhaps your disk recovered your bad blocks using spare ones and now there's no problem at all. [-- Attachment #2: Type: message/rfc822, Size: 2355 bytes --] From: David Tolpin <dvd@davidashen.net> To: 9fans@cse.psu.edu Subject: Re: [9fans] i/o error: wrenwrite Date: Wed, 3 Mar 2004 12:10:15 +0400 (AMT) Message-ID: <200403030810.i238AFrb097441@adat.davidashen.net> > Content-Type: text/plain; charset="US-ASCII" > Content-Transfer-Encoding: 7bit > > When I've seen wrenwrite complaints, it has always been due to > about-to-fail disks. But don't know if that's your case. It was long > ago. > Unlikely. FreeBSD runs for several hours under fs stress test without any slightest sign of a problem. On the same hardware. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:15 ` Fco.J.Ballesteros @ 2004-03-03 8:22 ` David Tolpin 2004-03-03 8:34 ` Fco.J.Ballesteros 0 siblings, 1 reply; 41+ messages in thread From: David Tolpin @ 2004-03-03 8:22 UTC (permalink / raw) To: 9fans > > Perhaps your disk recovered your bad blocks using spare ones and > now there's no problem at all. > 1) what exactly does the word mean? 2) how it depends on rebooting it from a different media? 3) among the messages displayed last time there was one that it cannot open /adm/timezone/local. After the reboot, /adm/timezone/local is where it should and unchanged (that is, my timezone as I put it there). Can it be something with controller state not properly initialized? How exactly should I report my hardware configuration? Why it only happens after a power failure and not during normal work, if it is a hardware problem? I am not a hardware expert. I am just trying to port some programs to Plan9, and use it as a platform -- it was said to be 'finished', that is, usable for work, and I hope it is mature indeed. David Tolpin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:22 ` David Tolpin @ 2004-03-03 8:34 ` Fco.J.Ballesteros 2004-03-03 8:59 ` David Tolpin 2004-03-03 9:51 ` Geoff Collyer 0 siblings, 2 replies; 41+ messages in thread From: Fco.J.Ballesteros @ 2004-03-03 8:34 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 1182 bytes --] I am not a hardware expert either. Wren was the name for the disk device in fs(8), and I think kfs inherited the name. It refers to a rw disk or partition. The messages can be understood AFAIK as I/O errors, which usually correspond to broken hardware. Regarding 2, I don't know how that may be. Regarding 3, I know disks that do automatically what time ago you did by hand (declaring some blocks as defects and instructing the disk to use other spare ones instead). The only reason I may find for this is that your disk detected an error and was able to correct it; but I'm not a hardware expert either. A power failure may cause all this depending on the disk you use, because it may lead to broken disks (although I admit I've not seen this since long ago). We use Plan 9 here for daily work: It runs a lab for students, our accounts, we write programs and documents on it, read mail, etc. I indeed can say that it's more reliable than Linux, according to my experience (that can be different for others, of course). So I'd not be scared to use plan 9 for daily work; I'd be to switch back to what I used before. If I may help somehow, let me know. [-- Attachment #2: Type: message/rfc822, Size: 2852 bytes --] From: David Tolpin <dvd@davidashen.net> To: 9fans@cse.psu.edu Subject: Re: [9fans] i/o error: wrenwrite Date: Wed, 3 Mar 2004 12:22:35 +0400 (AMT) Message-ID: <200403030822.i238MZBg097480@adat.davidashen.net> > > Perhaps your disk recovered your bad blocks using spare ones and > now there's no problem at all. > 1) what exactly does the word mean? 2) how it depends on rebooting it from a different media? 3) among the messages displayed last time there was one that it cannot open /adm/timezone/local. After the reboot, /adm/timezone/local is where it should and unchanged (that is, my timezone as I put it there). Can it be something with controller state not properly initialized? How exactly should I report my hardware configuration? Why it only happens after a power failure and not during normal work, if it is a hardware problem? I am not a hardware expert. I am just trying to port some programs to Plan9, and use it as a platform -- it was said to be 'finished', that is, usable for work, and I hope it is mature indeed. David Tolpin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:34 ` Fco.J.Ballesteros @ 2004-03-03 8:59 ` David Tolpin 2004-03-03 9:51 ` Geoff Collyer 1 sibling, 0 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 8:59 UTC (permalink / raw) To: 9fans How to boot from CD and check filesystem? That is, to mount? I've tried but somehow cannot figure out how to. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:34 ` Fco.J.Ballesteros 2004-03-03 8:59 ` David Tolpin @ 2004-03-03 9:51 ` Geoff Collyer 2004-03-03 9:54 ` David Tolpin 2004-03-03 20:35 ` splite 1 sibling, 2 replies; 41+ messages in thread From: Geoff Collyer @ 2004-03-03 9:51 UTC (permalink / raw) To: 9fans There's no great mystery to the name `wren'; there was a line of SCSI disks at the time (ca. 1990) called Wren I, Wren II, Wren III, etc., I think made by Fujitsu, who earlier made the Eagle, Super-Eagle (a.k.a. the Super-Turkey), Swallow, etc. So `wren' just denotes a SCSI disk. Is your machine on a UPS? If not, all bets are off after a power failure; if so, it shouldn't even notice a short power failure. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 9:51 ` Geoff Collyer @ 2004-03-03 9:54 ` David Tolpin 2004-03-03 10:39 ` matt 2004-03-03 20:35 ` splite 1 sibling, 1 reply; 41+ messages in thread From: David Tolpin @ 2004-03-03 9:54 UTC (permalink / raw) To: 9fans > > Is your machine on a UPS? If not, all bets are off after a power > failure; if so, it shouldn't even notice a short power failure. > My machine is on UPS, but the power was off for more than UPS could hold. Here in Armenia, it is not a rare occasion. But if I just turn it off with power button, FreeBSD comes up just fine after intensive disk writes just before poweroff; and Plan9 always shows the same behaviour -- endless wrenwrite messages on console, untill I reboot from CD and do nothing. This has something to do with the hardware state not properly initialized. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 9:54 ` David Tolpin @ 2004-03-03 10:39 ` matt 2004-03-03 10:43 ` David Tolpin 2004-03-03 12:44 ` boyd, rounin 0 siblings, 2 replies; 41+ messages in thread From: matt @ 2004-03-03 10:39 UTC (permalink / raw) To: 9fans > But if I just turn it off with power button, FreeBSD comes up just fine after intensive disk writes just before poweroff; I wouldn't rely on that feature if I were you. m ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 10:39 ` matt @ 2004-03-03 10:43 ` David Tolpin 2004-03-03 11:37 ` Charles Forsyth 2004-03-03 12:46 ` boyd, rounin 2004-03-03 12:44 ` boyd, rounin 1 sibling, 2 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 10:43 UTC (permalink / raw) To: 9fans > > But if I just turn it off with power button, FreeBSD comes up just fine after intensive disk writes just before poweroff; > > I wouldn't rely on that feature if I were you. > I do not rely on this feature. I properly do shutdown on all my servers and terminals, and my file server's uptime is more than year now at home. But I cannot allocate a 2KWh UPS to every computer; and sometimes computers loose power. After this accident, I must be able to repair the file system, either automatically, or manually,. Plan9 has a bug that puts it into endless loop on my hardware after reboot without fshalt. I am asking where to look at to fix the bug. David Tolpin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 10:43 ` David Tolpin @ 2004-03-03 11:37 ` Charles Forsyth 2004-03-03 12:19 ` David Tolpin 2004-03-03 12:46 ` boyd, rounin 1 sibling, 1 reply; 41+ messages in thread From: Charles Forsyth @ 2004-03-03 11:37 UTC (permalink / raw) To: 9fans >>Plan9 has a bug that puts it into endless loop on my hardware >>after reboot without fshalt. I am asking where to look at to >>fix the bug. quite right. if you can recompile a kernel while the system is up, and run that, you could try adding code to /sys/src/9/pc/sdata.c to atainterrupt (say) to help diagnose the i/o error more precisely. "i/o error" isn't much help as you've discovered. there might already be debugging code that can be switched on. (i did have a very quick look at it myself but i didn't see anything just right for this case, but i might easily have missed it.) round about here, say: if(status & Err) drive->error = inb(cmdport+Error); i suspect the error might happen much earlier, before it gets to generate an interrupt, but that should become apparent if you see no error output from that point. just a print() of the status might do. i suspect the controller ends up in a state where its registers don't look right to the driver, or indeed where the controller needs a bit of a whack from the driver after the power cycling, in which case it probably can't generate IO at all and it's (perhaps) timing out the command. it's interesting that it's apparently only wrenwrite not wrenread that generates errors (or i missed something). that in itself might help pinpoint the problem. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 11:37 ` Charles Forsyth @ 2004-03-03 12:19 ` David Tolpin 0 siblings, 0 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 12:19 UTC (permalink / raw) To: 9fans > > round about here, say: > if(status & Err) > drive->error = inb(cmdport+Error); > > i suspect the error might happen much earlier, before it gets to > generate an interrupt, but that should become apparent if you > see no error output from that point. just a print() of the status > might do. i suspect the controller ends up in a state where its > registers don't look right to the driver, or indeed where the > controller needs a bit of a whack from the driver after the power cycling, > in which case it probably can't generate IO at all and it's (perhaps) 1) I have added print("status=%d\n",status) which during normal boot displays status=81 occasionally. During wrenwrite loop the point is not reached. 2) I then added a call to print("atainterrupt\n") at the top of atainterrupt(). The result has been that the computer boots with this kernel (after a few minutes of course) and does not with kernel without that 'brake'. Does it mean it is something about timeouts? I'll continute investigating. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 10:43 ` David Tolpin 2004-03-03 11:37 ` Charles Forsyth @ 2004-03-03 12:46 ` boyd, rounin 1 sibling, 0 replies; 41+ messages in thread From: boyd, rounin @ 2004-03-03 12:46 UTC (permalink / raw) To: 9fans i have a cunning plan, my lord: a screwdriver, sticky tape, adb, dd and a magtape ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 10:39 ` matt 2004-03-03 10:43 ` David Tolpin @ 2004-03-03 12:44 ` boyd, rounin 1 sibling, 0 replies; 41+ messages in thread From: boyd, rounin @ 2004-03-03 12:44 UTC (permalink / raw) To: 9fans > I wouldn't rely on that feature if I were you. if it's the FFS (apart from the appalling design) there's redundancy and no doubt bad block mapping. it only took 'em 10 years to get it right ... for some value of 'right'. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 9:51 ` Geoff Collyer 2004-03-03 9:54 ` David Tolpin @ 2004-03-03 20:35 ` splite 2004-03-03 21:25 ` Geoff Collyer 1 sibling, 1 reply; 41+ messages in thread From: splite @ 2004-03-03 20:35 UTC (permalink / raw) To: 9fans On Wed, Mar 03, 2004 at 01:51:52AM -0800, Geoff Collyer wrote: > There's no great mystery to the name `wren'; there was a line of SCSI > disks at the time (ca. 1990) called Wren I, Wren II, Wren III, etc., > I think made by Fujitsu Nope, CDC. Later spun off into Imprimis, which was still later bought by Seagate. At one point my home box (OEM 12-slot VME crate that my wife painted a lovely green) had three Wren IVs, one from each company. Hell of a way to get 1GB of storage, but cheap. Later replaced the lot with one Wren 7. None of them ever gave me a day's trouble (unlike the pile of Deathstars we've replaced this year), so I appreciate Plan 9's little "homage". ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 20:35 ` splite @ 2004-03-03 21:25 ` Geoff Collyer 2004-03-03 21:42 ` splite 0 siblings, 1 reply; 41+ messages in thread From: Geoff Collyer @ 2004-03-03 21:25 UTC (permalink / raw) To: 9fans CDC, right! Thanks for the history; I knew Fujitsu didn't sound right, but it's been a long time. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 21:25 ` Geoff Collyer @ 2004-03-03 21:42 ` splite 2004-03-03 22:48 ` boyd, rounin 0 siblings, 1 reply; 41+ messages in thread From: splite @ 2004-03-03 21:42 UTC (permalink / raw) To: 9fans On Wed, Mar 03, 2004 at 01:25:29PM -0800, Geoff Collyer wrote: > CDC, right! Thanks for the history; I knew Fujitsu didn't sound > right, but it's been a long time. Yeah, I was only, ummm, three or four back then. It was actually my dad's computer. He got it from his dad. I just, like, googled on "wren" while jamming to Eamon on my iPod and doing my, like, social studies homework so I'd sound all like l33t and stuff. CUL8R ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 21:42 ` splite @ 2004-03-03 22:48 ` boyd, rounin 0 siblings, 0 replies; 41+ messages in thread From: boyd, rounin @ 2004-03-03 22:48 UTC (permalink / raw) To: 9fans > Yeah, I was only, ummm, three or four back then. It was actually my dad's > computer. ... http://www.insultant.net/uk/BP2004/DSC00556.JPG ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:10 ` David Tolpin 2004-03-03 8:15 ` Fco.J.Ballesteros @ 2004-03-03 9:55 ` Bruce Ellis 2004-03-03 10:00 ` David Tolpin 1 sibling, 1 reply; 41+ messages in thread From: Bruce Ellis @ 2004-03-03 9:55 UTC (permalink / raw) To: 9fans i would love to have an OS that doesn't tell you that the disk is crook as rookie. but call me young and willing. brucee ----- Original Message ----- From: "David Tolpin" <dvd@davidashen.net> To: <9fans@cse.psu.edu> Sent: Wednesday, March 03, 2004 7:10 PM Subject: Re: [9fans] i/o error: wrenwrite > > Content-Type: text/plain; charset="US-ASCII" > > Content-Transfer-Encoding: 7bit > > > > When I've seen wrenwrite complaints, it has always been due to > > about-to-fail disks. But don't know if that's your case. It was long > > ago. > > > > Unlikely. FreeBSD runs for several hours under fs stress test > without any slightest sign of a problem. On the same hardware ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 9:55 ` Bruce Ellis @ 2004-03-03 10:00 ` David Tolpin 2004-03-03 10:47 ` Richard Miller 0 siblings, 1 reply; 41+ messages in thread From: David Tolpin @ 2004-03-03 10:00 UTC (permalink / raw) To: 9fans > i would love to have an OS that doesn't tell you that > the disk is crook as rookie. but call me young > and willing. FreeBSD does report any failures with the disk, but all reports are about filesystem inconsistencies, none about hw failures. Plan9 is not telling that the disk has a failure. It falls in an endless loop leaving me no chance to take an action. I am just asking what can I do to pin the bug down. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 10:00 ` David Tolpin @ 2004-03-03 10:47 ` Richard Miller 2004-03-03 11:19 ` David Tolpin 0 siblings, 1 reply; 41+ messages in thread From: Richard Miller @ 2004-03-03 10:47 UTC (permalink / raw) To: 9fans > How to boot from CD and check filesystem? If you've booted Plan 9 from the Plan 9 distribution CD, you should be able to do: disk/kfs -f/dev/sdC0/fs disk/kfscmd 'check r' Replace "sdC0" with the name of your kfs disk device. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 10:47 ` Richard Miller @ 2004-03-03 11:19 ` David Tolpin 2004-03-03 11:25 ` lucio 2004-03-03 13:41 ` jmk 0 siblings, 2 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 11:19 UTC (permalink / raw) To: 9fans > > How to boot from CD and check filesystem? > > If you've booted Plan 9 from the Plan 9 distribution CD, you should > be able to do: > > disk/kfs -f/dev/sdC0/fs > disk/kfscmd 'check r' I had tried that before, and probably misspelled something. Worked this time. Does not report any errors, just finishes smoothly. Powercycling -> endless wrenwrite i/o error messages -> booting from CD -> 'check r' no errors rebooting from sdC0 normal. Just booting from CD and immediately rebooting helps as well. What can be the cause? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 11:19 ` David Tolpin @ 2004-03-03 11:25 ` lucio 2004-03-03 11:34 ` David Tolpin 2004-03-03 13:41 ` jmk 1 sibling, 1 reply; 41+ messages in thread From: lucio @ 2004-03-03 11:25 UTC (permalink / raw) To: 9fans > What can be the cause? You'll have to tell us something about your hardware. As you have no doubt gathered by now, no one here has experienced the exact problem you encountered. In my experience, the best person to deal with it is Jim McKie (sp?) <jmk@plan9.bell-labs.com>. But it may be something simple or something, once investigated, that others with knowledge of the hardware understand better. Would it be SCSI drives you're dealing with? ++L ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 11:25 ` lucio @ 2004-03-03 11:34 ` David Tolpin 2004-03-03 20:11 ` splite 0 siblings, 1 reply; 41+ messages in thread From: David Tolpin @ 2004-03-03 11:34 UTC (permalink / raw) To: 9fans > Jim McKie (sp?) <jmk@plan9.bell-labs.com>. But it may be something > simple or something, once investigated, that others with knowledge of > the hardware understand better. > > Would it be SCSI drives you're dealing with? No. It is IDE. cpu% cat /dev/sdC0/ctl inquiry IBM-DTLA-307015 config 045A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 16 geometry 29336832 512 16383 16 63 cpu% cat /dev/kmesg cpu0: 798MHz GenuineIntel PentiumIII/Xeon (cpuid: AX 0x0683 DX 0x383FBFF) ELCR: 0C20 #l0: i82557: 10Mbps port 0x1400 irq 5: 0002A5561D54 Turning dma or rwm on or off before powercycling does not change the picture. David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 11:34 ` David Tolpin @ 2004-03-03 20:11 ` splite 2004-03-03 20:25 ` David Tolpin 0 siblings, 1 reply; 41+ messages in thread From: splite @ 2004-03-03 20:11 UTC (permalink / raw) To: 9fans On Wed, Mar 03, 2004 at 03:34:38PM +0400, David Tolpin wrote: > > cpu% cat /dev/sdC0/ctl > inquiry IBM-DTLA-307015 > config 045A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 16 > geometry 29336832 512 16383 16 63 Oh my, the dread Deathstar 75GXP. Before you go any further, I'd recommend downloading Hitachi's (formerly IBM's) Drive Fitness Test floppy image and run its "advanced" drive test a few times. Your disk may be about to go toes-up. http://www.hgst.com/hdd/support/download.htm (I understand that it's not giving you trouble under FreeBSD, but it could be that FreeBSD is masking or oblivious to a drive problem.) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 20:11 ` splite @ 2004-03-03 20:25 ` David Tolpin 0 siblings, 0 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 20:25 UTC (permalink / raw) To: 9fans > > Oh my, the dread Deathstar 75GXP. > > Before you go any further, I'd recommend downloading Hitachi's (formerly > IBM's) Drive Fitness Test floppy image and run its "advanced" drive test > a few times. Your disk may be about to go toes-up. > > http://www.hgst.com/hdd/support/download.htm Thank you, of course I will. But for me the problem is not the disk fails (in case it does); but that the diagnostics is problematic. Anyway, I am going to test the disk and debug the kernel. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 11:19 ` David Tolpin 2004-03-03 11:25 ` lucio @ 2004-03-03 13:41 ` jmk 2004-03-03 13:45 ` David Tolpin 1 sibling, 1 reply; 41+ messages in thread From: jmk @ 2004-03-03 13:41 UTC (permalink / raw) To: 9fans On Wed Mar 3 06:22:36 EST 2004, dvd@davidashen.net wrote: > > > How to boot from CD and check filesystem? > > > > If you've booted Plan 9 from the Plan 9 distribution CD, you should > > be able to do: > > > > disk/kfs -f/dev/sdC0/fs > > disk/kfscmd 'check r' > > I had tried that before, and probably misspelled something. Worked > this time. Does not report any errors, just finishes smoothly. > > Powercycling -> endless wrenwrite i/o error messages -> > booting from CD -> 'check r' no errors > rebooting from sdC0 normal. > > Just booting from CD and immediately rebooting helps as well. > > What can be the cause? What type of discs are they, ATA, SCSI, etc? Are they set to spin up automatically when power is applied? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 13:41 ` jmk @ 2004-03-03 13:45 ` David Tolpin 2004-03-03 13:58 ` C H Forsyth 0 siblings, 1 reply; 41+ messages in thread From: David Tolpin @ 2004-03-03 13:45 UTC (permalink / raw) To: 9fans > > What type of discs are they, ATA, SCSI, etc? ATA. cpu% cat /dev/sdC0/ctl inquiry IBM-DTLA-307015 config 045A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 16 geometry 29336832 512 16383 16 63 > Are they set to spin up automatically when power is applied? I do not know. What are other options if I boot from it? I am booting, see kfs...version...time... and then wrenwrite: i/o error many times. If I put enough debugging output into the kernel's atainterrupt, the kernel is able to boot. David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 13:45 ` David Tolpin @ 2004-03-03 13:58 ` C H Forsyth 2004-03-03 13:58 ` lucio 0 siblings, 1 reply; 41+ messages in thread From: C H Forsyth @ 2004-03-03 13:58 UTC (permalink / raw) To: 9fans >> Are they set to spin up automatically when power is applied? >I do not know. What are other options if I boot from it? it's usually selected by a jumper on the drive itself. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 13:58 ` C H Forsyth @ 2004-03-03 13:58 ` lucio 2004-03-03 14:07 ` C H Forsyth 0 siblings, 1 reply; 41+ messages in thread From: lucio @ 2004-03-03 13:58 UTC (permalink / raw) To: 9fans >>> Are they set to spin up automatically when power is applied? >>I do not know. What are other options if I boot from it? > > it's usually selected by a jumper on the drive itself. But not the avarage ATA, at least not the ones I'm familiar with. Plus David makes it clear the system has booted at the time the error occurs, so unless the drive is actual _spun_down_... I imagine though it's the first write, although, again, you won't get the progress messages in that sequence unless the disk has been checked and, in the case of a power failure, it usually takes a while to get past the "check". ++L ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 13:58 ` lucio @ 2004-03-03 14:07 ` C H Forsyth 2004-03-03 14:04 ` David Tolpin 0 siblings, 1 reply; 41+ messages in thread From: C H Forsyth @ 2004-03-03 14:07 UTC (permalink / raw) To: 9fans >>But not the avarage ATA, at least not the ones I'm familiar with. sorry, my mistake. i work too much with SCSI drives ... ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 14:07 ` C H Forsyth @ 2004-03-03 14:04 ` David Tolpin 2004-03-03 14:14 ` lucio 2004-03-03 14:41 ` Derek Fawcus 0 siblings, 2 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 14:04 UTC (permalink / raw) To: 9fans > >>But not the avarage ATA, at least not the ones I'm familiar with. > > sorry, my mistake. i work too much with SCSI drives ... > That was my thought too. SCSI drives can be started by a command; but ATA drives start when powered, aren't way? But in assumption that I am wrong I've opened the box, unplugged the data cable and turned power on. It started instantly. David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 14:04 ` David Tolpin @ 2004-03-03 14:14 ` lucio 2004-03-03 14:23 ` jmk 2004-03-03 14:41 ` Derek Fawcus 1 sibling, 1 reply; 41+ messages in thread From: lucio @ 2004-03-03 14:14 UTC (permalink / raw) To: 9fans > That was my thought too. SCSI drives can be started by a command; > but ATA drives start when powered, aren't way? But in assumption > that I am wrong I've opened the box, unplugged the data cable and > turned power on. It started instantly. There is power-saving mode, and they do stop turning. So I presume there is a command to start them up again. But I don't think that's it. I think you should dump as much of the status of the drive controller as the device driver allows at the first encounter with the error, somebody here may (Christopher Nielsen, I seem to remember, did the 48-bit extensions to ATA) be able to identify what is going out of kilter. What might also hep would be to try an old kernel, say a year old, see if it has the same symptoms. ++L ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 14:14 ` lucio @ 2004-03-03 14:23 ` jmk 2004-03-03 14:24 ` David Tolpin 2004-03-03 21:34 ` Exact Eios " David Tolpin 0 siblings, 2 replies; 41+ messages in thread From: jmk @ 2004-03-03 14:23 UTC (permalink / raw) To: 9fans If you're putting in debugging code, try putting it in port/devsd.c:sdbio() to figure out which of the Eio's is the problem. There's code there for dealing with initial drive conditions, etc. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 14:23 ` jmk @ 2004-03-03 14:24 ` David Tolpin 2004-03-03 21:34 ` Exact Eios " David Tolpin 1 sibling, 0 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 14:24 UTC (permalink / raw) To: 9fans > If you're putting in debugging code, try putting it in port/devsd.c:sdbio() > to figure out which of the Eio's is the problem. There's code there for dealing > with initial drive conditions, etc. > I'll do, many thanks. I a few hours, have to go. The problem is that I need more time to understand how the parts are interrelated in the kernel. The problem goes away when debugging slows down the process, so I cannot just dump everything. David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Exact Eios Re: [9fans] i/o error: wrenwrite 2004-03-03 14:23 ` jmk 2004-03-03 14:24 ` David Tolpin @ 2004-03-03 21:34 ` David Tolpin 1 sibling, 0 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 21:34 UTC (permalink / raw) To: 9fans; +Cc: jmk The exact places where Eio is reported in sdbio are almost all occurences: devsd.c:800 l = unit->dev->ifc->bio(unit, 0, 0, b, nb, bno); if(l < 0) { error(Eio); few occurences: devsd.c:820 l = unit->dev->ifc->bio(unit, 0, 1, b, nb, bno); if(l < 0) { error(Eio); Is it a disk failure? The Hitachi's test suite says the disk is good. Would it make sense to make error reporting more detailed? David Tolpin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 14:04 ` David Tolpin 2004-03-03 14:14 ` lucio @ 2004-03-03 14:41 ` Derek Fawcus 1 sibling, 0 replies; 41+ messages in thread From: Derek Fawcus @ 2004-03-03 14:41 UTC (permalink / raw) To: 9fans On Wed, Mar 03, 2004 at 06:04:38PM +0400, David Tolpin wrote: > > That was my thought too. SCSI drives can be started by a command; > but ATA drives start when powered, aren't way? I believe there is a config option (command) that will change ATA disks such that they subsequently require a command to start. i.e. they power up in the powersave mode, and need the normal 'leave power save mode' command to get them going. Though I also believe the command (feature set subset if I recall) is optional anyway. (I spent too long working on ATA disk drivers ~ 8 years ago) DF ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 8:06 ` Fco.J.Ballesteros 2004-03-03 8:10 ` David Tolpin @ 2004-03-03 9:15 ` Richard Miller 2004-03-03 9:18 ` David Tolpin 1 sibling, 1 reply; 41+ messages in thread From: Richard Miller @ 2004-03-03 9:15 UTC (permalink / raw) To: 9fans > When I've seen wrenwrite complaints, it has always been due to > about-to-fail disks. An "i/o error" report from wrenwrite is not necessarily a hardware problem. It may be a failed seek because a block address in the filesystem has been corrupted. This should be repairable (with some loss of data) by kfscmd chk. -- Richard ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 9:15 ` Richard Miller @ 2004-03-03 9:18 ` David Tolpin 0 siblings, 0 replies; 41+ messages in thread From: David Tolpin @ 2004-03-03 9:18 UTC (permalink / raw) To: 9fans > > When I've seen wrenwrite complaints, it has always been due to > > about-to-fail disks. > > An "i/o error" report from wrenwrite is not necessarily a hardware > problem. It may be a failed seek because a block address in the > filesystem has been corrupted. This should be repairable (with some > loss of data) by kfscmd chk. I don't get a chance to run kfscmd check since wrenwrite are endlessly repeating each time I reboot system. And when I just reboot from CD, and then c-a-d to boot from the hard disk, everything is normal. I can reproduce it easily by pressing power button. It is something to do with the state of hardware not properly initialized. Is there a way to get more debugging information? David Tolpin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [9fans] i/o error: wrenwrite 2004-03-03 7:57 [9fans] i/o error: wrenwrite David Tolpin 2004-03-03 8:06 ` Fco.J.Ballesteros @ 2004-03-03 12:31 ` boyd, rounin 1 sibling, 0 replies; 41+ messages in thread From: boyd, rounin @ 2004-03-03 12:31 UTC (permalink / raw) To: 9fans > after a power failure, I'm getting repeating lines with > I/O error: wrenwrite http://www.insultant.net/uk/BP2004/DSC00558.JPG btw: the BP shots are un-retouched, just rotated. ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2004-03-03 22:48 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-03-03 7:57 [9fans] i/o error: wrenwrite David Tolpin 2004-03-03 8:06 ` Fco.J.Ballesteros 2004-03-03 8:10 ` David Tolpin 2004-03-03 8:15 ` Fco.J.Ballesteros 2004-03-03 8:22 ` David Tolpin 2004-03-03 8:34 ` Fco.J.Ballesteros 2004-03-03 8:59 ` David Tolpin 2004-03-03 9:51 ` Geoff Collyer 2004-03-03 9:54 ` David Tolpin 2004-03-03 10:39 ` matt 2004-03-03 10:43 ` David Tolpin 2004-03-03 11:37 ` Charles Forsyth 2004-03-03 12:19 ` David Tolpin 2004-03-03 12:46 ` boyd, rounin 2004-03-03 12:44 ` boyd, rounin 2004-03-03 20:35 ` splite 2004-03-03 21:25 ` Geoff Collyer 2004-03-03 21:42 ` splite 2004-03-03 22:48 ` boyd, rounin 2004-03-03 9:55 ` Bruce Ellis 2004-03-03 10:00 ` David Tolpin 2004-03-03 10:47 ` Richard Miller 2004-03-03 11:19 ` David Tolpin 2004-03-03 11:25 ` lucio 2004-03-03 11:34 ` David Tolpin 2004-03-03 20:11 ` splite 2004-03-03 20:25 ` David Tolpin 2004-03-03 13:41 ` jmk 2004-03-03 13:45 ` David Tolpin 2004-03-03 13:58 ` C H Forsyth 2004-03-03 13:58 ` lucio 2004-03-03 14:07 ` C H Forsyth 2004-03-03 14:04 ` David Tolpin 2004-03-03 14:14 ` lucio 2004-03-03 14:23 ` jmk 2004-03-03 14:24 ` David Tolpin 2004-03-03 21:34 ` Exact Eios " David Tolpin 2004-03-03 14:41 ` Derek Fawcus 2004-03-03 9:15 ` Richard Miller 2004-03-03 9:18 ` David Tolpin 2004-03-03 12:31 ` boyd, rounin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).