9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Christopher Nielsen <cnielsen@pobox.com>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] venti+fossil woes
Date: Tue, 18 Nov 2003 04:40:23 -0800	[thread overview]
Message-ID: <20031118124023.GF65844@cassie.foobarbaz.net> (raw)
In-Reply-To: <20031116013757.GO834@cassie.foobarbaz.net>

Here's an update for anyone interested, since I can't
manage to get to sleep for some reason.

I bought some better quality ata cables yesterday. That
helped to the point that I thought my troubles were over.
No such luck.

Now, what I am seeing is whenever a venti arena becomes
full and is in the process of being sealed, the screen
becomes filled with IBsy+ repeated ad infinitum, which
I know is from the ata driver. Eventually, fossil gives
an error from diskReadRaw() saying something like:

archive(0, <block addr>): cannot find block: i/o error

followed by a dump that I presume could be useful for
diagnostics.

What I am guessing is happening is that there is so much
contention in the controller that it's causing reads and
sometimes writes to timeout. This eventually causes fossil
to just fall over dead. At which point, I reboot from a
CD, run venti/checkarenas -vf on the arena partition and
then reboot so that fossil can continue where it left off
with the snapshot.

Wash, rinse, repeat.

Anyway, the saga continues. We'll see if I end up losing
data. I'm still guessing not. My only comment is that it
would be nice if fossil would handle such error conditions
more gracefully.

Regardless, I am going to dig around for another ata
controller to spread the disks across.

On Sat, Nov 15, 2003 at 05:37:57PM -0800, Christopher Nielsen wrote:
> this is looking more and more like it was a hardware
> problem. reseating all the connections eliminated most
> of the errors i was seeing. now i am getting errors
> from diskRawWrite, which leads me to believe that one
> of the disks is going bad. i can't really tell which
> one, though. the error message from diskRawWrite gives
> some diagnostic info, but i don't know how to interpret
> it. admittedly, i haven't dived into the source as much
> as i could, but maybe someone can provide some insight
> before i go ahead and do that.
>
> thanks to everyone that has provided input so far.
>
> i have to say, it doesn't look like i'm going to lose
> any data. it's not certain yet, but it's looking good.
> the paranoia in fossil and venti are good.
>
> On Fri, Nov 14, 2003 at 03:18:42PM -0800, Christopher Nielsen wrote:
> > fossil crashed in the middle of an archival snapshot.
> > now, i'm getting
> >
> > err 4: no space left in arenas
> > failed to write lump for <vac score>: no space left in arenas
> >
> > there's plenty of space left in the arenas. a whole other
> > 167G disc, in fact.
> >
> > i've run venti/checkarenas and venti/checkindex to fix any
> > inconsistencies. they were both successful according to the
> > output.
> >
> > any ideas about what is going on and how to fix it?
> >
> > also, is there any way to tell fossil to stop trying to do
> > the snapshot?
>
> --
> Christopher Nielsen
> "They who can give up essential liberty for temporary
> safety, deserve neither liberty nor safety." --Benjamin Franklin

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


  parent reply	other threads:[~2003-11-18 12:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-14 23:18 Christopher Nielsen
2003-11-14 23:23 ` Geoff Collyer
2003-11-14 23:34   ` Christopher Nielsen
2003-11-14 23:37 ` Charles Forsyth
2003-11-14 23:43   ` Christopher Nielsen
2003-11-15  0:17     ` Charles Forsyth
2003-11-15  1:00       ` Christopher Nielsen
2003-11-14 23:43   ` boyd, rounin
     [not found] ` <20031116013757.GO834@cassie.foobarbaz.net>
2003-11-18 12:40   ` Christopher Nielsen [this message]
2003-11-18 14:08     ` Russ Cox
2003-11-18 15:27       ` Charles Forsyth
2003-11-18 22:35       ` Christopher Nielsen
2003-11-18 23:10         ` jmk
2003-11-18 23:18           ` mirtchov
2003-11-18 23:30           ` Christopher Nielsen
2003-11-18 23:59             ` Geoff Collyer
2003-11-19  0:35               ` Christopher Nielsen
2003-11-19  1:12                 ` okamoto
2003-11-19  4:51                 ` Dan Cross
2003-11-19  5:59                   ` Christopher Nielsen
2003-11-19  0:03             ` jmk
2003-11-19  0:20               ` Charles Forsyth
2003-11-20  6:04               ` Christopher Nielsen
2003-11-20  9:26                 ` C H Forsyth
2003-11-20 10:14                   ` Christopher Nielsen
2003-11-18 15:36     ` SPAM: " jmk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031118124023.GF65844@cassie.foobarbaz.net \
    --to=cnielsen@pobox.com \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).