From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <5a828837c6470dfc75b135b56c038ebe@plan9.bell-labs.com> From: jmk@plan9.bell-labs.com To: 9fans@cse.psu.edu Subject: Re: [9fans] more fossil woes In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Date: Fri, 31 Oct 2003 23:18:12 -0500 Topicbox-Message-UUID: 7b868e18-eacc-11e9-9e20-41e7f4b1d025 I'd say you had something more fundamental wrong, or else you're not telling the whole story. If you do the 2nd flfmt as described below you should get a message like fs header block already exists; are you sure? [y/n]: unless you have the '-y' option. On Fri Oct 31 19:25:36 EST 2003, mirtchov@cpsc.ucalgary.ca wrote: > I never thought I'd get to that point, but here it is: > > Fossil is unable to initialize a partition with flfmt. > > Here's the whole story: > > > This morning after succesfully checking my email from home I arrived at > school just to find that fossil has died with the familiar: > > assert failed: b->nlock == 1 > fossil 44: suicide: sys: trap: fault read addr=0x0 pc=0x0002b6b7 > > It was the first crash in a long time, but unfortunately I had no way of > finding out who/what had caused it, because Plan 9 does not allow me to > examine process' activity based on utilization of a particular resource. > (Interestingly enough, when I suggested such "features" are added to the > system there was an outrage, especially from people who never use Plan 9, > telling me I'm just polluting the beautiful system :)... > > I didn't give much thought to the problem and ran fossil/flchk, which > surprisingly discovered much more errors than I had thought I had. Here's > how many blocks it couldn't access anymore (I run a 3-day wide epoch > window) and had suggested that I bfree: > > mirtchov@fbsd$ cat flchk | sed '/^[^b]/d' | wc -l > 365357 > mirtchov@fbsd$ > > > that's 3 gigs of broken data... For comparison my entire venti archive > weights in at 1.3GB. > > I examined the blocks for any obvious errors and cat them to the fossil > console, which immediately came back with the somewhat new: > > cacheLocalData: addr=7840 type got 16 exp 8: tag got 0 exp 65afd613 > fossil 94: suicide: sys: trap: fault read addr=0x0 pc=0x0002b6b7 > > A reboot or two later, and I had a running system that was good for checking > email. Only much later, when I needed to do some real work with Plan 9 did I > find out that /acme/bin/* was corrupted! It was showing binaries as > existing, but no file operations could be done on them. At this point I > decided that it's best to reinit fossil with the latest venti score and just > forget about it, but fossil thought differently: > > plan9# fossil/flfmt -v ff96c3967c7815e15a8a4c09196221b01a8bba3d /dev/sdD0/fossil > cacheLocalData: addr=7841 type got 16 exp 0: tag got 0 exp 6669fe74 > fossil 90: suicide: sys: trap: fault read addr=0x0 pc=0x0002b6b7 > > exactly the same happens if I try to format the drive: > > plan9# fossil/flfmt /dev/sdD0/fossil > cacheLocalData: addr=7841 type got 16 exp 0: tag got 0 exp 6669fe74 > fossil 89: suicide: sys: trap: fault read addr=0x0 pc=0x0002b807 > > for all it's worth, reading and writing from sdD0 work fine... > > Anyway, I have another fossil disk that I can boot and with venti's help > will reinitialize the system. Others may not be that lucky :) > > andrey