From mboxrd@z Thu Jan 1 00:00:00 1970 From: andrey mirtchovski To: 9fans@cse.psu.edu Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: [9fans] more fossil woes Date: Fri, 31 Oct 2003 17:24:25 -0700 Topicbox-Message-UUID: 7b7db7ca-eacc-11e9-9e20-41e7f4b1d025 I never thought I'd get to that point, but here it is: Fossil is unable to initialize a partition with flfmt. Here's the whole story: This morning after succesfully checking my email from home I arrived at school just to find that fossil has died with the familiar: assert failed: b->nlock == 1 fossil 44: suicide: sys: trap: fault read addr=0x0 pc=0x0002b6b7 It was the first crash in a long time, but unfortunately I had no way of finding out who/what had caused it, because Plan 9 does not allow me to examine process' activity based on utilization of a particular resource. (Interestingly enough, when I suggested such "features" are added to the system there was an outrage, especially from people who never use Plan 9, telling me I'm just polluting the beautiful system :)... I didn't give much thought to the problem and ran fossil/flchk, which surprisingly discovered much more errors than I had thought I had. Here's how many blocks it couldn't access anymore (I run a 3-day wide epoch window) and had suggested that I bfree: mirtchov@fbsd$ cat flchk | sed '/^[^b]/d' | wc -l 365357 mirtchov@fbsd$ that's 3 gigs of broken data... For comparison my entire venti archive weights in at 1.3GB. I examined the blocks for any obvious errors and cat them to the fossil console, which immediately came back with the somewhat new: cacheLocalData: addr=7840 type got 16 exp 8: tag got 0 exp 65afd613 fossil 94: suicide: sys: trap: fault read addr=0x0 pc=0x0002b6b7 A reboot or two later, and I had a running system that was good for checking email. Only much later, when I needed to do some real work with Plan 9 did I find out that /acme/bin/* was corrupted! It was showing binaries as existing, but no file operations could be done on them. At this point I decided that it's best to reinit fossil with the latest venti score and just forget about it, but fossil thought differently: plan9# fossil/flfmt -v ff96c3967c7815e15a8a4c09196221b01a8bba3d /dev/sdD0/fossil cacheLocalData: addr=7841 type got 16 exp 0: tag got 0 exp 6669fe74 fossil 90: suicide: sys: trap: fault read addr=0x0 pc=0x0002b6b7 exactly the same happens if I try to format the drive: plan9# fossil/flfmt /dev/sdD0/fossil cacheLocalData: addr=7841 type got 16 exp 0: tag got 0 exp 6669fe74 fossil 89: suicide: sys: trap: fault read addr=0x0 pc=0x0002b807 for all it's worth, reading and writing from sdD0 work fine... Anyway, I have another fossil disk that I can boot and with venti's help will reinitialize the system. Others may not be that lucky :) andrey