From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46FCC2CF.1060501@gmx.de> Date: Fri, 28 Sep 2007 11:01:03 +0200 From: Kernel Panic User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] More venti sync woes. References: <509071940709261232t1046ecv2ce6800d549c180c@mail.gmail.com> <509071940709271549l57a778e5k9d0f70e3d3a670be@mail.gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: c69d0124-ead2-11e9-9d60-3106f5b1d025 Russ Cox wrote: >dma is worth around 10x, certainly less than 50. >i agree that your venti server is taking a very long >time to come back. i reboot mine all the time >and don't have this problem. > >i am at a loss for what could be taking it so long. >it's probably not going to hurt any to stop it. >it could take forever -- maybe it's looping! > > It is... while(1){ proc main: kick icache work icachewritecoord: start proc icachewritecoord: icachewritecoord kick dcache work flushproc: start proc flushproc: build t=131 proc flushproc: writeblocks t=991 proc flushproc: writeblocks.1 t=1632 proc flushproc: writeblocks.2 t=2296 proc flushproc: writeblocks.3 t=2944 proc flushproc: undirty.4 t=3564 work flushproc: finish proc icachewritecoord: kick dcache proc icachewritecoord: icachewritecoord kicked dcache proc icachewritecoord: icachewritecoord start flush proc icachewritecoord: icachedirty enter proc icachewritecoord: icachedirty exit proc icachewritecoord: icachewritecoord sleep proc main: kick icache } the main proc loops in icachealloc(): while(icache.ndirty == icache.entries){ /* * This is a bit suspect. Kickicache will wake up the * icachewritecoord, but if all the index entries are for * unflushed disk blocks, icachewritecoord won't be * able to do much. It always rewakes everyone when * it thinks it is done, though, so at least we'll go around * the while loop again. Also, if icachewritecoord sees * that the disk state hasn't change at all since the last * time around, it kicks the disk. This needs to be * rethought, but it shouldn't deadlock anymore. */ kickicache(); rsleep(&icache.full); } but icache.ndirty never changes... so it hangs forever in "sync..." because it cant allocate ientries. >when you manage to boot in other means, >it would be nice to see what ps -a|grep venti >says. venti sets its proc args that show up in ps -a >to tell you what each proc does. > >the new venti is very careful both about the >consistency of what is stored on disk and about >recovering quickly after a disk failure >(there's not a lot to do -- just pick up the unindexed >arena entries from the arena tocs and toss them >back into the index write buffer where they were >when you restarted the system). > >what you're describing could happen if you were >running a new venti (which buffers index updates >quite aggressively) and then on reboot managed >to start an old venti (which would then process the >unindexed new blocks one at a time instead of >buffering the updates, with about 3 seeks per block). > >without more information i'm afraid i have no good answers. > >russ > > >