From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Mon, 15 Nov 2010 23:40:46 -0500 To: 9fans@9fans.net Message-ID: <51e87437b774890c36956be747be653c@brasstown.quanstro.net> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] That deadlock, again Topicbox-Message-UUID: 8158784c-ead6-11e9-9d60-3106f5b1d025 On Mon Nov 15 23:23:12 EST 2010, lucio@proxima.alt.za wrote: > Regarding the "deadlock" report that I occasionally see on my CPU > server console, I won't bore anyone with PC addresses or anything like > that, but I will recommend something I believe to be a possible > trigger: the failure always seems to occur within "exportfs", which in > this case is used exclusively to run stats(1) remotely from my > workstation. So the recommendation is that somebody like Erik, who is > infinitely more clued up than I am in the kernel arcana should run one > or more stats sessions into a cpu server (I happen to be running > fossil, so maybe Erik won't see this) and see if he can also trigger this behaviour. I'm hoping that it is not platform specific. > > Right now, I'm short of skills as well as a serial console :-( i run stats all the time. i've never seen a lock loop caused by stats. exportfs gets blamed all the time for the sins of others. possible culprits are the tcp/ip stack and the kernel devices that stats accesses and of course, the channel code itself. it would be a good idea for you to track down all the pcs involved and send them along. i can't think of another way of narrowing down the list of potential suspects. not all of our usual suspects has an alibi. i assume you've fixed this? (not yet fixed on sources.) /n/sources/plan9//sys/src/9/port/chan.c:1012,1018 - chan.c:1012,1020 /* * mh->mount->to == c, so start at mh->mount->next */ + f = nil; rlock(&mh->lock); + if(mh->mount) for(f = mh->mount->next; f; f = f->next) if((wq = ewalk(f->to, nil, names+nhave, ntry)) != nil) break; - erik