From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Fri, 17 Sep 2010 18:12:08 -0400 To: 9fans@9fans.net Message-ID: In-Reply-To: References: <48038affccd3ef00676d162365448522@brasstown.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Subject: Re: [9fans] interesting deadlock Topicbox-Message-UUID: 587d1bf8-ead6-11e9-9d60-3106f5b1d025 On Thu, Sep 16, 2010 at 11:02 PM, erik quanstrom wrote: > i have these processes all deadlocked.  8.out > is serving /n/mntpt. > > xxx        11921346    0:00   0:00      436K Create   8.out > xxx        11921785    0:00   0:00       24K Open     cat /n/mntpt/sos > xxx        11921786    0:00   0:00       24K Unmount  unmount /n/mntpt > xxx        11921787    0:00   0:00       44K Pwrite   echo x y okay, it's not. sorry for the confusion. the reason for looking at that case was that it seemed at the time related to a crash here on the indirection of mh->mount, which is nil. the code in question was a fileserver which was crashing, getting killed while concurrent io was being done to the fs. /sys/src/9/port/chan.c:1009,1019 if((wq = ewalk(c, nil, names+nhave, ntry)) == nil){ /* try a union mount, if any */ if(mh && !nomount){ /* * mh->mount->to == c, so start at mh->mount->next */ rlock(&mh->lock); for(f = mh->mount->next; f; f = f->next) if((wq = ewalk(f->to, nil, names+nhave, ntry)) != nil) break; i don't know why it can't be nil, since the code here doesn't have a lock. i think this might be the solution, but i haven't done a careful lock audit to be sure /usr/quanstro/src/ysk/port/chan.c:1009,1021 if((wq = ewalk(c, nil, names+nhave, ntry)) == nil){ /* try a union mount, if any */ if(mh && !nomount){ /* * mh->mount->to == c, so start at mh->mount->next */ >> f = nil; rlock(&mh->lock); >> if(mh->mount) for(f = mh->mount->next; f; f = f->next) if((wq = ewalk(f->to, nil, names+nhave, ntry)) != nil) break; runlock(&mh->lock); - erik