From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Tue, 16 Nov 2010 23:18:09 -0500 To: lucio@proxima.alt.za, 9fans@9fans.net Message-ID: <1e1e3d7c4781c86aa3a270cecdbaadbb@coraid.com> In-Reply-To: <10e606b8715d8e2c9fda5768466036ca@proxima.alt.za> References: <10e606b8715d8e2c9fda5768466036ca@proxima.alt.za> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] That deadlock, again Topicbox-Message-UUID: 83880dc6-ead6-11e9-9d60-3106f5b1d025 > > acid: src(0xf0148c8a) > > /sys/src/9/ip/tcp.c:2096 > > 2091 if(waserror()){ > > 2092 qunlock(s); > > 2093 nexterror(); > > 2094 } > > 2095 qlock(s); > >>2096 qunlock(tcp); > > 2097 > > 2098 /* fix up window */ > > 2099 seg.wnd <<= tcb->rcv.scale; > > 2100 > > 2101 /* every input packet in puts off the keep alive time out */ > > The source actually says (to be pedantic): > > /* The rest of the input state machine is run with the control block > * locked and implements the state machine directly out of the RFC. > * Out-of-band data is ignored - it was always a bad idea. > */ > tcb = (Tcpctl*)s->ptcl; > if(waserror()){ > qunlock(s); > nexterror(); > } > qlock(s); > qunlock(tcp); > > Now, the qunlock(s) should not precede the qlock(s), this is the first > case in this procedure: it doesn't. waserror() can't be executed before the code following it. perhpas it could be more carefully written as > > 2095 qlock(s); > > 2091 if(waserror()){ > > 2092 qunlock(s); > > 2093 nexterror(); > > 2094 } > >>2096 qunlock(tcp); but it really wouldn't make any difference. i'm not completely convinced that tcp's to blame. and if it is, i think the problem is probablly tcp timers. - erik