9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: cinap_lenrek@gmx.de
To: 9fans@9fans.net
Subject: Re: [9fans] That deadlock, again
Date: Wed, 17 Nov 2010 06:22:33 +0100	[thread overview]
Message-ID: <4ee4b7c2dd207e953599fc618df2c456@gmx.de> (raw)
In-Reply-To: <1e1e3d7c4781c86aa3a270cecdbaadbb@coraid.com>

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

qpc is the just the caller of the last successfull *acquired* qlock.
what we know is that the exportfs proc spins in the q->use taslock
called by qlock() right?  this already seems wired...  q->use is held
just long enougth to test q->locked and manipulate the queue.  also
sched() will avoid switching to another proc while we are holding tas
locks.

i would like to know which qlock is the kernel is trying to acquire
on behalf of exportfs that is also reachable from the etherread4
code.

one could move:

	up->qpc = getcallerpc(&q);

from qlock() before the lock(&q->use); so we can see from where that
qlock gets called that hangs the exportfs call, or add another magic
debug pointer (qpctry) to the proc stucture and print it in dumpaproc().

--
cinap

[-- Attachment #2: Type: message/rfc822, Size: 3429 bytes --]

From: erik quanstrom <quanstro@labs.coraid.com>
To: lucio@proxima.alt.za, 9fans@9fans.net
Subject: Re: [9fans] That deadlock, again
Date: Tue, 16 Nov 2010 23:18:09 -0500
Message-ID: <1e1e3d7c4781c86aa3a270cecdbaadbb@coraid.com>

> > acid: src(0xf0148c8a)
> > /sys/src/9/ip/tcp.c:2096
> >  2091		if(waserror()){
> >  2092			qunlock(s);
> >  2093			nexterror();
> >  2094		}
> >  2095		qlock(s);
> >>2096		qunlock(tcp);
> >  2097
> >  2098		/* fix up window */
> >  2099		seg.wnd <<= tcb->rcv.scale;
> >  2100
> >  2101		/* every input packet in puts off the keep alive time out */
>
> The source actually says (to be pedantic):
>
> 	/* The rest of the input state machine is run with the control block
> 	 * locked and implements the state machine directly out of the RFC.
> 	 * Out-of-band data is ignored - it was always a bad idea.
> 	 */
> 	tcb = (Tcpctl*)s->ptcl;
> 	if(waserror()){
> 		qunlock(s);
> 		nexterror();
> 	}
> 	qlock(s);
> 	qunlock(tcp);
>
> Now, the qunlock(s) should not precede the qlock(s), this is the first
> case in this procedure:

it doesn't.  waserror() can't be executed before the code
following it.  perhpas it could be more carefully written
as

> >  2095		qlock(s);
> >  2091		if(waserror()){
> >  2092			qunlock(s);
> >  2093			nexterror();
> >  2094		}
> >>2096		qunlock(tcp);

but it really wouldn't make any difference.

i'm not completely convinced that tcp's to blame.
and if it is, i think the problem is probablly tcp
timers.

- erik

  parent reply	other threads:[~2010-11-17  5:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-16  4:21 lucio
2010-11-16  4:40 ` erik quanstrom
2010-11-16  5:03   ` lucio
2010-11-16  5:11     ` cinap_lenrek
2010-11-16  5:18       ` lucio
2010-11-16  5:28         ` cinap_lenrek
2010-11-16  6:47           ` Lucio De Re
2010-11-16 13:53     ` erik quanstrom
2010-11-16 18:03       ` lucio
2010-11-17  4:08         ` Lucio De Re
2010-11-17  4:18           ` erik quanstrom
2010-11-17  4:37             ` Lucio De Re
2010-11-17  4:43               ` erik quanstrom
2010-11-17  5:22             ` cinap_lenrek [this message]
2010-11-17  6:45               ` Lucio De Re
2010-11-17  7:03                 ` Lucio De Re
2010-11-17  7:09                   ` erik quanstrom
2010-11-17  5:33             ` cinap_lenrek
2010-11-17  6:48               ` Lucio De Re
2010-11-17  7:03                 ` erik quanstrom
2010-11-17 14:40           ` Russ Cox
2010-11-18  5:50 Lucio De Re
2010-11-18  5:53 ` erik quanstrom
2010-11-18  8:11   ` Lucio De Re
2010-11-18  8:35     ` cinap_lenrek
2010-11-18  9:20     ` cinap_lenrek
2010-11-18 10:48       ` Lucio De Re
2010-11-18 15:10         ` erik quanstrom
2010-11-18 16:46           ` erik quanstrom
2010-11-18 18:01             ` Lucio De Re
2010-11-18 18:29               ` C H Forsyth
2010-11-18 18:23                 ` Lucio De Re
2010-11-18 18:33                 ` Lucio De Re
2010-11-18 18:43               ` erik quanstrom
2010-11-18 18:54                 ` erik quanstrom
2010-11-18 19:01                 ` Lucio De Re
2010-11-18 19:27                   ` Lucio De Re
2010-11-18 18:03           ` Lucio De Re
2010-11-18  5:57 ` Lucio De Re

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ee4b7c2dd207e953599fc618df2c456@gmx.de \
    --to=cinap_lenrek@gmx.de \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).