[9fans] /net panic

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] /net panic
@ 2008-02-15  9:20 sqweek
  2008-02-15 15:19 ` Iruata Souza
  2008-02-15 16:23 ` erik quanstrom
  0 siblings, 2 replies; 6+ messages in thread
From: sqweek @ 2008-02-15  9:20 UTC (permalink / raw)
  To: 9fans

 muzgo (from irc) was playing around with /net in qemu and came across this gem:

on drawterm:
cpu% cd /net/tcp
cpu% cat clone
23cpu% cd 23
cpu% echo connect 10.0.2.1!12345 >ctl
cpu% cat status
Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin
65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200
katimer.count 159
cpu% echo connect 10.0.2.1!12345 >ctl
cpu%

this causes CPU server to reboot with:
panic: timerstate1
panic: timerstate1
dumpstack disabled
cpu0 exiting

 The usage of /net is invalid, but you wouldn't really expect that to
reboot the machine (or maybe it's a holdover from before /dev/reboot
existed? ;) ).
 Just tried it on my cpu server and got the same panic, so we can rule
qemu out. I adjusted the ip!port to something that would accept my
connection, and my status was something like Timedwait rather than
Finwait2. Second time around I skipped the cat status and still hit
the panic so the status read isn't affecting things (which is probably
blindingly obvious to anyone familiar with the /net code, but oh
well).

 I appear to be running a realtek 8169 nic:
#l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c

 Don't know what muzgo was using in qemu, but let me know if I can
provide any useful information.
-sqweek

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /net panic
  2008-02-15  9:20 [9fans] /net panic sqweek
@ 2008-02-15 15:19 ` Iruata Souza
  2008-02-15 16:10   ` Iruata Souza
  2008-02-15 16:23 ` erik quanstrom
  1 sibling, 1 reply; 6+ messages in thread
From: Iruata Souza @ 2008-02-15 15:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Feb 15, 2008 at 7:20 AM, sqweek <sqweek@gmail.com> wrote:
>  muzgo (from irc) was playing around with /net in qemu and came across this gem:
>
>  on drawterm:
>  cpu% cd /net/tcp
>  cpu% cat clone
>  23cpu% cd 23
>  cpu% echo connect 10.0.2.1!12345 >ctl
>  cpu% cat status
>  Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin
>  65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200
>  katimer.count 159
>  cpu% echo connect 10.0.2.1!12345 >ctl
>  cpu%
>
>  this causes CPU server to reboot with:
>  panic: timerstate1
>  panic: timerstate1
>  dumpstack disabled
>  cpu0 exiting
>
>
>   The usage of /net is invalid, but you wouldn't really expect that to
>  reboot the machine (or maybe it's a holdover from before /dev/reboot
>  existed? ;) ).
>   Just tried it on my cpu server and got the same panic, so we can rule
>  qemu out. I adjusted the ip!port to something that would accept my
>  connection, and my status was something like Timedwait rather than
>  Finwait2. Second time around I skipped the cat status and still hit
>  the panic so the status read isn't affecting things (which is probably
>  blindingly obvious to anyone familiar with the /net code, but oh
>  well).
>
>   I appear to be running a realtek 8169 nic:
>  #l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c
>
>   Don't know what muzgo was using in qemu, but let me know if I can
>  provide any useful information.
>  -sqweek
>

no *strange* things running, i guess.

iru


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /net panic
  2008-02-15 15:19 ` Iruata Souza
@ 2008-02-15 16:10   ` Iruata Souza
  0 siblings, 0 replies; 6+ messages in thread
From: Iruata Souza @ 2008-02-15 16:10 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Feb 15, 2008 at 1:19 PM, Iruata Souza <iru.muzgo@gmail.com> wrote:
>
> On Fri, Feb 15, 2008 at 7:20 AM, sqweek <sqweek@gmail.com> wrote:
>  >  muzgo (from irc) was playing around with /net in qemu and came across this gem:
>  >
>  >  on drawterm:
>  >  cpu% cd /net/tcp
>  >  cpu% cat clone
>  >  23cpu% cd 23
>  >  cpu% echo connect 10.0.2.1!12345 >ctl
>  >  cpu% cat status
>  >  Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin
>  >  65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200
>  >  katimer.count 159
>  >  cpu% echo connect 10.0.2.1!12345 >ctl
>  >  cpu%
>  >
>  >  this causes CPU server to reboot with:
>  >  panic: timerstate1
>  >  panic: timerstate1
>  >  dumpstack disabled
>  >  cpu0 exiting
>  >
>  >
>  >   The usage of /net is invalid, but you wouldn't really expect that to
>  >  reboot the machine (or maybe it's a holdover from before /dev/reboot
>  >  existed? ;) ).
>  >   Just tried it on my cpu server and got the same panic, so we can rule
>  >  qemu out. I adjusted the ip!port to something that would accept my
>  >  connection, and my status was something like Timedwait rather than
>  >  Finwait2. Second time around I skipped the cat status and still hit
>  >  the panic so the status read isn't affecting things (which is probably
>  >  blindingly obvious to anyone familiar with the /net code, but oh
>  >  well).
>  >
>  >   I appear to be running a realtek 8169 nic:
>  >  #l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c
>  >
>  >   Don't know what muzgo was using in qemu, but let me know if I can
>  >  provide any useful information.
>  >  -sqweek
>  >
>
>  no *strange* things running, i guess.
>
>  iru
>

some info about my environment is in http://iru.oitobits.net/9netpanic/ where:
QEMU_Plan9 - sh script to run emulated Plan 9
cpuemu_config.tgz - CPU server's /cfg/cpuemu
qemu-ifup - sh script to up host<->guest tun interfaces, Plan 9 gets tun0
ifconfig.tun0 - tun interface configuration
listen.c - listener running on host
panic - complete scenario of snap pasted by sqweek

iru


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /net panic
  2008-02-15  9:20 [9fans] /net panic sqweek
  2008-02-15 15:19 ` Iruata Souza
@ 2008-02-15 16:23 ` erik quanstrom
  2008-02-15 17:29   ` Iruata Souza
  1 sibling, 1 reply; 6+ messages in thread
From: erik quanstrom @ 2008-02-15 16:23 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1447 bytes --]

i'm not sure this is a perfect solution.  i just don't have enough
of the plan 9 ip stack loaded into cache to be sure nothing's
been forgotten.  but give this patch a whirl.  basically, i think
the problem is that inittcpctl() was stepping on timers that might
have been active.  these timers need to be shutdown.  unfortunately,
tcpclose() and localclose() are too agressive.  cleanupconnection()
is a chopped-down version of localclose.

- erik

/n/sources/plan9//sys/src/9/ip/tcp.c:782,787 - tcp.c:782,813
  	return mtu;
  }

+ static void
+ cleanupconnection(Conv *s)
+ {
+ 	Tcpctl *tcb;
+ 	Reseq *rp,*rp1;
+ 	Tcppriv *tpriv;
+
+ 	tpriv = s->p->priv;
+ 	tcb = (Tcpctl*)s->ptcl;
+
+ 	iphtrem(&tpriv->ht, s);
+
+ 	tcphalt(tpriv, &tcb->timer);
+ 	tcphalt(tpriv, &tcb->rtt_timer);
+ 	tcphalt(tpriv, &tcb->acktimer);
+ 	tcphalt(tpriv, &tcb->katimer);
+
+ 	/* Flush reassembly queue; nothing more can arrive */
+ 	for(rp = tcb->reseq; rp != nil; rp = rp1) {
+ 		rp1 = rp->next;
+ 		freeblist(rp->bp);
+ 		free(rp);
+ 	}
+ 	tcb->reseq = nil;
+ }
+
  void
  inittcpctl(Conv *s, int mode)
  {
/n/sources/plan9//sys/src/9/ip/tcp.c:792,798 - tcp.c:818,827

  	tcb = (Tcpctl*)s->ptcl;

- 	memset(tcb, 0, sizeof(Tcpctl));
+ 	if(tcb->timer.arg)		// c->state != Idle?
+ 		cleanupconnection(s);
+ 	else
+ 		memset(tcb, 0, sizeof(Tcpctl));

  	tcb->ssthresh = 65535;
  	tcb->srtt = tcp_irtt<<LOGAGAIN;

[-- Attachment #2: Type: message/rfc822, Size: 4485 bytes --]

From: sqweek <sqweek@gmail.com>
To: 9fans@cse.psu.edu
Subject: [9fans] /net panic
Date: Fri, 15 Feb 2008 18:20:13 +0900
Message-ID: <140e7ec30802150120y1b0e4b33lf76787aaee84edd2@mail.gmail.com>

 muzgo (from irc) was playing around with /net in qemu and came across this gem:

on drawterm:
cpu% cd /net/tcp
cpu% cat clone
23cpu% cd 23
cpu% echo connect 10.0.2.1!12345 >ctl
cpu% cat status
Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin
65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200
katimer.count 159
cpu% echo connect 10.0.2.1!12345 >ctl
cpu%

this causes CPU server to reboot with:
panic: timerstate1
panic: timerstate1
dumpstack disabled
cpu0 exiting

 The usage of /net is invalid, but you wouldn't really expect that to
reboot the machine (or maybe it's a holdover from before /dev/reboot
existed? ;) ).
 Just tried it on my cpu server and got the same panic, so we can rule
qemu out. I adjusted the ip!port to something that would accept my
connection, and my status was something like Timedwait rather than
Finwait2. Second time around I skipped the cat status and still hit
the panic so the status read isn't affecting things (which is probably
blindingly obvious to anyone familiar with the /net code, but oh
well).

 I appear to be running a realtek 8169 nic:
#l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c

 Don't know what muzgo was using in qemu, but let me know if I can
provide any useful information.
-sqweek

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /net panic
  2008-02-15 16:23 ` erik quanstrom
@ 2008-02-15 17:29   ` Iruata Souza
  2008-02-15 18:34     ` erik quanstrom
  0 siblings, 1 reply; 6+ messages in thread
From: Iruata Souza @ 2008-02-15 17:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Feb 15, 2008 at 2:23 PM, erik quanstrom <quanstro@quanstro.net> wrote:
>
>i'm not sure this is a perfect solution.  i just don't have enough
>of the plan 9 ip stack loaded into cache to be sure nothing's
>been forgotten.  but give this patch a whirl.  basically, i think
>the problem is that inittcpctl() was stepping on timers that might
>have been active.  these timers need to be shutdown.  unfortunately,
>tcpclose() and localclose() are too agressive.  cleanupconnection()
>is a chopped-down version of localclose.
>
>- erik
>
>
>/n/sources/plan9//sys/src/9/ip/tcp.c:782,787 - tcp.c:782,813
>  	return mtu;
>  }
>
>+ static void
>+ cleanupconnection(Conv *s)
>+ {
>+ 	Tcpctl *tcb;
>+ 	Reseq *rp,*rp1;
>+ 	Tcppriv *tpriv;
>+
>+ 	tpriv = s->p->priv;
>+ 	tcb = (Tcpctl*)s->ptcl;
>+
>+ 	iphtrem(&tpriv->ht, s);
>+
>+ 	tcphalt(tpriv, &tcb->timer);
>+ 	tcphalt(tpriv, &tcb->rtt_timer);
>+ 	tcphalt(tpriv, &tcb->acktimer);
>+ 	tcphalt(tpriv, &tcb->katimer);
>+
>+ 	/* Flush reassembly queue; nothing more can arrive */
>+ 	for(rp = tcb->reseq; rp != nil; rp = rp1) {
>+ 		rp1 = rp->next;
>+ 		freeblist(rp->bp);
>+ 		free(rp);
>+ 	}
>+ 	tcb->reseq = nil;
>+ }
>+
>  void
>  inittcpctl(Conv *s, int mode)
>  {
>/n/sources/plan9//sys/src/9/ip/tcp.c:792,798 - tcp.c:818,827
>
>  	tcb = (Tcpctl*)s->ptcl;
>
>- 	memset(tcb, 0, sizeof(Tcpctl));
>+ 	if(tcb->timer.arg)		// c->state != Idle?
>+ 		cleanupconnection(s);
>+ 	else
>+ 		memset(tcb, 0, sizeof(Tcpctl));
>
>  	tcb->ssthresh = 65535;
>  	tcb->srtt = tcp_irtt<<LOGAGAIN;
>

works for me.
I don't know the internal workings of the plan 9 ip stack so I take
the risk of being silly: could be that the bug is not tcp only?
iru


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /net panic
  2008-02-15 17:29   ` Iruata Souza
@ 2008-02-15 18:34     ` erik quanstrom
  0 siblings, 0 replies; 6+ messages in thread
From: erik quanstrom @ 2008-02-15 18:34 UTC (permalink / raw)
  To: 9fans

> works for me.
> I don't know the internal workings of the plan 9 ip stack so I take
> the risk of being silly: could be that the bug is not tcp only?
> iru

no.  the problem is that active tcp timers are overwritten.
all the tcp timer code is contained within ip/tcp.c

- erik


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-02-15 18:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-15  9:20 [9fans] /net panic sqweek
2008-02-15 15:19 ` Iruata Souza
2008-02-15 16:10   ` Iruata Souza
2008-02-15 16:23 ` erik quanstrom
2008-02-15 17:29   ` Iruata Souza
2008-02-15 18:34     ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).