* [9fans] /net panic @ 2008-02-15 9:20 sqweek 2008-02-15 15:19 ` Iruata Souza 2008-02-15 16:23 ` erik quanstrom 0 siblings, 2 replies; 6+ messages in thread From: sqweek @ 2008-02-15 9:20 UTC (permalink / raw) To: 9fans muzgo (from irc) was playing around with /net in qemu and came across this gem: on drawterm: cpu% cd /net/tcp cpu% cat clone 23cpu% cd 23 cpu% echo connect 10.0.2.1!12345 >ctl cpu% cat status Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin 65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200 katimer.count 159 cpu% echo connect 10.0.2.1!12345 >ctl cpu% this causes CPU server to reboot with: panic: timerstate1 panic: timerstate1 dumpstack disabled cpu0 exiting The usage of /net is invalid, but you wouldn't really expect that to reboot the machine (or maybe it's a holdover from before /dev/reboot existed? ;) ). Just tried it on my cpu server and got the same panic, so we can rule qemu out. I adjusted the ip!port to something that would accept my connection, and my status was something like Timedwait rather than Finwait2. Second time around I skipped the cat status and still hit the panic so the status read isn't affecting things (which is probably blindingly obvious to anyone familiar with the /net code, but oh well). I appear to be running a realtek 8169 nic: #l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c Don't know what muzgo was using in qemu, but let me know if I can provide any useful information. -sqweek ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [9fans] /net panic 2008-02-15 9:20 [9fans] /net panic sqweek @ 2008-02-15 15:19 ` Iruata Souza 2008-02-15 16:10 ` Iruata Souza 2008-02-15 16:23 ` erik quanstrom 1 sibling, 1 reply; 6+ messages in thread From: Iruata Souza @ 2008-02-15 15:19 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Feb 15, 2008 at 7:20 AM, sqweek <sqweek@gmail.com> wrote: > muzgo (from irc) was playing around with /net in qemu and came across this gem: > > on drawterm: > cpu% cd /net/tcp > cpu% cat clone > 23cpu% cd 23 > cpu% echo connect 10.0.2.1!12345 >ctl > cpu% cat status > Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin > 65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200 > katimer.count 159 > cpu% echo connect 10.0.2.1!12345 >ctl > cpu% > > this causes CPU server to reboot with: > panic: timerstate1 > panic: timerstate1 > dumpstack disabled > cpu0 exiting > > > The usage of /net is invalid, but you wouldn't really expect that to > reboot the machine (or maybe it's a holdover from before /dev/reboot > existed? ;) ). > Just tried it on my cpu server and got the same panic, so we can rule > qemu out. I adjusted the ip!port to something that would accept my > connection, and my status was something like Timedwait rather than > Finwait2. Second time around I skipped the cat status and still hit > the panic so the status read isn't affecting things (which is probably > blindingly obvious to anyone familiar with the /net code, but oh > well). > > I appear to be running a realtek 8169 nic: > #l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c > > Don't know what muzgo was using in qemu, but let me know if I can > provide any useful information. > -sqweek > no *strange* things running, i guess. iru ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [9fans] /net panic 2008-02-15 15:19 ` Iruata Souza @ 2008-02-15 16:10 ` Iruata Souza 0 siblings, 0 replies; 6+ messages in thread From: Iruata Souza @ 2008-02-15 16:10 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Feb 15, 2008 at 1:19 PM, Iruata Souza <iru.muzgo@gmail.com> wrote: > > On Fri, Feb 15, 2008 at 7:20 AM, sqweek <sqweek@gmail.com> wrote: > > muzgo (from irc) was playing around with /net in qemu and came across this gem: > > > > on drawterm: > > cpu% cd /net/tcp > > cpu% cat clone > > 23cpu% cd 23 > > cpu% echo connect 10.0.2.1!12345 >ctl > > cpu% cat status > > Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin > > 65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200 > > katimer.count 159 > > cpu% echo connect 10.0.2.1!12345 >ctl > > cpu% > > > > this causes CPU server to reboot with: > > panic: timerstate1 > > panic: timerstate1 > > dumpstack disabled > > cpu0 exiting > > > > > > The usage of /net is invalid, but you wouldn't really expect that to > > reboot the machine (or maybe it's a holdover from before /dev/reboot > > existed? ;) ). > > Just tried it on my cpu server and got the same panic, so we can rule > > qemu out. I adjusted the ip!port to something that would accept my > > connection, and my status was something like Timedwait rather than > > Finwait2. Second time around I skipped the cat status and still hit > > the panic so the status read isn't affecting things (which is probably > > blindingly obvious to anyone familiar with the /net code, but oh > > well). > > > > I appear to be running a realtek 8169 nic: > > #l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c > > > > Don't know what muzgo was using in qemu, but let me know if I can > > provide any useful information. > > -sqweek > > > > no *strange* things running, i guess. > > iru > some info about my environment is in http://iru.oitobits.net/9netpanic/ where: QEMU_Plan9 - sh script to run emulated Plan 9 cpuemu_config.tgz - CPU server's /cfg/cpuemu qemu-ifup - sh script to up host<->guest tun interfaces, Plan 9 gets tun0 ifconfig.tun0 - tun interface configuration listen.c - listener running on host panic - complete scenario of snap pasted by sqweek iru ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [9fans] /net panic 2008-02-15 9:20 [9fans] /net panic sqweek 2008-02-15 15:19 ` Iruata Souza @ 2008-02-15 16:23 ` erik quanstrom 2008-02-15 17:29 ` Iruata Souza 1 sibling, 1 reply; 6+ messages in thread From: erik quanstrom @ 2008-02-15 16:23 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 1447 bytes --] i'm not sure this is a perfect solution. i just don't have enough of the plan 9 ip stack loaded into cache to be sure nothing's been forgotten. but give this patch a whirl. basically, i think the problem is that inittcpctl() was stepping on timers that might have been active. these timers need to be shutdown. unfortunately, tcpclose() and localclose() are too agressive. cleanupconnection() is a chopped-down version of localclose. - erik /n/sources/plan9//sys/src/9/ip/tcp.c:782,787 - tcp.c:782,813 return mtu; } + static void + cleanupconnection(Conv *s) + { + Tcpctl *tcb; + Reseq *rp,*rp1; + Tcppriv *tpriv; + + tpriv = s->p->priv; + tcb = (Tcpctl*)s->ptcl; + + iphtrem(&tpriv->ht, s); + + tcphalt(tpriv, &tcb->timer); + tcphalt(tpriv, &tcb->rtt_timer); + tcphalt(tpriv, &tcb->acktimer); + tcphalt(tpriv, &tcb->katimer); + + /* Flush reassembly queue; nothing more can arrive */ + for(rp = tcb->reseq; rp != nil; rp = rp1) { + rp1 = rp->next; + freeblist(rp->bp); + free(rp); + } + tcb->reseq = nil; + } + void inittcpctl(Conv *s, int mode) { /n/sources/plan9//sys/src/9/ip/tcp.c:792,798 - tcp.c:818,827 tcb = (Tcpctl*)s->ptcl; - memset(tcb, 0, sizeof(Tcpctl)); + if(tcb->timer.arg) // c->state != Idle? + cleanupconnection(s); + else + memset(tcb, 0, sizeof(Tcpctl)); tcb->ssthresh = 65535; tcb->srtt = tcp_irtt<<LOGAGAIN; [-- Attachment #2: Type: message/rfc822, Size: 4485 bytes --] From: sqweek <sqweek@gmail.com> To: 9fans@cse.psu.edu Subject: [9fans] /net panic Date: Fri, 15 Feb 2008 18:20:13 +0900 Message-ID: <140e7ec30802150120y1b0e4b33lf76787aaee84edd2@mail.gmail.com> muzgo (from irc) was playing around with /net in qemu and came across this gem: on drawterm: cpu% cd /net/tcp cpu% cat clone 23cpu% cd 23 cpu% echo connect 10.0.2.1!12345 >ctl cpu% cat status Finwait2 qin 0 qout 0 srtt 0 mdev 0 cwin 1461 swin 32850>>0 rwin 65535>>0 timer.start 10 timer.count 10 rerecv 0 katimer.start 200 katimer.count 159 cpu% echo connect 10.0.2.1!12345 >ctl cpu% this causes CPU server to reboot with: panic: timerstate1 panic: timerstate1 dumpstack disabled cpu0 exiting The usage of /net is invalid, but you wouldn't really expect that to reboot the machine (or maybe it's a holdover from before /dev/reboot existed? ;) ). Just tried it on my cpu server and got the same panic, so we can rule qemu out. I adjusted the ip!port to something that would accept my connection, and my status was something like Timedwait rather than Finwait2. Second time around I skipped the cat status and still hit the panic so the status read isn't affecting things (which is probably blindingly obvious to anyone familiar with the /net code, but oh well). I appear to be running a realtek 8169 nic: #l0: rtl8169: 100Mbps port 0xE400 irq 11: 000aeb2ff32c Don't know what muzgo was using in qemu, but let me know if I can provide any useful information. -sqweek ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [9fans] /net panic 2008-02-15 16:23 ` erik quanstrom @ 2008-02-15 17:29 ` Iruata Souza 2008-02-15 18:34 ` erik quanstrom 0 siblings, 1 reply; 6+ messages in thread From: Iruata Souza @ 2008-02-15 17:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Fri, Feb 15, 2008 at 2:23 PM, erik quanstrom <quanstro@quanstro.net> wrote: > >i'm not sure this is a perfect solution. i just don't have enough >of the plan 9 ip stack loaded into cache to be sure nothing's >been forgotten. but give this patch a whirl. basically, i think >the problem is that inittcpctl() was stepping on timers that might >have been active. these timers need to be shutdown. unfortunately, >tcpclose() and localclose() are too agressive. cleanupconnection() >is a chopped-down version of localclose. > >- erik > > >/n/sources/plan9//sys/src/9/ip/tcp.c:782,787 - tcp.c:782,813 > return mtu; > } > >+ static void >+ cleanupconnection(Conv *s) >+ { >+ Tcpctl *tcb; >+ Reseq *rp,*rp1; >+ Tcppriv *tpriv; >+ >+ tpriv = s->p->priv; >+ tcb = (Tcpctl*)s->ptcl; >+ >+ iphtrem(&tpriv->ht, s); >+ >+ tcphalt(tpriv, &tcb->timer); >+ tcphalt(tpriv, &tcb->rtt_timer); >+ tcphalt(tpriv, &tcb->acktimer); >+ tcphalt(tpriv, &tcb->katimer); >+ >+ /* Flush reassembly queue; nothing more can arrive */ >+ for(rp = tcb->reseq; rp != nil; rp = rp1) { >+ rp1 = rp->next; >+ freeblist(rp->bp); >+ free(rp); >+ } >+ tcb->reseq = nil; >+ } >+ > void > inittcpctl(Conv *s, int mode) > { >/n/sources/plan9//sys/src/9/ip/tcp.c:792,798 - tcp.c:818,827 > > tcb = (Tcpctl*)s->ptcl; > >- memset(tcb, 0, sizeof(Tcpctl)); >+ if(tcb->timer.arg) // c->state != Idle? >+ cleanupconnection(s); >+ else >+ memset(tcb, 0, sizeof(Tcpctl)); > > tcb->ssthresh = 65535; > tcb->srtt = tcp_irtt<<LOGAGAIN; > works for me. I don't know the internal workings of the plan 9 ip stack so I take the risk of being silly: could be that the bug is not tcp only? iru ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [9fans] /net panic 2008-02-15 17:29 ` Iruata Souza @ 2008-02-15 18:34 ` erik quanstrom 0 siblings, 0 replies; 6+ messages in thread From: erik quanstrom @ 2008-02-15 18:34 UTC (permalink / raw) To: 9fans > works for me. > I don't know the internal workings of the plan 9 ip stack so I take > the risk of being silly: could be that the bug is not tcp only? > iru no. the problem is that active tcp timers are overwritten. all the tcp timer code is contained within ip/tcp.c - erik ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-02-15 18:34 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-02-15 9:20 [9fans] /net panic sqweek 2008-02-15 15:19 ` Iruata Souza 2008-02-15 16:10 ` Iruata Souza 2008-02-15 16:23 ` erik quanstrom 2008-02-15 17:29 ` Iruata Souza 2008-02-15 18:34 ` erik quanstrom
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).