From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu From: andrey mirtchovski MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20020316035113.A000F19AA1@mail.cse.psu.edu> Subject: [9fans] tcp problems Date: Fri, 15 Mar 2002 20:54:38 -0700 Topicbox-Message-UUID: 68357790-eaca-11e9-9e20-41e7f4b1d025 Hi, We've reported before the problems that plan9 is experiencing with tcp -- when the data transmitted is more than, let's say, 32Kbytes the performance drops somewhat on 100baseT and by a large amount on Gigabit Ethernet networks. Dean has been digging in the tcp.c code trying to find what the reason is and discovered that the amount of packet retransmissions increases by a factor of 1000+ when the size of the data sent goes above 32kbytes. Note however that this is not a fixed size that we can establish scientifically, it's just in the vicinity (playing with the load on the machine could influence it, sometimes by a large amount). A kernel profile during execution shows that the machine is idle some 80% of the time during peak transmissions so it is definitely (umm?) not a hardware issue. Turning on debugging messages we discovered that the retransmits are caused by tcptrim() in tcp.c which calculates that a packet is out of the receive window for tcp and triggers its retransmit. The actual code dealing with this is in tcpiput(): /* Cut the data to fit the receive window */ if(tcptrim(tcb, &seg, &bp, &length, f) == -1) { netlog(f, Logtcp, "tcp len < 0, %lux %d\n", seg.seq,length); update(s, &seg); if(qlen(s->wq)+tcb->flgcnt == 0 && tcb->state == Closing) { netlog(f, Logtcp, "tcp len error halt\n"); tcphalt(tpriv, &tcb->rtt_timer); tcphalt(tpriv, &tcb->acktimer); tcphalt(tpriv, &tcb->katimer); tcpsetstate(s, Time_wait); tcb->timer.start = MSL2*(1000 / MSPTICK); tcpgo(tpriv, &tcb->timer); } if(!(seg.flags & RST)) { tcb->flags |= FORCE; netlog(f, Logtcp, "tcp len error output\n"); goto output; } qunlock(s); poperror(); return; } For the packets we think are causing the problem, tcptrim returns -1, we log an error message (incorrect, but i'll come to that later), skip the first 'if' and enter if(!(seg.flags & RST)).... So we did a huge dump of all packets received during a single run of presotto's netpipe clone (Dean modified it locally so it'd compile) and found out that the packets that tcptrim rejects are actually not bad packets, but a repeat of a packet already acknowledged as received (they fall out of the receive window, but are in front of it.. It sort of looks like this (I know most of you are familiar with this, I'm just trying to explain the situation better): received: size: window_start: window_end: 100 5 100 200 105 5 105 205 110 5 110 210 120 5 115 215 #115 did not arrive orderly 125 5 115 215 130 5 115 215 115 5 115 215 # 115 arrives late 135 5 135 235 115 5 140 240 # a second 115 arrives retransmitted from the other side So at this point tcptrim() thinks (correctly) that the packet is out of the window, returns -1 and the tcp stack prints that the packet length is negative (incorrectly -- the packet length is reported fine all the time). It's a bit of a mistery what the stack decides to do with the packet after that. Normally the first thing that comes to mind is to throw it away, since we're ack-ing a higher sequence already, but... Can anyone spot the problem with this code immediately? Is there a problem at all? andrey