From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: From: presotto@plan9.bell-labs.com To: 9fans@cse.psu.edu Subject: Re: [9fans] tcp problems MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="upas-uftxanaerjoyqyxpbwjatteuan" Date: Sat, 16 Mar 2002 11:30:48 -0500 Topicbox-Message-UUID: 683d2468-eaca-11e9-9e20-41e7f4b1d025 This is a multi-part message in MIME format. --upas-uftxanaerjoyqyxpbwjatteuan Content-Disposition: inline Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit The -1 means throw away the packet, we've seen it before. It's the correct response, tcpiput then ignores the packet (modulo looking at the ack). From your description it sounds like you're going into retransmission at the send side. That's because it is timing out waiting for an ack. That in turn is either because the acks are not getting back, getting back too late, or we've got some terrible bug. We might have a screwed up timer. I'ld gotten that far myself before I got interrupted to justifying my continued existence to my boss. When one side starts retransmitting, its congestion window closes down so things go pretty slow till it grows open again. If we keep dropping into retransmission mode, things stay really slow which is what you're seeing. Keep looking. Your next step might be to check the acks arriving at the sender and see if they make sense. --upas-uftxanaerjoyqyxpbwjatteuan Content-Type: message/rfc822 Content-Disposition: inline Received: from plan9.cs.bell-labs.com ([135.104.9.2]) by plan9; Fri Mar 15 22:52:20 EST 2002 Received: from mail.cse.psu.edu ([130.203.4.6]) by plan9; Fri Mar 15 22:52:19 EST 2002 Received: from psuvax1.cse.psu.edu (psuvax1.cse.psu.edu [130.203.16.6]) by mail.cse.psu.edu (CSE Mail Server) with ESMTP id ED4D419AAA; Fri, 15 Mar 2002 22:52:08 -0500 (EST) Delivered-To: 9fans@cse.psu.edu Received: from acl.lanl.gov (plan9.acl.lanl.gov [128.165.147.177]) by mail.cse.psu.edu (CSE Mail Server) with SMTP id A000F19AA1 for <9fans@cse.psu.edu>; Fri, 15 Mar 2002 22:51:13 -0500 (EST) To: 9fans@cse.psu.edu From: andrey mirtchovski MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20020316035113.A000F19AA1@mail.cse.psu.edu> Subject: [9fans] tcp problems Sender: 9fans-admin@cse.psu.edu Errors-To: 9fans-admin@cse.psu.edu X-BeenThere: 9fans@cse.psu.edu X-Mailman-Version: 2.0.8 Precedence: bulk Reply-To: 9fans@cse.psu.edu List-Help: List-Id: Fans of the OS Plan 9 from Bell Labs <9fans.cse.psu.edu> List-Archive: Date: Fri, 15 Mar 2002 20:54:38 -0700 Hi, We've reported before the problems that plan9 is experiencing with tcp -- when the data transmitted is more than, let's say, 32Kbytes the performance drops somewhat on 100baseT and by a large amount on Gigabit Ethernet networks. Dean has been digging in the tcp.c code trying to find what the reason is and discovered that the amount of packet retransmissions increases by a factor of 1000+ when the size of the data sent goes above 32kbytes. Note however that this is not a fixed size that we can establish scientifically, it's just in the vicinity (playing with the load on the machine could influence it, sometimes by a large amount). A kernel profile during execution shows that the machine is idle some 80% of the time during peak transmissions so it is definitely (umm?) not a hardware issue. Turning on debugging messages we discovered that the retransmits are caused by tcptrim() in tcp.c which calculates that a packet is out of the receive window for tcp and triggers its retransmit. The actual code dealing with this is in tcpiput(): /* Cut the data to fit the receive window */ if(tcptrim(tcb, &seg, &bp, &length, f) == -1) { netlog(f, Logtcp, "tcp len < 0, %lux %d\n", seg.seq,length); update(s, &seg); if(qlen(s->wq)+tcb->flgcnt == 0 && tcb->state == Closing) { netlog(f, Logtcp, "tcp len error halt\n"); tcphalt(tpriv, &tcb->rtt_timer); tcphalt(tpriv, &tcb->acktimer); tcphalt(tpriv, &tcb->katimer); tcpsetstate(s, Time_wait); tcb->timer.start = MSL2*(1000 / MSPTICK); tcpgo(tpriv, &tcb->timer); } if(!(seg.flags & RST)) { tcb->flags |= FORCE; netlog(f, Logtcp, "tcp len error output\n"); goto output; } qunlock(s); poperror(); return; } For the packets we think are causing the problem, tcptrim returns -1, we log an error message (incorrect, but i'll come to that later), skip the first 'if' and enter if(!(seg.flags & RST)).... So we did a huge dump of all packets received during a single run of presotto's netpipe clone (Dean modified it locally so it'd compile) and found out that the packets that tcptrim rejects are actually not bad packets, but a repeat of a packet already acknowledged as received (they fall out of the receive window, but are in front of it.. It sort of looks like this (I know most of you are familiar with this, I'm just trying to explain the situation better): received: size: window_start: window_end: 100 5 100 200 105 5 105 205 110 5 110 210 120 5 115 215 #115 did not arrive orderly 125 5 115 215 130 5 115 215 115 5 115 215 # 115 arrives late 135 5 135 235 115 5 140 240 # a second 115 arrives retransmitted from the other side So at this point tcptrim() thinks (correctly) that the packet is out of the window, returns -1 and the tcp stack prints that the packet length is negative (incorrectly -- the packet length is reported fine all the time). It's a bit of a mistery what the stack decides to do with the packet after that. Normally the first thing that comes to mind is to throw it away, since we're ack-ing a higher sequence already, but... Can anyone spot the problem with this code immediately? Is there a problem at all? andrey --upas-uftxanaerjoyqyxpbwjatteuan--