[9fans] tcp!

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] tcp!
@ 2012-08-18 20:11 erik quanstrom
  2012-08-18 20:26 ` erik quanstrom
  2012-08-19 13:17 ` Richard Miller
  0 siblings, 2 replies; 13+ messages in thread
From: erik quanstrom @ 2012-08-18 20:11 UTC (permalink / raw)
  To: 9fans

since it came up, i put my working copy of tcp along with some testing
scripts in /n/sources/contrib/quanstro/tcp.

there are a number of fixes rolled into this, but the main fixes are
- add support for new reno,
- properly handle zero-window probes (on both ends),
- don't confuse the cwind with the receiver's advertized window.  this
particular condition can lead to livelock.
- don't confuse the window scale with the amount of local buffering
we'd like to do.
- and, don't queue tcp infinitely, which can crash kernels.  :-)

i don't have the numbers for the old tcp handy, but i think you'll
be surprised at how much difference there can be.  i saw differences
of 20x when the sender was limited in how fast it could read by the
read rate from user space.

i've included "testscript."  for the two machines i have handy, i get
the following results with new and old tcp.

machine		stack	kernel	0ms delay	1ms delay
ideal		-	386	unlimited	8.19mb/s

xeon x5550	old	386	138mb/s		0.49mb/s  (!)
intel atom	old	386	37.2mb/s		0.10mb/s

amd x4 964	new	386	145mb/s		8.03mb/s
intel e31220	new	amd64	303mb/s		8.15mb/s
intel atom	new	386	67mb/s		8.03mb/s
	# note: i can get up to 80mb/s using forsyth's qmalloc.

- erik

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-18 20:11 [9fans] tcp! erik quanstrom
@ 2012-08-18 20:26 ` erik quanstrom
  2012-08-19 13:37   ` Richard Miller
  2012-08-19 13:17 ` Richard Miller
  1 sibling, 1 reply; 13+ messages in thread
From: erik quanstrom @ 2012-08-18 20:26 UTC (permalink / raw)
  To: 9fans

> - add support for new reno,

i apoligize for not mentioning that the new reno work
was part of the nix/9k tcp.  i'm not sure who wrote it.

sorry!

also i forgot to mention that this version of qread can
potentially cut the number of reads on tcp channels by up
to 1/2.  one might as well completely satisfy the read,
if possible.  especially since typical iounits (8192) do not
divide up into typical mss-sized (1460) packets evenly.

[...]
	/* if we get here, there's at least one block in the queue */
	if(q->state & Qcoalesce){
		/* when coalescing, 0 length blocks just go away */
		b = q->bfirst;
		if(BLEN(b) <= 0){
			freeb(qremove(q));
			goto again;
		}

		/*
		 * grab the first block and as many following
		 * blocks as will partially fit in the read
		 */
		n = 0;
		l = &first;
		for(;;) {
			*l = qremove(q);
			l = &b->next;
			n += BLEN(b);
			if(n >= len || (b = q->bfirst) == nil)
				break;
		}

- erik

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-18 20:11 [9fans] tcp! erik quanstrom
  2012-08-18 20:26 ` erik quanstrom
@ 2012-08-19 13:17 ` Richard Miller
  2012-08-19 15:43   ` erik quanstrom
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Miller @ 2012-08-19 13:17 UTC (permalink / raw)
  To: 9fans

Within the last month or so I've been having trouble copying large
files to remote servers e.g. sources.  The cp process hangs for
many minutes and eventually ends in 'mount rpc error'.  I was
hoping this tcp patch might solve it, but alas no.

Has anyone else been observing this?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-18 20:26 ` erik quanstrom
@ 2012-08-19 13:37   ` Richard Miller
  2012-08-19 13:55     ` cinap_lenrek
  2012-08-19 14:48     ` erik quanstrom
  0 siblings, 2 replies; 13+ messages in thread
From: Richard Miller @ 2012-08-19 13:37 UTC (permalink / raw)
  To: 9fans

> also i forgot to mention that this version of qread can
> potentially cut the number of reads on tcp channels by up
> to 1/2.  one might as well completely satisfy the read,
> if possible.

This looks like a good idea for tcp.  But there are other
users of qread, with stricter assumptions.  Aren't you in danger
of breaking the contract of pipe(3) which uses qwrite/qread:

          Writes are atomic up to a certain size, typically 32768
          bytes, that is, each write will be delivered in a single
          read by the recipient, provided the receiving buffer is
          large enough.

To preserve the atomicity of qread/qwrite, maybe tcp should be
coalescing the blocks itself by multiple calls to qread.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-19 13:37   ` Richard Miller
@ 2012-08-19 13:55     ` cinap_lenrek
  2012-08-19 14:05       ` Richard Miller
  2012-08-19 14:48     ` erik quanstrom
  1 sibling, 1 reply; 13+ messages in thread
From: cinap_lenrek @ 2012-08-19 13:55 UTC (permalink / raw)
  To: 9fans

its only done on queues that have this flag set i think:

Qcoalesce	= (1<<4),	/* coallesce packets on read */

--
cinap



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-19 13:55     ` cinap_lenrek
@ 2012-08-19 14:05       ` Richard Miller
  2012-08-19 15:07         ` erik quanstrom
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Miller @ 2012-08-19 14:05 UTC (permalink / raw)
  To: 9fans

> its only done on queues that have this flag set i think:

... and it won't be set for pipes, of course.  Sorry Erik, I should
have studied this more carefully.

I'll try it.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-19 13:37   ` Richard Miller
  2012-08-19 13:55     ` cinap_lenrek
@ 2012-08-19 14:48     ` erik quanstrom
  1 sibling, 0 replies; 13+ messages in thread
From: erik quanstrom @ 2012-08-19 14:48 UTC (permalink / raw)
  To: 9fans

> This looks like a good idea for tcp.  But there are other
> users of qread, with stricter assumptions.  Aren't you in danger
> of breaking the contract of pipe(3) which uses qwrite/qread:
>
>           Writes are atomic up to a certain size, typically 32768
>           bytes, that is, each write will be delivered in a single
>           read by the recipient, provided the receiving buffer is
>           large enough.

this change only applies to Qcoalesce queues.

the only users of Qcoalesce are the kprintoq and tcp.  both
should be okay with this change.

; g qopen port/devpipe.c
port/devpipe.c:68: 	p->q[0] = qopen(conf.pipeqsize, 0, 0, 0);
port/devpipe.c:73: 	p->q[1] = qopen(conf.pipeqsize, 0, 0, 0);

- erik



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-19 14:05       ` Richard Miller
@ 2012-08-19 15:07         ` erik quanstrom
  0 siblings, 0 replies; 13+ messages in thread
From: erik quanstrom @ 2012-08-19 15:07 UTC (permalink / raw)
  To: 9fans

> ... and it won't be set for pipes, of course.  Sorry Erik, I should
> have studied this more carefully.
> 
> I'll try it.

no problems.  i'm glad you're double-checking.  nobody i know is immune
from error.  and there's me, myself and i.  so i am 3x as likely to screw up.

i'd be curious to know if this makes a noticable difference on slower machines
like the π with tcptest to self.

- erik

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-19 13:17 ` Richard Miller
@ 2012-08-19 15:43   ` erik quanstrom
  0 siblings, 0 replies; 13+ messages in thread
From: erik quanstrom @ 2012-08-19 15:43 UTC (permalink / raw)
  To: 9fans

On Sun Aug 19 09:19:23 EDT 2012, 9fans@hamnavoe.com wrote:
> Within the last month or so I've been having trouble copying large
> files to remote servers e.g. sources.  The cp process hangs for
> many minutes and eventually ends in 'mount rpc error'.  I was
> hoping this tcp patch might solve it, but alas no.

could you send a snoopy capture?  -M100 and just the tail should
be good enough.  also a capture of /net/log with 'set tcp' during the
issue could be helpful. also, could you point to a particular large
file on sources?  i'd like to try to replicate.

- erik



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
       [not found]   ` <CAGGHmKFokgrbH1P_j+31teoO9ubEvLu2Ti7_QP1+WPLXQ291Mg@mail.gmail.c>
@ 2012-08-22 18:46     ` erik quanstrom
  0 siblings, 0 replies; 13+ messages in thread
From: erik quanstrom @ 2012-08-22 18:46 UTC (permalink / raw)
  To: 9fans

On Wed Aug 22 14:31:13 EDT 2012, sstallion@gmail.com wrote:
> On Wed, Aug 22, 2012 at 10:18 AM, Gorka Guardiola <paurea@gmail.com>
> wrote:
> > I had this problem several years ago with an adsl router (9fans
> > archive may know about this).  There was a bug in my adsl router
> > (which seems to be common, I have seen it since more than once) that
> > dropped ethernet frames of size greater than 1480 (someone counted a
> > header twice probably).  Linux adapts the mss to 1480 if there are
> > problems so it works in this case.
> >
> Not so much a bug as ATM overhead.

atm overhead is 5 bytes per 48 bytes transmitted.

the original problem is a limit of 1496 bytes,
not 1460, which is more constent with mpls
than l2tp (1460) or pppoe (1492).  but all that's
guesswork.

the "bug" here, if there is one, is that there's
neither an icmp message nor fragmentation
nor mss rewriting at the local gateway, which
should (since it's eithernet) not silently drop
mtu-sized frames that it's responsible for
gatewaying.

- erik

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-22 17:18 ` Gorka Guardiola
@ 2012-08-22 18:29   ` Steven Stallion
       [not found]   ` <CAGGHmKFokgrbH1P_j+31teoO9ubEvLu2Ti7_QP1+WPLXQ291Mg@mail.gmail.c>
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Stallion @ 2012-08-22 18:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Aug 22, 2012 at 10:18 AM, Gorka Guardiola <paurea@gmail.com> wrote:
> I had this problem several years ago
> with an adsl router (9fans archive may know about this). There was a bug in my adsl router (which seems to be common, I have seen it since more than once) that dropped ethernet frames of size greater than 1480 (someone counted a header twice probably). Linux adapts the
> mss to 1480 if there are problems so it works in this case.

Not so much a bug as ATM overhead.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
  2012-08-21 18:32 Richard Miller
@ 2012-08-22 17:18 ` Gorka Guardiola
  2012-08-22 18:29   ` Steven Stallion
       [not found]   ` <CAGGHmKFokgrbH1P_j+31teoO9ubEvLu2Ti7_QP1+WPLXQ291Mg@mail.gmail.c>
  0 siblings, 2 replies; 13+ messages in thread
From: Gorka Guardiola @ 2012-08-22 17:18 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I had this problem several years ago
with an adsl router (9fans archive may know about this). There was a bug in my adsl router (which seems to be common, I have seen it since more than once) that dropped ethernet frames of size greater than 1480 (someone counted a header twice probably). Linux adapts the
mss to 1480 if there are problems so it works in this case. 

G.

On Aug 21, 2012, at 8:32 PM, Richard Miller <9fans@hamnavoe.com> wrote:

> I reported:
> 
>> Within the last month or so I've been having trouble copying large
>> files to remote servers e.g. sources.  The cp process hangs for
>> many minutes and eventually ends in 'mount rpc error'.
> 
> Thanks to a hint from Erik ("... an mss problem of some sort"), I've
> managed to make the problem go away, by doing
>  echo mtu 1496 >/net/ipifc/1/ctl
> 
> I hope to come back to this when I have more time, because I don't
> like not understanding why this works.  As nobody else has said they
> have the same trouble, there may be something amiss in my adsl gateway.
> 
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [9fans] tcp!
@ 2012-08-21 18:32 Richard Miller
  2012-08-22 17:18 ` Gorka Guardiola
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Miller @ 2012-08-21 18:32 UTC (permalink / raw)
  To: 9fans

I reported:

> Within the last month or so I've been having trouble copying large
> files to remote servers e.g. sources.  The cp process hangs for
> many minutes and eventually ends in 'mount rpc error'.

Thanks to a hint from Erik ("... an mss problem of some sort"), I've
managed to make the problem go away, by doing
  echo mtu 1496 >/net/ipifc/1/ctl

I hope to come back to this when I have more time, because I don't
like not understanding why this works.  As nobody else has said they
have the same trouble, there may be something amiss in my adsl gateway.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-08-22 18:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-18 20:11 [9fans] tcp! erik quanstrom
2012-08-18 20:26 ` erik quanstrom
2012-08-19 13:37   ` Richard Miller
2012-08-19 13:55     ` cinap_lenrek
2012-08-19 14:05       ` Richard Miller
2012-08-19 15:07         ` erik quanstrom
2012-08-19 14:48     ` erik quanstrom
2012-08-19 13:17 ` Richard Miller
2012-08-19 15:43   ` erik quanstrom
2012-08-21 18:32 Richard Miller
2012-08-22 17:18 ` Gorka Guardiola
2012-08-22 18:29   ` Steven Stallion
     [not found]   ` <CAGGHmKFokgrbH1P_j+31teoO9ubEvLu2Ti7_QP1+WPLXQ291Mg@mail.gmail.c>
2012-08-22 18:46     ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).