From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: Your message of "Thu, 29 Apr 2010 13:23:00 EDT." References: <5fa9fbfe115a9cd5a81d0feefe413192@quintile.net> <4fa1305e0f56a0ef89c2e05320fa5997@coraid.com> <40cf59cfc2735e232f0fd67df725e65d@kw.quanstro.net> <046b9c874815d108c6ffb7a857e847a7@kw.quanstro.net> <20100429170859.2EBE05B78@mail.bitblocks.com> From: Bakul Shah Date: Thu, 29 Apr 2010 20:47:37 -0700 Message-Id: <20100430034737.577235B81@mail.bitblocks.com> Subject: Re: [9fans] A simple experiment Topicbox-Message-UUID: 143fd174-ead6-11e9-9d60-3106f5b1d025 On Thu, 29 Apr 2010 13:23:00 EDT erik quanstrom wrote: > > > 9p, like aoe, is a ping-pong protocol. each message requires an ack. > > > therefore, the transport layer doesn't need flow control. > > > > Therefore, it is also not able to utilise bandwidth > > effectively over longhaul links. As an example, US coast > > to coast round trip time latency is about 100ms. Now consider > > fcp. Each worker thread of fcp does 8K read/writes. Due to > > pingponging, the *most* a thread can xfer coast to coast is > > 80KBps (for 16 threads, 1.28MBps). It is actually much worse > > since each thread doesn't even overlap reads with writes. > > i think you are conflating a ping-pong protocol with the > limitation of a single outstanding message. there is no > reason that one needs a 1:1 correspondence between threads > and outstanding messages. nor does the protocol inherently > prohibit 1 write from becoming multiple messages. What I am saying is that RPC (which is how 9p is mostly used) has this inherent performance limitation due to latency. To get around that if you use multiple outstanding messages (regardless of one per thread or many per thread), you *will* require flow control or you run into problems -- imagine having thousands of outstanding messages. Or imagine having a low bandwidth link for fcp. Fcp will hog the link, to the detriment of other programs. So on the high end fcp throughput is limited 1.28MBps, on the low end it will hog the link completey (unless your kernel implements some sort of fair queuing)! If you want to make optimum use of the available bandwidth while adjusting to changing conditions and not competely clog up the pipe, you need a feedback mechanism and window opening algorithm ALA TCP or something. And anyway why do all this in fcp or every other program that needs streaming? It should be factored out. > > Short of a sliding window that is as large as the capacity of > > the pipe, you can't expect to keep it full. As usual one has > > to trade off simplicity vs performance. > > i don't think so. the only thing a sliding window gives you > is < 1 ack per message sent. If the sender->receiver pipe can hold N bytes and the sender is streaming (that is, keeping the pipe full), the sender *will* be ahead of the receiver by N bytes. So a *streaming* protcol has to allow it to be N bytes ahead. Even if you have one ack per message, in a sliding window of N messages (or bytes), the sender is allowed to get ahead of the receiver by upto N more messages (or bytes). Here I am not concerned about in-order delivery (though typically people assume this when they talk about sliding windows). If you assume in-order delivery you can coalesce multiple acks but that can be seen as an ack optimization (but then either you throw away out of order deliveries or have to add selective ack or something).