From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: Your message of "Sun, 09 Jan 2011 09:29:04 PST." References: <16094d5a594bfa72dd0e9ac6f3f8b31c@plug.quanstro.net> From: Bakul Shah Date: Sun, 9 Jan 2011 11:54:26 -0800 Message-Id: <20110109195426.D1ED35B42@mail.bitblocks.com> Subject: Re: [9fans] fs performance Topicbox-Message-UUID: 92df6972-ead6-11e9-9d60-3106f5b1d025 On Sun, 09 Jan 2011 09:29:04 PST ron minnich wrote: > > Those are interesting numbers. Actually, however, changing a program > to use the stream stuff is trivial. I would expect the streaming to be > a real loser in a site with 10GE but we can try it. As John has > pointed out the streaming only makes sense where the inherent network > latency is pretty high (10s of milliseconds), i.e. the wide area. For "streaming" at wirespeed you want to keep the pipe full at all times. For that to happen, by the time ack for byte# N arrives, you would have sent at least RTT*bandwidth more bytes. So it is the latency*bandwidth product and not just latency that matters -- that is, streaming makes sense at lower latencies for higher speed networks. Ideally on a 10GbE you should get something like 1.1GBps of TCP bandwidth. If the throughput you get from local filesystems is not in this neighborhood, that would mean there are bigger bottlenecks than dealing with latency. In this case streaming won't help much. On FreeBSD (on a 3.2Ghz w3550) dd /dev/null yields 3.5GBps (@128k blocksize). md0 is a ramdisk. Write to is at about 1/4th read speed. dd /dev/null gives 15+GBps -- basically system overhead for IO. dd from an uncached file on ZFS is about 350MBps. Once the file is cached, the throughput increases to 2.5GBps. None of these use any streaming (though there *is* readahead at the FS level). [GBps == 10^9 bytes/second. MBps == 1^6 bytes/second] The point of mentioning FreeBSD numbers is to show what is possible. To really improve plan9 fs performance one would have to look at things like syscall overhead, number of data copies made, number of syscalls and context switches etc. and tune each component. [Aside: has ttcp been ported to plan9?]