From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <01be01c2ce04$8c6cb240$654cb2cc@kds> From: "David Butler" To: <9fans@cse.psu.edu> References: Subject: Re: [9fans] GCC3.0 [Was; Webbrowser] MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Date: Thu, 6 Feb 2003 11:23:09 -0600 Topicbox-Message-UUID: 51d79bc6-eacb-11e9-9e20-41e7f4b1d025 Forgive the length of this response, but I've been thinking about this for some time. > The real question for me is whether the speed difference is fundamental or > not. I don't know enough to know. My opinion is the problem is fundamental but fixable. The fundamental problem is the synchronous nature of the API. This is where I think Plan 9 has not gone far enough breaking with tradition. Even though ANSI and POSIX are given the finger, it is only slightly so. The underlying system is very asynchronous (9P messages have tags to correlate replies) and there is a lot of overlap allowed from multiple processes, but any one application is slow. Take for example the command sed -e script output. It is not hard to imagine the flow of 9P messages that stream from the system to fulfill this command. Each message is synchronous. This is why most of the time in execution of this command is in "halt". But this is not an OS or protocol issue because if 100 of these commands are executed in parallel, the real time for completion is quite a bit less than 100 times the real time for one. There is an optimization in the file server code to try to assist the single thread case that has to be removed to get good performance from the multi thread case. That optimization is the attempt of the file server to fill the cache with some number of blocks ahead of a file being read. This, like any cache policy, should not be in a file server but in the application. LRU is not always the best way. This is why database systems bypass the UNIX buffer cache and use raw disks. I have changed my file server to use the cache only for its needs and let applications create their own policy in their local address space. Using the current API to get performance sed would need to be rewritten with multiple threads which launch parallel reads and writes. Since it does not reuse the data after it has modified it, it has no cache needs. Because of these thoughts and efforts, my system runs a little slower than the average Plan 9 system in casual use, but the applications I have written for performance are very fast and can utilize every resource I can throw at them. So, I know the API needs to allow more I/O concurrency and cache policy management. I do not know enough yet how that looks. I'm interested in any ideas. David