From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <01be01c2ce04$8c6cb240$654cb2cc@kds>
From: "David Butler" <gdb@dbSystems.com>
To: <9fans@cse.psu.edu>
References: <Pine.LNX.4.44.0302060841200.10630-100000@carotid.ccs.lanl.gov>
Subject: Re: [9fans] GCC3.0 [Was; Webbrowser]
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Date: Thu,  6 Feb 2003 11:23:09 -0600
Topicbox-Message-UUID: 51d79bc6-eacb-11e9-9e20-41e7f4b1d025

Forgive the length of this response, but I've been thinking about this for
some time.

> The real question for me is whether the speed difference is fundamental or
> not. I don't know enough to know.

My opinion is the problem is fundamental but fixable. The fundamental
problem is the synchronous nature of the API. This is where I think Plan 9
has not gone far enough breaking with tradition. Even though ANSI and
POSIX are given the finger, it is only slightly so.

The underlying system is very asynchronous (9P messages have tags to
correlate replies) and there is a lot of overlap allowed from multiple
processes, but any one application is slow. Take for example the command
sed -e script <input >output. It is not hard to imagine the flow of 9P
messages
that stream from the system to fulfill this command. Each message is
synchronous. This is why most of the time in execution of this command is in
"halt". But this is not an OS or protocol issue because if 100 of these
commands are executed in parallel, the real time for completion is
quite a bit less than 100 times the real time for one.

There is an optimization in the file server code to try to assist the single
thread case that has to be removed to get good performance from the
multi thread case. That optimization is the attempt of the file server to
fill the cache with some number of blocks ahead of a file being read.
This, like any cache policy, should not be in a file server but in the
application. LRU is not always the best way. This is why database
systems bypass the UNIX buffer cache and use raw disks. I have
changed my file server to use the cache only for its needs and let
applications create their own policy in their local address space.

Using the current API to get performance sed would need to be rewritten
with multiple threads which launch parallel reads and writes. Since it does
not reuse the data after it has modified it, it has no cache needs.

Because of these thoughts and efforts, my system runs a little slower than
the average Plan 9 system in casual use, but the applications I have written
for performance are very fast and can utilize every resource I can throw at
them.

So, I know the API needs to allow more I/O concurrency and cache policy
management. I do not know enough yet how that looks. I'm interested in any
ideas.

David