On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion <sstallion@gmail.com> wrote:

As the guy who wrote the majority of the code that pushed those 1M 4K
random IOPS erik mentioned, this thread annoys the shit out of me. You
don't get an award for writing a driver. In fact, it's probably better
not to be known at all considering the bloody murder one has to commit
to marry hardware and software together.

Let's be frank, the I/O handling in the kernel is anachronistic. To
hit those rates, I had to add support for asynchronous and vectored
I/O not to mention a sizable bit of work by a co-worker to properly
handle NUMA on our appliances to hit those speeds. As I recall, we had
to rewrite the scheduler and re-implement locking, which even Charles
Forsyth had a hand in. Had we the time and resources to implement
something like zero-copy we'd have done it in a heartbeat.

In the end, it doesn't matter how "fast" a storage driver is in Plan 9
- as soon as you put a 9P-based filesystem on it, it's going to be
limited to a single outstanding operation. This is the tyranny of 9P.
We (Coraid) got around this by avoiding filesystems altogether.

Go solve that problem first.
On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
>
> > But the reason I want this is to reduce latency to the first
> > access, especially for very large files. With read() I have
> > to wait until the read completes. With mmap() processing can
> > start much earlier and can be interleaved with background
> > data fetch or prefetch. With read() a lot more resources
> > are tied down. If I need random access and don't need to
> > read all of the data, the application has to do pread(),
> > pwrite() a lot thus complicating it. With mmap() I can just
> > map in the whole file and excess reading (beyond what the
> > app needs) will not be a large fraction.
>
> you think doing single 4K page sized reads in the pagefault
> handler is better than doing precise >4K reads from your
> application? possibly in a background thread so you can
> overlap processing with data fetching?
>
> the advantage of mmap is not prefetch. its about not to do
> any I/O when data is already in the *SHARED* buffer cache!
> which plan9 does not have (except the mntcache, but that is
> optional and only works for the disk fileservers that maintain
> ther file qid ver info consistently). its *IS* really a linux
> thing where all block device i/o goes thru the buffer cache.
>
> --
> cinap
>