On Tue, Oct 9, 2018 at 5:28 PM hiro <23hiro@gmail.com> wrote:
> E.g. right now Plan 9 suffers from a *lot* of data copying between
> the kernel and processes, and between processes themselves.

Huh? What exactly do you mean?

The current plan9 architecture relies heavily on copying data within a process between userspace and the kernel for e.g. IO. This should be well known to anyone who's rummaged around in the kernel, as it's pretty evident. Because of the simple system call and VM interfaces, things like scatter/gather IO or memory-mapped direct hardware access aren't really options. `iowritev`, for example, coalesces its arguments into a single buffer that it then pwrite()'s to its destination.

Can you describe the scenario and the
measurements you made?

This is a different issue. I don't know if copying is as significant an overhead as Lyndon suggested, but there are plenty of slow code paths in plan9. For example, when we ported the plan9 network stack to Akaros, we made a number of enhancements that combined sped things up by 50% or greater. Most of these were pretty simple: optimizing checksum calculations, alignment of IP and TCP headers on natural word boundaries meaning that we could read an IP address with a 32-bit load (I think that one netted a gigabit increase in throughput), using optimized memcpy instead of memmove in performance critical code paths, etc. We went from about 7Gbps on a 10Gbps interface to saturating the NIC. Those measurements were made between dedicated test machines on a dedicated network using netperf. Drew Gallatin, now at Netflix working on FreeBSD's network stack, did most of the optimization work.

If that experience in that one section of the kernel is any indicator, plan9 undoubtedly has lots of room for optimization in other parts of the system. Lots of aspects of the system were optimized for much smaller machines than are common now and many of those optimizations no longer make much sense on modern machines; the allocator is slow, for example, though very good at not wasting RAM. Compare to a vmem-style allocator, that can allocate any requested size in constant-time, but with up-to a factor of two waste of memory.

Lots of plan9 code is also buggy, or at least racy: consider the seemingly random valued timeouts to "give other threads 5 seconds to get out" in ipselffree() and iplinkfree() before "deallocating" an Iplink/Ipself. Something like RCU, even a naive RCU, would be more robust here, particularly under heavy load. Device drivers are atrophied and often buggy, or at least susceptible to hardware bugs that are fixed by the vendor-provided drivers. When I put in the plan9 networks to support Akaros development, we ran into a bug in the i218 ethernet controller that caused the NIC to wedge. We got Geoff Collyer to fix the i82563 driver and we sent a patch to 9legacy, but it's symptomatic of an aging code base with a shrinking developer population.

> If we could eliminate most of that copying, things would get a lot faster.

Which things would get faster?

Presumably bulk data transfer between devices and the user portion of an address space. If copying were eliminated (or just reduced) these would certainly get fast*er*. Whether they would be sufficiently faster as to make a perceptible performance different to a real workload is another matter.

> Dealing with the security issues isn't trivial

what security issues?

Presumably the bread-and-butter security issues that arise whenever the user portion of an address space is being concurrently accessed by hardware. As a trivial example, imagine scheduling a DMA transfer from some device into a buffer in the user portion of an address space and then exit()'ing the process. What do you do with the pages the device was writing into? They had better be pinned in some way until the IO operation completes before they're reallocated to something else that isn't expecting it to be clobbered.

I wouldn't be surprised if the raft of currently popular speculative execution bugs could be exacerbated by the kernel playing around with data in the user address space in a naive way. It doesn't look like plan9 has any serious mitigations for those.

        - Dan C.