Basically, Plan 9 (kernel and applications) was designed and written for multiprocessors,
and the kernel is written with pre-emptive concurrency in mind (rather than, say, retrofitting
it all). That extends to the drivers and most platform-specific kernel code (except where someone slipped up,
which is rare). Even on things that are currently uniprocessor, the discipline is to write the
mutual exclusion code as required. Cache control is less well-developed,
since most platforms so far have offered some adequate form of coherency, but
explicit cache flushing and invalidation is missing from some x86 drivers, because
the architecture did the work, so that DOS would run. In practice, most embedded platforms
have had custom SoC devices, or different devices from x86 at any rate,
so the cache flushing was included when a new driver
was written (once we understood the problem). Unfortunately, it was done
using different primitives, or at least primitive names, for different architectures.