Wasn't that the whole point of RISC?

It could be, but after having looked briefly at the size of the design for RISC-V Rocket and especially BOOM I wonder if it's all overly complicated. They even built their own high level hardware language (Chisel) that generates Verilog using Scala. Yuck.

Also, there's appears to be quite alot of compiler optimizations in gcc for RISC-based chips.

Could you get away with a much simpler, smaller hardware design and still run Plan 9 in a reasonable way? Maybe one side of the software/hardware divide has to take on more complexity to help simplify the other side?