The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Dan Cross <>
To: Warner Losh <>
Cc: Paul Ruizendaal <>,
	The Unix Historical Society <>
Subject: [TUHS] Re: Porting the SysIII kernel: boot, config & device drivers
Date: Sat, 31 Dec 2022 23:40:44 -0500	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Sat, Dec 31, 2022 at 10:09 PM Warner Losh <> wrote:
> On Sat, Dec 31, 2022 at 1:03 PM Paul Ruizendaal <> wrote:
> There's been much nasty said about FDT and ACPI, but they do solve real problems: how to enumerate this diversity to the OS in a way that's sane and might not always be as simple as returning a specific number but that requires hardware access to answer even basic questions (because, say, the CPUs were wired this way or that and you have to read those wirings). Linux, even in Linuxboot environments, still uses ACPI, FDT and UEFI to get the job done, and the code there isn't horrific.

This is sort of the issue I have with ACPI+UEFI et al. If we stopped
at what you say (a small nucleus of primordial software that runs once
at the beginning of time intended only to provide essentially static
information to the OS in some relatively sane format, but then dies
and is never consulted again) then either is fine: the ACPI table
formats where this sort of information is encoded are well-defined and
not horrible; ripping through the MADT to find all of your CPUs is

But that's not all that either did, and the amount of functionality
being shoved into UEFI/ACPI in particular seems to show no sign of
slowing down. I get that having this sort of parallel OS that exposes
functionality in a manner transparent to what we normally consider the
operating system coupled with CPU "features" like SMM means that
vendors can write clever software to hide the fact that you've got a
USB keyboard from an OS that doesn't understand USB; this touches on
the thing Ted mentioned about needing to support old OSes like Windows
95 or whatever. But we're so far beyond compatibility crutches and
into the land of magical black boxes running opaque blobs that do all
sorts of stuff well hidden from the OS, and indeed, the OS has no
control over, by design. THAT's the problem.

> V7 unix for the PDP-11 shipped with maybe 25 drivers total for the whole system, and many of them were quite niche...
>> Together this might be a usable Unix BIOS that could have worked in the early 80’s. One could also think of it as a simple hypervisor for only one client. The remaining BBL functionality is not all that different from the content in mch.s on 16-bit Unix (e.g. floating point emulation for CPU’s that don’t have it in hardware). A virtio device is not all that different from the interface presented by e.g. PDP-11 ethernet devices (e.g. DELUA), the MMU abstraction was contemporary.
> virtio solves a different problem, though: It's goal is to provide THE interface for mass storage, THE interface for networking, etc so that hypervisor clients can limit their drivers substantially and not have to deal with the thousands of drivers normally needed.

Generally speaking, the hypervisor won't expose hardware devices
directly to the guest. Even with SR-IOV and the like, the HV
necessarily synthesizes, say, PCI config space as seen by the guest
and tightly controls what virtual functions the guest sees, as
anything else allows the guest to usurp the host. This implies that
the HV is providing the guest with virtualized devices anyway, and
once you're doing that the question becomes: what devices to
parameterize the guest with? The HV could emulate things it is fairly
sure the guest already knows about because they're relatively common:
say, an e1000 for a NIC, or AHCI for storage, a 16550 for a console
UART, etc, but what's the relative cost of doing so? It turns out,
most of these devices aren't super great fits for virtualization; they
generate too many exits. Enter virtio, designed for the use case. That
said, hypervisor bypass for virtual devices is a big deal, and not
something that's easily done with virtio. Even offloading virtio
handling to dedicated cores is hard, because there's no easy way for
the guest to generate a doorbell interrupt in the host (kicking a
virtio queue involves a guest exit, which implies some local
processing on the processor running the VCPU: that may be as simple as
kicking off an IPI to another CPU, but you're still exiting, which is

> ACPI/FDT just try to make the non-self-describing aspects of the hardware described.

If that were all they did, I think a lot of the complaints would melt away.

> Now, I don't disagree with the org chart args for why they are so large outside of linuxboot, but they do fill a vacuum that would otherwise exist.

The data formats and the general concept of providing that data to the
OS in semi-portable manner, yes. The real problem addressed here is
the decoupling of hardware and systems software at scale; the problem
was simply smaller on the PDP-11 and VAX, but is orders of magnitude
larger now. You need _something_ unless you're in the downright
luxurious position that, say, we're in at Oxide. UEFI+ACPI serve in
that capacity, but it's important to note that that doesn't make them
good. Lots of things that are very useful kind of suck.

        - Dan C.

  reply	other threads:[~2023-01-01  4:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30 18:25 [TUHS] " Paul Ruizendaal
2022-12-30 18:56 ` [TUHS] " Steve Nickolas
2022-12-31 14:59 ` Dan Cross
2022-12-31 19:08   ` Clem Cole
2022-12-31 21:10     ` Dan Cross
2022-12-31 21:39       ` Clem Cole
2022-12-31 21:52         ` Dan Cross
2022-12-31 23:25         ` Dave Horsfall
2023-01-01  1:02           ` Rob Pike
2023-01-01  1:16             ` George Michaelson
2023-01-01  1:40               ` Larry McVoy
2023-01-01  2:29                 ` Warner Losh
2023-01-01  1:24             ` Larry McVoy
2022-12-31 22:38       ` Theodore Ts'o
2022-12-31 22:55         ` Marc Donner
2023-01-01  3:55         ` Dan Cross
2023-01-01 20:29         ` Paul Ruizendaal
2023-01-01 21:26           ` G. Branden Robinson
2023-01-01 21:31             ` Rob Pike
2022-12-31 21:11     ` Paul Ruizendaal
2022-12-31 20:02   ` Paul Ruizendaal
2022-12-31 21:04     ` Warner Losh
2022-12-31 21:41     ` Dan Cross
2023-01-01  3:08     ` Warner Losh
2023-01-01  4:40       ` Dan Cross [this message]
2023-01-01  8:05     ` Jonathan Gray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='' \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).