9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] 386 interrupts
@ 2011-03-21  2:17 erik quanstrom
  0 siblings, 0 replies; only message in thread
From: erik quanstrom @ 2011-03-21  2:17 UTC (permalink / raw)


i've been taking a careful look at interrupts recently
because i had a number of machines that just didn't
work.  that problem turned out not to be too hard,
and was easily fixed.  it turns out that one needs to
do a dance for lapic apic ids > 7.  (this code is already
in 9atom.)

along the way, i found that it was quite easy to turn
on msi interrupts, but that has not been as problem-free.
in case you're wondering why that would be interesting,
there are three problems with i/o apics: they're slow,
chained and error-prone[1].

while most devices/machines just work, there are a few that
just make the machine go nuts.  by go nuts, i mean
panic after taking a number of machine check interrupts.
wiring the interrupt to 1 ap fixes the problem.  and by
"fixes," i'm pretty sure i mean papers over.

here's what i think may be happening.  in trap(),
valid vectors are processed and then the eoi is processed.
after the eoi, a new trap may be taken before we exit
trap().  normally this isn't a problem since we're splhi().

unfortuantely there are a 3 places we spllo()
from trap().

- fault386 does this to handle page faults in line
(this might be a reason that swapping on x86 is not
reliable)

- if the vector is the clock handler and up->delaysched
is set

- notify to acquire the debug qlock().

so if you're keeping score at home, i think the reason
the machine "goes nuts" is that trap never returns.
it spllos and eventually calls sched().  i think this
comment in the notify/kexit case may be telling

		/* moderately dangerous; can stack arbitrarily */

clearly, if you believe intel, the reason this is much more likely
with msi interrupts is that they're faster, and don't require an
eoi cycle to the i/o apic to rearm the interrupt.

so is this analysis mistaken?  and if not, what to do about it?
wake a cleaner process that can run in non-trap context?

- erik

[1]
- you need to get a correct interrupt mapping for the
i/o apic to set up routing.  this is one extra step to go wrong.
- many bus interrupts map to the same output vector.  worse,
the bus irq table can't tell you which one belongs to your device.
- intel claims a 3?s advantage on atom machines.
(ftp://download.intel.com/design/intarch/PAPERS/321070.pdf)
due to cutting out the i/o apic from the interrupt path, and
elimiating the eoi message to the i/o apic.



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2011-03-21  2:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-21  2:17 [9fans] 386 interrupts erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).