Implementing multiple interrupts

Aleksandar Kuktin · Dec 7, 2013

Hi all.

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

My first idea was to make a stack and push the interrupt contexts onto
the stack as new IRQs come in and pop them as the software interrupt
handlers return. But this is far too elaborate to be implemented in
reasonably little resources.

My second idea was to have every IRQ assert a bit in a special reqister
before asserting the unified IRQ signal. Then I would mandate that the
software interrupt handler iterates over the register and handles IRQs
that it finds asserted. And when it is done, the interrupt handler would
reiterate over the register to catch any interrupts that came in during
the handling of the previous ones. I currently plan to use a variant of
this method.

While writing this I also came up with the option of deffering IRQs until
the previous one's handling returns. Probably with some sort of a stack.
If done correctly, it should take little resources.

What to other CPUs usualy do in these situations?

Gabor · Dec 7, 2013

On 12/7/2013 7:04 AM, Aleksandar Kuktin wrote:

Hi all.

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

My first idea was to make a stack and push the interrupt contexts onto
the stack as new IRQs come in and pop them as the software interrupt
handlers return. But this is far too elaborate to be implemented in
reasonably little resources.

My second idea was to have every IRQ assert a bit in a special reqister
before asserting the unified IRQ signal. Then I would mandate that the
software interrupt handler iterates over the register and handles IRQs
that it finds asserted. And when it is done, the interrupt handler would
reiterate over the register to catch any interrupts that came in during
the handling of the previous ones. I currently plan to use a variant of
this method.

While writing this I also came up with the option of deffering IRQs until
the previous one's handling returns. Probably with some sort of a stack.
If done correctly, it should take little resources.

What to other CPUs usualy do in these situations?

Normally you have an instruction to mask / unmask interrupts. When the
CPU gets interrupted, the source that caused the interrupt gets cleared,
but others may be active. In some CPU's, there is one guaranteed
instruction before another can interrupt the CPU, and you must use
the general "mask interrupts" instruction as the first instruction
of the service routine. Others automatically set the global interrupt
mask when the interrupt occurs, and the service routine only needs
to unmask them at the end of the routine, just before the return. Note
that this means you need to allow one more instruction after the
unmask to make sure the return happens before the next interrupt,
or you can have stack overruns. Another option is to have a
special RTI instruction that enables interrupts and returns.

In addition to this simple global mask, there is usually a mask
per IRQ, that allows you to have prioritized interrupts. An
interrupt service routine in this case will first save the
current state of the per-IRQ mask, then mask all lower level
IRQ's and unmask interrupts globally. Thus only higher-level
IRQ's can interrupt this service routine. At the end, it must
restore the state of the per-IRQ masks.

All this implies a machine with a stack to save current state.
Here again there are differences in implementation. Some
CPU's will automatically push registers onto the stack when
the interrupt occurs, others require the service routine
to save any state that it may change during operation. In
the first case, you definitely need a separate RTI instruction
that unwinds the stack.

--
Gabor

glen herrmannsfeldt · Dec 7, 2013

In comp.arch.fpga Aleksandar Kuktin <akuktin@gmail.com> wrote:

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

First, you want level triggered interrupts, such that the interrupt
line stays active until acknowledged. This was a problem with the ISA
bus, fixed in PCI.

Q-bus, and probably the somewhat similar unibus, use a daisy-chained
interrupt acknowledge system. You should probably look up the details,
but, more or less, it chains through the backplane slots, such that
ones nearer the CPU have higher priority. When not interrupting,
the board in each slot (or a jumper board if no actual board) passes
the ACK down the line. It a board has requested an interrupt, it doesn't
pass it along, and the CPU will then process that one. When it clears,
the next one will get its chance. Note the requirement for level
triggering, and that the interrupt routine (or hardware) must reset
the interrupt line at the appropriate time.

-- glen

Stephen Fuld · Dec 7, 2013

On 12/7/2013 6:44 AM, Gabor wrote:

On 12/7/2013 7:04 AM, Aleksandar Kuktin wrote:
Hi all.

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

My first idea was to make a stack and push the interrupt contexts onto
the stack as new IRQs come in and pop them as the software interrupt
handlers return. But this is far too elaborate to be implemented in
reasonably little resources.

My second idea was to have every IRQ assert a bit in a special reqister
before asserting the unified IRQ signal. Then I would mandate that the
software interrupt handler iterates over the register and handles IRQs
that it finds asserted. And when it is done, the interrupt handler would
reiterate over the register to catch any interrupts that came in during
the handling of the previous ones. I currently plan to use a variant of
this method.

While writing this I also came up with the option of deffering IRQs until
the previous one's handling returns. Probably with some sort of a stack.
If done correctly, it should take little resources.

What to other CPUs usualy do in these situations?

Normally you have an instruction to mask / unmask interrupts. When the
CPU gets interrupted, the source that caused the interrupt gets cleared,
but others may be active. In some CPU's, there is one guaranteed
instruction before another can interrupt the CPU, and you must use
the general "mask interrupts" instruction as the first instruction
of the service routine. Others automatically set the global interrupt
mask when the interrupt occurs, and the service routine only needs
to unmask them at the end of the routine, just before the return. Note
that this means you need to allow one more instruction after the
unmask to make sure the return happens before the next interrupt,
or you can have stack overruns. Another option is to have a
special RTI instruction that enables interrupts and returns.

In addition to this simple global mask, there is usually a mask
per IRQ, that allows you to have prioritized interrupts. An
interrupt service routine in this case will first save the
current state of the per-IRQ mask, then mask all lower level
IRQ's and unmask interrupts globally. Thus only higher-level
IRQ's can interrupt this service routine. At the end, it must
restore the state of the per-IRQ masks.

All this implies a machine with a stack to save current state.
Here again there are differences in implementation. Some
CPU's will automatically push registers onto the stack when
the interrupt occurs, others require the service routine
to save any state that it may change during operation. In
the first case, you definitely need a separate RTI instruction
that unwinds the stack.

Yes. All those will work. Another alternative is to have another,
probably smaller, register set that is used by the first level interrupt
handler. When an interrupt comes in, if it is allowed, further
interrupts are disabled, the system is "switched" to use the special
register set, and control is transferred to the ISR. A minimal amount
of code is executed to whatever is needed to save the details of the
interrupt to memory, set flags, etc. Then a special instruction is
executed (e.g. something like the RTI mentioned above) that returns to
the instruction that would have been executed had the interrupt not
occurred, further interrupts are enabled and the mode is switched back
to the normal register set. The big advantage of this is faster
interrupt response, as no need to save/restore registers, and no need to
size a specific interrupt stack.

There are lots of alternatives depending on your specific requirements
and available resources.

The key requirement for all of them is the ability to defer interrupts
for a short time until the information about an interrupt is saved.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

glen herrmannsfeldt · Dec 7, 2013

In comp.arch.fpga Tom Gardner <spamjunk@blueyonder.co.uk> wrote:

(snip)

Consider "level triggered" interrupts rather than edge
triggered interrupts...

Yes.

Every peripheral (etc) generates an interrupt which pulls a
single signal high until that interrupt has been cleared
by the relevant interrupt service routine (ISR) and the return
from interrupt (RTI) instruction executed.

Normally low, at least in the TTL days. Many buses are based on TTL
level signals, even if implemented in MOS logic. TTL outputs pull
down much harder than up.

-- glen

Tom Gardner · Dec 8, 2013

On 07/12/13 12:04, Aleksandar Kuktin wrote:

Hi all.

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

My first idea was to make a stack and push the interrupt contexts onto
the stack as new IRQs come in and pop them as the software interrupt
handlers return. But this is far too elaborate to be implemented in
reasonably little resources.

My second idea was to have every IRQ assert a bit in a special reqister
before asserting the unified IRQ signal. Then I would mandate that the
software interrupt handler iterates over the register and handles IRQs
that it finds asserted. And when it is done, the interrupt handler would
reiterate over the register to catch any interrupts that came in during
the handling of the previous ones. I currently plan to use a variant of
this method.

While writing this I also came up with the option of deffering IRQs until
the previous one's handling returns. Probably with some sort of a stack.
If done correctly, it should take little resources.

What to other CPUs usualy do in these situations?

Consider "level triggered" interrupts rather than edge
triggered interrupts...

Every peripheral (etc) generates an interrupt which pulls a
single signal high until that interrupt has been cleared
by the relevant interrupt service routine (ISR) and the return
from interrupt (RTI) instruction executed.

If a second interrupt arrives before the first one has been
cleared, then it too pulls the same signal high - which has
zero effect. When the ISR has cleared the first interrupt
and the RTI executed, the processor immediately enters
the ISR for the second interrupt.

Hence all the interrupts are dealt with serially without
preemption. If you want preemption then you have to design
a hierarchy of interrupts and be prepared to stack one
context for each level in the hierarchy. Many processors
manage with two levels, typically a maskable interrupt
and a non-maskable interrupt.

Have a look at early microprocessors, e.g. the Z80 or 6800.

Tom Gardner · Dec 8, 2013

On 07/12/13 22:55, glen herrmannsfeldt wrote:

In comp.arch.fpga Tom Gardner <spamjunk@blueyonder.co.uk> wrote:
Every peripheral (etc) generates an interrupt which pulls a
single signal high until that interrupt has been cleared
by the relevant interrupt service routine (ISR) and the return
from interrupt (RTI) instruction executed.

Normally low, at least in the TTL days. Many buses are based on TTL
level signals, even if implemented in MOS logic. TTL outputs pull
down much harder than up.

Yes indeed; I originally wrote "low". Given the modern
tendency for all i/o (and less obviously internal) signals
to be active-high, i suspected the OP wouldn't be familiar
with how to cheaply implement a wired-or function with
active-low signals. And I didn't see any benefit in explaining
he concept!

Active high i/o signals /still/ feel somewhat unnatural
to me. Must be too old

Gabor · Dec 8, 2013

On 12/7/2013 3:39 PM, glen herrmannsfeldt wrote:

In comp.arch.fpga Aleksandar Kuktin <akuktin@gmail.com> wrote:

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

First, you want level triggered interrupts, such that the interrupt
line stays active until acknowledged. This was a problem with the ISA
bus, fixed in PCI.

Q-bus, and probably the somewhat similar unibus, use a daisy-chained
interrupt acknowledge system. You should probably look up the details,
but, more or less, it chains through the backplane slots, such that
ones nearer the CPU have higher priority. When not interrupting,
the board in each slot (or a jumper board if no actual board) passes
the ACK down the line. It a board has requested an interrupt, it doesn't
pass it along, and the CPU will then process that one. When it clears,
the next one will get its chance. Note the requirement for level
triggering, and that the interrupt routine (or hardware) must reset
the interrupt line at the appropriate time.

-- glen

The last time I designed a Unibus board, I needed to get the
bus pinout from an insider at DEC. The documentation was
all written about the interface at the ends of a backplane
where the bus extenders hook to the A and B slots. Plug-
in cards all used the C D E and F slots. This pinout was
a (sort-of) secret. If you look at the documentation
for Unibus, you can't actually figure out how the IRQ's
are daisy-chained when there's only one pin assigned for each.
That's of course only on the A B slots. Qbus is a better
documented bus.

IIRC Qbus had 4 or five of these interrupt chains, one for each
available interrupt priority level. So the CPU still had to
deal with the interrupt priority. Also I believe there were
many more soft interrupt levels on those CPU's. Probably
not very interesting for a small compact CPU as mentioned
in the OP. My original comments were based on my limited
memory of 8085 and 6800 which I worked with in the early
days of my career, before VHDL when design was pencil on
vellum....

--
Gabor

Aleksandar Kuktin · Dec 8, 2013

On Sun, 08 Dec 2013 11:00:35 -0800, Andy (Super) Glew wrote:

---++ Threads

I can't quite work this into my taxonomy, so let me just say: one of the
most common unusual or alternative interrupt schemes is thread based.
Instead of logically switching threads, you start off with a
multithreaded processor, and interrupt delivery conceptually just
unblocks a stalled thread, the thread for the interrupt handler.
Conceptuially letting the interruopted thread continue execution.

I *really* like this idea.

---+ Queuing interrupts

[...]

Much less common: deeper queues, less combining. Some event driven
RTOSes simplified by keeping interrupt requests separate, unmerged.
Inevitably must handle possibility of interrupt overflow. But this can
become an error condition, rather than common.

I think I will implement a variant of queuing. Or, as I have begun to
call it, "deffering". After posting the original post yesterday, I gave
the idea thought and figured out I can make a sufficiently small
mechanism that does this.

Originally I was going to use a stack but stack is a FILO construct and I
need a FIFO so I'll probably use a ring buffer with two pointers (read
and write) pointing into it. So an interrupt request makes the IRQ buffer
unit write into the appropriate place that the IRQ has been received as
well as which IRQ was it. Actually, it will just write and increment the
write pointer. Logic can figure out the interrupt is pending because
there is a used memory slot between the read and write pointers. Then, if
the CPU is not in interrupt, it gets interrupted. If it is in interrupt,
nothing happens untill the interrupt handler returns when the CPU
immediatelly gets pushed back into an interrupt.

With a sufficiently big ring buffer, sufficiently fast interrupt handlers
and sufficiently interspersed interrupt requests, this scheme should
handle all of my needs for the concievable future.

> [...]

Aleksandar Kuktin · Dec 8, 2013

On Sat, 07 Dec 2013 20:26:26 +0000, Tom Gardner wrote:

On 07/12/13 12:04, Aleksandar Kuktin wrote:
[snip]

Consider "level triggered" interrupts rather than edge triggered
interrupts...

Every peripheral (etc) generates an interrupt which pulls a single
signal high until that interrupt has been cleared by the relevant
interrupt service routine (ISR) and the return from interrupt (RTI)
instruction executed.

If a second interrupt arrives before the first one has been cleared,
then it too pulls the same signal high - which has zero effect. When the
ISR has cleared the first interrupt and the RTI executed, the processor
immediately enters the ISR for the second interrupt.

Interesting concept. Suppose I'll check out Z80 and the Motorola. Always
had a suspicion I'll end up doing something with Z80.

Hence all the interrupts are dealt with serially without preemption. If
you want preemption then you have to design a hierarchy of interrupts
and be prepared to stack one context for each level in the hierarchy.
Many processors manage with two levels, typically a maskable interrupt
and a non-maskable interrupt.

Have a look at early microprocessors, e.g. the Z80 or 6800.

Andy (Super) Glew · Dec 9, 2013

On 12/7/2013 4:04 AM, Aleksandar Kuktin wrote:

Hi all.

Some time ago, I designed a small and simple CPU to go into a project I
am sort-of working on (when can I steal the time to do so). Now, I added
an interrupt mechanism to it and encountered a problem when I tried to
turn the one-interrupt interrupt into a multiple-interrupt interrupt -
the kind all other CPUs have these days.

Here's my gripe: suppose the CPU enters interrupt 1. While it is
interrupted, IRQ2 comes in. And then IRQ3 comes in, followed by IRQ4. WTF
am I supposed to do now? I want all IRQ-s to be handled. So, no ignoring.

My first idea was to make a stack and push the interrupt contexts onto
the stack as new IRQs come in and pop them as the software interrupt
handlers return. But this is far too elaborate to be implemented in
reasonably little resources.

My second idea was to have every IRQ assert a bit in a special reqister
before asserting the unified IRQ signal. Then I would mandate that the
software interrupt handler iterates over the register and handles IRQs
that it finds asserted. And when it is done, the interrupt handler would
reiterate over the register to catch any interrupts that came in during
the handling of the previous ones. I currently plan to use a variant of
this method.

While writing this I also came up with the option of deffering IRQs until
the previous one's handling returns. Probably with some sort of a stack.
If done correctly, it should take little resources.

What to other CPUs usualy do in these situations?

Fun stuff, eh? I keep thinking that interrupt architecture is one of
the areas of computer architecture that is most in need of cleanup. I
want to do a writeup, a survey, for my comp-arch.net wiki --
understanding the state of the art is what I usually do before trying to
advance it.

Anyway - what do other CPUs do?

Most CPUs allow interrupt preemption. They save the state of the
preempted thread / interrupt handler somewhere. Where they save the
state varies. I'll summarize that later, but this thread inspires me to
start from the simplest thing.

---+ Interrupt Source

In the beginning there was an interrupt pin... (oooo, this will be a
cool slideset. complementary to my tutorial on memory, which begins
"This is a bit..." with a big dot in the middle of the page, which is
still my favorite.)

Inevitably, people wanted multiple interrupt sources, multiple interrupt
pins. Sometimes they treated them uniformly: INT0, INT1, INT2 ... .
Sometimes they created different mechanisms for different pins: INT,
NMI, SMI, FERR, MCERR ... Fundamentally, even if different mechanisms,
the same issues arise - often it is as if each interrupt source takes a
different combination of the options we are about the list.

But let's deal with interrupt sources. Pins, easy to understand. Level
triggered. Edge triggered => edge detector. Detect on risi9ng edge,
falling edge, both. Glitches - level triggered interrupt pins that rise
and then fall too quickly to actually trigger and interrupt. Similarly
for edge triggered.

Non-pin interrupt sources: ECC on memory. Internal exceptions, not from
outside world: cache or register parity.

Interrupt delivery mechanisms: interrupt controller outside in the
system, that delivers and routes interrupts to various targets. E.g.
different CPUs. Dedicated interrupt farics / busses, like the original
Intel MPIC serial bus.

More non-pin: bus messages on the memory bus, like Intel XAPIC. Makes
it easier to reason about ordering of interrupts and memory requests.
Makes "level triggered" less meaningful, edge triggered more natural:
level triggered requires messages to be sent on both high-to-low and
low-to-high transitions. Edge triggered only requires one message.

I still think that interrupt "pins" are the fundamental primitive, and
will be so long as we are working with wires. Even when you have bus
message based interrupts, inevitably the messages are received by logic
at the target, and then assert signals - the good old INT, NMI, etc, -
into the CPU core or cores.

---+ Detecting interrupts

Pre-beginning: polling. Read the pin, possibly compare it to a
previous value. Decide what to do.

Interrupts really begin when software doesn't have to poll. Dedicated
logic does. And dedicated logic then tries to get software to do something.

---+ Interrupt delivery

OK, so what do you do when you detect an interrupt?

Pre-beginning: unconditionally force the PC to specific value.
Throwing away whatever the old PC was.

This is limited, so we start thinking about saving the PC at point of
interrupt somewhere.

0) dedicated save register

Interrupted_PC := PC
PC := interrupt_handler_PC

1) switch PC

current_PC
non_interrupt_PC
interrupt_handler_active_PC,
interrupt_handler_start_PC (constant)

on new interrupt:
interrupt_handler_active_PC := interrupt_handler_start_PC
(may be preloaded)
current_PC = interrupt_handler_active_PC (points to)

on interrupt return
current_PC = non_interrupt_PC

Note: fundamentally delivering an interrupt is an edge triggered action.

I think dedicated "save register" and "switch PC" are isomorphic, dual.
Although the details matter. I will try to say "save/switch" interrupt
state.

Many systems have a mix: some interrupt sources switch, others save.

Oftentimes, switching is suggested as the solution to all of the
problems that saving has. Or vice versa. So we end up with inconsistent
interrupt handling mechanisms. :-(

Either can be scaled to apply to multiple interrupt sources and
save/switch PCs.

Nearly all systems save/switch not just PC, but also some other
registers. Typically some register that can be used in addressing
memory. Often this is called a "interrupt handler stack pointer", but
it doesn't have to be a stack.

Execution mode / privilege (although not applicable to simplest systems
that do not have privilege).

More:

2) Save/switch sets of registers.

3) Save/switch registers to blocks of memory. Basically, to avoid
having to build lots of hardware registers.

E.g. Gould SEL machines had one of the most elegant save/switch
mechanisms I have encountered. Perhaps not so much elegant, but very
simple, instructive. Each of the many interrupt "vectors" had a memory
block. Interrupt delivery consisted of saving (a subset of) current
register state in the memory block, and loading new register state from
a different part of the memory block. Delivery of new interrupts using
that same memory block was prevented, until software had moved the state
somewhere else, out of the memory block, at which point that interrupt
source was unblocked

By the way: note "save old state / load new state". Basically a state
exchange. CDC "exchange jump". Could be same memory, although IMHO
that complicates state management.

Typically the old state save area is coupled with, related to, the
interrupt vector, and the area from which new state for the interrupt
handler is loaded. But it doesn't have to be this way. In some
systems, you want the saved state of the interrupted thread/process to
be protected from the interrupt handler - i.e. you do not necessarily
assume that the interrupt handler is higher privilege that what got
interrupted. The interrupt handler often does not need to know what
got interrupted. Think - I/O interrupts. But many other uses need
handler to know what got interrupted - think timer interrupts for
profiling, OS context switching. Anyway, what I am trying to say is
that some sort of linkage to what got interrupted needs to be saved, but
not necessarily directly accessible.

Some systems with "Processor Control Blocks" or "Processor State Blocks"
in memory save the interrupt state into that.
But if every interrupt saves into the same PSB, then you cannot nest
until that state has been moved somewhere else - unless the PSB itself
is changed as a result of each interrupt.

Basically, the save area is something chosen from the cross product of
Privilege_Domains X Interrupt_Sources.

(Of course, can always emulate complicated interrupt architectures like
this on top of simpler architectures - like assuming interrupt handler
is more privileged. And then let the privileged SW int handler switch
to a less privileged actual interrupt handler. Similarly, we often say
"interrupt return", as in "return to that which got interrupted, and
resume". In reality, it is "end of interrupt handler", and then a
scheduling mechanism (HW or SW) picks up whatever is most appropriate.
Possibly return to that which got interrupted. But possibly switch.)

4) Save/switch onto stacks

It is quite common to push interrupts onto a stack - possibly THE stack.
Avoids need for so many dedicated save areas. Typically, SW then moves
stuff off the stack, ...

But once you start doing this, you quickly realize that you want
multiple stacks. E.g. separate user / kernel stacks, separate by
privilege. Which typically involves switching or saving registers, to
change the stack pointer used, and then the memory accesses.

Basically, we can see that you have a dedicated stack per privilege
domain (thread/process/user/supervisor). Nested interrupts / higher
priority interrupts that run in the same privilege domain may be able to
save on the same stack, pushing, incrementing stack pointer. Avoids the
need for need to allocate per interrupt source, i.e. to have the full
cross product of Privilege_Domains X Interrupt_Sources.

GLEW OPINION / BKM:
0)save/switching register state for simplest systems.
1) If things get more complicated, save/switch to memory
1.1) Old/new state per interrupt source
1.2) Or: save old state to PSB, load new state, including PC and PSB
Leaving TBD the issue of where to link the saved PSB.

Although switching stacks is common, IMHO it is complicated. For a
simple implementation, I suggest one of the above. Save/switch through
registers or memory.

---++ How much state to save?

Lots of brainpower has been spent on minimizing the amount of state to
be loaded / saved / switched/

Obviously, must save/switch PC. Typically privilege. And usually a
register used in memory addressing - stack pointer, or PSB.

You can leave other state in registers to be saved/restored by software,
right?

Well...

a) Yes if int handler privileged... big issues if not

b) Yes if int handler knows about and how to save state.

Problems arise with long lived computer architectures that have legacy
concerns. E.g. an int handler that knows about 32-bit GPRs, but
doesn't know about 64-bit GPRs. Or an int handler than knows about 128
bit XMM registers, but not about 256 or 512 bit YMM registers. Where
accessing a 128-bit register zeroes out the upper bits if 512-bit registers.

If you assume that OS software in an interrupt handler knows about the
state. But, oftentimes the reason we are doing virtual machines is to
keep legacy OSes running.

These concerns are often coupled, related, to overlapped versus
non-overlapped, partial registers, etc. I.e. if you want the simpler
for hardware approach of zero or sign extending small register writes to
full register width, to avoid partial register issues, you may be
required that your privileged resource architecture save more registers,
either in hardware or software.

Or, you can simplify interrupts by supporting partial register writes.

Or... various flavors of lazy save/restore (after an interrupt, trap if
tricky registers are accessed. But then the trap handler just needs to
know what to do about the state. There have been examples where lazy
save/restore was not kept consistent with OS policies, producing
security holes.)

Or... state saving instructions like Intel XSAVE.

---++ Threads

I can't quite work this into my taxonomy, so let me just say: one of the
most common unusual or alternative interrupt schemes is thread based.
Instead of logically switching threads, you start off with a
multithreaded processor, and interrupt delivery conceptually just
unblocks a stalled thread, the thread for the interrupt handler.
Conceptuially letting the interruopted thread continue execution.

Elegant...

If the threads are loaded into registers, scaling issues. But a nice
approach for a dedicated system.

If the thread state is in memory - but then hardware is doing CPU
scheduling function. I like it, but that's a whole other topic.

Issue: what about profiling interrupts, where the interrupt handler
wants to inspect the thread that was interrupted.

Dedicating CPUs to interrupts - same thing. Same elegance, same
scalability.

Up to the point where hardware is free, we will always have scaling
issues. The issues of how to switch state will always be there - but
techniques like saving/switching register sets, or activating interrupt
handler threads or CPUs, put the issue off. Possibly off far enough
that it can be deferred to software, and hardware doesn't have to worry.

But there will always be the need for an interrupt handler thread / CPU
to preempt or access state another thread/CPU. That will always need a
hardware mechanism - inter-thread or inter-processor interrupts. It
just can be reduced in frequency.

And, I think on general purpose CPUs, we will probably need to multiplex
multiple threads/processes for 10-20 years more. I suspect forever.

I.e. dedicated interrupt handling threads/CPUs are something for the
toolbox. But are not general purpose. Not universal.

---+ Blocking or Masking interrupts

Must always be able to block interrupts, even with a single pin.
Basically a bit that says "interrupt handler in progress, can't
save/switch state". Inevitably extended to also be used to block
interrupts when interrupt handler not in progress,

Blocking bit.

Multiple: blocking bitmask. Or, level - accepting ints of higher priority.

GLEW: I like bitmasks. One bit per save area.
Levels are more compact. Allows you to have many morte save areas
in memory, without excessive bitmask storage.
Levels can tie you up in knots - because actually interrupt
handlers change their priority during their lifetime.
Bitmasks, though, add to state that must be saved / restored.

Basically, mixing up, entwining, two concepts:
a) interrupts that must be blocked because the save area for them is
in use. (Statically save/switched, regs or memory; not so much
dynamically allocated, stacks.)
b) interrupts that must be blocked because of simplistic mutual
exclusion policies.

I suppose could have both.

Levels are always simplistic for mutual exclusion. Level/bitmask?

---++ What about NMIs?

What about NMIs? Non-Maskable Interrupts.

GLEW OBSERVATION:
a) first you add ordinary interrupts
b) then you add NMIs. NMIs are supposed to never return.
c) but then you want to have NMIs that return...
e.g. profiling all code.
e.g. hypervisor wants to context switch an OS that uses NMIs
NOTE: must then guarantee that NMI state doesn't collide with
ordinary ints

d) ... eventually, it seems that most mature architectures in fact have
a mask bit for non-maskable interrupts. A mask bit separate from the
mask bit for ordinary interrupts.

---++ Masking interrupts at start of handler

Simplicity can supposedly be obtained by masking state at the start of
an interrupt handler - to give the OS a chance to save state, before
allowing others.

Can be automatic - 1 or N-instruction window.

Or int handler may be trusted.

But... when you start virtualizing, or when you have NMIs that you want
to guarantee do not get masked.

I.e. automatic masking is an optimization. Does not work in general, but
in specific cases.

---+ Queuing interrupts

With a single pin, level triggered, there isn't really much of a queue.

1 bit, interrupt pending.
Another bit, interrupt active.

Issues wrt clearing the bits as returning. Nearly always a special
interrupt return instruction. (x86 special "deferred" handlimng of
instruction after CLI/STI.)

With multiple interrupt sources, a bitmask.

Multiple arrivals sharing same bitmask -
0) don't do that (simplistic)
1) saturate - OR into existing.

Structure often seen: 1 deep satuirating queue (per bitmask position).

1 bitmask, but per int group, for "this interrupt is dispatched, handker
has been started" (although may have been preempted).

Second bitmask for pending interrupt that has not yet been dispatched.

Less often: 2-deep saturating queue. x86 APIC? (Why... circuits.)

Much less common: deeper queues, less combining. Some event driven
RTOSes simplified by keeping interrupt requests separate, unmerged.
Inevitably must handle possibility of interrupt overflow. But this can
become an error condition, rather than common.

Issue:

Conceivable, but I am not aware of: counter per interrupt group./ I
think it has worst properties: still need to poll if multiplexed int
sources.

---+ Conclusion

Lots of combinations of these mechanisms.

I think I am stopping because of exhaustion of myself, not of the
possibilities seen, that I want to document.

--
The content of this message is my personal opinion only. Although I am
an employee (currently Imagination Technologies, which acquired MIPS; in
the past of companies such as Intellectual Ventures/QIPS, Intel, AMD,
Motorola, and Gould), I reveal this only so that the reader may account
for any possible bias I may have towards my employers' products. The
statements I make here in no way represent my employers' positions on
the issue, nor am I authorized to speak on behalf of my employers, past
or present.

Ivan Godard · Dec 9, 2013

On 12/8/2013 11:00 AM, Andy (Super) Glew wrote:
<snip>

Fun stuff, eh? I keep thinking that interrupt architecture is one of
the areas of computer architecture that is most in need of cleanup.

<snip of Andy being amazing

>

Lots of combinations of these mechanisms.

I think I am stopping because of exhaustion of myself, not of the
possibilities seen, that I want to document.

What a mess. We decided early on that we didn't want to deal with it.

A distinction that was useful to us: a fault, a trap, and an interrupt
are not the same thing.

* A fault is internally generated by the running program, and has
termination semantics. Example: illegalInstruction

* A trap is internally generated, not necessarily associated with a
particular program or execution, and has resumption semantics. Example:
not-present-page trap.

* An interrupt is externally generated and has resumption semantics.
Example: i/o complete interrupt.

The difference between interrupt and trap boils down to whether the
event can be precluded by having the CPU avoid doing anything that would
raise it (a trap), or whether the event is triggered by something that
is outside CPU control (interrupt).

For all three, handler invocation is an involuntary function call, and
the handler is a function like any other. All three have dispatch
vectors that are simply arrays of function pointers indexed by event
kind; the vectors are in memory and the three vector roots are in
hardware registers. Hardware detecting an event selects the appropriate
root, indexes the vector with the event kind, and calls the function.

The expected use is that the fault vector and its root are in the
executing turf and hence under the control of the faulting program (or
more likely its run-time system). Trap and interrupt vectors are not
expected to be in the application turf but instead in a turf used by the
OS for event management. Trap and interrupt handlers are expected to be
distant functions (the invocation switches turf); fault handlers may or
may not be.

Cascaded faults (including an attempt to return from a fault handler)
causes a trap. The trap handler clears up the rubble.

Jecel · Dec 9, 2013

On Sunday, December 8, 2013 6:23:26 PM UTC-2, Aleksandar Kuktin wrote:

On Sun, 08 Dec 2013 11:00:35 -0800, Andy (Super) Glew wrote:
[snip]
---++ Threads
I can't quite work this into my taxonomy, so let me just say: one of the
most common unusual or alternative interrupt schemes is thread based.
Instead of logically switching threads, you start off with a
multithreaded processor, and interrupt delivery conceptually just
unblocks a stalled thread, the thread for the interrupt handler.
Conceptuially letting the interruopted thread continue execution.

I *really* like this idea.

One advantage of using co-routines instead of subroutines for handling interrupts is that the suspended PC encodes some state while since the subroutine (traditional) interrupt handler always starts at the exact same address it has to spend some time figuring out the current system state.

The TX-2 machine (MIT Lincoln Laboratory) was an early co-routine based design and the Xerox PARC machines starting with the Alto used this extensively. I have used this in some of my designs (using the distributed RAM instead of the flip-flops in FPGA slices for register allows you to have, for example, 16 PCs in the same area you would normally just have one) and it can work very well.

At the software level, the Apertos (later Aperios) operating system from Sony Labs implemented its drivers as co-routines instead of subroutines.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.5315

-- Jecel

Implementing multiple interrupts

Aleksandar Kuktin

Guest

Gabor

Guest

glen herrmannsfeldt

Guest

Stephen Fuld

Guest

glen herrmannsfeldt

Guest

Tom Gardner

Guest

Tom Gardner

Guest

Gabor

Guest

Aleksandar Kuktin

Guest

Aleksandar Kuktin

Guest

Andy (Super) Glew

Guest

Ivan Godard

Guest

Jecel

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

Implementing multiple interrupts

Aleksandar Kuktin

Guest

Gabor

Guest

glen herrmannsfeldt

Guest

Stephen Fuld

Guest

glen herrmannsfeldt

Guest

Tom Gardner

Guest

Tom Gardner

Guest

Gabor

Guest

Aleksandar Kuktin

Guest

Aleksandar Kuktin

Guest

Andy (Super) Glew

Guest

Ivan Godard

Guest

Jecel

Guest

Log in

Welcome to EDABoard.com

Sponsor