CS222 Lecture: Input-output devices and interfacing Last revised 3/26/98
Transparencies: IBM 370 block diagram,
Priority Encoder, Z80 DMA, Software polled interrupt network
Outline:
1. Device characteristics and control requirements
2. Connecting the CPU to IO devices - options and examples
3. Control options: programmed, interrupt-driven, DMA
I. IO Devices (T's)
- -- ------- -----
A. Practically anything can be connected to a computer system as an IO
device: not only conventional peripherals but such things as automobile
carbeuretors and nuclear power plants. However, we want to look at some
of the more common kinds of device: terminals, printers, tape units,
disks, and real-time devices.
B. TERMINALS
1. Terminals function both as input devices and as output devices. When
used as an input device, though the host often has to be responsible
for "echoing" the typed characters back to the device. Thus, each
input operation may actually involve both input and output
communication with the device.
2. Input characteristics:
a. Speed: The communication link may allow speeds of up to 6000
bytes/second.
i. However, when a human is typing, the speed tends to be quite low.
(Observe that a 60wpm typist is typing about 5 characters per
second, with occassional pauses.)
ii. It is common to connect PC's to a host system as terminals. The
maximum data rate can be achieved when the PC is downloading
data it has stored internally to the host.
b. Input tends to come in bursts, with pauses between.
c. Input can require significant per-character activity by the
host - including:
i. Checking for line terminators.
ii. Processing control characters such as DEL, break etc.
iii. Echoing the data back for display. This may involve sending
back more than one character for one character received - e.g.
CRLF for CR; BS space BS for DEL on a video terminal etc.
d. Often, there is a provision for synchronization protocols. One
such protocol, XON/XOFF, uses two special characters ^Q (XON) and
^S (XOFF).
i. If the terminal sends XOFF to the host, then the host will not
send any more data to the terminal until it sends XON. This
allows terminals to keep up with bursts of data from the CPU -
especially important for hardcopy devices or PC's that have
to pause to write data to a local disk from time to time.
ii. In like manner, the terminal may also respond to XON/XOFF from
the CPU. This is especially necessary for block mode terminals
or PC's.
e. There are three options for handling this per-character processing
i. The CPU can do it.
ii. A special terminal interface board with its own microprocessor
can handle it. Such boards typically handle 8-32 lines.
iii. A dedicated "front end" minicomputer can handle it for all
terminals on the system. This is common on larger mainframes.
4. Output characteristics
a. Terminals are generally line-oriented - i.e the host normally
sends the characters making up a single line in one burst, often
with pauses between lines. (But not always)
b. Some per-character processing may be required for output, too -
e.g. Conversion of control characters like Form Feed, Tab to
line feeds, spaces.
c. Note that XON/XOFF's send to the host as input by the terminal
actually control output from the host to the terminal and vice
versa.
C. PRINTERS are output only devices and are available with a broad spectrum
of capabilities, from relatively slow dot-matrix printers to fast laser
and page printers. The former may print under 100 cps; the latter 10's
of 1000's of cps.
D. GRAPHICS DEVICES, both video and hardcopy, have many of the
characteristics of printers. (In fact, many printers can also
do graphics.) Video graphics devices impose special requirements on the
system, because the display must be continually refreshed. This is
generally accomplished by including a local memory in the device to store
a "bit map" of the screen as displayed. Sometimes, this local memory is
seen as part of the regular system memory space by the host CPU.
E. POINTING DEVICES (mice, graphics tablets etc.) are input devices often
used in conjunction with video graphics devices.
1. They typically send "messages" to the CPU to report "events" such as
mouse movement (rotation of the ball), button clicking, or the touching
of the tablet.
2. A distinctive characteristic of these devices is that their input is
often unsolicited. The CPU must be able to receive and respond to a
message from them at any time.
F. MAGNETIC TAPE has already been discussed under memory. Tape devices have
data rates of 100,000 to 1 million cps, and are used for both input and
output.
G. DISK has also been discussed already. Data rates may exceed 1 million
cps. Disks are also used for both input and output.
H. REAL TIME DEVICES
1. This broad category includes various sensors (input devices) and
effectors (output devices) that might be found in process control
environments. The distinctive characteristic of these
devices is that the outside world is directly affected by the
computation in progress as it occurs.
2. The requirements vary widely, but range from very low volumes of data
(e.g. a simple A/D converter) to very high volumes (e.g. an image
digitizer.) Often, fast CPU response to certain kinds of external
events detected by a sensor is expected (e.g. to an
over-temperature or over-pressure alarm in process control.)
I. NETWORK INTERFACES
1. A computer that is part of a network will contain a network
interface that connects to the computer's system bus on one side and
to the network on the other side. (For illustration, we will assume
that the network is an ethernet.) The interface must be capable of
two basic operations, plus additional functions:
a. Transferring a stream of data from the computer to the network -
which may involve waiting for the network to be idle before
initiating a transmission. (Ethernet is a shared bus).
b. Monitoring the data coming over the network, and receiving any
data that is addressed to this particular computer (or is
broadcast to all computers on the network), while ignoring all
other data.
c. Data is transmitted in units called packets, which have headers
that contain network control information, followed by the actual
data; frequently, the interface is made responsible for generating
and interpreting these headers.
d. For these reasons, a network interface will often be built around
a dedicated microprocessor on the interface card itself.
2. The data rate that must be supported depends, of course, on the
type of network; for ethernet it can be either 10 MBits/sec (about
1M Byte/sec) or 100 Mbits/second; other networks can reach gigabit
data rates.
J. THE SYSTEM CLOCK
1. Most systems include a real-time clock that keeps track of the current
time of day. Often, this clock will be set up to interrupt the CPU
at regular or programmable intervals (often every 1/60 second as
tied to the power-line frequency). At every tick, system software
updates the internal time.
2. These are IO devices only in the sense that they interface to the
CPU like IO devices.
K. MISCELLANEOUS DEVICES
In addition to the devices we have listed here, there are, of course, many
many others that could be named - scanners, video input, etc. etc.
The principles we develop for the devices we did list will work with
these, too.
L. Summary of Requirements Imposed on the System
Device I/O Transfer rate (bps) Special CPU involvement
Terminal/PC I,O up to 6000 Considerable per
character processing
Printer O 100-100,000 +
Graphics device O Varies widely
Pointing device I Generally very low Respond at any time to
message from device
Tape I,O up to 1 million or so Blocking/deblocking data
Disk I,O up to 2 million or so Blocking data; scheduling of
requests to minimize head
movement
Real Time I,O varies varies - often needs very fast
response to certain events
Network I,O up to gigabit Considerable processing
generally done by dedicated CPU
that is part of the interface
Clock I none Update internal time each "tick"
II. Connecting the CPU to IO Devices - Options and Examples
-- ---------- --- --- -- -- ------- - ------- --- --------
A. In discussing IO systems, one will often hear reference to an "IO port".
A port is simply a mechanism for the transmission of information
to/from a particular device. Implementing a port requires the building
of an interface to the system bus, plus an interface to the device.
1. The interface to the device may take one of three forms:
a. It may be a serial line (one wire for each direction of data flow).
This is commonly used for low data-rate devices (terminals, some
printers and graphics devices, pointing devices) - especially when
they are located at a considerable physical distance from the CPU.
b. It may be a parallel cable (one wire per bit of data to be
transferred at one time, plus control lines). This is required for
high data-rate devices. Because these busses are built for speed,
their physical length must be kept short, requiring the devices
they serve to be physically close to the CPU.
c. An extension of the idea of a parallel interface to the device is
to use a bus system in its own right to service multiple devices
from one controller.
i. In large computer systems, controllers for devices like disks
are often set up this way, to allow one controller to service
multiple disks.
ii. On smaller systems (PC's and workstations), there are a number
of peripheral bus standards in common use - e.g. SCSI (small
computer systems interconnect) - pronounced "Scuzzy"; IDE.
The computer system itself contains a single interface from its
internal bus to the external, and then individual devices in
turn are interfaced to it. (Of course, now each device must
have its own internal interface between the external bus and the
device proper!)
d. The MPF-I uses both serial and parallel IO.
i. Communication with the keyboard and display is done in parallel.
ii. But data is stored on cassette tape as serial data.
e. We will begin by focussing on parallel IO. Serial IO always
requires an interface that connects to the system bus as a parallel
device and to the external device as a serial device, so we need
to understand parallel operation first.
B. In designing an IO system, there are a number of key design choices
to be made. Some of these choices are determined by the architecture
of the CPU, while others are made by the system designer.
1. Separate IO processor versus CPU responsible for IO.
a. On very large mainframe computer systems, it is common to have
one or more separate programmable processors responsible for
handling IO, with the main CPU doing computation only. This
allows each kind of processor to be customized for the particular
task it has to perform.
Example: IBM 370 - TRANSPARENCY
This approach has found its way into smaller computers as well;
for example, some microcomputer families offer the system designer
the option of either having the microprocessor do IO itself or
adding a special IO processor to the system configuration. (But
this is always an option, never a necessity as with the 370.)
b. Note that, regardless of which approach is taken, the following
discussion is still relevant either to the CPU itself or to
the IO processor/coprocessor - though we will assume that the
CPU handles IO for simplicity.
2. Several of the bus system design issues we discussed in an earlier
lecture come into play here.
a. Separate IO and memory busses versus a single physical bus.
b. Separate IO address space versus Memory-mapped IO.
i. Primarily an issue on systems with a single physical bus -
though memory mapped IO COULD be used on any system by ignoring
the separate IO bus if it is present and hooking everything to
the memory bus.
ii. Mandatory if the CPU has a single physical bus which is NOT
divided into two logical busses, and no special I/O instructions.
Example: PDP-11, VAX, Motorola 680x0 series processors
c. Synchronous versus asynchronous bus protocols.
d. Linear select versus partially decoded versus fully-decoded
addressing.
i. In any interfacing scheme, an IO port must be able to recognize
and respond to its unique address (whether that be an IO port
number or an address within the memory map.) This requires
decoding logic.
ii. In a fully-decoded addressing scheme, all of the relevant bits of
the address bus are considered by the decoder. This means that
the device will respond to the correct address, and no others.
However, this requires considerable decoding logic (e.g. 8
address bits + 3 control lines for a port on the Z80).
Therefore, on smaller systems one of two shortcuts is often used
to reduce decoding logic requirements.
iii. Partial decoding: only some of the relevant address bits are
actually decoded; the rest are ignored. This means that the
port will actually respond to several different addresses,
though only one is its "official address".
Example: the address of 8255 #1 port A on the MPF-I is 80.
However, since the decoding logic for IO ports ignores A3 and
A2, this chip will actually respond to port numbers 84, 88, and
8C as well.
iv. If the number of ports is small, it may be possible to assign
port addresses so that each port has one bit in its address that
is unique to it. This address bus bit can then be used directly
as a chip select, with no decoding.
Example: A Z80-based system with 8 ports might assign (binary)
addresses to the ports as follows:
1111 1110
1111 1101
1111 1011
1111 0111
1110 1111
1101 1111
1011 1111
0111 1111
(Note that each address includes exactly one 0.)
Now suppose each device has two separate low-true chip select
inputs, as is often the case. One of these can be derived by
____ __
NAND-ing the complement of IORQ with M1. This will be common
____ __
to all chips: no chip can respond unless IORQ is low and M1 is
high. Individual selects for each port can now be derived as
follows. For the first:
__
CS = A0
for the second:
__
CS = A1
etc.
(Of course, if each chip includes some address decoding logic on
it, as the 8255's do, we might only include (say) 6 of the 8
address bus bits in this linear select scheme.)
C. Basic requirements for implementing an IO port
1. Regardless of what schemes we use for port addressing and selection,
bus protocol etc. each port must meet certain requirements:
a. It will have an interface to the system data bus.
i. If it is an input port (data from it can be read by the CPU), it
must have one tri-state bus driver (buffer) per bit, which will
be enabled only when the CPU is reading from the port.
ii. If it is an output port (data can be sent to it from the CPU),
it must have one latch per bit to hold the data the CPU sends,
since the data will only be on the data bus for a brief time.
This latch must be enabled to capture data only when the CPU is
writing to the port.
b. It will have to include the necessary logic to respond to system
bus control signals and (if necessary) to generate handshake
signals or wait state requests.
c. It will have an interface to the device it serves.
2. Often, additional facilities are included in a port
a. An input port may include a latch to hold the data from the device
until the CPU reads it.
b. The port may have handshaking signals with the device it serves to
control transfer between the port latches and the device.
c. The port may have one or more status bits that the CPU can read to
determine if the port is ready for a data transfer. (These status
bits will actually be part of a separate status port that the CPU
can examine.)
Example: input port for a keyboard. When a key is struck, a byte is
transferred to a latch in the port. The "data available"
status bit will be set and will remain set until the data is
read by the CPU. (This bit will be part of a separate port.)
d. The port may be able to generate an interrupt to inform the CPU that
the device has just completed a data transfer to/from the latches in
the port, so that it would now be appropriate for the CPU to do a
transfer to/from the port.
Example: In the above, the setting of "data available" may trigger
an interrupt.
III. Control Options
A. We have seen how parallel devices can be interfaced to a computer
system bus by the use of appropriate devices. We now turn to a
consideration of mechanisms for transmitting data to/from the device
at an appropriate rate.
B. IO devices vary widely in the rates at which they can handle data.
1. At one end of the spectrum, the rate at which a keyboard produces
data is limited by the typing speed of a human typist, which is rarely
more than 5 characters per second, and can be as slow as one character
every few seconds (or less.) Pointing devices (e.g. mice) have
similar characteristics.
2. At the other end of the spectrum, devices such as standard magnetic
tape (not cassettes) or disks can transfer data at rates in excess of
1 million bytes/second. Fast network interfaces can transfer
10's or even 100's of millions of bytes/second.
3. Compare these numbers with CPU speeds, where the clock rate may be
on the order of 100s of MHz. In some cases, the CPU could execute
1000's of instructions in the time the IO device takes to process
one byte of data; in other cases, the CPU cannot even execute one
instruction in this time.
C. For any device then, we need some way of coordinating the device speed
and the CPU speed. In most cases, the CPU will have to wait for the
slower IO device to perform its work; but in other cases the reverse may
be true.
D. For illustration, we will use a typical moderate-speed printer, which
may print at a rate of 100 CPS (characters per second) or so, or roughly
1 millionth of the clock rate of a typical CPU. (We will assume
for now that the printer is connected as a parallel device, though
a serial connection may also be used. We will also ignore the possibility
of the printer having its own memory to buffer incoming characters.)
1. If we assume that the CPU executes a loop in which it sends
characters to the printer as fast as it can, and the loop contains
10 instructions, then the CPU can send 10 million characters per
second and the printer can handle only 100. If we did this, over
99% of the characters would be lost!
2. To prevent such things from occurring, devices such as printers are
typically built with some sort of status flag which indicates whether
the printer is able to accept a character. This flag will be part of
a status byte that the CPU can read (possibly along with other flags
such as "out of paper" etc.) Thus, the interface to the printer will
include at least two separate ports: an output port to which an ASCII
character may be sent, and an input port which may be read in order to
determine the device's status.
3. Two different kinds of status flags may be used:
a. A "ready" flag would be 1 to indicate that the device is able to
accept a new character, and 0 to indicate that it cannot accept
a character because it is still printing a previous one.
b. A "busy" flag would be 1 to indicate that the device is printing a
previously-sent character, and so cannot accept a new one, and 0
to indicate that it is available to accept a new character.
c. Clearly, these definitions of "ready" and "busy" are complementary;
if the designer of a particular system chose to implement a ready
flag, and a user wanted a busy flag, all he would have to do is
invert the flag supplied.
d. For our discussion, we will assume a printer with a "ready" flag.
We assume the printer manages this flag as follows:
- When it is first turned on, the printer sets its ready flag to 1.
- When it receives a character to be printed, it sets its ready
flag to 0.
- When it has finished printing the character, it sets its ready
flag back to 1.
4. To prevent loss of data, we impose the following requirement: a
character to be printed can only be send to the printer's output
port when the printer's ready flag is 1.
E. Four basic approaches may be taken to synchronizing the CPU and its
external devices.
1. A strictly hardware-based approach, in which the device interface
"hangs" the CPU if a data transfer is attempted when the device is
not ready, by using handshaking or WAIT control lines.
a. In our example, we might construct a configuration like the
following:
______ ____
Select for printer from decoder --------o| \ |\ ____
| )---| >o----- WAIT
Ready flag from device -----------------o|____/ |/
____
b. This circuit would assert WAIT to the CPU (thus forcing some number
of wait states to occur) whenever the device was selected by an IO
operation but was not ready for it.
c. Such an approach, however, is seldom desirable. A busy device will
hang up the entire system, enabling it to do nothing else, until
it has finished its work. Thus, we shall not pursue this approach
any further.
2. A strictly software-based approach, in which the CPU tests the device
status before sending data.
a. The simplest form of this is an approach known as busy waiting.
Suppose, for illustration, that:
- Printer data to be printed is to be output to port 40
- Printer status can be read from port 41. The low order bit of
the status is the ready bit.
Then the following subroutine could be used to print a character
from the A register. (The example code is for a Z80):
PRTCHR PUSH AF
WAIT IN A,(41H)
BIT 0,A
JP Z,WAIT
POP AF
OUT (40H),A
RET
Note that the loop from WAIT through JR would typically be
executed over a thousand times for each character printed. As
in the previous strictly-hardware example, the CPU would do nothing
else during this time.
b. Another variant of this is an approach known as polling. Here,
instead of looping forever on a status test, the CPU arranges to
test the status periodically, doing other useful things in the
meantime. Such an approach might be used, for example, on a CPU
dedicated to serving the IO needs of a large number of IO devices
(e.g. a communications "front end" processor on a mainframe system).
Such a processor might execute code like the following:
for i := 1 to NumberOfDevices do
test status of device i
if it is ready, then service it
c. With somewhat more difficulty, polling of devices might be
intermixed with other kinds of computation. This is made difficult
by the need to have the other computational routines "remember"
to call the polling routines from time to time.
d. Except for CPU's totally dedicated to IO, total software control of
IO is rarely satisfactory.
Example: simple microcomputer operating systems such as CPM and
MS/DOS use a polling approach to keyboard input. For
example, in CPM the keyboard ready flag is checked under
two circumstances:
- as part of a busy waiting loop whenever the current program
needs keyboard input. (At this point, busy waiting is
appropriate since the program cannot proceed until the data
is obtained.)
- Whenever IO is done to another device such as the screen
or printer, CPM also checks the keyboard flag.
The effect of this is that one can control-C a running
program which does output to the screen or printer; but the
only way to stop a program that does no IO at all is to
reset the system!
3. A third approach to IO control uses a mixture of hardware and software
techniques, by utilizing the interrupt capabilities of the CPU. This
approach is called interrupt-driven IO.
a. Most CPU's have one or more interrupt control lines, which may be
asserted by an external device. (For now, we assume just one.) We
begin by connecting the device's ready flag to the CPU's
interrupt input, so that an interrupt is requested whenever the
device needs CPU attention:
(other devices)
|\ | ___
Ready flag of device -----------| >o----O------- INT
|/ |
(Note: this gate is normally an
open collector gate to allow
multiple devices to connect to the
same line.)
b. We further arrange for the CPU to respond to the interrupt request
by executing a software routine that performs an appropriate data
transfer to/from the device, thus clearing the ready flag and
removing the interrupt condition until the operation is complete,
at which time a new interrupt will be generated.
c. Interrupt-driven IO has several complexities that must be dealt with
in a complete system:
i. With the simple hardware configuration shown above, we assumed
that we would always want the device to interrupt when it becomes
ready. In the case of a device like a printer, however, there
may be times when we have no work for it to do. In such a case,
we want to be able to tell the device to quit interrupting until
some more work comes along. This is conventionally handled by
including an interrupt-enable flag in each interface, which the
CPU can set and clear to determine whether that particular
device may interrupt.
____ (other devices)
Ready flag of device -------| \ | ___
____ | )o----O------- INT
| |----------------|____/ |
| |
|__| Interrupt-enable flip flop (settable/clearable by CPU)
ii. If the system has more than one IO device (as it generally does),
some provision must be made to cause the software for the proper
device to be invoked when the interrupt is received.
iii. Further, if two or more devices become ready at the same time we
want to guarantee that each is serviced in turn without
interference from the other. We may wish to prioritize the
interrupts so that the highest priority device gets served
first, and we may even wish to allow a higher priority device to
interrupt a lesser-priority one.
d. The problem of identifying the device responsible for the interrupt
can be handled in one of two ways:
i. The simplest approach (from a hardware standpoint) is simply to
have the CPU poll all the devices to see which one is in need
of service. However, this makes the interrupt service routine
slow, and so is not generally desirable.
ii. Instead, most systems make some provision for the interrupting
device to place some data on the system bus to identify itself.
This is done in response to an interrupt acknowledge signal from
the CPU - e.g.
_______ ________
Interrupt request | |
|____________________________|
________________ __________
Acknowledge | |
|__________________|
________________
Device identification ___________________/ \________
\________________/
(Note: we assume that the device also uses the acknowledge as an
indicator to remove its request.)
iii. Anything that will uniquely identify the device can be chosen
for the device's response; but the most typical choice is to
require the device to put on the bus a memory address which
is either:
- The starting address of a service routine for the device.
- The address of a memory location which contains the starting
address of a service routine for the device.
(The second option is more flexible since it allows system
software to be restructured without rewiring the devices,
so long as a table of service routine addresses is kept in
a fixed, known location.)
This approach is known as VECTORED interrupts.
Example: On the Intel 8086/8088, memory locations 0..3FF are
reserved for interrupt vectors. An interrupting
device puts an 8 bit interrupt type on the bus,
which the CPU uses as an index into a table of 32
bit addresses of service routines. Each device is
generally assigned a unique interrupt type from
among the 256 possible values. A very similar
approach is used by later members of the 80x86 family,
except the table can be anywhere in memory, not
necessarily at 0..3FF (a special CPU register is
used to point to it.)
iv. Vectored interrupts are the most sophisticated scheme and
are available on most medium and larger CPUs, including
many micros.
e. The problem of multiple devices interrupting at the same time
can be handled in several different ways.
i. First, all CPU's have some mechanism whereby interrupt
recognition can be temporarily disabled. (The request is
present, but the CPU ignores it until interrupts are
re-enabled.) This allows an interrupt service routine to
protect itself from interrupts by other devices.
ii. But we still have to ensure that when an interrupt is
accepted only one device will respond. One way to do this
is by daisy-chaining. We add one new input and one new output
to each interface, known as IEI (interrupt-enable-in) and IEO
(interrupt-enable out.)
The various interfaces are connected as follows:
___
INT ------------------------------------------------
| | | |
---------- ---------- ---------- ----------
Acknowledge ---|IEI IEO |---|IEI IEO|---|IEI IEO|---|IEI IEO|
---------- ---------- ---------- ----------
|| || || ||
Other ================================================
bus signals
Note that the first device receives an input of 1 when the
CPU acknowledges an interrupt, and either keeps it or passes
it on to the next device. On a Z80, the IEI input to the
first device might be derived as follows:
____ _____
IORQ ----o| \
__ | )---- IEI to first device
M1 ------o|_____/
Each interface is now wired something like the following:
____
----------------------| \
| |\ /-| )--------- IEO
| +--| >o-/ |____/
| | |/ ____
IEI ---+-----------)----------| \ Put device's vector
| | )-- on bus
Internal request --+----------|____/
|
| |\ ___
+--| >o------------ INT
|/
___
- Any device may assert INT, and multiple devices may do so at
the same time.
- However, only the requesting device nearest the CPU will see
the acknowledge signal, and so it alone will put its vector
on the bus.
- To prevent race conditions, however, we must ensure that no
device near the CPU decides to request and interrupt (and
thus "steal" IEI) when a device further down the chain is in
the process of being acknowledged. This can be done by
wiring the internal request so that it cannot be set when IEI
coming into the interface is high.
iii. Another way to handle the problem of multiple devices
interrupting at the same time is by the use of a special purpose
support chip called a priority encoder.
- As an example, a one out of eight priority encoder has 8 inputs
and 4 outputs. The inputs are numbered 0, 1, 2 ... 7, with
7 being the highest priority input and 0 the lowest.
- One of the outputs is asserted if at least one of the inputs
are asserted. This output is called GS.
- The remaining three outputs encode the number of the highest
priority input that is currently asserted. (If no input is
asserted, these outputs encode are normally ignored.) These
outputs are designated A2, A1, A0.
Examples:
- No input is asserted. GS is not asserted, A2..A0 ignored.
- Input 4 is asserted. GS is asserted, A2..A0 encode 4.
- Inputs 4,5 asserted. GS is asserted, A2..A0 encode 5.
- All inputs asserted. GS is asserted, A2..A0 encode 7.
TRANSPARENCY - TERRELL PAGE 359
Notes:
- The 8212 is a microprocessor support chip that accepts an
___
8-bit data item from an external source when its STB input
___
is asserted. So long as STB remains asserted, the data in
the 8212 will follow the inputs, changing with any change on
___
them. At the same time, the 8212 also starts to assert INT.
The data contained in the 8212 is put onto the system bus
___
when DS2 is asserted, at which time INT is de-asserted.
- In this example, the 8212 is wired to load a value suitable
for forming a Z80 interrupt vector. The connection from
__ ___
GS of the priority encoder to STB of the 8212 causes the
interrupt request process to begin as soon as any one of the
eight devices requests an interrupt. However, the interrupt
that is actually finally generated will be the highest
priority one in effect when the Z80 acknowledges, since the
8212 follows changes in the output of the priority encoder.
- This circuit relieves the individual interfaces from the task
of generating a vector to put on the bus.
- It is assumed that each individual interface removes its
interrupt request when it is addressed for a data transfer.
Thus if multiple devices are simultaneously requesting
interrupts, each will be serviced in priority order and will
remove its request at that time.
iv. Some CPU's effectively internalize what the priority encoder
does, by having multiple interrupt lines coming in at
different levels.
- For example, the PDP-11 has 4 such lines, designated
BR4 .. BR7, with BR7 being the highest priority.
Each level also has its own acknowledge line.
- The CPU contains a three bit field in the PSW that encodes
a processor priority. (This can range from 0..7). Under
normal conditions, the CPU priority will be 3 or less.
- An interrupt will only be acknowledged when the CPU
priority is less than that of the incoming request - e.g.
a BR4 request will only be acknowledged if the CPU
priority is 3 or less.
- If multiple requests are coming in, the highest priority
request is acknowledged.
- Generally, the service routine for a given device will
see that the PSW is set to a priority level equal to
the priority of the interrupt that called for the service.
This means that a service routine for a level 4 device
(e.g. a terminal) cannot be interrupted by any other
level 4 device, but can be interrupted by level 5 or
higher devices. When the service routine exits, it
resets the CPU priority to what it was on entrance.
v. This last approach is known as MULTIPLE-LEVEL interrupts. It
is the most sophisticated scheme, found on most medium and
larger CPUs (including many 16/32 bit micros)
vi. Of course, it is still possible to have more devices than levels.
(And generally this will be true.) In this case, a daisy
chain can be used to prioritize devices on the same level.
Example:
-------
| CPU |<-- Level 4 daisy chain
| |<-- Level 5 daisy chain
| |<-- Level 6 daisy chain
| |<-- Level 7 daisy chain
-------
4. Another approach to IO control is direct memory access (DMA). This is
an approach that is totally based in hardware.
a. We have noted that the speed of IO devices ranges from several
thousand times slower than the CPU to as fast as the CPU itself.
When device speeds approach those of the CPU, the other forms of
IO control we have discussed cease to be useable, since any form
of software IO requires several machine instructions (at least)
to transfer a single item of data. Thus, when device speeds
approach 10% or so of CPU speed, software control of IO becomes
impractical.
b. The alternative for fast devices is to allow the device interface
to gain control of the system bus each time it has a data item
to transfer. This means, in essence, that the interface must
contain some of the capabilities of a CPU.
- It must be able to generate the various system bus signals for
a MEMORY operation (such as MREQ) and to gate its own address
and data information onto the bus.
- Typically, the interface needs at least two registers of its
own:
- A memory address register to keep track of where the next
transfer is to go to/come from. This register must be
incremented after each transfer.
- A counter to keep track of the number of data items transferred.
Typically, the DMA interface will interrupt the CPU when this
count reaches 0.
- Often, a third register is needed. If the device itself is
addressable (as would be true in the case of a disk, say), then
the interface also needs a DEVICE ADDRESS register to keep track
of the location on the device to/from which the transfers occur.
Example: (for 8-bit CPU with 16 bit address):
D Bus
^^^^^^^^
||||||||
--------------
| Tri-state |
| buffers |<- TS Enable
|____________|
A Bus ||||||||
^^^^^^^^^^^^^^^^ |===============> To device
|||||||||||||||| ||||||||
--------------------------- -----------------------
TS enable ---> | 16 bit address register | | 8 bit data register |
Load lower ---> | with tri-state outputs | | with latched inputs |<--Load
Load upper ---> | | | and 2 sets of |
Increment ---> | | | tri-state outputs |
--------------------------- -----------------------
^^^^^^^^ ^^^^^^^^ ^^^^^^^^
|||||||| |||||||| ||||||||
D Bus D Bus ||||||||
To internal __________ -----------------------
control | | MUX |<- In/Out
----------- -----------------------
| Counter |<-Load ^^^^^^^^ ^^^^^^^^
----------- |||||||| ||||||||
^^^^^^^^ D Bus Device
||||||||
D Bus
c. When a DMA device is in use, the only task of the software is to
load up the registers in the interface and start it doing the
transfer.
d. Because DMA interfaces are complex, DMA is typically used only in
cases where device speeds make it necessary.
e. One other issue arises in conjunction with DMA interfaces:
cycle-stealing versus burst mode transfer:
i. Most interfaces are designed so that they have to
go through the process of requesting the bus and waiting for
acknowledgement for EACH data transfer done. This is fine,
so long as the data rate of the device is low enough. This
mode is called CYCLE-STEALING, because each transfer "steals"
one memory cycle from the CPU.
ii. For very fast devices (e.g. some disks), there might not be
enough time to allow the interface to request and wait for the
bus for each transfer. Such interfaces may work in a BURST MODE
in which, once the interface has control of the bus, it keeps
it until a whole block of transfers is done - i.e. it goes through
repeated memory cycles, but holds the bus request active the
whole time without ever releasing it.
5. Finally, we mention the use of separate, dedicated IO processors to
almost totally remove IO responsibility from the CPU. For example,
we have already mentioned the IBM 370 configuration. Here,
communication between the CPU and the Pio is generally by means of
shared memory plus the ability for each processor to interrupt the
other.
6. We can contrast these approaches in terms of the extent to which they
allow for the OVERLAPPING of computation and IO:
Busy-waiting No overlap possible: computation halts while
IO is being done
Polling Possibily some overlap possible
Interrupt-driven Considerable overlap - but CPU must still pause
computation to handle each byte transferred
DMA and Pio Total overlap - CPU is only involved in
initiating the request.
IV. Interrupts on the Z80
___ ___ ___
A. We have noted that the Z80 has two interrupt inputs: INT and NMI. INT
is the one typically used for ordinary IO activity.
1. The Z80 contains an internal flip-flop called IFF1 which allows
software to control the recognition of interrupts. The following
gating structure is present on the chip itself:
___ _____
INT ------o| \
| )---- interrupt recognition circuits.
IFF1 -------|_____/
___
Note the effect of this: when IFF1 is clear, an external signal on INT
is simply ignored by the CPU until IFF1 is set.
2. The setting and clearing of IFF1 is controlled by two instructions:
EI (enable interrupts) and DI (disable interrupts.) Also, IFF1 is
automatically cleared under certain circumstances:
a. When the CPU is reset. Any software that wants to use interrupts
must therefore explicitly enable them as part of its initialization
code.
b. When an interrupt is acknowledged. This means that the routine
which responds to an interrupt must re-enable interrupts, typically
just before it exits.
c. When a non-maskable interrupt is received. In this case, the
hardware stores the current value of IFF1 in a second flip-flop
IFF2 and clears IFF1. When the non-maskable interrupt handler
software exits, it may use the RETN instruction to return from
the interrupt and reset IFF1 to the saved value in IFF2.
___
3. The Z80 has three different interrupt modes that determine how an INT
interrupt (if enabled) is actually handled. These modes are called
mode 0, mode 1, and mode2. Software may set the mode by using the
instructions IM 0, IM 1, and IM 2. (The default mode at reset is 0.)
a. Mode 0 is also called the 8080 mode, since in this mode the Z80
behaves like an 8080.
b. Mode 1 provides a very simple mechanism for simple systems. It
requires minimal hardware in the interface.
c. Mode 2 provides a more sophisticated mechanism for more complex
systems.
d. We will discuss the modes in the order 1, 0, then 2.
B. Mode 1 interrupts on the Z80.
___
1. When interrupt mode 1 is selected, the Z80 responds to an enabled INT
interrupt as if the program had executed an RST 38 instruction.
That is:
a. The address of the instruction that was about to be fetched when
the interrupt was accepted is instead pushed on the system stack.
b. The PC is loaded with 38H.
c. Note that this is the same as a CALL 38H.
d. It is assumed that memory locations 38 and following contain a
routine to service the interrupt. Often, the code here will
simply jump to a service routine elsewhere in memory.
e. Note that 38H is within the address space of the MPF-I monitor
ROM. The MPF-I code beginning at 38H is the following:
PUSH HL
LD HL,(FF01)
EX (SP),HL
RET
- The effect of the first three instructions is to push the
word located at addresses FF01..FF02 (in RAM) onto the stack,
without destroying any register. (HL is used, but its original
value is restored by the exchange.)
- The RET pops this value and transfers control to it.
- The MPF-I user can therefore route a mode 1 interrupt to a
service routine anywhere in memory by placing the address of
the routine in RAM locations FF01..FF02. During initialization,
the MPF-I monitor loads these locations with the address of
its breakpoint service routine, so a mode 1 interrupt would be
treated as a program breakpoint in the absence of user
alteration of these locations.
2. In addition to the above, the Z80 also clears IFF1, disabling further
interrupts until the software executes EI again.
3. The interrupt handler software that is invoked by the interrupt should
behave as follows:
a. It must save and later restore any registers it uses, to avoid
messing up the program that was interrupted. Often, this is done
by using EX AF, AF' and EXX both at the start and finish of the
routine; but this may only be done if no other software uses these
instructions.
b. It can return to the program that was interrupted as follows:
EI
RETI
(Note: the RETI is essentially the same as an RET, and an RET could
be used instead in many cases. But RETI does have a purpose in
connection with certain special devices to be discussed later.)
4. A mode 1 interrupt does not expect the hardware to do anything
special in response to the acknowledge signal from the CPU; thus,
it imposes the least hardware requirements on the system.
5. A key limitation of mode 1 is that if the system has more than one
device capable of generating an interrupt, then there is no way for
the service routine to know "who did it", short of somehow polling all
of the devices to see which one(s) have/has its/their ready flag set.
On a system that has 8 or fewer devices capable of interrupting, a
fairly elegant scheme can be used. (This also illustrates how hardware
can be constructed to allow the software to selectively disable
interrupts from certain devices at certain times.)
TRANSPARENCY FROM TERRELL PAGE 347
C. Mode 0 interrupts on the Z80
1. Mode 0 on the Z80 imitates the interrupt-handling provisions of the
8080. When an interrupt is accepted in mode 0, the CPU does an
instruction fetch cycle, not from memory but from the IO bus. The
interrupting device is expected to recognize the interrupt
____ __
acknowledgement on the bus (IORQ and M1 both low), placing any 1-byte
opcode on the data bus. The CPU treats this as an instruction to
execute, just as if it had been fetched from memory.
2. In addition to the above, the Z80 also clears its interrupt-enable,
of course.
3. While any 1-byte opcode may be used as the response from the device,
the most common choice will be one of the 8 RST instructions
(opcodes C7, CF, D7, DF, E7, EF, F7 and FF). Each of these
instructions behaves as follows:
a. Push the current PC on the system stack.
b. Transfer control to one of the eight memory locations 0, 8, 10, 18,
20, 28, 30, or 38.
c. In other words, these instructions behave like a subroutine call;
but the subroutine address is implicit in the opcode rather than
requiring two more bytes in the instruction.
d. These instructions were put in the 8080 instruction set primarily
for use as interrupt-acknowledgement responses by devices. Due
to the requirement that the response be only 1 byte long, ordinary
CALL instructions could not be used.
4. Mode 0, then allows for 8 different devices to each force the Z80 to
execute an appropriate service routine.
Example: A system with a keyboard, display, and disk, each
capable of generating an interrupt. The system designer
decides to put the first few instructions of the service
routines in the following locations:
10 - service routine for keyboard
20 - service routine for display
30 - service routine for disk
When the keyboard generates an interrupt and sees the
acknowledge coming back, it will put the opcode for
RST 10 (D7) on the data bus.
5. Mode 0 is not terribly useful on the MPF-I, because the monitor
uses the reserved locations for the RST instructions for other
purposes, except for location 38 (as discussed above). Thus, only
RST 38 is useable.
6. As an aside, note that mode 1 can be thought of as a special case of
mode 0, in which the bus is not actually read; instead, an FF
(RST 38) is used as the op-code to execute.
D. The most flexible Z80 interrupt mode is mode 2. Mode 2 provides for
VECTORED interrupts.
1. In a vectored interrupt scheme, the device supplies the address of
a memory location which in turn contains the address of a service
routine for the device.
2. Since the Z80 data bus is only 8 bits wide and an address is 16
bits, the vector address is specified as follows:
a. The Z80 contains an internal I register, which must be pre-set
by the programmer to contain the high order byte of the
address of the interrupt vectors. Two instructions allow
access to the I register:
LD I,A
LD A,I
b. The device, when responding to an interrupt acknowledge, will
place the low order byte of the vector address on the bus.
(This value must always be even.)
c. Note that the combination I+D-Bus is not the address of the
actual service routine to execute, but rather the address of
a memory location containing the address of the service
routine.
3. As with the other modes, the Z80 pushes the PC and disables
further interrupts before loading the PC with the value specified
by the vector.
4. Example: A system with keyboard, display, and disk. The interrupt
handlers for these devices begin at
1234
2017
3029
respectively. The system designer decides to put the interrupt
vector at locations 8000 on up.
a. At system startup, the I register will be loaded with 80.
b. Memory locations 8000..8005 will contain 34 12 17 20 39 30
(remember byte-reversed format.)
c. The keyboard interface will respond to interrupt acknowledge
by placing 00 on the bus; the display 02, and the disk 04.
5. With vectored interrupts, up to 128 different devices can be
accomodated.
E. NMI on the Z80
___
1. We now can say something about how the Z80 responds to an NMI.
2. No interaction with external hardware is involved, and no acknowledge
signal is put on the bus.
3. The Z80 does the following:
a. Push the address of the next instruction.
b. Put 66H in the PC.
c. Save IFF1 into IFF2 and clear IFF1.
4. NMI interrupt handling routines should terminate with RETN, which
restores IFF1 from IFF2.
Copyright ©1999 - Russell C. Bjork