CS222 Lecture: Overall System Structure last revised 3/30/99
Objectives:
1. To overview the major building blocks of a complete computer system
2. To discuss issues and options in the design of bus systems.
Materials: Transparency of Stallings page 76
I. Introduction
-- ------------
A. At the start of the course, we noted that a computer system can be
described at five different levels of detail. More recently, we have
been focussing on one of those levels - the hardware design level - and
have divided it into three sublevels. What are they? ASK
1. The system level
2. The CPU implementation level
3. The logic design level
We focussed on the lowest of these three levels in CS221; we have been
looking at the middle level in the last week or so; now we move up
to the overall, or system level.
B. We will do this by looking at two issues:
1. The major kinds of building blocks from which complete systems are
constructed.
2. The way in which these building blocks are interconnected in order to
produce a complete system.
C. As we are doing this, we will also introduce a system of notation that
can be used to describe an overall system.
1. In this system, each major component of a system is denoted by a
single upper-case letter, with each type of component being denoted
by a different letter - e.g. a P is a processor, an M is a memory
element, etc.
2. The description is sometimes further qualified - e.g. an M might be
a. Primary memory, such as semiconductor RAM (M.primary or M.p or Mp).
b. Secondary memory, such as disk (M.secondary or M.s or Ms)
etc.
3. A formal definition of the syntax of the notation we will use is
contained in an appendix to:
C. Gordon Bell and Allan Newell. Computer Structures: Readings
and Examples (NY: McGraw Hill, 1971)
II. Basic Building Blocks
-- ----- -------- ------
A. All computer systems are built by combining certain kinds of basic
building blocks, which fall into the various categories. (We will
overview the categories now and study each in detail later in the
course.)
B. The following are basic components:
1. Memories (M)
a. A memory is a device that stores information without altering its
meaning or form - i.e. if a certain binary value is stored into a
memory, then that exact same binary value can be retrieved at a
later time.
b. The simplest form of memory is a register, that stores a single
atomic value. Registers are basic building blocks of many
components of a system. Those contained in the CPU are often
directly visible to the assembly language programmer (e.g. the
accumulator of the von Neumann machine.)
c. Other memories store multiple values, with some mechanism being
used to specify a particular value that is to be read or written.
i. Primary memory is typically organized using a linear addressing
scheme, so that each value stored is assigned a unique address
in the range 0 .. total_size - 1.
ii. Secondary memory may require that values be addressed by
physical position - e.g. surface, track, sector, location
within sector.
c. Note that we can further distinguish different types of memory -
e.g. Mp, Ms.
2. Data elements (D)
a. A data element is device that changes the meaning of information
without altering its form - i.e. it may take in a binary value and
output a binary value that is the result of some computation on the
input value.
b. A simple example is a shifter, which receives a binary value and
outputs a binary value; but the value outputted is different from
the value put in - e.g. if the shifter does a left shift then the
output is the input value * 2. (Devices like adders also fall into
this category.)
c. A more complicated example is an ALU, which can perform any one of
several operations on an operand or set of operands presented to
it. (This can be realized by a set of combinatorial circuits - one
per function - plus MUX(es) to control the inputs to the circuits
and to select which function appears as the output.)
3. Transducers (T)
a. A transducer is device that changes the form of information without
altering its meaning.
b. Most IO devices are transducers. For example:
i. A keyboard transforms the representation of a character as the
physical motion of a key into the representation of that same
character as a binary code.
ii. A screen transforms the representation of a character as a
binary code into the representation of that same character as
a pattern of dots on the screen.
4. Control elements (K)
a. A control element is a device that controls the operation of other
devices.
b. Control elements are frequently used to interface between various
devices. For example, disks (a type of M) typically have
controllers that control operations such as the positioning of the
heads.
c. Control elements are frequently realized as state machines.
C. The following are used to connect basic components:
1. Links (L)
a. A link is path for transmitting information between two points
without altering either its meaning or form.
b. Technically, even a piece of wire is a link. But links are
explicitly noted in describing a system only when the link has
important characteristics that effect system performance, such as
speed limitations.
2. Switches (S)
a. A switch is a device that provides alternate paths for information
between other devices - i.e. a set of potential links.
i. In the most sophisticated case, several links can be active at
the same time - i.e. the switch behaves like a telephone
switchboard.
ii. Simpler switches allow only one link to be active at a given
time.
b. Note that the system bus that forms the heart of many computers is
a form of switch which allows one link active at a time.
D. One last basic component is the processor (P)
1. A processor can actually be viewed as composed of simpler components:
One or more M's (registers)
D (ALU)
K (control)
One or more S's (internal data paths)
2. However, in describing a complete system it is often expedient to
treat a processor as a single unit without worrying about its internal
structure.
3. What makes a processor a processor is that it is capable of fetching,
interpreting and executing instructions - i.e. it is programmable.
4. Every computer system has at least one processor - a central processor.
a. Some have additional processors - e.g. systems with multiple CPU's
or mainframe systems that have a single CPU and some number of IO
processors or a personal computer having a single CPU and a
floating-point coprocessors.
b. When several processors are present in a system, we qualify the P
symbol to denote the type of each - e.g.
P.central or P.c or Pc
P.input_output or P.io or Pio
P.floating_point or P.fpp or Pfpp
E. When we describe networks of interconnected computer systems, we may
choose to treat an entire computer system as a single building block, for
which we would use the letter C.
F. Some examples of complete systems:
1. The VonNeumann machine:
M--P--T
or: K
/|\
M
/ | \
T--D--T
2. IBM 370/155:
Mp(#0) --
|
Mp(#1) ---- S ---|--- M.cache --- Pc --- T('Console)
| |----------------|
Mp(#2) -| | |--Ms(#0;fixed head disk)
| |--- Pio(#0) -- K --|
Mp(#3) -| | |--Ms(#1;fixed head disk)
|
|
.
.
.
| |--Ms(#0;movable arm disk)
|--- Pio(#4) -- K --|
|--Ms(#1;movable arm disk)
|
|--Ms(#2;movable arm disk)
3. A network of PC's connected to a central file-server via ethernet:
C C C C C
|_______|_______|_______|_______|___ S.ethernet
|
C.server
III. Computer Connection Structures
--- -------- ---------- ----------
A. At the highest level, a complete computer system can be regarded as being
built up from three basic modules:
1. One or more CPU's - often one per system, but sometimes more than one
(multiprocessor systems).
2. The memory system, consisting of one or more kinds of memory plus
associated controllers.
3. The IO system, consisting of some number of IO devices plus associated
controllers.
B. A key design issue in building a computer system is how these various
devices are to be CONNECTED.
1. The SPEED at which information flows between the various components
of the system can be the determining factor in overall system
performance. If system performance is limited by the speed of the
interconnection system, then technical improvements to the individual
components will not result in performance gains.
2. Interconnection systems often have a longer lifetime than individual
components; thus design decisions concerning them can have long-term
implications.
Example: DEC designed the UNIBUS as an interconnect structure for
its PDP-11's in 1970. The PDP-11 CPU went through a series of
generations, but the UNIBUS architecture connecting the CPU to
peripherals remained the same. Early members of the VAX family -
introduced in the late 1970's - still used a UNIBUS for connecting to
many of their IO devices. Thus, the UNIBUS architecture lived through
several generations of PDP-11 CPU and on into the next CPU
architecture.
(Note: later PDP-11's and some VAXes used a bus structure called the
QBus, and several other bus architectures have been used on more
various VAX models over the years.)
C. By far the most commonly-used interconnect structure today is a bus
structure in which all modules connect to a set of shared lines called a
BUS SYSTEM.
1. Today there are several industry-standard bus architectures which allow
components from several different manufacturers to be assembled
together into a single system.
Example: The PCI Bus is widely used in both Wintel PCs and MacIntoshes.
2. In addition, many manufacturers have their own proprietary bus
architectures
3. Actually - as we shall see - a given computer may have several bus
systems. Further, there are a number of fundamental design choices
which must be made when designing such a system.
IV. Overview of Bus Systems
-- -------- -- --- -------
A. A bus system is a simple form of switch (S)
1. A bus consists of a collection of wires that individual components
(CPU's, memories, and IO device interfaces) plug into.
2. These wires include some for carrying addresses, some for carrying
information, and some for control.
TRANSPARENCY: STALLINGS P. 76
Example: the Z80 system bus consists of 40 lines: 16 for address, 8 for
data, and 16 for control (including power, ground, clock.)
Example: the UNIBUS consists of 56 lines: 18 for address, 16 for data,
and 22 for control.
B. Generally, bus architectures are standardized - either by one company or
by an industry group such as IEEE. This allows many different kinds of
devices to be built to plug into it. Each device that plugs into a given
bus must know what signals to expect on what pins and what protocols will
be used to exchange information over the bus.
C. The transfer of a piece of information between two devices on the bus
involves a BUS CYCLE.
1. For each bus cycle, one device on the bus is designated the BUS MASTER,
and the other is designated the SLAVE.
a. This master may the same device all the time, or provision may be
made to allow different devices to become bus masters on different
cycles.
i. CPU's are almost always masters.
ii. Memories are almost always slaves.
iii. IO devices are generally slaves when receiving commands or data
from the CPU, but can be masters when doing DMA transfers from
memory.
b. If multiple devices can be bus masters, then each cycle must be
proceeded by some arbitration period when one device is chosen to
be the master for the current cycle. (This is generally done on a
priority basis.)
2. The first part of a bus cycle involves the bus master putting an
address on the bus, with the expectation that the slave device
will recognize it and respond.
a. When the slave is a memory, the address may consist of two parts:
i. The high order bits designate a particular memory device (if
there is more than one on the system).
ii. The low order bits designate a particular location in that
memory.
b. When the slave is an IO device, the address serves only to
designate the particular device. Furthermore, sometimes only a
portion of the address bits are used for this purpose, since the
number of IO devices on the system is usually small compared to
the total number of individually-addressable memory locations.
3. At the same time, the bus master uses control signals to indicate what
type of data transfer is to be performed.
a. Type of slave being addressed (memory, IO, or perhaps
a coprocessor). (Note: some bus architectures don't need this - no
distinction is drawn between types of devices.)
b. Direction: master to slave (write) or slave to master (read) or
(sometimes) slave to master to slave (read-modify-write).
c. Quantity of data to be transferred (one byte, one word, or
(sometimes) a block of contiguous locations.
4. The second part of the bus cycle involves the actual data transfer.
a. In the simplest case, one unit of data (i.e. as many bits as
there are in the data part of the bus) is transferred.
b. It is also possible to do BLOCK MODE transfers, in which several
units of data are transferred from successive addresses, one
after another. (The address specified in the address phase is
the address of the first unit of data).
D. Bus cycles are used for a variety of different purposes. (Note: high
performance CPU's have a small amount of very high speed on-chip memory
known as cache memory that allows most of these accesses to be done
without the need for an actual bus cycle - more on this later.)
1. Each instruction executed by the CPU may involve a bus cycle for
INSTRUCTION FETCH. Here the master is the CPU and the slave is memory.
a. On CPU's that allow variable-length instructions, several bus
cycles may be needed to fetch a complete instruction.
b. As we have seen, some CPU's prefetch instructions - i.e.
they try to maintain a lookahead of several words into the
instruction stream by issuing instruction fetch cycles when the
bus is not otherwise in use.
2. Instructions executed by the CPU may involve additional bus cycles
with the CPU as master and memory as the slave - for:
a. OPERAND ADDRESS CALCULATION may or may not require a bus cycle.
(A bus cycle is required when indirect addressing is used.)
b. OPERAND FETCH (possibly more than one on 2 or 3 address machines).
c. OPERAND STORE (possibly more than one, though multiple stores for
a single instruction are less common than multiple fetches).
3. IO instructions will involve a bus cycle with the CPU as master and
an IO controller as the slave. This may involve:
a. TRANSFER OF A COMMAND (to the controller).
e. TRANSFER OF STATUS INFORMATION (from the controller.)
f. TRANSFER OF DATA TO/FROM AN IO DEVICE.
4. On many systems, IO controllers can also initiate bus cycles. These
are of two types:
a. INTERRUPT CYCLES (CPU is the slave).
b. DMA CYCLES (memory is the slave).
5. On multi-processor systems, bus cycles may also be used for
INTER-PROCESSOR COMMUNICATION.
E. Implementation of a bus
1. This is an interesting problem because, in general, it must be possible
for different devices to drive a given line at different times.
a. Example: For a write transaction, the master drives data on to the
data bus; but for a read transaction, the slave does so. Further,
different read transactions may involve different slaves.
b. Example: If a bus can have multiple masters, then each master must
be capable of driving the address and control lines in the bus
when it is in charge.
2. This suggests that each device that can drive a given line of a bus
should contain a gate (called the driver) whose output is tied to
that line - e.g.
|
device #1 --- driver -----+
|
device #2 --- driver -----+
|
device #3 --- driver -----+
|
However, this won't work if ordinary gates are used for the drivers.
ASK CLASS WHY
3. One solution might be to implement a bus using MUXes:
_______
device #1 --------------| MUX |----- bus
device #2 --------------| |
device #3 --------------| |
. .
. .
a. This technique is frequently used for INTERNAL BUSSES in the CPU
b. But it is not a good approach for system busses. ASK WHY
- The total number of devices to be connected to the bus must be
known when it is built (inflexible)
- A lot more wires are needed - each bus slot must have its own
set of connections to the MUXes
4. The most common approach is to use TRI-STATE gates.
a. As the name implies, a tri-state gate is one whose output can be in
any of three states: 0, 1, or Hi-impedance.
b. The hi-impedance state is the new one. When the output is in this
state, it behaves like it is not connected at all - e.g. (viewing
the gate output as being like a switch)
Ordinary gate: Tri-state gate:
1 ---o 1 ---o
\___ o \___
0 ---o 0 ---o
Output is always connected Output is connected to 1 or 0
to 1 or 0 or not connected at all
c. Tri-state gates are available in many standard configurations (e.g.
AND, OR, NAND, flip-flops etc.) A tri-state gate has an additional
input called ENABLE. When this is active, the output of the gate
is determined by the other inputs, as usual; when it is inactive,
the output of the gate is effectively disconnected from the circuit.
d. Tri-state gates are realized by modifying the output circuit of
a standard gate. The following is the "totem-pole" circuit used
by TTL gates. (CMOS gates use a similar structure).
Vcc
|
__|/
|\
|
+--- output
|
__|/
|\
|
Ground
i. Each of the two transistors acts like a switch which is either
off or on. If the transistor is on, then the output of the gate
is effectively connected to Vcc or ground (as the case may be.)
(Clearly, we cannot allow both transistors to be on at the
same time. This would effectively short the power supply to
ground, resulting in the rapid destruction of one or both of
the transistors.)
ii. In an ordinary TTL gate one or the other of the two transistors
connecting to the output is on at any given time, and the other
is off, thus connecting the output to either Vcc or ground.
iii. In a disabled tri-state gate, BOTH transistors are off, thus
leaving the output effectively unconnected (as if the gate
weren't in the circuit at all).
5. A third approach is to use OPEN-COLLECTOR gates. This is useful for
cases where a given device must drive a given line either to 0 or not
at all (i.e. it never has to drive the line to 1).
a. Open-collector gates are available in many standard configurations
(e.g. AND, OR, NAND, flip-flops etc.) However, the two states of
the output are disconnected or 0. (The disconnected state occurs
when the logic function the gate implements would call for a 1
to be output.)
Ordinary gate: Open collector gate:
1 ---o o
\___ \___
0 ---o 0 ---o
Output is always connected Output is connected to 0
to 1 or 0 or not connected at all
b. Open collector gates are realized by a different modification of the
standard gate output circuit. For example, this is the way a basic TTL
"totem pole" would be turned into an open-collector gate by omitting
one of the output transistors:
+--- output
|
__|/
|\
|
Ground
c. Open collector gates are most often used when any number of devices
must be able to assert the same line at the same time - e.g. an
arbitration line representing a bus request.
i. Because the only state to which a device can assert such a line is
0, such lines are most often configured as ACTIVE-LOW - i.e. the
active state is 0 and the inactive state is 1.
ii. To make sure the line goes to 1 when no device is asserting it,
such lines normally are terminated by a PULLUP RESISTOR to Vcc.
F. Regardless of how the interfaces connect to the bus, the electrical
characteristics of a bus system have an important influence on
system performance.
1. Bus designers must take at least the following characteristics into
consideration:
a. PROPAGATION DELAY: When a bus master or slave near one end of the
bus places some information on the bus, it will take a measurable
time for that information to propagate to the other end, due to
effects of capacitance and inductance. This time increases with
increasing physical length of the bus.
b. SKEW: The propagation delay for different lines of the bus is not
necessarily the same; thus if several bits are changed at the same
time near one end of the bus, the changes may be seen at different
times at the other end. Also, if a common clock is used to
synchronize events, the clock may actually arrive at different
devices at different times.
c. LOADING: When we talked about the realization of gates, we mentioned
that a given gate can only drive so many of a given type. Since
some signals generated by a bus master must be received by all other
devices on the bus (e.g. to recognize their own address), there is
a limit as to how many devices may be plugged into the bus. Note,
too, that propagation delay tends to increase with increasing bus
load.
2. Bus designers take these factors into consideration when establishing
bus timing.
a. An appropriate interval must be allowed between the time a device
asserts a signal and the time it can expect the signal to be
received. This time is called the SETTLING TIME.
b. In the case of addresses, because slew could cause the wrong device
to respond to an address, a separate "address valid" control signal
is often used, asserted some time AFTER the address itself is put
on the bus (to ensure that all bits have settled.)
c. For over a decade now, bus speeds have lagged behind CPU speeds, so
that the basic bus cycle time is some multiple of the CPU cycle
time (e.g. 2:1 or 3:1 or 4:1).
V. General Issues in the Design of Bus Systems
- ------- ------ -- --- ------ -- --- -------
A. Before establishing the detailed assignments of different wires on the
bus, bus architects need to make a number of general design choices. We
consider these in turn now.
B. One fundamental choice, when designing an overall system, is whether to
have one bus to serve both memory and IO devices, or separate memory and
IO busses.
1. The difference can be seen by comparing diagrams:
Two busses Single bus
M--S--P--S--K--T P--S--M
|--K--T |--K--T
---K--T |--K--T
---K--T
or (w/DMA):
M--S--P--S--K--T non-DMA device
| |--K--T " "
| |
| ---K--T DMA device
|________|
or (w/CPU connected to memory bus only and an adapter used to
connect the busses)
M--S--P
|
K.adapter--S--K--T
|--K--T
|--K--T
2. On high-end computer systems, the choice is often made to have two or
more separate busses. Often considerations of speed are a reason for
going this route - a memory bus (which gets the most intense
use) can be made faster if it handles memory only, since the total
length of the bus is smaller.
3. Smaller computers generally use a single physical bus, possibly with
some control lines unique to memory operations and some to IO. (Note
that this design choice gave rise to the name UNIBUS (one bus) for the
PDP-11 bus - the first system to use this design.)
a. This makes for a simpler and less expensive system.
b. It also simplifies the building of interfaces for DMA devices. If
there are separate IO and memory busses, then DMA device controllers
must connect to both busses some how, either directly or through an
adapter that ties the two busses together.
c. Use of one bus reduces the number of pins needed on the CPU
package.
4. Note that even when there is only one PHYSICAL bus, there can be
more than one LOGICAL bus if there are several sets of control lines.
____
Example: One of the control lines on Z80 bus (MREQ) is used only
____
for memory cycles and one (IORQ) is used only for IO
cycles. Exactly one of these two lines is asserted during
any given bus cycle.
5. When a single physical bus is used, there is also a choice to be made
between using MEMORY MAPPED IO and ISOLATED IO.
a. With memory-mapped IO, both memory and IO devices use the same
address space - i.e. a "memory read" or "memory write" operation to
certain addresses actually transfers data to or from a given device.
Example: The PDP-11 UNIBUS is designed for memory-mapped IO.
Addresses 000000 to 775777 (octal) are memory addresses
776000 to 777777 are IO devices
b. With isolated io, separate address spaces are used for memory and
IO. This also requires separate control lines for each kind of
operation.
Example: The Z80 uses addresses 0000 to FFFF (hex) for memory
____
and 00 to FF (hex) for IO ports. The MREQ control
line causes memories to look at the address lines and IO
____
ports to ignore them, while IORQ causes ports to look at
the address lines and memory to ignore them.
c. Each approach has its advantages and disadvantages (ASK).
i. Advantages of memory-mapped IO:
- Fewer control lines on the bus (an important consideration with
the limited pinout of microprocessor chips.)
- The full instruction set of the processor can be used for IO,
not just a few specialized instructions. (E.g. bit-oriented
instructions can be used to test/set individual bits in
peripheral registers.)
- If the CPU itself is configured for memory-mapped IO, then the
opcodes that would have been needed for input-output
can be used for something else.
ii. Disadvantages of memory-mapped IO:
- With a limited number of bits available for addressing memory
(e.g. 16 on small microprocessors), memory-mapped IO reduces
the total amount of memory that can be installed in a system
since some of the address space must be used for IO addresses.
(This is less of a problem with processors that use wider
addresses.)
- With separate IO and memory addresses and control lines, it is
possible to tailor bus protocols to the characteristics of each
memory and IO ports separately.
- Interfacing can often be simpler with separate IO addresses -
e.g. the number of address bits to decode is smaller. (There
are many fewer ports than memory addresses.)
- Also, IO instructions can be shorter, since they need to
specify fewer address bits.
6. Finally, we should note that some systems have been built in which
the CPU connects to a single physical bus, but other busses are
present in the system, being connected to the central bus via BUS
ADAPTERS.
Example: When the VAX line was first introduced, Digital wanted to
allow customers to continue to use peripherals that worked
with the PDP-11 UNIBUS, since there were many in existence.
However, it was necessary to design a new bus (the SBI bus)
for memory, to allow more than 256K of memory to be
present. The approach used was to build a system with one
to four UNIBUSes connected to the SBI bus via a bus adapter:
P -- S.SBI -- M
|
|------ K.UNIBUS adapter --- S.UNIBUS -- (To UNIBUS peripherals)
|
|------ K.UNIBUS adapter --- S.UNIBUS -- (To UNIBUS peripherals)
C. Another fundamental choice is the WIDTH of the bus (each bus) - the
number of bits used for addresses, and the number of bits used for data.
1. The address width ultimately determines how much memory and/or how
many IO devices can be connected to the bus.
Example: The UNIBUS's 18 bits of address allow up to 256K of memory,
which seemed more than adequate when it was designed.
However, this eventually proved inadequate, and later PDP-11's
and VAXes had to resort to using a separate bus for memory,
with some fairly complicated techniques used to allow IO
devices (on the UNIBUS) to access memory on the memory bus
for DMA operations.
2. The data width helps determine bus throughput (# of bytes transferred
over the bus per second).
Example: Microprocessors are generally classified as 8 bit, 16 bit, or
32 bit not on the basis of the width of their internal data
paths, but rather on the basis of their bus width.
D. Another fundamental choice is whether to DEDICATE various lines in the
bus to certain functions, or to MULTIPLEX certain lines.
1. In our discussion so far, we have assumed that the bus contains
separate lines for address and data.
2. To reduce the width of the bus (and thus the cost of each interface),
the same lines can be used for both functions, but at different times
during the bus cycle.
a. During the first half of the cycle, they carry address information.
b. During the second half, they carry the data being transferred.
3. Of course, this could result in an increased cycle time, since the
address and data parts of the cycle cannot overlap. It also makes
the memory system interface to the bus more complex, since it must
now contain a register to hold the address during the data part of
the cycle.
4. Microprocessors with data busses wider than 8 bits sometimes have to
use multiplexing due to pinout limitations on the chip package.
E. The previous choices have dealt with the physical configuration of the
bus. Another important choice has to do with the bus PROTOCOL - the
rules whereby the bus master and slave exchange signals with one
another. Here, the fundamental choice is between SYNCHRONOUS and
ASYNCHRONOUS protocols.
1. In a synchronous protocol, all devices on the bus share a common clock.
The bus master puts signals on the bus and expects the slave to respond
within a certain time frame, without looking for explicit
acknowledgement from the slave that it has done so.
a. Example: The Z80 memory read protocol:
____________________________________
Address from CPU __/ \__
\____________________________________/
____ _______ ______
MREQ from CPU | |
|____________________________|
_______________
Data from memory ______________________/ \___
\_______________/
i. Note the delay between putting the address on the bus and
____
asserting MREQ. This ensures that the address has settled so
that only the right memory chip will respond.
ii. The protocol specifies a maximum interval between the falling
____
edge of MREQ and the time the memory gets its data on the bus.
(We will see that a separate wait line is provided for use by
memories that cannot meet this standard.)
b. Note that, in a synchronous protocol, all control signals are
generated by the master.
i. The slave has to respond by providing the data, but does not
send any control signals to the master. This simplifies the
construction of the slave interface.
ii. However, if the slave failed to respond the CPU would never
know. (A totally floating bus looks like a byte of all 1's, so
if the CPU addressed a non-functioning (or nonexistent) slave it
would think that the slave was sending it the value FF and that
would be treated as the slave's data.)
c. Note, too, that in a synchronous system all devices are expected
to be able to respond within a specific time frame when they are
addressed. Since this is not necessarily realistic, many
synchronous systems include a WAIT control signal that a device
may assert if it needs more time.
i. Example: Many bus systems - including that of the Z80 ____
we will be using in lab -include a control line called WAIT.
If the device being addressed asserts this, the progress of
the protocol is held up until it is released.
ii. A typical use of this control line is to interface memory
chips with a longer access time to a system. This is the
origin of the phrase "zero wait state memory" - describing
a system whose memory chips are fast enough not to require
use of this facility.
2. In an asynchronous protocol, the CPU and port EXCHANGE a series of
signals.
a. For example, the following is the protocol for a memory or IO
IO read on the MC68000 microprocessor:
___________________________________
Address from CPU __/ \__
\___________________________________/
Address and data ______ ____
strobes from CPU \___________________________/
____________________________
Data from slave ___________/ \_
\____________________________/
_____
DTACK from slave _________________ ___
\___________________/
i. The CPU puts the address on the bus, waits for settling time,
then asserts the strobes (three separate lines). It then
waits for the memory/port to respond.
ii. The memory or port addressed places its data on the bus, waits
_____
for settling time, and then asserts DTACK.
iii. The CPU captures the data, then releases its strobes. After
a settling time it also releases its address.
_____
iv. The port, seeing the strobes no longer asserted, releases DTACK,
then (after a settling time) its data.
b. This exchange of control signals is often referred to as
"HANDSHAKING".
Note that the issue in the handshake is the TRANSMISSION of the
data, not its PROCESSING by the device. For example, a printer
may take several milliseconds to print a character. But its
interface will handshake with the CPU when the character to be
printed has been received, not when it has actually been
printed. (The CPU must still poll a status bit separately to be
sure that the printer has finished printing the preceeding
character.)
3. The choice of synchronous versus asynchronous protocols is
basically made by the CPU designer. However, if desired,
handshaking can often be added to a synchronous system (e.g. by
appropriate use of WAIT on the Z80.)
For example, suppose we wanted to use handshaking for IO (only) ____
on a Z80 based system. For this, we add the new control line SYNC
and require that the addressed device assert this to indicate that
it has responded to the transfer request. Of course, now we must
make the Z80 wait during an IO cycle for this response to occur.
____
We can use WAIT for this, as follows:
____ _________ Note that this NAND gate
SYNC ____|\o____| \ ____
____ |/ | \ ____ asserts WAIT to the Z80
IORQ ____|\o____| )o____ WAIT if we are in an IO cycle
__ |/ | ) ____ __
M1 ___________| / (WAIT low and M1 high)
|________/ and the device hasn't
____
yet responded(SYNC high)
4. Both synchronous and asynchronous protocols have their pros and cons:
a. In favor of the synchronous approach:
i. Interfacing is simpler: the slave does not need to send any
signals back to the CPU. (However, this advantage goes away
if the slave is slow and must request wait states.)
ii. The synchronous approach is faster overall, since fewer signals
must be put on the bus. (Recall that each signal must be
followed by a settling time before other activity can occur.)
b. In favor of the asynchronous approach:
i. This approach can accomodate a wide variety of interface speeds
mixed on the same bus.
- This allows older and newer technology interfaces to be used
on the same system, increasing the range possible devices that
can be interfaced.
- If a slow interface is replaced with a fast interface, system
speed immediately improves without changing anything else.
ii. This approach gives a positive assurance that the requested
data transfer has actually occurred - i.e. the interface
addressed exists, is working, and was able to respond. (Of
course, if an attempt is made to access a nonexistent or
nonfunctional device, the CPU could wait forever for a
handshake signal that never comes. This is usually handled by
having a bus timeout mechanism that causes a trap to a software
routine that deals with the problem.)
c. On systems having separate busses for memory and IO, it is common
to find that the memory bus is synchronous (for speed, and since
the memories can be assumed to be of uniform technology), while the
IO bus is asynchronous (for interface flexibility.)
F. Finally, if more than one device can serve as a bus master, there is the
matter of BUS ARBITRATION - how is a master chosen if more than one
device wants to use the bus at the same time?
1. There are two basic approaches that can be taken.
a. A centralized approach: one device (often the CPU) is designated
as the bus ARBITRATOR. All requests to use the bus are routed to
it and it gives permission on a priority basis.
b. A decentralized approach: all potential bus masters look at the
arbitration lines, and the highest priority device recognizes that
it has priority and proceeds while all others wait.
2. An example of a centralized approach: DAISY-CHAINING:
a. Each device has an BUS-GRANT INPUT (BGI) and a BUS-GRANT OUTPUT
(BGO).
b. The devices are connected in a chain, such that the BGO of one
device connects to the BGI of its neighbor. The first device
on the chain receives an external grant signal (usually coming
from a centralized arbiter) and the last device on the chain has
no connection from its BGO. Usually, all devices are also connected
to a common request line.
_______
REQUEST ------------------------------------------------
Arbiter | | | |
---------- ---------- ---------- ----------
GRANT ---|BGI BGO |---|BGI BGO|---|BGI BGO|---|BGI BGO|
---------- ---------- ---------- ----------
|| || || ||
Other ================================================
bus signals
c. When the arbiter sees an incoming bus use request and is able to
grant it, it asserts BGI to the first device.
d. Each device behaves as follows:
i. If its BGI is not asserted, then it does not assert its BGO.
ii. If its BGI is asserted then
- If it wants the bus, it uses it and leaves BGO unasserted.
- Otherwise, it asserts BGO.
e. The result is that, if multiple devices request the bus, only the
one nearest the arbiter gets to use it.
Copyright ©1999 - Russell C. Bjork