CPU Control Unit Implementation

CS222 Lecture: CPU Control: Hardwired control, Microprogramming
                                                        Last revised 3/13/99

Objectives:

1. To explain the concept of a control word
2. To show how control words can be generated using hardwired control
3. To explain the concept of microprogramming
4. To discuss key variations: horizontal vs diagonal; sequencing 
5. To discuss advantages and disadvantages of microprogramming

Materials:

1. Wilkes scheme transparency
2. Handouts of example machine architecture, organization, specifications for 
   hard-wired control, and microprogram.

I. Introduction
-  ------------

   A. We have seen that a CPU - whether simple or complex - basically consists
      of a control unit, plus a data part, encompassing:

      1. A set of registers (including registers that interface to the system
         bus)

      2. A set of D-units (adders, shifters etc.)

      3. A set of data paths (busses) connecting the above.

      (We continue to assume a single data path shared by all steps of
       instruction execution, and sequential execution of instruction steps.
       When we discuss parallelism, we will see that some components will have
       to be replicated.)

   B. The data part is capable of performing a set of micro-operations or 
      primative computations that can be performed in one minor cycle (clock 
      pulse).  Each micro-operation changes the contents of a single register.  
      An instruction in the user-visible instruction set must be programmed as a
      series of micro-operations (some of which may be done in parallel on the
      same clock pulse.)  The bus system is also capable of various 
      micro-operations - e.g. performing a read cycle, write cycle etc.

   C. Control of the system is accomplished by a control unit that -at the start
      of each minor cycle - activates the necessary control functions to cause
      the data part to perform the desired micro-operation(s) on the next clock
      pulse.  This can be pictured as follows:

        --------                        --------
                    ----------------->  Registers,
        Control     ----------------->  ALU,
                    ----------------->  data paths
                                        
                                        --------
                    ----------------->  Bus System
        --------                        --------

   D. The set of control signals that pass from Control to the data part and bus
      system is called a micro-word or control word.  Conceptually, each bit of 
      this micro-word corresponds to the enabling of one particular 
      micro-operation that some system component can perform - e.g.

        AC <- 0         (one bit)
        AC <- not AC    (one bit)
        AC <- AC + 1    (one bit)
        ...

      1. Note that, on a sophisticated machine, a microword could well comprise
         hundreds of bits - most of which will be zero.

      2. The description above is somewhat simplified from reality.  On many
         machines, advantage is taken of the fact that certain micro-operations
         are mutually exclusive - e.g. one does not simultaneously gate two
         different registers onto the same bus or load the same register from
         two different sources.  Thus, what passes from the control unit to the
         data part is often an encoding - e.g. 4 bits in the control word may be
         used to select one of 16 registers is gated onto a specific bus.  In 
         this case the bus will contain a MUX to decode the 4 bits and select 
         the correct register. Note, however, that the ultimate micro-word does 
         in fact contain the full 16 bits - what has been done, in effect, is to
         place part of the control unit (the MUX) in the data part, so that the
         complete micro-word only exists internally there.

   E. The job of the control unit designer is to develop a means whereby an
      orderly sequence of control words may be presented to the data part (and 
      other hardware such as the memory) - one per clock pulse.

      For example, consider a very simple CPU that deals only with one size of
      binary integer operands, has a single accumulator and uses one address 
      instructions, each one word long, with a format like the following:

        op-code operand-address

      For such a machine, performing an instruction involves some combination
      of instruction fetch, instruction decode, operand address calculation,
      operand fetch, execution, and operand store phases.

      1. The instruction fetch phase of each instruction on this machine might 
         involve the following sequence of micro-operations - where each line
         represents the operations to be performed on one clock pulse:

                MAR <- PC
                MBR <- M[MAR], PC <- PC + 1

         (This is common to all instructions)

      2. The instruction decode phase of each instruction on this machine might
         involve the following micro-operation:

                IR <- MBR(op-code part)

         (This is common to all instructions.  From here on, the contents of 
          the IR will now determine what micro-operations are done next.)

      3. For an ADD instruction, the remaining phases might look like this.

                MAR <- MBR(address part)        (Operand address calculation)
                MBR <- M[MAR]                   (Operand fetch)
                AC <- AC + MBR                  (Execute)
                                                (no operand store)

      4. For an unconditional branch instruction, these phases might look like
         this:

                MAR <- MBR(address part)        (Operand address calculation)
                                                (no operand fetch)
                PC <- MAR                       (Execute)
                                                (no operand store)

      5. For a store accumulator instruction, these phases might look like this:

                MAR <- MBR(address part)        (Operand address calculation)
                                                (no operand fetch)
                MBR <- AC                       (Execute)
                M[MAR] <- MBR                   (Operand store)

      6. For a shift accumulator left one place instruction, we need only an 
         execute phase:
                                                (No operand address calculation)
                                                (No operand fetch)
                AC <- shl AC                    (Execute)
                                                (No operand store)

      7. If indirect addressing is used, then we must add to the operand address
         calculation in each case (except, of course, the shift):

                MBR <- M[MAR]
                MAR <- MBR

   F. There are two basic ways such a sequence of control words can be 
      generated:

      1. Hardwired control: The control unit is implemented as a state machine,
         with combinatorial circuits generating each of the control functions 
         on the basis of the current state and certain variables such as the 
         op-code of the user instruction undergoing execution.

         a. For the example above, we might have a major state register that
            indicates which of the six phases of instruction execution we
            are currently in.  The transitions might include the following (for
            just the three instructions we considered above):

                                     Store AC
         <-----------------------------------------------------------------
       /                      Add, Branch, Shift                           \
      /<---------------------------------------------------                 \
     //                                 Store AC           \                 \
    //                              ------------------>     \                 \
    \\                             / Add,Branch        \    / Store AC        /
      --> INSTR --> INSTR   --> OPERAND --> OPERAND --> EXECUTE --> OPERAND --
          FETCH     DECODE      ADDRESS     FETCH      /            STORE
                          \     Shift                 /
                           -------------------------->

         b. Each major state might in turn be divided into minor states, each
            corresponding to one clock cycle.  These might be designated t1,
            t2 etc.  The number of minor states needed for a given major
            state is a function of how many sequential micro-operations are
            needed to accomplish the task of the that state, which may in
            turn depend on the instruction being executed.

            e.g. The operand address calculation state needs one minor
                 state if direct addressing is used, to accomplish the
                 single required operation MAR <- MBR(address part).
                 However, if indirect addressing is called for by the
                 current instruction, two more minor states are needed.

         c. The conditions for each micro-operations would include an and
            of the current major state and minor state - e.g the
            micro-operation MAR <- PC that is part of instruction fetch 
            would have the following conditions for activation:

            (State = InstFetch and t1) : MAR <- PC

            This might be abbreviated down to:

                I1: MAR <- PC

            where I1 means minor state 1 of the Instruction fetch cycle.

         d. Obviously, some micro-operations would be activated at many
            places and so would have more complex conditions - e.g.
            MBR <- M[MAR]

            (State = InstFetch and t2) or (State = OperandFetch) : MBR <- M[MAR]

         e. Given that the above represents only a small fraction of the
            complete instruction set of a very simple machine, it is easy to
            see that the control equations for the various micro-operations
            could get quite complex!

      2. Microprogrammed control.  The various control words needed to
         implement the user instructions are stored in a ROM, with a sequencer
         causing the appropriate control word to be fetched at each clock
         cycle and fed to the rest of the CPU.

II. An Example of Hardwired Control
--  ---------- -- --------- -------

   A. To get some feel for what is involved in hardwired control, we will
      design the control unit for a very simple, hypothetical machine - similar
      to the one being used in a homework problem.  It is very loosely based on
      the DEC PDP-8, a 1970's minicomputer that has become a favorite examples
      for teaching CPU implementation because it is probably the simplest
      commercial CPU ever to see widespread usage.  (Gordon's very first
      computer was a PDP-8).  We will ignore a number of efficiency issues in
      order to keep the design as simple and understandable as possible.

      The machine will have the following characteristics:
          
      1. Word Length:           16 bits         (The PDP-8 used 12 bit words!)
          
      2. Memory:                2048 x 16       (The PDP-8 was 4096 x 12!)

      3. Instruction Format:    All instructions are one word long
                                    op-code: 4 bits
                                    indirect addressing flag: 1 bit
                                    address: 11 bits
        
      4. Memory mapped IO is used, so no separate IO instructions are needed.

      5. Interrupts are handled as follows: save current PC at address 0 and 
         begin executing at address 1.

      6. Instruction set:
          
      Op-code   Instruction
          
      0000      ADD
      0001      SUB
      0010      AND
      0011      OR
      0100      LOAD AC
      0101      STORE AC
      0110      JMS             (memory <- PC; PC <- effective address + 1)
      0111      JMP
          
      1000      CLR AC
      1001      SHL AC
      1010      SHR     AC
      1011      ASHR AC
      1100      INC AC
      1101      DEC AC
      1110      SKPZ            (skip next instruction if AC = 0)
      1111      SKPN            (skip next instruction if AC < 0)
                  
      (Note that op-codes 0000-0111 involve a memory address, while 1000-1111
      do not.  For the latter instructions, bits 11:0 of the instruction are
      ignored.)
          
   B. The internal organization of this machine might look like this:
   
            ------------------------- Result (RBus)
            |                       |
            |                      ALU 
            |                      /  \
            |                    S1    S2
            |                    |      |
            |                    |      +-------- 0000000000000001        
            |                    |      |                 
            |                    |      |                 
            +---------> AC-------+      +-------- MBR <----->
            +---------> PC-------+                 ^            Memory
            +---------> MAR --+--+                 |
            |                 |                    |
            +---------> IR    +--------------------|-------->
            |______________________________________|
                        
                (Note: AC, MBR are 16 bits; PC, MAR are 11; IR is 5)
                       
         - Any of the registers can be loaded from the output of the ALU.
         - AC, PC, and MAR can drive the S1 input to the ALU (but only one at
           any time).  It will be 0 if no other input is specified. 
         - MBR can drive the S2 input.  The S2 input can also be driven by the 
           constant 1.  It will be 0 if no other input is specified. 
         - MAR and MBR have connections to memory (one way for MAR, two way for 
           MBR).

      1. The control word would contain the following bits - divided into
         mutually exclusive groups as shown:
   
         Enable AC to S1                (EAC)
         Enable PC to S1                (EPC)
         Enable MAR to S1               (EMA)
         (Note: if neither EAC nor EPC nor EMA is enabled, S1 = 0)
                
         Enable MBR to S2               (EMB)
         Enable 1 to S2                 (EONE)
         (Note: if neither EMB nor EONE is enabled, S2 = 0)
                
         RBus := S1 + S2                (EADD)
         RBus := S1 - S2                (ESUB)
         RBus := S1 and S2              (EAND)
         RBus := S1 or S2               (EOR)
         RBus := SHL S1                 (ESHL)
         RBus := SHR S1                 (ESHR)
         RBus := ASHR S1                (EASHR)
         (Note: if none of the above is specified, the RBus value is undefined)

         Memory read                    (READ)
         Memory write                   (WRITE)
                  
         The next group is not mutually-exclusive - more than one can be
         specified at the same time:

         AC := RBus                     (LAC)
         PC := RBus                     (LPC)
         MBR := RBus                    (LMB)
         MAR := RBus[10:0]              (LMA)
         IR := RBus[15:11]              (LIR)
                
         (We will put only the op-code part of a given instruction into the
          IR; we will get the address part from the MBR when we need it.)

      2. We will assume instruction execution is carried out in 2 or 3 major
         cycles.  Note that these do not correspond directly to the phases of
         instruction execution - F and E do several phases, and D does part of
         a phase.  The division is based on the fact that each major cycle
         does (in most cases) a single access to memory, and is based on the
         implementation of the DEC PDP-8.
   
         a. Instruction fetch/decode (always)   (F)     (IF, IOD, part of OAC)
         b. Indirect address (op = 0xxx1)       (D)     (part of OAC if needed)
         c. Execution (always)                  (E)     (OF, EXEC and/or OS)
          
         Also, we will assume the possibility of an interrupt/exception cycle
         (I).

      3. The sequence of major cycles can be realized by the following state 
         machine:
                 
                (IR[4] = 0 and IR[0] = 1)                (interrupt request)
             F -----------------------------> D ----> E ------------------> I
            ^ \                                      /  \                    \
           /    ------------------------------------>    \                    \
          /           (otherwise)                         \ (otherwise)        \
          \                                               /                    /
           \                                             /                    /
            <-------------------------------------------+<-------------------
                
         This can be built by implementing the following state table -
         assuming that the states are encoded by F=00, D=01, E=10, I=11

         Current        IR[4] IR[0] Interrupt   Next
         state                                  state

         00             0       0       -       10
         00             0       1       -       01
         00             1       0       -       10
         00             1       0       -       10
         01             -       -       -       10
         10             -       -       0       00
         10             -       -       1       11
         11             -       -       -       00

      4. Each major cycle will be divided into three minor cycles - designated
         F0, F1, F2, or D0, D1 ... etc.  (We assume that state transitions 
         in the major cycle state machine occur on transitions from minor cycle
         2 to minor cycle 0.)  [ This is not highly efficient, since some
         major cycles need only two or one minor cycle.  It is based on the
         DEC PDP-8, whose memory technology required 4 minor cycles per major
         cycle in all cases - and which did instruction execution as part of
         the F cycle for instructions not needing to reference memory. ]
                 
         The minor cycles can be generated by a modulo-3 counter connected
         directly to the system clock.  The outputs of this counter, together
         with the state machine, can be connected to a 1 out of 16 decoder to
         produce 12 control signals labelled F0, F1 .. I2, I3 (4 outputs of
         the decoder will be unused).
                                    ___________
         Clock ---> Major State ==> |         |--F0
                |   Register        | Decoder |--F1
                |       ^           |         |--F2
                |       |           |         |--D0
                --> Minor State ==> |         |--D1
                    Register        |_________| ...

   C. We now need to determine the derivation of each of the 19 control
      signals as a function of the current state (major and minor) and the
      contents of the instruction register (IR), plus the state of the AC
      (=0, <0) in a few cases.

      1. To get started, we need to derive the sequence of micro-operations
         corresponding to each major cycle.  For each micro-operation we note
         the control signals that must be active.
                 
         F cycle:       F0: MAR := PC           EPC, EADD, LMA
                        F1: MBR := M[MAR],      READ, EPC, EONE, EADD, LPC
                            PC := PC + 1
                        F2: IR := MBR[15:11],   LIR, EMB, EADD, LMA
                            MAR := MBR[10:0]
                                                        
         D cycle:       D0: MBR := M[MAR]       READ
                        D1: MAR := MBR[10:0]    EMB, EADD, LMA
                        D2: (no operation)

         E cycle:       Depends on op-code, as follows:
                 
                ADD     E0: MBR := M[MAR]       READ
                        E1: AC := AC + MBR      EAC, EMB, EADD, LAC
                        E2: (no operation)
                                                
                SUB     E0: MBR := M[MAR]       READ
                        E1: AC := AC - MBR      EAC, EMB, ESUB, LAC
                        E2: (no operation)
                                                
                AND     E0: MBR := M[MAR]       READ
                        E1: AC := AC and MBR    EAC, EMB, EAND, LAC
                        E2: (no operation)
                                                
                OR      E0: MBR := M[MAR]       READ
                        E1: AC := AC or MBR     EAC, EMB, EOR, LAC
                        E2: (no operation)
                                                
                LOAD    E0: MBR := M[MAR]       READ
                        E1: AC := MBR           EMB, EADD, LAC
                        E2: (no operation)
                                                
                STORE   E0: MBR := AC           EAC, EADD, LMB
                        E1: M[MAR]:= MBR        WRITE
                        E2: (no operation)
                                                
                JMS     E0: MBR := PC           EPC, EADD, LMB
                        E1: M[MAR] := MBR,      WRITE, EMA, EONE, EADD, LPC
                            PC := MAR + 1
                        E2: (no operation)
                                                
                JMP     E0: PC := MAR           EMA, EADD, LPC
                        E1, E2: (no operation)
                                
                CLR     E0: AC := 0             EADD, LAC
                        E1, E2: (no operation)
                                
                SHL     E0: AC := shl AC        EAC, ESHL, LAC
                        E1, E2: (no operation)
                                
                SHR     E0: AC := shr AC        EAC, ESHR, LAC
                        E1, E2: (no operation)
                                
                ASHR    E0: AC := ashr AC       EAC, EASHR, LAC
                        E1, E2: (no operation)
                               
                INC     E0: AC := AC + 1        EAC, EONE, EADD, LAC
                        E1, E2: (no operation)
                                
                DEC     E0: AC := AC - 1        EAC, EONE, ESUB, LAC
                        E1, E2: (no operation)
                                
                SKPZ    E0&(AC = 0): PC := PC+1 EPC, EONE, EADD, LPC
                        E1, E2: (no operation)
                                
                SKPN    E0&(AC < 0): PC := PC+1 EPC, EONE, EADD, LPC
                        E1, E2: (no operation)
                                
        I Cycle:        I0: MAR := 0            EADD, LMA
                        I1: MBR := PC           EPC, EADD, LMB
                        I2: M[MAR] := MBR,      WRITE, EONE, EADD, LPC
                            PC := 1

      2. Now, we are almost ready to derive the complete equations for each 
         control signal.  However, it will be helpful if we connect the four op
         code bits of the IR to a one out of 16 decoder to derive intermediate
         signals corresponding to each machine instruction - e.g. the
         control signal ADD will be active just when the IR contains 0000 -
         the op-code for ADD.

      3. Now, we simply collect all references to each control signal to
         derive its equation.   We will need to look carefully for opportunities
         to reduce the complexity of the resulting realization.
                 
         Example: EPC is active on F0, I1, and E0 for JMS, SKPZ and SKPN if
                  the skip is taken.  This leads to the following logic to
                  derive EPC:
                                        
                  F0 -------------
                  I1 -------------
                                        
                  JMS --- and ----
                  E0  ---
                                     or --- EPC
                  SKPZ -- 
                  E0 ---- and ----
                  AC=0 --
                                        
                  SKPN --
                  E0 ---- and ----
                  AC[15]-

         Example: READ is active on F1, D0, and E0 for op-codes ADD .. LOAD.
                                  
                  F1 -------------------------------
                  D0 -------------------------------
                                                        or ---- READ
                  E0 --------------
                                  
                  ADD ----          and ------------
                  SUB ----
                  AND ---- or -----
                  OR  ----
                  LOAD ---

III. Microprogramming
---  ----------------

   A. As you can see, for even a very simple machine like the one we just
      looked at, hardwired control leads to very complex control logic.  For a 
      more complex machine like the VAX, the complexity would make hardwired 
      control virtually impossible.  Thus, the majority of CISCs use 
      microprogramming as a means of keeping the complexity of control within 
      limits (at the cost of a somewhat slower execution cycle.)

   B. The original micro-programming proposal was made by Wilkes and Stringer
      in 1951.

      -- TRANSPARENCY
      -- explain
      -- note two parts to the control word:
         -- microword to control ALU
         -- sequencing control

      1. The original proposal - calling for the use of a diode matrix - was
         never implemented.

      2. The earliest micro-programmed machines used core-technology type ROMs
         (--explain).  Current machines often use semiconductor ROMs.  Thus, in
         practice the decoding tree and matrices are replaced by a ROM that
         fetches a new control word on each clock pulse, based on the contents
         of a control address register (corresponding to register I in Wilkes'
         scheme.)

      3. It is also possible to use a writeable memory (PROM or RAM) for the
         control memory.  Most often this represents only a portion of the
         control store, with the rest still being ROM.  This allows for:

         a. Dynamic microprogramming - e.g. for adding custom user instructions
            to the standard set or emulating another machine.

         b. Diagnostics - a microprogram that exercises a suspected portion of
            the circuitry one micro-operation at a time may be loaded to assist
            in the isolation of hardware flaws.

   C. A micro-programmed implementation of our example machine.

      1. Structure of the control unit:

                -------------------------------------
                | Control store - small, fast ROM   |
                | 64 words x 32 bits                |
                |                                   |
                -------------------------------------
                 ||||||||||||||||||| ||| |||||| ||||
                -------------------------------------
                | Current word from control store   |
                -------------------------------------
                 ||||||||||||||||||| ||| |||||| ||||
                 ||||||||||||||||||| ||| |||||| ||||
                 Control word to     used by    not
                 registers, data     sequencer- used
                 paths, ALU, memory  see below

      2. Micro-word format

        ----------------------------------------------------------
        | Control word to send to data part | Sequencing control |
        ----------------------------------------------------------

         a. The control word part would be 19 bits wide - 1 bit for each
            possible micro-operation - e.g.

        E E E E E E E E E E E E L L L L R W L
        A P M M O A S A O S S A A P M M E R I
        C C A B N D U N R H H S C C B A A I R
                E D B D   L R H         D T
                              R           E

         b. We will say more about sequencing control in a moment.  

      3. Now, we can encode the control word part of the microwords that
         correspond to our previously written "microprogram" for our simple
         machine.  Assume we place the control word that corresponds to F0
         in control store location 0.  In general, we assign control words to
         successive locations in control store; the exceptions will be
         explained shortly.  In the table below, the column labelled "Next" 
         specifies which micro-word is to follow the current word, and thus 
         corresponds to the state machine portion of the hardwired 
         implementation.

        Address State   Control-word part       Next
        (hex)           (binary)
        00      F0      0100010000000001000     01
        01      F1      0100110000000100100     02
        02      F2      0001010000000001001     08 if indirect addressing is
                                                   specified; else 03
        03              0000000000000000000     depends on op-code - one of 20,
                                                22, 24, ... 3C, 3E
        04-07   (not used)
        08      D0      0000000000000000100     09
        09      D1      0001010000000001000     depends on op-code - one of 20,
                                                22, 24, ... 3C, 3E
        0A-0F   (not used)
        10      I0      0000010000000001000     11
        11      I1      0100010000000010000     12
        12      I2      0000110000000100010     00
        13-1F   (not used)
        20      E0-ADD  0000000000000000100     21
        21      E1-ADD  1001010000001000000     00 or 10 if interrupt pending
        22      E0-SUB  0000000000000000100     23
        23      E1-SUB  1001001000001000000     00 or 10 if interrupt pending
        24      E0-AND  0000000000000000100     25
        25      E1-AND  1001000100001000000     00 or 10 if interrupt pending
        26      E0-OR   0000000000000000100     27
        27      E1-OR   1001000010001000000     00 or 10 if interrupt pending
        28      E0-LOAD 0000000000000000100     29
        29      E1-LOAD 0001010000001000000     00 or 10 if interrupt pending
        2A      E0-STOR 1000010000000010000     2B
        2B      E1-STOR 0000000000000000010     00 or 10 if interrupt pending
        2C      E0-JMS  0100010000000010000     2D
        2D      E1-JMS  0010110000000100010     00 or 10 if interrupt pending

        2E      E0-JMP  0010010000000100000     00 or 10 if interrupt pending
        2F      (not used)
        30      E0-CLR  0000010000001000000     00 or 10 if interrupt pending
        31      (not used)
        32      E0-SHL  1000000001001000000     00 or 10 if interrupt pending
        33      (not used)
        34      E0-SHR  1000000000101000000     00 or 10 if interrupt pending
        35      (not used)
        36      E0-ASHR 1000000000011000000     00 or 10 if interrupt pending
        37      (not used)
        38      E0-INC  1000110000001000000     00 or 10 if interrupt pending
        39      (not used)
        3A      E0-DEC  1000101000001000000     00 or 10 if interrupt pending
        3B      (not used)
        3C      E0-SKPZ 0000000000000000000     3F if AC is 0, else 3D
        3D              0000000000000000000     00 or 10 if interrupt pending
        3E      E0-SKPN 0000000000000000000     3D if AC is NOT negative;else 3F
        3F              0100110000000100000     00 or 10 if interrupt pending
                                                
      4. The final issue we must consider is sequencing.

         a. Note in the example above that the next micro-word to be fetched
            after a given micro-word is determined in one of the following
            ways:

            i. The next micro-word occurs in the next sequential location
               in control store.

           ii. The next micro-word is either micro-word 00 or 10, depending
               on whether or not an interrupt is pending.  This is the case
               for the last micro-word of the execute cycle of each
               instruction.

          iii. The next micro-word is either micro-word 3B (if a certain
               condition holds about the AC) or the next micro-word in
               sequence.

           iv. The next micro-word depends on the contents of the IR - it is
               the control word that corresponds to the E0 state of the
               instruction whose opcode is in the IR.  This happens after
               the final F state of instructions that do not use
               indirect addressing, and after the final D state of instructions
               that do.

         b. These cases can be handled in the following ways:

            i. We can dedicate 6 bits of the micro-word to hold the
               address of the next micro-word to be used. This can handle the 
               case where we execute micro-words sequentially, and could
               handle other branches in the micro-program if needed.

           ii. We can build a "conditional branch" facility in the
               sequencer - if a specified condition is true, take the next
               micro-word from the address specified in the last 6 bits;
               else simply add 1 to the address of the current micro-word.

               The conditions we have to be able to test for are:

               indirect address type instruction (IR[4]=0 and IR[0]=1)
               AC = 0
               AC < 0

               (In effect, we build a small portion of the logic that would
                be used for hardwired control to handle these cases)

          iii. We can build a decode facility in the sequencer, where the
               address of the next micro-word is formed by taking the op-code
               stored in the IR (bits 4..1), preprending a 1 and post-pending
               a 0 to form a six bit address.  (Note how the control words
               for E0 of each instruction have been laid out to facilitate
               this - e.g.

               op code = 0000 (ADD) - E0 step is at 1 0000 0 = 20 hex.
               op code = 0001 (SUB) - E0 step is at 1 0001 0 = 22 hex
               ...
               op code = 1111 (SKPN) - E0 step is at 1 1111 0 = 3E hex

          iv. We can build interrupt detection logic into the sequencer -
              at a certain point, if no interrupt is pending, we go to
              control-word 0, else we go to control-word 8.

         c. This leads to a sequencing portion of the micro-word with the
            following format:

            operation (3 bits) micro-program address (6 bits)

            000 = take next micro-word from next location in control store
            001 = decode op code by or-ing 4 bit op-code with address
                  specified in next 6 bits
            010 = if an interrupt is pending, then take the next micro-word
                  from address in control store specified by next 6-bits; else
                  take the next micro-word from location 0 in control store.
            011   (not used)
            100 .. 110 conditional branch in microprogram.  If specified
                       condition is true, take next control word from
                       address in control store specified by next 6 bits; else
                       take next control word in sequence - where condition is

                100 - MBR[15]=0 and MBR[11] = 1 *
                101 - AC = 0
                110 - AC >= 0

         * We test bits of MBR rather than IR because the one place where this
           is used is part of the same microinstruction that does IR <- MBR.

      5. Thus, we end up with a micro-word having a total of 28 bits, which
         we might implement using a 32-bit ROM with 4 bits of each word
         unused.  The contents would look like this - include both fields:

        00      F0      0100010000000001000     000 000001
        01      F1      0100110000000100100     000 000010
        02      F2      0001010000000001001     100 001000 
        03              0000000000000000000     001 100000
        04-07   (not used)
        08      D0      0000000000000000100     000 001001
        09      D1      0001010000000001000     001 100000
        0A-0F   (not used)
        10      I0      0000010000000001000     000 001011
        11      I1      0100010000000010000     000 001100
        12      I2      0000110000000100010     000 000000
        13-1F   (not used)
        20      E0-ADD  0000000000000000100     000 100001
        21      E1-ADD  1001010000001000000     010 010000
        22      E0-SUB  0000000000000000100     000 100011
        23      E1-SUB  1001001000001000000     010 010000
        24      E0-AND  0000000000000000100     000 100101
        25      E1-AND  1001000100001000000     010 010000
        26      E0-OR   0000000000000000100     000 100111
        27      E1-OR   1001000010001000000     010 010000
        28      E0-LOAD 0000000000000000100     000 101001
        29      E1-LOAD 0001010000001000000     010 010000
        2A      E0-STOR 1000010000000010000     000 101011
        2B      E1-STOR 0000000000000000010     010 010000
        2C      E0-JMS  0100010000000010000     000 101101
        2D      E1-JMS  0010110000000100010     010 010000
        2E      E0-JMP  0010010000000100000     010 010000
        2F      (not used)
        30      E0-CLR  0000010000001000000     010 010000
        31      (not used)
        32      E0-SHL  1000000001001000000     010 010000
        33      (not used)
        34      E0-SHR  1000000000101000000     010 010000
        35      (not used)
        36      E0-ASHR 1000000000011000000     010 010000
        37      (not used)
        38      E0-INC  1000110000001000000     010 010000
        39      (not used)
        3A      E0-DEC  1000101000001000000     010 010000
        3B      (not used)
        3C      E0-SKPZ 0000000000000000000     101 111111
        3D              0000000000000000000     010 010000
        3E      E0-SKPN 0000000000000000000     110 111101
        3F              0100110000000100000     010 010000
                                                
   D. The example we just worked was ad-hoc - which is not unusual for a
      situation where even one extra control word - especially in the fetch
      cycle - can have a profound impact on execution time.  Note also that
      decoding the instruction requires an extra cycle, when compared to
      the hardwired implementation - this is a price sometimes paid for using
      microprogrammed control.

V. Advantages/disadvantages of micro-programming
-  ------------------------ -- -----------------

   A. Advantages

      1. Great sophistication in the user instruction set can be achieved for
         relatively low cost.  Adding new instructions is cheap.

      2. Multiple user instruction sets can be available on the same machine.
         This allows a new machine to emulate a previous model to aid in
         the conversion process - e.g.

         a. Early IBM 360's contained microcode to emulate 1401's and/or 1620's
         b. Early DEC VAX's emulated PDP-11's.
         c. DEC Alpha's use a form of microcode (though different from what
            we have discussed here) to emulate VAX's.

      3. New architectures can be tried out by simulating them using writeable
         control store on an existing machine.  Special micro-engines have
         been built for just this kind of work.

      4. Micro-code can be written to allow direct execution of high-level
         languages - e.g. LISP, Pascal.

      5. For specialized applications (e.g. real-time systems), critical loops
         can be microprogrammed for faster execution time.

      6. Micro-programmed diagnostics.

      7. Bit-sliced processors, allowing implementation of custom machines.

   B. Disadvantages

      1. For a simple machine, the extra hardware needed for the control store
         and sequencer may be more complex than hardwiring.

      2. For a given level of technology, hardwired control will be faster, 
         since there is no delay for micro-instruction fetch from ROM before
         the control unit can produce a control word.
         
      3. Does not lend itself well to parallelism, as we shall see.

   C. CISCs typically use micro-programming, because hardwired control is
      generally not feasible due to the complexity of their instruction set.
      RISCs do not use micro-programming; their simplicity facilitates
      hard-wired control, which is faster and allows pipelining - our next
      topic.
Copyright ©1999 - Russell C. Bjork