Unit 4 PDF
Unit 4 PDF
Unit 4 PDF
Microprogrammed
Design
At the heart of our development of digital design is the algorithm. It is our basic
tool for organizing our thoughts, and we use it to guide the design process. The
actual implementation of the control algorithm is less important than the algorithm
itself. We will accept any reasonable implementation scheme that conforms to
our demands for clarity, simplicity, and regUlarity. In Part II, we developed
systematic methods of realizing algorithmic state machines using building blocks
of the scale of MSI integrated cicuits. Are there other ways to transform ASM
charts into circuits?
From the earliest days of computers, programmers have regarded computers
as machines for executing algorithms. In 1951, Maurice Wilkes* proposed building
a special "computer" for executing algorithmic state machines, with the logic
of the algorithm residing in a special program call1ed a microprogram. Wilkes's
concept, microprogramming, was well ahead of the state of digital technology.
Microprogramming was not used commercially until 1964, when IBM employed
it extensively in the construction of the System/360 series of computers.
The year 1964 also saw the beginnings of the small, inexpensive computer.
In that year the Digital Equipment Corporation introduced the PDP-8 minicomputer,
which was the first CPU inexpensive enough to be dedicated to running algorithms
to control a particular device. The PDP-8 and its successors and imitators
371
enjoyed wide use for more than ten years in sophisticated digital control applications,
although engineers looked upon these uses as simply an extended form of con-
ventional programming. In 1974 another wave of technology produced the dra-
matically less expensive CPUs that we call microprocessors and microcomputers.
The ensuing explosion of applications will continue in the foreseeable future.
Unfortunately, many people equate the microprocessor with the concept of mi-
croprogramming, a serious misconception. Today's microprocessors and micro-
computers are inexpensive, small, conventional computers, programmed in a
conventional way. Microprogramming represents a different approach to pro-
gramming, and although many computers are constructed with the aid of micro-
programming techniques, the conventional software programmer normally does
not use the technique. To separate these ideas more clearly, we refrain from
using the sadly diluted "microcomputer" and "microprocessor" names when
referring to microprogrammed devices. Instead, we will speak of a "micropro-
grammed controller," or "microcontroller."
CLASSICAL MICROPROGRAMMING
Wilkes recognized the fundamental separation between controller and architecture
and was able to contemplate new and systematic ways of implementing the
control function. Although formal ASM charts had not yet been invented, Wilkes
proposed a machine whose fundamental operation was the execution of the ASM
state in Fig. 10-1. This standard state has at most one test variable, and may
have none.
Wilkes proposed to use diodes for his machine. At that time, diodes were
used to construct logic AND and OR functions. Although we no longer use
diodes for this purpose, to appreciate Wilkes's proposal you should understand
that diode construction provides a wired-OR capability similar to that of the
modern open-collector gate.
We will describe Wilkes's method by means of an example. Consider the
simple ASM in Fig. 10-2. Let's proceed with a clocked implem~ntation of the
state generator using an encoded state assignment of the usual sort; Fig. 10-2
shows an arbitrary assignment of state variables B and A for the three states.
Implementing the algorithm calls for constructing the new values of the state
variables, B(NEW) and A(NEW), which serve as inputs to the clocked state flip-
@ r---'------,
flops. If we decode the current state variables, we may produce logic signals
SO, Sl, and S2 for the individual states. A routine examination of the ASM
leads us to the following equations for the various outputs required in the imple-
mentation:
x Y
~
SOoX
0 SOoX
Decoder
~ I S1
~
S20Y
r---- 2 S20Y
31--
P Q R
B
State
A
t1ip-
flops
< B(NEW)
A(NEW)
CLASSICAL MICROPROGRAMMING
WITH MODERN TECHNOLOGY
Let us now explore the design of microprogrammed controllers with modern
devices. We will find that this causes a few changes in Wilkes's scheme, but
only in the details. At the conceptual level, microprogramming remains the
realization of controllers by means of tables rather than gates, an idea that
survives from Wilkes. We may reduce any ASM chart to a table with inputs of
current state and status variables and outputs of next-state variables and commands.
If we translate an ASM chart into a table and then implement the table directly,
using hardware lookup techniques, we are microprogramming in the broadest
sense. The essential step is the direct implementation of the table, bypassing
gates, Boolean algebra, and so on.
To simplify our treatment, we adopt for the present a uniform representation
of logic truth in the microcode. The customary convention is positive logic, with
T = H; we adhere to this convention in this section. In software programming,
the choice of convention is of no consequence to the programmer, who is not
dealing with voltage. In microprogramming, where we remain close to the hardware,
this choice of positive logic will create problems, and we will return to discuss
how to reinsert the full power of mixed logic into the microcode.
If our micro controller is to implement really large algorithms, comparable
to computer programs, we may need a sizable memory to hold the microinstructions
for all the branch paths of the ASM. At this stage, ROM is a good choice since
its contents remain intact even when the power is off. This means that the
algorithm is instantly available when the power goes on, which seems quite
desirable. The absence of inexpensive, fast ROM was the stumbling block in
implementing microprogrammed control after its introduction, and many years
of research ensued before the development of practical devices; it is no longer
a problem.
In the testing of status variables newer technology has forced some changes
in Wilkes's scheme. In the jargon of microprogramming, ASM variables are
called qualifiers. In Fig. 10-3, we supply the current 2-bit state address B,A
that we decode into individual signals for each state. After this decoding of the
state address, we incorporate the qualifier tests using AND gates, to create one
Next-state Command
address outputs
0 0 0 0 0 0 1
0 0 0 1 1 0 0 1
0 0 1 0 0 1 1 0 0
0 0 1 1 0 1 1 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0
0 1 1 0 0 0 0 1 0
0 1 1 1 0 0 0 1 0
1 0 0 0 1 0 1 0 0
0 0 1 0 0 0 0
0 1 0 1 0 0 0
0 1 1 0 0 0 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 0 0 0 0
1 1 0 0 0 0 0
Next, we implement the table directly in a system of sixteen 5-bit words. Figure
10-4 is a sketch of the circuit. All lines into the PROM are address inputs; of
the 5 output bits from the PROM, 2 are inputs to the state flip-flops, and 3 are
command outputs to the architecture.
This approach to microprogrammed design is conceptually straightforward
but requires enormous ROMs as the algorithm becomes more complex. There
is an added benefit, however. Our original treatment based on Wilkes' s work
allowed us to implement ASMs containing at most one test variable per state.
Since the present approach requires us to create a microinstruction for every
possible combination of qualifier values for each state, we are automatically able
to implement an ASM of arbitrary complexity; hence the name "multiple qualifier."
No matter how complicated the branch path through a state, it corresponds to
some microinstruction in this ROM-based design. Note the strong similarity to
the ROM-based implementation of logic circuits discussed in Chapter 4.
As an exercise in using the mUltiple qualifier method, you might wish to
implement the Black Jack Dealer machine of Chapter 6, using microprogrammed
control. Suppose you use the same architecture as in the hardwired solution
presented in Chapter 6. In Fig. 6-32, you can identify two state variables (say
B and A) and eight qualifiers (CARD.RDY.SYNC, CARD.RDY.DELAYED,
STAND, BROKE, ACECARD, ACEllFLAG, SCOREGT16, and SCOREGT21),
so the ROM address field will contain 10 bits. The number of microinstructions
is 210 = 1024! The eleven command signals (including the adder select signals)
together with the two inputs to the state-variable flip-flops require that each ROM
word have 13 bits. You will need a system containing a 1K x 13 ROM.
You may wish to write down the contents of some of the 1024 microinstruc-
tions to solidify your grasp of the concepts in the multiple qualifier method. You
<
B B(NEW)
Q D
<
~
'--+- A
d 0
d u
r PROM t
-
e p
s u
s t
-
Y
S
t
a
u
C
0 i~
~
a t m n Q
X t p Architecture m P
u
u u a t p Figure 10-4 A PROM-based micro-
s t n
s d
s programmed implementation of Fig.
10-2.
can appreciate the tedium of using this "simple" method to implement a complex
algorithm manually.
What is your reaction to this approach? Ours is:
(a) It is a straightforward but tedious implementation of a general ASM.
(b) The tables are very large, even for relatively small problems.
(c) The method would be feasible only with inexpensive ROMs or PROMs.
The problem is that the address for the ROM is a concatenation of a small
number of encoded state-variable bits and a large number of individual qualifier
bits. The state variables are important at all times in the execution of the
algorithm, but each qualifier appears only occasionally in the ASM. Most of the
time, the algorithm is indifferent to the value of most of the qualifiers, yet we
must enumerate each combination. We are forced to use a canonical form of
truth table rather than the compact form allowed by the typical ASM. This
ROM-based method is feasible if you have a "smart" PROM programmer that
The Wilkes scheme tests only one qualifier per state. The address is formed
from the state variables alone, requiring the decoding of only a small number
of variables. The scheme implements the qualifier tests with AND gates inserted
in an orderly manner inside the circuit, following the decoding of the address.
You have seen that using a ROM precludes this method, and our first attempt
was to move all the qualifier signals out into the address field. This allowed us
to implement general ASMs, but at a severe penalty. We would like to remove
the qualifier variables from the ROM address field so that we can eliminate
redundant microinstructions.
Providing properly sequenced command outputs to the architecture is the
purpose of an ASM. In microprogramming, the command outputs arise from
bits in the microcode. In addition to these command bits, microprogramming
instructions also provide the new values of the state variables. This gives us a
clue: our microcode has two components, an external one (command outputs)
and an internal one (the next microinstruction address). Perhaps by enlarging
the internal portion of the microinstruction we can shrink the number of instructions.
We are striving to develop a method of handling large problems, and we may
have to compromise the generality of the ASM structures that our method will
handle.
Let's start instead with the most elementary useful ASM operation; one
qualifier per state with no conditional outputs. Further, let's try to realize each
state with only one microinstruction. In such a scheme, the ROM address would
consist solely of the state variables, which would select the proper microinstruction.
This instruction must contain sufficient information to guide the development of
the next-state address. In particular, the instruction itself must specify which
qualifier this particular state is testing.
If we organize all the qualifiers in a list, we may designate any qualifier
by its index n in the list. The ASM structure we are trying to realize is
T F
X(n)
where X(n) is one of the qualifiers X. Let's include this index n-the qualifier
index-in the microinstruction. Then, in our single microinstruction for state
Each word is now wider than in the multiple qualifier method, since it includes
the index field and an extra address field, but our microinstruction table is reduced
to one row per state.
To execute a microinstruction, our primitive microcontroller must be able
to select the proper X(n) using the value of n in the instruction. Based on the
present value of the selected test input X(n) the processor must choose one of
the two address fields as input to the state flip-flops. Let's construct this selection
hardware. Index n is an address in a table, so we may use a multiplexer building
block with n as the code for the test input selection. The output of this multiplexer
is a variable X(n), whose value must specify either the true or the false address.
We can perform this last selection with a set of two-input multiplexers, using
X(n) as the select input. Figure 10-5 is the circuit for our processor.
v
'-----~---_1 Q D ~-----------.....
Code
Address n TA FA P Q R
0 0 1 3 1 0 0
1 0 0 0 0 1 0
2 1 0 2 1 0 0
3 0 2 2 0 0
TA and FA each require 2 bits, n has I bit, and there are three command outputs,
so this design would require four words of 8-bit ROM.
Unconditional state transitions occur in instructions 1 and 3. Since each
microinstruction must specify the index n of some test variable, we simply choose
any index and make both TA and FA point to the same next instruction.
Why did we eliminate the ASM's conditional output from our single-qualifier
scheme? Conditional outputs arose naturally in Wilkes's scheme and in the
multiple-qualifier approach. But here we have exactly one microinstruction per
ASM state, and all the information for the execution of that state must reside
in that microinstruction. If we were to permit conditional outputs, we would
need a way to designate, for each command output bit, whether it is to be
asserted unconditionally, or only on the true branch, or on the false branch, or
not at all. This would require 2 microinstruction bits per command output-a
considerable burden on the hardware. One of the virtues of the present method
The preceding single-qualifier structure is feasible, but the two address fields
can consume considerable space in the microinstruction. We can eliminate one
address field if we adopt a rule for inferring that address from the present address.
The obvious choice, which conforms closely to the practice used in conventional
computers, is to insist that one of the branch addresses be the next sequential
address. In the single-qualifier method, state assignments correspond to micro-
instruction addresses, and since the state assignments are at our disposal, we
may use normal sequencing to save bits in the microinstruction. The ASM
operation reflecting this modification of the single-qualifier scheme is shown in
Fig. 1O-7a.
The microinstruction now contains one jump address, the qualifier index,
and the command output bits. In the version of ASM in Fig. 1O-7a the jump
address is always the true path. We can enhance the versatility of this approach
by adding one more bit to the microinstruction, to allow the microprogrammer
to specify which path, true or false, the jump address refers to. Now the format
for the microinstruction is
Jump
Index Command
TFBIT address
n outputs
JA
Now consider how we might build the processor for this method. Since
we have eliminated one of the address fields, we have also removed the need
for the two-input multiplexers on the state flip-flop inputs in Fig. 10-5. Instead,
we need to be able to increment the current address whenever the test variable
value is opposite of TFBIT in the microinstruction. We may incorporate this
V1
X(n)
n
I
PROM
Addr TFBIT
v
JA
-<:::-'-
v
Qk Dk I -
LD
Counter
CNT
SI 0 0 1 2 0 1 0
S2 1 2 0 1 1 0 0
SO 2 1 0 1 0 0
S3 3 0 1 0 0
0 1 0
1 0 1
2 2 11
3 3 11
4 4 1 12
5 6 0 0
6 7 0 10
7 5 0 9
8 0 1 5
9 0 0
10 0 0
11 0 4
12 5 5
13 0 5
F
CARD.RDYSYNC [I] >--....
T
13
SelectADDJO
Load SCORE
T-+ACEllFLAG
F
SCOREGT16 [6J
10
SCOREGT2J [7J
8
T Select SUBJO
ACEllFLAG [5J Load SCORE
F-+ACEllFLAG
Figure 10-9 The Black Jack Dealer ASM revised for microprogrammed implementation.
To architecture
To architecture
To sequencer
Architecture
r+"
Test ~
-- ~
inputs I
--l-
- /'
• Selected input
Next-instruction
address
Sequencer
I>
-
Sequencer
instruction P
i 0 A
p u d
Test index
~f-
e W
Q I D_~ C
i u S e
n _t s
Command e s Figure 10-10 A sophisticated micro-
A
outp uts program sequencer.
12
y NEXT. INSTR UCTION
INSTR UCTION 2910
ADDRESS
CONDITION. CODE
ENABLE
CC CI FULL
5-word
stack
D -.....- - - ,
The output of the p,PC forms one input to the Y multiplexer. The 12-bit external
data input D forms another multiplexer input. UsuallY,.D comes from the jump-
address field of the pipeline register.
The two remaining multiplexer inputs support subprogram calls and program
looping. The 2910 contains a five-word stack. In a subprogram call, the 2910
must save the return address on the top of the stack; a subprogram return must
supply the return address from the top of the stack. The 2910's internal stack
allows calls to microprogram subprograms to be nested five deep. Each subprogram
call results in a stack push operation, and each subprogram return causes a stack
pop operation. When a microinstruction executes a subprogram call, the required
return point is the address of the control store word following the subprogram
call instruction. This address is exactly the quantity that is currently stored in
the 2910's p,PC; Fig. 10-12 shows a data path from the p,PC to the stack that
supports subprogram calls.
The fourth input to the Y multiplexer is from an internal register R that
can hold a loop counter. The R-register can be loaded from the 2910's D-input.
Several 2910 instructions support the loading, testing, and decrementing of the
value in the R-register.
The 2910's 4-bit operation code supports 16 basic instructions. Each in-
struction has a "pass" and a "fail" option, generating a total of 32 possible
operations. The selection of pass or fail is controlled by the values of the 2910
inputs CC and CCEN, according to the following prescription: if the enable signal
CCEN is true and the test input (condition code) CC is false, then fail; otherwise
pass. Viewed another way, this structure allows the execution of the pass version
of an instruction when we are not testing the input, or when the input is true
CCEN CC Result
F F Pass
F T Pass
T F Fail
T T Pass
CJP (conditional jump) is a typical 2910 instruction. In its fail mode, this
instruction selects the j.tPC as the Y-output, accomplishing normal sequencing.
In its pass mode, CJP selects the D-input as the Y-output, thus performing a
branch. At the next system clock edge, the pipeline register will receive the
appropriate instruction and the 2910's j.tPC will capture the address + 1 of this
instruction.
Another example is CJS (conditional jump to subprogram). In its fail mode,
CJS sequences to the next microinstruction address, with no effect on the 2910's
internal stack. In its pass mode, CJS performs a subprogram jump, which selects
the branch address in the D-input as the value of Y, and, when the system clock
fires, causes the contents of f,LPC to be pushed onto the internal stack. (As
usual, j.tPC will receive the new Y + 1 when the clock transition occurs.)
In the fail mode, the instruction CRTN (conditional subprogram return)
performs normal sequencing, with no effect on the 2910's stack. In the pass
mode, CRTN delivers the top-of-stack element to Y, thereby supplying the return
address to the previous subprogram as the address of the next microinstruction.
When the clock transition occurs, the 2910 pops its stack, and f,LPC receives Y
+ 1.
When the 2910 is used as the sequencing element in Fig. 10-12, an appropriate
form for the flow-of-control portion of the microinstruction is:
Sequencer component
D
I Test index
Each microinstruction provides the 2910 with the /, CCEN, and D fields. The
index field goes to the architecture to guide the selection of the appropriate test
input, which becomes the 2910's CC input.
Thus far, the 2910's D-field arises from the corresponding field in the mi-
croinstruction pipeline register. Although this is by far the most common and
useful mode of operation, the 2910 also permits an alternative source of the D-
input. With each instruction, the 2910 asserts one of three D-field selection
signals. In the instructions described above, the 2910 asserts its Pipeline-Enable
signal PL(EN). On the other hand, the 2910's JMAP (Jump on Map Address)
instruction, which causes an unconditional jump to the D-field address, asserts
With the realization that our microprogramming methods are capable of describing
and executing quite complex algorithms, we begin to see the need for sophisticated
equipment to help the designer to manage the complexity. We have assumed
that the microprogram storage was a read-only memory-ROM, PROM, or
EPROM. In accordance with good programming practice, our microprograms
do not change; all the "data storage" is in the architecture. Even when sophis-
ticated microprogram sequencers such as the 2910 are used, the microprogram
remains fixed during execution-the sequencer itself contains storage for sub-
program return points and loop control. For such an environment, ROM seems
the natural choice. Many designers initially discarded RAM for this purpose
because of the volatility of its contents when power drops. But as the size and
complexity of modern microprograms have increased, this choice has been reversed.
For debugging complex microprograms, and when the microcode may be modified
in the field, RAM is essential. In microprogramming jargon, the microprogram
storage is the control store. If the control store is easily alterable, as is RAM,
it is called writable control store (WCS). If we use RAM, we must load it
frequently, and we need powerful microprogramming aids. In the next section
we describe a microprogrammable development system that provides the designer
with the hardware and software tools required to manage the design and de-
velopment process.
HOLD PL
2 JMAP JUMP MAP X 0 HOLD 0 HOLD HOLD M~
3 CJP COND JUMP PL X PC HOLD 0 HOLD HOLD PL
4 PUSH PUSH/COND LD CNTR X PC PUSH PC PUSH Note 1 PL
5 JSRP COND JSB R/PL X R PUSH 0 PUSH HOLD PL
t----
6 CJV COND JUMP VECTOR X PC HOLD 0 HOLD HOLD VECT
7 JRP COND JUMP R/PL X R HOLD 0 HOLD HOLD PL
-- -~------.
Note 1: If CCEN == LOW and CC = HIGH, hold; else load. X = Don't Care
I-field
Value Mnemonic Function
$0 JZ Jump to location 0
$1 CJS Conditional jump to subroutine at pipeline address
$2 JMAP Jump to map address
$3 CJP Conditional jump to pipeline address
$4 PUSH Push with conditional load of counter
$5 JSRP Conditional jump to subroutine at R address or at pipeline address
$6 CJV Conditional jump to vector address
$7 JRP Conditional jump to R address or pipeline address
$8 RFCT Repeat loop if counter is non-zero
$9 RPCT Jump to pipeline address if counter is non-zero
$A CRTN Conditional return from subroutine
$B CJPP Conditional jump to pipeline address with stack pop
$C LDCT Load counter from D input
$D LOOP Test end of loop
$E CaNT Continue
$F TWB Three-way branch
Alternative instruction mnemonics:
CALL Equivalent to CJS
RTN Equivalent to CRTN
JUMP Equivalent to CJP
BACKPANEL
r------ Designer's
wire-wrap board
I
I
I
Test
I input
n Commands
I
Logic Engine
Display
panel Microprogrammable
controller
Serial
Control Status
data
Monitor
Microcomputer
support system
wi th disk storage
Serial
in/out Shift register
To monitor
WCS address
I WCS data
I
t
I
l
Pipeline register
I
Designer's
commands
The Backpanel
The Logic Engine's backpanel is large: 16 in. wide and 20 in. high. It has a
general-purpose work area to handle integrated circuit chips of 8 to 64 pins. For
a typical design, the backpanel can accommodate several hundred chips. The
designer has access to both sides of the board at all times. Ground and +5 V
appear as power grids on opposite sides of the board and there are extensive
provisions for attaching power-bypass capacitors (see Chapter 12).
Along one side of the backpanel is an area committed to the microinstruction
pipeline register and WCS for the designer's command signals. This permits
easy wire-wrapping of the command signals to the architecture, and allows the
designer to employ as many command signals as the design requires.
The Logic Engine's development and debugging monitor supports the detailed
control of the WCS and of the operations of the microprogram sequencer and
pipeline register. The designer may load the WCS from a floppy-disk file-an
example of downloading. The designer may read and modify any word in the
WCS, modify any word without disturbing the remainder, and display the contents
of a block of WCS. Since the 2910 microprogram sequencer is an integral part
of the Logic Engine, the monitor knows its characteristics in detail and thus can
support the display and modification of all of the sequencer's internal registers.
The designer may display the microinstruction pipeline register and modify any
portion of it. The monitor also permits the designer to specify whether, with
each manual change of the pipeline register, a designer's clock signal is to be
issued. These features give the designer an important debugging tool: the manual
entry of microinstructions into the pipeline register without modifying the writable
control store. Since the pipeline register's command field is wired to the designer's
architecture, the designer may exert detailed manual control of the circuit.
Our first step is to work out the architecture-the registers, busses, and data
paths. Forth is a stack-oriented language, and several of its important operations
involve manipulations of the elements on the stack. In this example, we focus
on the stack operations. We wish the several top elements of the stack to be
available for direct use; the deeper elements will be kept in a RAM. Figure 10-15
is a portion of the architecture, showing the top three elements of the stack.
(The stack elements may contain as many bits as required by the problem, but
this decision does not concern us here.) The input to each stack element is
through a set of multiplexers. Each of the potential sources of a given element
of the stack becomes an input to that element's multiplexer. (Notice the similarity
of this data-routing design to that used in the LD20.) For testing purposes, in
SWR
Stack
SO
Sl SOCTL
S2
S3
Stack
SO
Sl REGCTL
SlCTL (6 bits)
S2
S3
so
Sl Stack
S2 S2CTL
S3
MUXCTL
(8 bits)
LD.L
TST.L
Now it is time to construct and test the architecture. On the Logic Engine's
backpanel, we layout the chips required by the architecture, assemble the
appropriate sockets and chips, and wire-wrap the design. The size of the backpanel
permits us to develop and debug the architecture without partitioning the com-
ponents among small printed circuit boards.
At this point, we usually make some preliminary tests of the registers and
the data paths. The Logic Engine's display panel has numerous lights and
switches to assist us. Using wire-wraps or jumpers, we connect the important
outputs to any of the display panel's LEDs and connect switches to the inputs.
A disposable cardboard overlay for the display panel allows us to label the lights
and switches. We may use the display panel's variable-speed clock to provide
clocking signals. Manual clocking permits us to debug statically-we deliver
clock transitions only when we wish. This is a powerful debugging technique.
Now we exercise the architecture with the display panel's switches, and
observe the results on the lights. In effect, we are manually delivering rudimentary
control to the architecture prior to developing the actual control program, thus
allowing early detection of gross errors in the wiring or design.
Once we are satisfied that the architecture is working properly, we turn to the
detailed development of the control algorithm. We rely on the Logic Engine to
help us develop the control in two ways: by providing a standard environment
for developing and executing microprograms, and by aiding us to program and
test the code.
The Logic Engine's microprogram assembler, LEASMB, has two parts.
In the declaration phase we specify symbolic names for all the variables and
quantities of interest, and we describe the structure of the microinstruction. The
program phase contains the microcode itself, in symbolic form. The use of
symbolic notations is of great value .because of their descriptive power and
because changes in the design may usually be made with little disturbance to
the program. During the earlier phases of the design, natural names will emerge
for the important signals that control the architecture. It is convenient to use
these names in the microcode.
Figure 10-17 is a microprogram for our Forth Machine design. We will
use this code to introduce the elements of the microassembly language. (Later,
you will study the process of generating a microprogram for a more complex
microprogrammed design. Our treatment of the microassembly language will be
informal.)
The microcode in Fig. 10-17 supports a small portion of the testing of the
Forth Machine-the manual loading of data from the switch register on the
display panel and the exercising of the Forth language's rotate instruction. This
code includes declarations and microinstructions that illustrate a variety of features
of the microassembler.
ID FORI'H TEST
~ * FORI'H ENGINE
N
* SAMPLE DECLARATIONS AND SAMPLE MICROCODE
SIZE 18: Number of command bits
MODE IDGIC
* TEST MUX CCNFIGURATICN
INMUX COM (O:3),T=%HHHH,D=Q
LD. L INV INMUX=O, T--%L
TST.L INV INMUX=l,T=%L
* crMMAND FIELD DEClARATIONS
MID<CI'L (7 : 0 ) COM (4: 11 ) , T=$FF , D=%'rrrl'rrrr
MO EQU MID<CI'L (7 : 5 ): Mux 0 select signals
Ml EQU MUXCI'L(4:2): Mux 1 select signals
M2 EOO MUXCI'L(l:O): Mux 2 select signals
MOS2 INV M0=5: Select Reg S2 thru Mux 0
OJ MOSWR INV MO=O: Select 9.lTitch Reg thru Mux 0
.... M1SO INV Ml=3: Select Reg SO thru Mux 1
c:
(C
M2S1 INV M2=1: Select Reg Sl thru Mux 2
:;' REOCTL COM (12: 17 ), T=%HHHHHH, D=%FFFFFF
-
(C
:r
CD
:::c
!.DAD3
ROl'ATE
EOO %111111: load 80,81,S2
INV MOS2,M1SO,M2S1,REOCTL=!.DAD3: Rotate stack.
III
PROG
.... LOC XDDDI ccce C
c.
:E
III
000 ORG 0
000 BEGIN *
~
en 000 10033 OFFO 0 !.DAD
EQU
.nMP TEST IF LD .L=%F
o 001 50013 OFFO 0 JUMP * IF LD. L=%'r
~
III
JUMP BEGIN:MOSWR,M180,M2S1,
.... 002 30003 OODF C REOCTL=LOAD3: **Push swi.tches onto stack.
CD
003 10003 lFFO 0 TEST JUMP LOAD IF TST. L=%F
G)
III 004 50043 lFFO 0 .nMP * IF T8T.L=%'r
"C
005 30003 OADF C ROI' JUMP BEGIN:ROI'ATE: **Rotate top 3 stack. elenents
END
"0
III
o ERROR (S) DETECl'ED
~
= Figure 10-17 Microcode for the design example,
LEASMB microprogram statements have an optional label field, an operation
field, and an optional operand field, in that order. Within the operand field, the
required subfields are separated by semicolons; comments may follow the operand
field, if preceded by a semicolon. Lines beginning with an asterisk are comments.
An LEASMB output listing, such as in Fig. 10-17, shows the source program
and, to the left of the program phase, the object code in hexadecimal notation.
An LEASMB program begins with the directive ID and ends with the
directive END. The ID is the name assigned to the object code produced by
the assembler, END marks the physical end of the program text. The declarations
precede the program; the directive PROG separates the two. In the declaration
phase, the directive SIZE specifies how many command bits are used in the
design. The directive MODE, which takes an argument LOGIC or VOLTAGE,
announces how the assembler is to interpret numeric data that could describe
either logical or voltage values. As mixed logicians, we choose LOGIC mode.
Unless otherwise specified, numbers are in decimal notation. Binary constants
are preceded by %, and may contain numerals 1 and 0, logical T and F, or
voltage Hand L.
The declaration phase has three main directives: COM, INV, and EQU.
The program phase has, besides mnemonics for each of the 2910 sequencer's
instructions, two directives: ORG and EQU. COM specifies the nature of the
command bits in the microinstruction; INV allows a symbolic name to invoke
complex operations on command bits; EQU allows the equivalencing of names
to values or to other names. ORG declares the origin of the microinstructions
in the control store.
Command bits, presented to the architecture by the command field of the
pipeline register, are defined by the COM directive. In Fig. 10-17, the definition
of REGCTL provides the following information: REGCTL is a field of 6 bits
which occupy bits 12 through 17 of the command field of the microinstruction.
[If we choose, we may refer to individual bits or groups of bits of REGCTL
using a default set of indices, 0 for the leftmost bit through 5 for the rightmost
bit. Thus REGCTL(0:3) would reference the leftmost 4 bits of the field. This
notation frees the programmer from a dependence on the positions of particular
command bits.] For each bit of REGCTL, truth is represented by a high voltage
level (T = %HHHHHH). Whenever any bits of REGCTL are not mentioned in
a microinstruction, the default values will be false (D = %FFFFFF). (A mi-
croinstruction will usually deal explicitly with only a few command bits; the
default declarations of command bits are therefore of great importance in freeing
the programmer from unnecessary details. In this example, the default values
hold the current contents of the stack registers.)
The MUXCTL(7:0) declaration specifies that MUXCTL occupies bits 4
through 11 of the command field. In our program we may refer to individual
bits or groups of bits using indices 7 (for the leftmost bit) to 0 (for the rightmost).
This explicit indexing notation overrides the normal default notation. For each
bit of MUXCTL, truth is represented by a high voltage (T = %HHHHHHHH).
By convention, their unspecified default values are false.
In our illustration, we usually wish to deal with the group of command
Developing a Microprogram
The following prescription is useful for developing the microprogram:
1. Develop the main architecture. For our LD30, we would follow almost
identically the path taken in Chapter 7, and we will adopt the LD20's main
architecture.
2. Specify the obvious command bits for controlling the architecture. We
need not worry about details at this stage, but our knowledge of the ar-
chitecture will give us much insight into the commands needed to control
it. We may modify our list of commands later.
3. Write high-level microinstructions for the control algorithm. We will use
the Logic Engine's microassembly language, which will allow us to express
complex operations in symbolic terms without going into details. Since we
have already studied the elements of the control process for the LD20, we
can draw on this experience, using notations that are as close as reasonable
to those used in the LD20's ASM chart.
4. Develop the declaration phase of the microcode. This is primarily the
specification of invocation variables to expand in ever-increasing detail the
high-level notations used in the microinstructions.
5. Tidy up. Complete the details of command-bit representations, test inputs,
minor architectural elements, and definitions of the behavior of specific
chips used in the architecture.
The Architecture of the LD30
We adopt almost intact the architecture developed for the LD20. Figure 10-18
shows one of the twelve bit-slices in the main data structure. We see the principal
registers, the ALU, and the main bussing system for the 12-bit data of the LD30.
MA
MA j
OUT
MEM MEM j ALU j
IN
ALU
MUX j
DATA
MB MUX
ME j
PC
PCj
EA j
INPUTj - - - - _....
IR
IR j
We will use the LD20's structure for the accumulator and link; this is shown
in Fig. 10-19. Also, we will use the LD20's structure for the memory and its
controller. We anticipate that the LD20's state generator will be completely
missing from our microprogrammed implementation, since the microprogram
controller in the Logic Engine performs the next-state selection. We expect our
LD30 to contain at least one new architectural element: the test multiplexer that
supports the delivery of the specified test signal to the microprogram controller.
We will specify other architectural elements as we encounter them in developing
the control algorithm for the LD30.
LINK
LINK
ACO Q 1---41......-1 RT IN
ACll LINK
AC shift
o register
12
ACO-ACII
Figure 10-19 The accumulator and link of the LD30.
To provide a concrete point of reference for our later work, it is useful to specify
as many of the architecture's control signals (the microinstruction commands)
We will use our earlier analysis of the LD20's control algorithm (see Chapter
7) to help develop the LD30's microcode. The hard work-understanding the
PDP-8's specifications well enough to describe a correct control algorithm-is
the same for the microprogrammed LD30 as for the hardwired LD20. In the
LD20, we expressed the algorithm as an ASM chart. For the LD30, we will
use the Logic Engine's LEASMB microassembly language. The two versions
differ dramatically in appearance-the ASM chart is a two-dimensional flow
diagram, whereas the microassembly language is linear, like a computer program.
The two versions also differ in detail, since the LD20's ASM contains many
states with multiple tests (multiple qualifiers) and conditional outputs, which are
not permitted in the single-qualifier microinstructions supported by the Logic
Engine. And the implementations of the two control algorithms are wildly different.
However, the overall sequencing should remain the same. We do not need an
ASM for our work on the LD30, but to tie this work more closely with the
LD20, we will use LD20 nomenclature when practical. Figure 10-20 is a high-
level diagram of the LD30's control algorithm. The algorithm consists of four
Halt Yes
?
IDLE
....-...
No
Fetch next
instruction
l
Continue Yes
~
')
Form
effective
No l
Manual
address operation
No ?
Operand
needed
?
y yesl
Perform
MANUAL
operation
Yes
Get operand
J
from memory
EXECUTE PHASE
Execute the
instruction
I
Figure 10-20 LD30 control.
principal blocks: the fetch phase, the execute phase, the idle phase, and the
manual phase. We must decompose each block into a sequence of microinstruc-
tions. As always, we will strive for a top-down development.
The purpose of the idle phase is to detect an operator's action at the LD30
display panel. As in the LD20, we must take care to process only synchronized
input signals in our algorithm, and we must provide a means of processing each
depression of a pushbutton on the display panel only once.
We may contemplate the use of a flip-flop HALTFF to record the status
of halt requests, as we did in designing the LD20. A request to halt will eventually
cause the algorithm to reach the idle phase; we should be prepared to clear the
IDLE EQU *
JUMP * IF MANPULSE=%F; CLEAR.HALTFF
JUMP FETCH IF CONT.SW=%T
CALL MANUAL
JUMP IDLE
MANSW. D Q
MANPULSE
MANSW* MANUAL.SW
Figure 10-21 Two versions of the hardware for the LD30's single-pulser.
The first instruction jumps in place until all buttons are released, thus assuring
that any previous depression of a pushbutton is processed only once. The next
step is to hang up until the operator depresses a pushbutton, which then allows
the idle phase to proceed. This second procedure is superior, since it requires
less hardware and only one additional microinstruction. This form of single-
pulser was used in our Forth Machine design example.
As we did in the LD20, we will insert into the architecture the large OR
gate necessary to produce MANSW* from the collection of individual pushbutton
signals on the display panel. We could dispense with this OR gate by writing
microcode that serially single-pulses through the individual pushbutton signals,
but this would require synchronizing each signal separately and would generate
much additional microcode.
The Manual Phase of the LD30
The processing of manual operations in the LD20 was rather tedious, since the
manual operations were merged with the execute phase. In the LD30, the manual
phase is much more straightforward, as the following microcode shows:
MANUAL EQU *
CALL LDMA IF LDMA.SW=%T
CALL LDMB IF LDMB.SW=%T
CALL LDPC IF LDPC.SW=%T
CALL LDIR IF LDIR.SW=%T
CALL LDAC IF LDAC.SW=%T
CALL CLEAR IF CLEAR.SW=%T
CALL EXAMINE IF EXAMINE. SW=%T
CALL LDMEM IF LDMEM.SW=%T
CALL DEPOSIT IF DEPOSIT. SW=%T
RTN
F2 CONT INCREMENT. PC
F3 CALL READ.TO.IR
F3.l JUMP FETCH.DONE IF NO.MEMORY; EA.TO.MA
F4 CALL READ.TO.MB
JUMP FETCH.DONE IF DIRECT.ADDRESSING
CALL AUTO IF AUTO. INDEXING
F7 CALL READ.TO.MB
In Chapter 7, developing the fetch phase was rather complex but now that
we understand it, rendering our understanding into microcode is simple. As
always, we have made good use of our ability to describe each microinstruction
in high-level terms.
At the start of the execute phase, IR will contain the current instruction, MB
will contain any needed operand from memory, and MA will contain the effective
address EA, if required. For the execute phase, we must develop microcode to
decode the instruction and perform each type of PDP-8 operation. Given our
work on the LD20, the execution of most of the instructions is straightforward,
and we might proceed as follows; some of the details are left for you to complete.
In problems at the end of the chapter, you are asked to complete the
microcode for the Operate instruction,
lOT Instruction. The PDP-8's lOT instruction performs input and output
operations. As in the LD20, our LD30 must arrange for the signals IOP1, IOP2,
and IOP4 to be asserted, if requested, long enough to examine the values of the
incoming status signals IOSKIP, ACCLR, and ORAC. This examination will
require several microinstructions, during which the appropriate lOP signal must
remain solidly asserted. We could write brute-force microcode for this problem,
with code to test the status of each of the three lOP signals. However, our
choice is to introduce an element into the LD30's architecture to generate and
maintain the lOP signals as required. We have already seen such an element
in the LD20: the lOP signal enabler. This circuit has a 4-bit shift register that
allows the enabling of each lOP signal in turn. We will use this circuit, shown
in Fig. 7-34, in our LD30. The microcode for the lOT instruction must then
loop three times over the status tests-once for each of the possible lOP signals.
We use the 2910 sequencer's internal counting instructions to manage the looping.
IOT.CODE EQU *; for PDP-8 input-output operations
LDCT 2 ;; load the 2910 R-register with (loop-count - 1)
IOT.LOOP CONT ; SHIFT. lOP. ENABLER
E3 JUMP *+2 IF IOSKIP=%F
CONT ; INCREMENT.PC ; if IOSKIP is asserted
E4 JUMP *+2 IF ACCLR=%F
CONT ; CLEAR.AC ; if ACCLR is asserted
E5 JUMP *+2 IF ORAC=%F
CONT ; OR.INPUT.TO.AC ; if ORAC is asserted
The lOT instruction provided two suboperations ION and IOF for enabling
and disabling the interrupt system. The suboperations are distinguished from
The LD30 must be able to detect an interrupt request and present a synchronized
version of the request to the microprogrammed control algorithm. The PDP-8's
interrupt protocol also requires that we be able to determine if interrupts are
enabled. If an interrupt is to occur, we must be able to force the execution of
a CALL 0 instruction. For these functions, we adopt the same architectural
elements used in the LD20. We show the architecture for the interrupt system
in Fig. 10-22.
Handling an interrupt request began early in the fetch phase with a subprogram
call to INTERRUPT. TEST. At the time we wrote that call, we had no clear
INT.EN.CTL(O) Q INTERRVPTENABLE
INT.EN.CTL( l)
Forcing a JMS
I JAMIRO
lRO
o ALVa
o AL V2 -------------.;~ JR2
The microcode for the execute phase calls a subprogram to decode the PDP-8
instruction residing in the IR. The operation code for most instructions is specified
by the three bits in IRO, IRl, and IR2. Since our microinstructions can examine
only one signal at a time, a pure microcode solution to instruction decoding is
clumsy (although perfectly feasible). Here is a brute-force way:
DECODE.INST EQU * pure microcode instruction decoding
JUMP CODE.IXX IF IRO=l
CODE.OXX JUMP CODE.OIX IF IRl=l
CODE.OOX JUMP TAD. CODE IF IR2=1
JUMP AND. CODE
CODE.OIX JUMP DCA. CODE IF IR2=1
JUMP ISZ.CODE
CODE.IXX JUMP CODE .11X IF IRl=l
CODE. lOX JUMP JMP.CODE IF IR2=1
JUMP JMS.CODE
CODE .I1X JUMP OP.CODE IF IR2=1
JUMP lOT. CODE
N
8
f-<
""0
......
~ Jump
0
e5 P2
...... ~
......
'"......~ P2
...... 0
...... 0
...... address
0 0 0 X X X X $10 AND.INST
0 0 1 X X X X $11 TAD.INST
0 1 0 X X X X $12 ISZ.lNST
0 1 1 X X X X $13 DCA.INST
1 0 0 X X X X $14 JMS.INST
0 1 X X X X $15 JMP.lNST
1 0 X X 1 0 $16 IOT.ION.INST
0 X X 0 $17 IOT.IOF.INST
0 X X 0 0 $18 IOT.lNST
1 0 X X X $19 OP.G1.lNST
1 0 X X $1A OP.G2.lNST
X X $lB OP.G3.lNST
MICROCODED JUMP TABLE
ORG $10; opcode jump table
AND. INST JUMP AND. CODE
TAD.INST JUMP TAD. CODE
ISZ. INST JUMP ISZ.CODE
DCA.INST JUMP DCA. CODE
JMS.INST JUMP JMS.CODE
lOT. ION. INST JUMP IOT.ION.CODE
IOT.IOF.INST JUMP IOT.IOF.CODE
OP.Gl.INST JUMP OP. Gl. CODE
OP.G2.INST JUMP OP.G2.CODE
OP.G3.INST JUMP OP.G3.CODE
mapping ROM to point directly to the instruction execution code, thereby eliminating
the microcoded jump table.
With the addition of the mapping ROM for decoding instructions, we may
now write the DECODE.INST microcode whose existence we assumed when
we developed the original microcode for the LD30's execute phase:
DECODE.INST EQU *; instruction decoding
JMAP ;; jump to correct instruction processing code
We have developed the main elements of the LD30's architecture and specified
the control algorithm in high-level terms. Now we will develop the declaration
phase of the microcode, in which we will expand our high-level invocations into
actual command signals directed to the LD30 architecture.
We have tried to choose obvious names for the intermediate operations. For
instance, SEL.MB means "select the MB register as the input to the ALU,"
ALU.PLUS means "cause the ALU to add its operands," and LOAD.AC means
"load the AC with whatever is at its data inputs." At this stage, if we were to
change our minds about some details of our main data path architecture, it is
likely that these declarations would not require alteration.
Next, we must expand the specifications for each of the intermediate-level
commands. Eventually, we will end up with detailed assignments of elementary
signals to microinstruction command bits, but we need not hurry this process.
We may describe the data multiplexer selection codes as follows, using the
same ordering of the multiplexer inputs as in the LD20.
* DATA MULTIPLEXER SELECTIONS
SEL.PC INV DATAMUXCTL=O
SEL.MB INV DATAMUXCTL=l
SEL.MA INV DATAMUXCTL=2
SEL.AC INV DATAMUXCTL=3
SEL.SWR INV DATAMUXCTL=5
SEL.MEM INV DATAMUXCTL=6
SEL.INPUT INV DATAMUXCTL=7
SEL.EA INV DATAMUXCTL=7, ENABLE.EA
The field occupies 6 bits. We choose to specify the default assignment so that
whenever we are not specifically controlling the ALU, it will be performing a
PASS operation.
Specifying the control of most of the data registers is simple, since only a
single control signal is involved, but the AC, with its ability to shift and load,
requires two control signals. Here is a representative sample of the declarations:
* SOME DATA REGISTER OPERATIONS
LOAD.MB INV MBLD
LOAD.PC INV PCLD
Finally, the specification of the command bits for the ME, PC, and AC
registers is:
We are nearly done. We may now tabulate the test input signals needed
to drive the microcode, and prepare the necessary declaration statements. Table
10-5 contains the declarations. Each invocation describes the voltage polarity
of the signal and its position within the test multiplexer.
Our treatment of the microprogram for the LD30 is finished. We have shown
representative portions of the algorithm and of the declarations that describe the
terms used in the algorithm. How different is the LD30 machine from the LD20?
The main elements of the architecture are the same in both the hardwired
and microprogrammed designs. Several minor architectural features of the LD20
survive in the LD30, and the LD30 contains two major additions: the test input
multiplexer and the jump-address EPROM used in instruction decoding. Figure
10-23 shows the elements of the LD30's architecture. Not much is left.
The LD30 requires about 70 chips; the LD20, 125. The dramatic change
between the two is, of course, in the implementation of the control algorithm.
In moving from the LD20's hardwired control to the LD30's microprogrammed
control, we eliminate about 55 integrated circuit chips and innumerable wires.
The LD30's control algorithm looks like a program, and it is one. It contains
about 120 microprogram instructions and several hundred lines of declarations.
Comments are often useful in the microprogram, but with well-chosen nomenclature
and systematic top-down development, the microcode is largely self-documenting.
In the senior-level hardware laboratory at Indiana University, our students
study, construct, debug, and extend the LD20 hardwired and the LD30 micro-
programmed versions of the PDP-8. With the aid of the Logic Engine, our
students accomplish this in one semester, using wire-wrap technology. The proof
of their performance is to download and execute actual PDP-8 programs-a
source of immense satisfaction for the students and the instructors.
SUMMING UP
Starting in 1964 with IBM's use of microprogramming in their System/360 digital
computers, manufacturers have increasingly adopted microprogramming methods
for the control of computers. It is fair to say that most computer designs now
have at least some microcode in their control-a fact often unknown to the
programmer, since the microcode is not visible. The conventional programmer
works with the computer's machine or assembly language, or with higher-level
languages, without being aware that there is really another layer of programming-
the microcode-buried in the hardware.
The great advantages of microcoding are its uniformity and ease of mod-
[J[J[JB
HALTFF Interrupt Interrupt lOP
D
JAMIRO
o
BB Memory
control
Effective
address
AUTO-INDEXING
oMB.ZERO
MANUAL.SW
FF
ification. Carrying the notion further, we could control all digital tasks with a
single type of controller, such as a Logic Engine. Using microcode, we make
the controller perform a specialized task for each type of device, for instance
executing the PDP-8's instructions in the LD30. With identical copies of the
basic microprogrammable processor, we could control computers, line printers,
card readers, floppy disks, terminals, and other devices having suitable speed
requirements. Each of these would have its own architecture, which would
EXERCISES