Unit 4 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 59

llCID

Microprogrammed
Design

At the heart of our development of digital design is the algorithm. It is our basic
tool for organizing our thoughts, and we use it to guide the design process. The
actual implementation of the control algorithm is less important than the algorithm
itself. We will accept any reasonable implementation scheme that conforms to
our demands for clarity, simplicity, and regUlarity. In Part II, we developed
systematic methods of realizing algorithmic state machines using building blocks
of the scale of MSI integrated cicuits. Are there other ways to transform ASM
charts into circuits?
From the earliest days of computers, programmers have regarded computers
as machines for executing algorithms. In 1951, Maurice Wilkes* proposed building
a special "computer" for executing algorithmic state machines, with the logic
of the algorithm residing in a special program call1ed a microprogram. Wilkes's
concept, microprogramming, was well ahead of the state of digital technology.
Microprogramming was not used commercially until 1964, when IBM employed
it extensively in the construction of the System/360 series of computers.
The year 1964 also saw the beginnings of the small, inexpensive computer.
In that year the Digital Equipment Corporation introduced the PDP-8 minicomputer,
which was the first CPU inexpensive enough to be dedicated to running algorithms
to control a particular device. The PDP-8 and its successors and imitators

* M. V. Wilkes, "The best way to design an automatic calculating machine," Manchester


University Computer Inaugural Conference, 1951, p. 16. A more accessible reference is M. V.
Wilkes, "The growth of interest in microprogramming: a literature survey," Computing Surveys,
Vol. 1, September 1969, pp. 139-145.

371
enjoyed wide use for more than ten years in sophisticated digital control applications,
although engineers looked upon these uses as simply an extended form of con-
ventional programming. In 1974 another wave of technology produced the dra-
matically less expensive CPUs that we call microprocessors and microcomputers.
The ensuing explosion of applications will continue in the foreseeable future.
Unfortunately, many people equate the microprocessor with the concept of mi-
croprogramming, a serious misconception. Today's microprocessors and micro-
computers are inexpensive, small, conventional computers, programmed in a
conventional way. Microprogramming represents a different approach to pro-
gramming, and although many computers are constructed with the aid of micro-
programming techniques, the conventional software programmer normally does
not use the technique. To separate these ideas more clearly, we refrain from
using the sadly diluted "microcomputer" and "microprocessor" names when
referring to microprogrammed devices. Instead, we will speak of a "micropro-
grammed controller," or "microcontroller."

CLASSICAL MICROPROGRAMMING
Wilkes recognized the fundamental separation between controller and architecture
and was able to contemplate new and systematic ways of implementing the
control function. Although formal ASM charts had not yet been invented, Wilkes
proposed a machine whose fundamental operation was the execution of the ASM
state in Fig. 10-1. This standard state has at most one test variable, and may
have none.

Figure 10-1 The ASM state executed


by Wilkes's microprogramming
machine.

Wilkes proposed to use diodes for his machine. At that time, diodes were
used to construct logic AND and OR functions. Although we no longer use
diodes for this purpose, to appreciate Wilkes's proposal you should understand
that diode construction provides a wired-OR capability similar to that of the
modern open-collector gate.
We will describe Wilkes's method by means of an example. Consider the
simple ASM in Fig. 10-2. Let's proceed with a clocked implem~ntation of the
state generator using an encoded state assignment of the usual sort; Fig. 10-2
shows an arbitrary assignment of state variables B and A for the three states.
Implementing the algorithm calls for constructing the new values of the state
variables, B(NEW) and A(NEW), which serve as inputs to the clocked state flip-

372 Bridging the Hardware-Software Gap Part III


State variables: BA

@ r---'------,

Figure 10-2 An ASM with three


F states, to illustrate classical micropro-
y
grammed control.

flops. If we decode the current state variables, we may produce logic signals
SO, Sl, and S2 for the individual states. A routine examination of the ASM
leads us to the following equations for the various outputs required in the imple-
mentation:

R(NEW) = SO·X + S2·Y


A(NEW) = SO·X
p = SO + S2
Q = Sl
R = SO·X

Wilkes proposed a systematic way of implementing these equations, which


we will model as follows. Arrange the test inputs on vertical wires, with individual
state signals emerging from the decoder on horizontal wires. Then, with diode
AND gates, produce the branch path terms required for each state and send
these horizontally to the right for use in constructing the command and next
state functions. For instance, in the first step, the scheme accepts state signal
SO and test input X, and produces SO·X and SO·X. Rather than use notations
for diode or simple AND gates, let us suppress the detail in order to emphasize
the standard, systematic structure of the design. We use a special square symbol
with two outputs:

so ~I,-__ ---,f---- ::~:


Chap. 10 Microprogrammed DeSign 373
One such operation in each branching state will produce all the product terms
required by the logic equations. If a state has no test input, we omit the square
symbol, sending the raw state signal on to the right.
To merge the terms into final expressions, we must have a systematic way
of performing the logic OR operation of the appropriate product terms. The
product terms all appear on horizontal lines, so we will arrange the output signals
on a second set of vertical wires, to the right of the input lines. Taking advantage
of the wired OR capability of diodes, we add a diode at wire crossings where
we require the OR operation. For simplicity, we show this as a dot.... The
hardware will produce the OR of all the dotted horizontal signals on a vertical
wire. Figure 10-3 shows the implementation for the ASM of Fig. 10-2, using
our square and dot notations for the logic operations.
Wilkes's idea was masterful. Here is a systematic method of building up
the random logic needed to implement any ASM with no more than one branch
per state. There is another powerful way of viewing the diode matrix, which
shows its close connection to conventional programming and which led to the
term microprogramming. The orderly arrangement of the matrix of wires in Fig.
10-3 suggests a lookup table. From the discussion of ROMs in Chapter 4 you
saw that we could represent any Boolean expression as a table in which we look
up the result for a particular set of values of the input variables. This was what
Wilkes did, using the technology of his day; he implemented the next-state
computation as a "table" of diodes. Each row at the right in Fig. 10-3 corresponds
to a table entry for a particular branch path of the algorithm. The table entry
describes the values of each output element in the system: the new values of
state variables and values for each command variable. We can interpret the

x Y

~
SOoX
0 SOoX
Decoder
~ I S1

~
S20Y
r---- 2 S20Y

31--

P Q R

B
State
A
t1ip-
flops

< B(NEW)
A(NEW)

Figure 10-3 A diode-based microprogrammed implementation of Fig. 10-2.

374 Bridging the Hardware-Software Gap Part III


state machine as a primitive computer whose instruction set executes only the
ASM branch operation, and whose instructions come from a special bit pattern
program, a microprogram. Each row in Fig. 10-3 is an instruction in the mi-
croprogram; in each row, the OR dots represent I-bits, and the absence of a
dot represents a O-bit.
Why is this viewpoint powerful? We have substituted bits of memory for
a random collection of gates. Since the bits of the diode array have a close
correspondence to the ASM structure, and are out in the open, we may readily
understand and manipulate them. Changing a Wilkes microprogram merely involves
changing some diodes in a systematic manner.

CLASSICAL MICROPROGRAMMING
WITH MODERN TECHNOLOGY
Let us now explore the design of microprogrammed controllers with modern
devices. We will find that this causes a few changes in Wilkes's scheme, but
only in the details. At the conceptual level, microprogramming remains the
realization of controllers by means of tables rather than gates, an idea that
survives from Wilkes. We may reduce any ASM chart to a table with inputs of
current state and status variables and outputs of next-state variables and commands.
If we translate an ASM chart into a table and then implement the table directly,
using hardware lookup techniques, we are microprogramming in the broadest
sense. The essential step is the direct implementation of the table, bypassing
gates, Boolean algebra, and so on.
To simplify our treatment, we adopt for the present a uniform representation
of logic truth in the microcode. The customary convention is positive logic, with
T = H; we adhere to this convention in this section. In software programming,
the choice of convention is of no consequence to the programmer, who is not
dealing with voltage. In microprogramming, where we remain close to the hardware,
this choice of positive logic will create problems, and we will return to discuss
how to reinsert the full power of mixed logic into the microcode.
If our micro controller is to implement really large algorithms, comparable
to computer programs, we may need a sizable memory to hold the microinstructions
for all the branch paths of the ASM. At this stage, ROM is a good choice since
its contents remain intact even when the power is off. This means that the
algorithm is instantly available when the power goes on, which seems quite
desirable. The absence of inexpensive, fast ROM was the stumbling block in
implementing microprogrammed control after its introduction, and many years
of research ensued before the development of practical devices; it is no longer
a problem.
In the testing of status variables newer technology has forced some changes
in Wilkes's scheme. In the jargon of microprogramming, ASM variables are
called qualifiers. In Fig. 10-3, we supply the current 2-bit state address B,A
that we decode into individual signals for each state. After this decoding of the
state address, we incorporate the qualifier tests using AND gates, to create one

Chap. 10 Microprogrammed Design 375


line for each decision path in the ASM. There are more branch paths than states,
and each branch path results in one microinstruction. .
As long as we implement the micromemory bit by bit with diodes, we can
dive into the hardware following the address-decoding stage and insert or remove
AND gates as needed. But ROMs and RAMs come as indivisible integrated
circuits-the designer has access to address inputs and memory outputs, but to
none of the interior circuitry. With RAM or ROM, the decoder of Fig. 10-3 is
inside the device, and we have no way to get in to insert the AND gates. We
need some other way to test the status variables in any given state, while
preserving the table-driven nature of microcoded design.

Microprogramming with Multiple Qualifiers per State

The only way to access microinstructions stored in RAM or ROM is through


the address inputs. The important elements in identifying a microinstruction are
the current state and the test inputs or qualifiers. We might construct the ROM
address from these two sources of bits: the current state code, obtained from
the state flip-flops, and the individual qualifier signals, using one address bit for
each qualifier. The size of the address field is the sum of the number of state
flip-flops and the number of test inputs used in the design. For n address bits,
a ROM contains 2n words. Since the ROM's size grows exponentially with the
number of qualifiers, this method rapidly gets out of hand.
In this approach, the value of every qualifier contributes to each next
instruction address. This is highly redundant addressing, since most of the
combinations of qualifiers are of no interest. For ASMs built from such states
as are shown in Fig. 10-1, at most one qualifier is needed in each state, yet the
microinstruction address includes all the qualifiers. In general, every address is
possible, so each word in the ROM must contain a valid microinstruction. Each
microinstruction must supply the value of all command outputs and the value
of the next ROM address:

Next-state Command
address outputs

To illustrate this approach, let's again implement the small algorithm in


Fig. 10-2, this time basing it on a ROM. (In practice, we would choose PROM,
EPROM, or RAM, to permit more ready modification of the microcode during
the development phase of the design. After the design has stabilized, we could
then find a manufacturer to produce the ROMs if our production volume warranted
this step. Let's use PROM in this example.) There are two state variables and
two qualifiers, we we must have at least 4 ROM address bits, resulting in sixteen
microinstructions. We begin by exhaustively enumerating the outputs required
for each of the sixteen instructions:

376 Bridging the Hardware-Software Gap Part III


Address Contents
B A X Y B(NEW) A(NEW) P Q R

0 0 0 0 0 0 1
0 0 0 1 1 0 0 1
0 0 1 0 0 1 1 0 0
0 0 1 1 0 1 1 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0
0 1 1 0 0 0 0 1 0
0 1 1 1 0 0 0 1 0
1 0 0 0 1 0 1 0 0
0 0 1 0 0 0 0
0 1 0 1 0 0 0
0 1 1 0 0 0 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 0 0 0 0
1 1 0 0 0 0 0

Next, we implement the table directly in a system of sixteen 5-bit words. Figure
10-4 is a sketch of the circuit. All lines into the PROM are address inputs; of
the 5 output bits from the PROM, 2 are inputs to the state flip-flops, and 3 are
command outputs to the architecture.
This approach to microprogrammed design is conceptually straightforward
but requires enormous ROMs as the algorithm becomes more complex. There
is an added benefit, however. Our original treatment based on Wilkes' s work
allowed us to implement ASMs containing at most one test variable per state.
Since the present approach requires us to create a microinstruction for every
possible combination of qualifier values for each state, we are automatically able
to implement an ASM of arbitrary complexity; hence the name "multiple qualifier."
No matter how complicated the branch path through a state, it corresponds to
some microinstruction in this ROM-based design. Note the strong similarity to
the ROM-based implementation of logic circuits discussed in Chapter 4.
As an exercise in using the mUltiple qualifier method, you might wish to
implement the Black Jack Dealer machine of Chapter 6, using microprogrammed
control. Suppose you use the same architecture as in the hardwired solution
presented in Chapter 6. In Fig. 6-32, you can identify two state variables (say
B and A) and eight qualifiers (CARD.RDY.SYNC, CARD.RDY.DELAYED,
STAND, BROKE, ACECARD, ACEllFLAG, SCOREGT16, and SCOREGT21),
so the ROM address field will contain 10 bits. The number of microinstructions
is 210 = 1024! The eleven command signals (including the adder select signals)
together with the two inputs to the state-variable flip-flops require that each ROM
word have 13 bits. You will need a system containing a 1K x 13 ROM.
You may wish to write down the contents of some of the 1024 microinstruc-
tions to solidify your grasp of the concepts in the multiple qualifier method. You

Chap. 10 Microprogrammed Design 377


State
flip-flops
A A (NEW)
Q D

<

B B(NEW)
Q D

<

~
'--+- A
d 0
d u
r PROM t

-
e p
s u
s t
-

Y
S
t
a
u
C
0 i~
~
a t m n Q
X t p Architecture m P
u
u u a t p Figure 10-4 A PROM-based micro-
s t n
s d
s programmed implementation of Fig.
10-2.

can appreciate the tedium of using this "simple" method to implement a complex
algorithm manually.
What is your reaction to this approach? Ours is:
(a) It is a straightforward but tedious implementation of a general ASM.
(b) The tables are very large, even for relatively small problems.
(c) The method would be feasible only with inexpensive ROMs or PROMs.
The problem is that the address for the ROM is a concatenation of a small
number of encoded state-variable bits and a large number of individual qualifier
bits. The state variables are important at all times in the execution of the
algorithm, but each qualifier appears only occasionally in the ASM. Most of the
time, the algorithm is indifferent to the value of most of the qualifiers, yet we
must enumerate each combination. We are forced to use a canonical form of
truth table rather than the compact form allowed by the typical ASM. This
ROM-based method is feasible if you have a "smart" PROM programmer that

378 Bridging the Hardware-Software Gap Part III


can accept your logic equations and expand them into a canonical truth table.
But in most applications another approach might be better.

One Qualifier per State

The Wilkes scheme tests only one qualifier per state. The address is formed
from the state variables alone, requiring the decoding of only a small number
of variables. The scheme implements the qualifier tests with AND gates inserted
in an orderly manner inside the circuit, following the decoding of the address.
You have seen that using a ROM precludes this method, and our first attempt
was to move all the qualifier signals out into the address field. This allowed us
to implement general ASMs, but at a severe penalty. We would like to remove
the qualifier variables from the ROM address field so that we can eliminate
redundant microinstructions.
Providing properly sequenced command outputs to the architecture is the
purpose of an ASM. In microprogramming, the command outputs arise from
bits in the microcode. In addition to these command bits, microprogramming
instructions also provide the new values of the state variables. This gives us a
clue: our microcode has two components, an external one (command outputs)
and an internal one (the next microinstruction address). Perhaps by enlarging
the internal portion of the microinstruction we can shrink the number of instructions.
We are striving to develop a method of handling large problems, and we may
have to compromise the generality of the ASM structures that our method will
handle.
Let's start instead with the most elementary useful ASM operation; one
qualifier per state with no conditional outputs. Further, let's try to realize each
state with only one microinstruction. In such a scheme, the ROM address would
consist solely of the state variables, which would select the proper microinstruction.
This instruction must contain sufficient information to guide the development of
the next-state address. In particular, the instruction itself must specify which
qualifier this particular state is testing.
If we organize all the qualifiers in a list, we may designate any qualifier
by its index n in the list. The ASM structure we are trying to realize is

T F
X(n)

where X(n) is one of the qualifiers X. Let's include this index n-the qualifier
index-in the microinstruction. Then, in our single microinstruction for state

Chap. 10 Microprogrammed Design 379


i, we must also include the next-state addresses j and k for the true and false
branches. Each microinstruction in our ROM will now look like:

Qualifier True False Command


index address address outputs
n TA FA

Each word is now wider than in the multiple qualifier method, since it includes
the index field and an extra address field, but our microinstruction table is reduced
to one row per state.
To execute a microinstruction, our primitive microcontroller must be able
to select the proper X(n) using the value of n in the instruction. Based on the
present value of the selected test input X(n) the processor must choose one of
the two address fields as input to the state flip-flops. Let's construct this selection
hardware. Index n is an address in a table, so we may use a multiplexer building
block with n as the code for the test input selection. The output of this multiplexer
is a variable X(n), whose value must specify either the true or the false address.
We can perform this last selection with a set of two-input multiplexers, using
X(n) as the select input. Figure 10-5 is the circuit for our processor.

.------4.. Command outputs

One mux for each


address bit
FA ~ _ _ _.;..v..lo-_ _ _ _~
v
TA~ _ _ _';"V~_ _ _ _~
Next state
address
ROM
X(O)
Address X(J)
Test input
selector
Index ~ _ _ _ _ _....
f1

v
'-----~---_1 Q D ~-----------.....

One flip flop for


'--_---' each address bit

Figure 10-5 A primitive microprogram controller with an input qualifier index


and both true and false jump addresses.

To illustrate the use of this microprogrammable machine to execute an


algorithm, we will write the microcode for the ASM of Fig. 10-6, which is a

380 Bridging the Hardware-Software Gap Part III


State variables: BA

y Figure 10-6 Figure 10-2 redesigned


to have no conditional outputs.

variant of Fig. 10-2 modified to eliminate the conditional output. Assigning


indices 0 and 1 to the qualifiers X and Y, respectively, yields the following four
microinstructions:

Code
Address n TA FA P Q R

0 0 1 3 1 0 0
1 0 0 0 0 1 0
2 1 0 2 1 0 0
3 0 2 2 0 0

TA and FA each require 2 bits, n has I bit, and there are three command outputs,
so this design would require four words of 8-bit ROM.
Unconditional state transitions occur in instructions 1 and 3. Since each
microinstruction must specify the index n of some test variable, we simply choose
any index and make both TA and FA point to the same next instruction.
Why did we eliminate the ASM's conditional output from our single-qualifier
scheme? Conditional outputs arose naturally in Wilkes's scheme and in the
multiple-qualifier approach. But here we have exactly one microinstruction per
ASM state, and all the information for the execution of that state must reside
in that microinstruction. If we were to permit conditional outputs, we would
need a way to designate, for each command output bit, whether it is to be
asserted unconditionally, or only on the true branch, or on the false branch, or
not at all. This would require 2 microinstruction bits per command output-a
considerable burden on the hardware. One of the virtues of the present method

Chap. 10 Microprogrammed Design 381


is its simplicity of form. Another reason for eliminating conditional outputs will
surface later.

Single-Qualifier, Single-Address Microcode

The preceding single-qualifier structure is feasible, but the two address fields
can consume considerable space in the microinstruction. We can eliminate one
address field if we adopt a rule for inferring that address from the present address.
The obvious choice, which conforms closely to the practice used in conventional
computers, is to insist that one of the branch addresses be the next sequential
address. In the single-qualifier method, state assignments correspond to micro-
instruction addresses, and since the state assignments are at our disposal, we
may use normal sequencing to save bits in the microinstruction. The ASM
operation reflecting this modification of the single-qualifier scheme is shown in
Fig. 1O-7a.
The microinstruction now contains one jump address, the qualifier index,
and the command output bits. In the version of ASM in Fig. 1O-7a the jump
address is always the true path. We can enhance the versatility of this approach
by adding one more bit to the microinstruction, to allow the microprogrammer
to specify which path, true or false, the jump address refers to. Now the format
for the microinstruction is

Jump
Index Command
TFBIT address
n outputs
JA

This form of microprogramming implements the basic ASM operation in Fig.


1O-7b.

Figure 10-7 ASMs for single-qualifier,


single-jump address microinstructions.
(a) Microinstruction branches on a true
qualifier. (b) Microinstruction allows
(a) (b) selection of the jump condition.

Now consider how we might build the processor for this method. Since
we have eliminated one of the address fields, we have also removed the need
for the two-input multiplexers on the state flip-flop inputs in Fig. 10-5. Instead,
we need to be able to increment the current address whenever the test variable
value is opposite of TFBIT in the microinstruction. We may incorporate this

382 Bridging the Hardware-Software Gap Part III


operation into our state flip-flop assembly by replacing the simple flip-flops with
a programmable binary counter building block. Our micro controller processes
a microinstruction in each clock cycle, so we are always either branching, which
corresponds to landing a new value into the counter, or sequencing the counter.
With the aid of the ASM in Fig. 1O-7b, we may derive the condition for
loading the counter. We load the counter whenever the next address is to be
the microinstruction jump address fA:

(NEXT.ADDRESS = fA) = X(n)oTFBIT + X(n) oTFBIT


= X(n) 0 TFBIT

We sequence the counter whenever we do not jump. An implementation of this


microprogrammable controller is shown in Fig. 10-8.

Com ... ,",


outputs
I T
X(IJ
X(2)
I
I
I
~
2

V1
X(n)

n
I
PROM

Addr TFBIT

v
JA
-<:::-'-

v
Qk Dk I -

LD
Counter
CNT

< Figure 10-8 A counter:based imple-


74LS163 mentation of the microinstruction con-
counters trol in Fig. IO-7b.

Another convenience shown in Fig. 10-8 is making test input position 0 a


permanent true signal. The microprogrammer may then execute an unconditional
branch (no test variable) by designating n = 0 and TFBIT = 1 in the microin-
struction. Conversely, unconditional sequencing to the next instruction· occurs
when n = 0 and TFBIT = O.
Having constructed this sophisticated single-qualifier controller, let's use
it for the simple ASM in Fig. 10-6. Earlier, when we used the single-qualifier
method with two address fields for this algorithm, our exact choice of state
assignment was unimportant. (Why?) In the present method, we are required

Chap. 10 Microprogrammed Design 383


to make one of the exits from each state sequential. We encounter difficulty
with the assignment in Fig. 10-6 since, as it happens, neither branch from state
S2 is sequential. T() avoid this problem, we may renumber the states in Fig.
10-6 so that SO = 2, SI = 0, S2 = 1, and S3 = 3. If we assign indices 1 and
2 to qualifiers X and Y, respectively, the microcode for the program is

Index Jump address Command outputs


State Address n TFBIT JA P Q R

SI 0 0 1 2 0 1 0
S2 1 2 0 1 1 0 0
SO 2 1 0 1 0 0
S3 3 0 1 0 0

As another illustration, we could implement the Black Jack Dealer of Chapter


6 using the sophisticated single-qualifier scheme. The ASM in Fig. 6-32 is not
suitable, since it contains conditional outputs and states with multiple tests.
Converting this ASM to an appropriate form is a useful exercise. In most cases,
we may convert a conditional output to an unconditional one by creating a new
state for the output. This simple method will not work when the exact timing
of 'the conditional output is crucial; an example is in the GET state of Fig.
6-32. If we create a separate state for the HIT output, the hit light will blink
on and off as the ASM loops around the two-state loop. We will deal with this
particular problem presently.
States with multiple tests also require modification before we can use the
single-qualifier method. Where the timings are not critical, we may create new
states to perform each individual test. This approach would handle all cases in
Fig. 6-32 except the first two tests in the GET state. You will recall that this
structure is a manifestation of the familiar single pulser; it requires that
CARD.RDY.SYNC and CARD.RDY.DELAYED be tested simultaneously. For
the Black Jack Dealer algorithm, the solution, which also solves the problem of
the HIT output, is to use one of the other forms for describing the action of the
single pUlser. (Notice how valuable was the knowledge that this troublesome
structure represented a standard design element. How much more difficult the
analysis would have been without this knowledge!)
Figure 10-9, the result of the ASM transformation, is a version of the Black
Jack Dealer algorithm that we may implement as a single-qualifier microprogram.
Figure 10-9 contains an address (state) assignment that makes good use of the
requirement that one branch of each state must lead to the next sequential state.
We have made an arbitrary choice of the qualifier index, and have shown the
index in brackets in each test in the ASM.
Here is the microcode for the Black Jack Dealer of Fig. 10-9, without the
command outputs. You should find it easy to include command bits in each
microinstruction. What is the size of the ROM required by this microprogram?

384 Bridging the Hardware-Software Gap Part III


Index Jump address
Address n TFBIT fA

0 1 0
1 0 1
2 2 11
3 3 11
4 4 1 12
5 6 0 0
6 7 0 10
7 5 0 9
8 0 1 5
9 0 0
10 0 0
11 0 4
12 5 5
13 0 5

Comparison of the Microprogramming Approaches

We have considered microprogramming from two viewpoints. How does the


single-qualifier approach compare with the multiple-qualifier method? Some facts
to consider are:

(a) The single-qualifier method has a one-to-one correspondence with an ASM


chart; the multiple-qualifier method does not.
(b) The single-qualifier method requires much less microcode.
(c) The single-qualifier method will handle large problems easily; the multiple-
qualifier method cannot tolerate many qualifiers before it becomes unman-
ageable.
(d) The multiple-qualifier method handles a completely general ASM; the single-
qualifier method handles only a special case.
(e) The single-qualifier method requires more special hardware in the micro-
controller, although the multiple-qualifier method requires the larger ROM.
From these points we may draw some conclusions. The single-qualifier
method is clearer and easier to manage. It is more likely to result in correct
microcode the first time. Field-service personnel are more likely to understand
single-qualifier programs and they are much easier to document.
Item (c) is probably decisive, but even so, our sense of style leads us to
recommend the single-qualifier method for most applications.

MOVING TOWARD PROGRAMMING


As we have developed the concepts of microprogramming in this chapter, our
language and our emphasis have come ever closer to those of the software
programmer. We began by emphasizing hardware, looking for ways to systematize

Chap. 10 Microprogrammed Design 385


CARD.RDY.SYNC[I] ~T_--,

F
CARD.RDYSYNC [I] >--....
T

13
SelectADDJO
Load SCORE
T-+ACEllFLAG

F
SCOREGT16 [6J

10
SCOREGT2J [7J

8
T Select SUBJO
ACEllFLAG [5J Load SCORE
F-+ACEllFLAG

Figure 10-9 The Black Jack Dealer ASM revised for microprogrammed implementation.

386 Bridging the Hardware-Software Gap Part III


the design process to handle large problems. We followed Wilkes through his
discovery of table-driven hardware controllers. We finally arrived at a design
of a machine that executes only a restricted form of ASM operation but that
executes all such operations in a systematic manner. We call the specification
of each standard ASM state a "microinstruction," and we refer to the state-
variable values as a "memory address." We think in terms of writing a set of
instructions-a microprogram-to describe an algorithm for this specialized
machine, and we place the program into a memory.
It sounds like programming, but how far have we gone? Microprogramming
is a middle ground between hardwired design and conventional programming,
drawing advantages from each. From the viewpoint of the designer of hardware,
microprogramming offers a way to tackle large and complex control problems.
It retains much but not all of the speed and capability of parallel action that is
characteristic of hardwired ASM implementations. The single-qualifier approach
to microprogramming permits the designer simultaneously to receive status in-
formation, control the flow of the algorithm, and issue detailed commands to
the controlled device. We lose multiple branches and conditional outputs and
also a bit of speed, but we gain a compact, highly structured, easily modified
method of formulating and implementing algorithms. At the other end of the
microprogramming spectrum, the multiple-qualifier approach costs nothing in
speed or in the flexibility of the algorithm, but tends to overpower the designer
with its exponentially increasing size of microcode.
The software programmer sees microprogramming as an entry into the field
of hardware. The serial, one-step-at-a-time nature of conventional programming
gives way to a more parallel but still program-oriented approach. The micro-
controller has a more primitive command structure than a conventional computer,
yet the simple but parallel nature of its operations makes for much greater speed
and versatility.
Why do we stress the programming aspect? Why has microprogramming
come to be considered a conceptual breakthrough in hardware design? The
answers lie in the great store of experience that computer science has gained in
using programs to emulate algorithms. Conventional programmers have a host
of software tools and strategies that we can use in microprogramming. By
transforming a hardware problem into the programming domain, we may look
forward to using editors, assemblers, language translators, and debugging aids
in support of the development of our microcode.
Let us try, then, to borrow from programming concepts to expand the
usefulness of microprogrammable controllers, without detracting from their inherent
power as emulators of hardware algorithms.

Cleaning Up the Outputs

Throughout our development of microprogramming, the microinstruction memory-


ROM, PROM, or RAM-has been the source of command outputs to the ar-
chitecture and control signals to the next-state controller. Unfortunately, these
memories undergo relatively long periods of instability when their address inputs

Chap. 10 Microprogrammed Design 387


change, and so the signals emerging from the memory outputs have undesirable
voltage characteristics. Our circuits for sequencing microinstructions have worked
despite this drawback, since the changes in the memory addresses have been
synchronized with the system clock. The architecture is exposed to the impurities
of the command outputs, but since it is driven by the same clock as the control
unit, the designer may use these outputs reliably to feed the clocked architectural
elements. The command outputs are unsatisfactory for nonclocked uses, such
as serving as control signals to the world outside the clocked design. In these
important cases, the designer must purify the command outputs, usually by
passing them through a clocked flip-flop to assure a clean output.
The microinstruction memory has served as an instruction register, but in
recent practice the role of instruction register is removed from the microprogram
memory and is assumed by a true clocked register. This register, known as the
microinstruction register or pipeline register, receives the full output of the
microinstruction memory and delivers clean, reliable signals to the architecture
and to the circuits that produce the next microinstruction address. The designer
then has a uniform and reliable interface with the microprogram control unit.

Enhancing the Control Unit

Just as Fig. 10-8 grew out of consideration of more primitive controllers, we


may generalize it to produce a more powerful controller. In Fig. 10-8, the counter
and the coincidence gate perform a control function that responds to inputs and
produces an output. The inputs are the test input signal from the external
architecture and the TFBIT and jump address from the present microinstruction.
The output is the address of the next microinstruction. We may view the TFBIT
and jump address as components of an elementary computer branch instruction,
and the counter and coincidence gate as a primitive computer control unit. In
this view, Fig. 10-8 implements a computer with one flow-of-control instruction,
a simple branch. But ordinary computers have much more sophisticated branching
than this, so why should we not incorporate some of this sophistication into the
next-instruction-address evaluator within our microinstruction control unit?
We will view a microinstruction as consisting of two components, a mi-
croinstruction sequencing part and a command output part:

Sequencer component Command component


Sequencer Test
Command outputs
instruction index

To architecture
To architecture

To sequencer

The sequencing component of the microinstruction becomes an instruction to


be processed by a microprogram sequencer, contained within the microprogram
control unit. The sequencer and its instruction input may be relatively simple,

388 Bridging the Hardware-Software Gap Part III


as in Fig. 10-8 and earlier figures, or it may be quite sophisticated. Our view
of the microinstruction control unit has been transformed into Fig. 10-tO. With
these moves, we have a structure that is close indeed to conventional computers,
yet retains much of the power of hardware. Our microprogram control unit has
a memory, an instruction register, and is capable of determining the microprogram's
flow based on the present state of the system and an external input.
With these expanded capabilities, we can express quite complex control
algorithms. Our new model of a microprogrammable controller (Fig. 10-tO) still
implements an ASM similar to Fig. 10-7 in which all command outputs are
unconditional and control of the microprogram either moves to the next sequential
state or branches to a new state in response to the test of a single input signal.
However, the possibilities for branching are considerably enlarged. We will see
that we may use subprogram calls and returns, loops, and other useful constructs
from the programming world. The opportunity to express highly complex algorithms
as microprograms means that the designer will need sophisticated aids to support
the development, debugging, and maintenance of the microprogram. At the same
time, the architectures to be managed by these moreQomplex algorithms become
larger and more complex, requiring additional aids for hardware development.
Several microprogram sequencers are available as integrated circuit chips.

Architecture
r+"

Test ~
-- ~
inputs I
--l-
- /'

• Selected input

Next-instruction
address
Sequencer

I>

-
Sequencer
instruction P
i 0 A
p u d
Test index
~f-
e W
Q I D_~ C
i u S e
n _t s
Command e s Figure 10-10 A sophisticated micro-
A
outp uts program sequencer.

Chap. 10 Microprogrammed Design 389


The first was the 2909, introduced by Advanced Micro Devices in 1975. The
2909 accepts a 2-bit operation code and provides 4 bits of next-instruction address;
several 2909s can be cascaded to produce larger addresses. The 2909 supports
conditional branches and subprogram calls and returns. As the technology ad-
vanced, more address bits were provided within a single chip and more complex
operations were introduced. For instance, the 2910 integrated circuit produces
12 bits of next-instruction address and executes 32 instructions. The Texas
Instruments 74AS890 supports 64 instructions and has 14 address bits.

The 2910 Microprogram Sequencer

Microprogram sequencers are sophisticated devices with many features . We will


limit our discussion to the 2910 and those characteristics that support our study
pf microprogramming. If another micro sequencer is used, it will have similar
characteristics. Figure 10-11 shows the principal signals entering and leaving
the 2910. The 2910 is designed to produce the address of the next microinstruction
to be loaded into the pipeline register. It accepts a 4-bit operation code I, a
test-input signal CC (Condition Code), a control signal CCEN (Condition Code
Enable) to guide the use of the test-input signal, and a 12-bit data-Input field D.
The D-field usually provides a microprogram branch address (our familiar jump
address) from the pipeline register, although it has other uses in the 2910. The
output of the 2910 is a 12-bit next-instruction address Y. The 2910 can support
microprograms containing up to 4096 instructions. Since the 2910 contains internal
registers, we must supply a system clock signal CPo
Figure 10-12 shows the internal architecture· of the 2910. The next-instruction
address Yoriginates from one of four sources, selected by a 4-input multiplexer
based on the operation code and the value of the test input signal. Two of the
four sources are already familiar to us: the next sequential address (current
microinstruction address + 1), and a branch address derived from the current
microinstruction. The 2910 contains a microprogram counter-register ({JPC)
which records Y + 1, the next sequential address, in case it is needed later.
r---------- PIPELINE. ENABLE
r-------+-MA~ENABLE

CLOCK---...., r----- VECT.ENABLE


12
PL MAP VECT
(EN) (EN) (EN)
DIRECT.lNPUT D

12
y NEXT. INSTR UCTION
INSTR UCTION 2910
ADDRESS

CONDITION. CODE
ENABLE
CC CI FULL

CONDlTIONCODE - - - - - ' ' - - - - - FULL.STACK


CARR Y.IN - - - - - - '
Figure 10-11 The 2910 microprogram sequencer.

390 Bridging the Hardware-Software Gap Part III


Stack
Register
pointer

5-word
stack

D -.....- - - ,

Figure 10-12 Twelve-bit data paths in


y the 2910 microprogram sequencer.

The output of the p,PC forms one input to the Y multiplexer. The 12-bit external
data input D forms another multiplexer input. UsuallY,.D comes from the jump-
address field of the pipeline register.
The two remaining multiplexer inputs support subprogram calls and program
looping. The 2910 contains a five-word stack. In a subprogram call, the 2910
must save the return address on the top of the stack; a subprogram return must
supply the return address from the top of the stack. The 2910's internal stack
allows calls to microprogram subprograms to be nested five deep. Each subprogram
call results in a stack push operation, and each subprogram return causes a stack
pop operation. When a microinstruction executes a subprogram call, the required
return point is the address of the control store word following the subprogram
call instruction. This address is exactly the quantity that is currently stored in
the 2910's p,PC; Fig. 10-12 shows a data path from the p,PC to the stack that
supports subprogram calls.
The fourth input to the Y multiplexer is from an internal register R that
can hold a loop counter. The R-register can be loaded from the 2910's D-input.
Several 2910 instructions support the loading, testing, and decrementing of the
value in the R-register.
The 2910's 4-bit operation code supports 16 basic instructions. Each in-
struction has a "pass" and a "fail" option, generating a total of 32 possible
operations. The selection of pass or fail is controlled by the values of the 2910
inputs CC and CCEN, according to the following prescription: if the enable signal
CCEN is true and the test input (condition code) CC is false, then fail; otherwise
pass. Viewed another way, this structure allows the execution of the pass version
of an instruction when we are not testing the input, or when the input is true

Chap. 10 Microprogrammed Design 391


while we are testing it. The following table specifies the conditions for selecting
the option:

CCEN CC Result

F F Pass
F T Pass
T F Fail
T T Pass

CJP (conditional jump) is a typical 2910 instruction. In its fail mode, this
instruction selects the j.tPC as the Y-output, accomplishing normal sequencing.
In its pass mode, CJP selects the D-input as the Y-output, thus performing a
branch. At the next system clock edge, the pipeline register will receive the
appropriate instruction and the 2910's j.tPC will capture the address + 1 of this
instruction.
Another example is CJS (conditional jump to subprogram). In its fail mode,
CJS sequences to the next microinstruction address, with no effect on the 2910's
internal stack. In its pass mode, CJS performs a subprogram jump, which selects
the branch address in the D-input as the value of Y, and, when the system clock
fires, causes the contents of f,LPC to be pushed onto the internal stack. (As
usual, j.tPC will receive the new Y + 1 when the clock transition occurs.)
In the fail mode, the instruction CRTN (conditional subprogram return)
performs normal sequencing, with no effect on the 2910's stack. In the pass
mode, CRTN delivers the top-of-stack element to Y, thereby supplying the return
address to the previous subprogram as the address of the next microinstruction.
When the clock transition occurs, the 2910 pops its stack, and f,LPC receives Y
+ 1.
When the 2910 is used as the sequencing element in Fig. 10-12, an appropriate
form for the flow-of-control portion of the microinstruction is:

Sequencer component

D
I Test index

Each microinstruction provides the 2910 with the /, CCEN, and D fields. The
index field goes to the architecture to guide the selection of the appropriate test
input, which becomes the 2910's CC input.
Thus far, the 2910's D-field arises from the corresponding field in the mi-
croinstruction pipeline register. Although this is by far the most common and
useful mode of operation, the 2910 also permits an alternative source of the D-
input. With each instruction, the 2910 asserts one of three D-field selection
signals. In the instructions described above, the 2910 asserts its Pipeline-Enable
signal PL(EN). On the other hand, the 2910's JMAP (Jump on Map Address)
instruction, which causes an unconditional jump to the D-field address, asserts

392 Bridging the Hardware-Software Gap Part 1/1


the Map-Enable signal MAP(EN) instead of PL(EN). This feature provides a
limited yet useful capability to select the D-field input from a source in the
designer's architecture, under control of the MAP(EN) signal. We will use this
feature of the 2910 in a subsequent design example. One other 2910 instruction
has similar characteristics; all other instructions cause the assertion of PL(EN).
If MAP(EN) is chosen, all inputs to the 2910's D-field must have three-
state characteristics and the designer must use the 2910 PL(EN) and MAP(EN)
signals to select the proper input.
Table 10-1 is a summary of the 2910's instructions. In this chapter we use
about half of these instructions, and will explain each new instruction at the
time of use. Consult an AM2910 data sheet for additional information, if you
desire. The instructions CJP, CJS, CRTN, and CONT (Continue) are by far the
most commonly used 2910 instructions. In this chapter, we use the alternative
mnemonics JUMP, CALL, and RTN in place of CJP, CJS, and CRTN.

Choosing a Microprogram Memory

With the realization that our microprogramming methods are capable of describing
and executing quite complex algorithms, we begin to see the need for sophisticated
equipment to help the designer to manage the complexity. We have assumed
that the microprogram storage was a read-only memory-ROM, PROM, or
EPROM. In accordance with good programming practice, our microprograms
do not change; all the "data storage" is in the architecture. Even when sophis-
ticated microprogram sequencers such as the 2910 are used, the microprogram
remains fixed during execution-the sequencer itself contains storage for sub-
program return points and loop control. For such an environment, ROM seems
the natural choice. Many designers initially discarded RAM for this purpose
because of the volatility of its contents when power drops. But as the size and
complexity of modern microprograms have increased, this choice has been reversed.
For debugging complex microprograms, and when the microcode may be modified
in the field, RAM is essential. In microprogramming jargon, the microprogram
storage is the control store. If the control store is easily alterable, as is RAM,
it is called writable control store (WCS). If we use RAM, we must load it
frequently, and we need powerful microprogramming aids. In the next section
we describe a microprogrammable development system that provides the designer
with the hardware and software tools required to manage the design and de-
velopment process.

THE LOGIC ENGINE-A DEVELOPMENT SYSTEM


FOR MICROPROGRAMMING*
Microprogramming permits the designer to tackle complex control tasks, but this
ability to deal conceptually with complex designs entails numerous practical
* This section is modified from Franklin Prosser and David Winkel, "The Logic Engine
Development System-Support for Microprogrammed Bit-Slice Development," Proceedings of
MICRO-J6, October 1983, pages 84-91.

Chap. 10 Microprogrammed Design 393


TABLE 10-1 INSTRUCTIONS OF THE 2910 MICROPROGRAM SEQUENCER

REGI FAIL PASS


HEX CNTR CCEN = LOW and CC = HIGH CCEN = HIGH or CC = LOW
CON- REGI
13-10 MNEMONIC NAME TENTS Y STACK Y STACK CNTR ENABLE

JUMP ZERO X CLEAR CLEAR HOLD PL


°1
JZ
CJS COND JSB PL X °
PC HOLD ° ._-
0
~------
PUSH
~-

HOLD PL
2 JMAP JUMP MAP X 0 HOLD 0 HOLD HOLD M~
3 CJP COND JUMP PL X PC HOLD 0 HOLD HOLD PL
4 PUSH PUSH/COND LD CNTR X PC PUSH PC PUSH Note 1 PL
5 JSRP COND JSB R/PL X R PUSH 0 PUSH HOLD PL
t----
6 CJV COND JUMP VECTOR X PC HOLD 0 HOLD HOLD VECT
7 JRP COND JUMP R/PL X R HOLD 0 HOLD HOLD PL
-- -~------.

8 RFCT REPEAT LOOP, CNTR * ° *0


=0
F
PC
HOLD
POP
F
PC
HOLD
POP
DEC
HOLD
PL
PL
-- ~------

9 RPCT REPEAT PL, CNTR * ° *0


=0
0
PC
-.-
HOLD
HOLD PC
0
t----
HOLD
HOLD
DEC
HOLD
PL
PL
A C~TN COND RTN X PC - - H O L O - I------t--r- POP HOLD PL
- --t---D --- --POP
II CJPP COND JUMP PL & POP X PC HOLD HOLD PL
-,~

C LDCT LD CNTR & CONTINUE X PC HOLD PC HOLD LOAD PL


0 LOOP TEST END LOOP X F HOLD PC POP HOLD PL
E CONT CONTINUE X PC HOLD PC HOLD HOLD PL
*0 F HOLD PC ---POp DEC PL
F TWB THREE-WAY BRANCH -- I-------PC
=0 0 POP POP HOLD PL

Note 1: If CCEN == LOW and CC = HIGH, hold; else load. X = Don't Care

I-field
Value Mnemonic Function

$0 JZ Jump to location 0
$1 CJS Conditional jump to subroutine at pipeline address
$2 JMAP Jump to map address
$3 CJP Conditional jump to pipeline address
$4 PUSH Push with conditional load of counter
$5 JSRP Conditional jump to subroutine at R address or at pipeline address
$6 CJV Conditional jump to vector address
$7 JRP Conditional jump to R address or pipeline address
$8 RFCT Repeat loop if counter is non-zero
$9 RPCT Jump to pipeline address if counter is non-zero
$A CRTN Conditional return from subroutine
$B CJPP Conditional jump to pipeline address with stack pop
$C LDCT Load counter from D input
$D LOOP Test end of loop
$E CaNT Continue
$F TWB Three-way branch
Alternative instruction mnemonics:
CALL Equivalent to CJS
RTN Equivalent to CRTN
JUMP Equivalent to CJP

394 Bridging the Hardware-Software Gap Part III


problems. On what type of breadboard should we construct the architecture?
How do we debug the architecture? How do we produce the microcode? How
do we load the microcode into a control store? How do we design and build
the microinstruction sequencer? How do we debug the microcode? How do we
modify the microcode?
These questions imply that designers need a powerful support system to
allow them to manage microprogrammed control. The control unit is itself only
one part of a good development system. The system must also support the
development and debugging of the architecture and the control algorithm. It
should minimize the usual headaches of design and the subtleties of constructing
the hardware. It should provide for convenient wire-wrapping for initial testing,
and for lights and switches for displaying and controlling individual signals during
the testing of the design, as well as lend powerful support to the development,
debugging, and modification of the control program.
Several commercial microprogrammable development systems have appeared.
We will describe one of these, the Logic Engine, which we designed and built.
Our goal is to reach a position from which we may easily produce and manage
complex hardware projects using microprogrammed control. The goal is ambitious,
and to achieve it requires an understanding of the design principles and practices
presented in this section.
Figure 10-l3 shows the parts of the Logic Engine DeVelopment System.
The base unit houses the microprogrammable controller, a microcomputer-based

BACKPANEL
r------ Designer's
wire-wrap board
I
I
I
Test
I input
n Commands
I

Logic Engine
Display
panel Microprogrammable
controller

Serial
Control Status
data

Monitor
Microcomputer
support system
wi th disk storage

BASE UNIT Figure 10-13 The Logic Engine.

Chap. 10 Microprogrammed Design 395


debugging support system, and a debugging display panel. Attached to the base
unit is a large, detachable backpanel for wire-wrap of the hardware architecture.
A terminal provides for convenient interaction between the development system
and its user.
The Base Unit

The Logic Engine's microprogrammable control unit contains a 2910 microprogram


sequencer, a writable control store of up to 4K words, a microinstruction pipeline
register to deliver command signals to the designer's architecture, and a buffer
register to support communication between the controller and the microcomputer-
based monitor. Figure 10-14 shows the structure of the Logic Engine's controller.
The task of the controller is to present a properly sequenced set of signal voltages
(commands) to the designer's architecture. Therefore, a Logic Engine micro-
instruction has two primary fields: a fixed-format sequencing field to direct the
2910 in the production of the address of the next microinstruction and an open-
ended field for specifying command signals to the designer's circuit. The sequencing
field consists primarily of the items required to direct the 2910: I, CCEN, and
D. The length of the command-bit field is determined by the particular project
and may exceed 100 bits.

Serial
in/out Shift register

To monitor
WCS address
I WCS data

I
t
I

r-l Writable control store


I
I

l
Pipeline register

2910 instruction I Commands

I
Designer's
commands

2910 sequencer I Designer's


test inpu!
Figure 10-14 The architecture of the
Logic Engine controller.

The designer may choose two ways of executing the microinstructions. In


the automatic mode, the controller loads microinstructions from the writable
control store (WeS) into the pipeline register under the control of the 2910
sequencer. In the debugging mode, the designer can influence the delivery of

396 Bridging the Hardware-Software Gap ,Part III


command signals to the architecture in several ways, using features of the Logic
Engine's debugging monitor.
The Logic Engine's support system consists of software running on a mi-
crocomputer inside the base unit. The microcomputer has dual floppy-disk drives,
two serial input-output ports, and one parallel input-output port. The parallel
port provides the interface between the support system and the Logic Engine's
controller. One serial port is dedicated to the designer's display terminal; the
other serial port is available for connecting a serial printer, remote computer,
or other device. The Logic Engine's support software is organized around a
debugging monitor. Additional software includes a text editor, a microprogram
assembler, and various utility programs.
The Logic Engine's display panel provides about 100 LEDs for displaying
data, over two dozen pushbuttons and toggle switches for entering data, and a
variable-speed clock that includes a manual mode. The designer has access to
these whenever the backpanel is attached to the base unit. The base unit contains
a power supply adequate to operate the Logic Engine and the designer's circuit.

The Backpanel

The Logic Engine's backpanel is large: 16 in. wide and 20 in. high. It has a
general-purpose work area to handle integrated circuit chips of 8 to 64 pins. For
a typical design, the backpanel can accommodate several hundred chips. The
designer has access to both sides of the board at all times. Ground and +5 V
appear as power grids on opposite sides of the board and there are extensive
provisions for attaching power-bypass capacitors (see Chapter 12).
Along one side of the backpanel is an area committed to the microinstruction
pipeline register and WCS for the designer's command signals. This permits
easy wire-wrapping of the command signals to the architecture, and allows the
designer to employ as many command signals as the design requires.

The Supporting Software

The Logic Engine's development and debugging monitor supports the detailed
control of the WCS and of the operations of the microprogram sequencer and
pipeline register. The designer may load the WCS from a floppy-disk file-an
example of downloading. The designer may read and modify any word in the
WCS, modify any word without disturbing the remainder, and display the contents
of a block of WCS. Since the 2910 microprogram sequencer is an integral part
of the Logic Engine, the monitor knows its characteristics in detail and thus can
support the display and modification of all of the sequencer's internal registers.
The designer may display the microinstruction pipeline register and modify any
portion of it. The monitor also permits the designer to specify whether, with
each manual change of the pipeline register, a designer's clock signal is to be
issued. These features give the designer an important debugging tool: the manual
entry of microinstructions into the pipeline register without modifying the writable
control store. Since the pipeline register's command field is wired to the designer's
architecture, the designer may exert detailed manual control of the circuit.

Chap. 10 Microprogrammed Design 397


Table 10-2 summarizes some of the functions of the Logic Engine's monitor
that are available to the designer. In executing microcode from the WCS, the
most powerful debugging features are single-step and breakpoint. Single-step
permits the designer to execute one instruction at a time from the WCS, with
a complete Logic Engine register dump accompanying each instruction. The
breakpoint feature is used when the designer is running microcode at high speed.
The designer announces a particular WCS address that, if it becomes the candidate
for next microinstruction, will cause the controller to halt. Breakpoints permit
the designer to stop the execution of instructions at any address and then observe
the status of the system.
TABLE 10-2 FUNCTIONS OF THE LOGIC
ENGINE'S MONITOR

M Display and modify the WCS


E Examine a block of the WCS
R Display and modify the Logic Engine's registers
P Load the pipeline and execute an instruction
C Clear the 2910
B Set or clear a breakpoint
G Go! (Run microcode from the WCS)
I Idle the Logic Engine
S Execute a single instruction
H Help!
L Load the WCS from a disk file
U Unload the WCS to a disk file

The Logic Engine's microprogram assembler provides powerful development


features within a structured microprogramming language. The microassembly
language encourages the designer to express the control algorithm in high-level
terms and provides for transforming the high-level specification into microcode.
The assembler supports the symbolic naming of single command bits and fields
of bits, and there is a convenient syntax for invoking the desired values of the
command bits. The assembler provides full mixed-logic capabilities, giving the
designer the freedom to specify signal values as voltages or as logic levels and
to describe the voltage convention for truth for each signal. The designer may
specify default values for command signals, so that in writing microcode only
command signals that deviate from the default values need be described. In the
sequencing portion of the microinstruction, the assembler supports the 2910's
instruction set and has a convenient syntax for specifying the designer's test
inputs.

DESIGNING AND DEBUGGING WITH THE LOGIC ENGINE


To illustrate the process of design using microprogrammed control, let's study
the design of a small portion of a machine that can directly execute the Forth
language in hardware. (It is not necessary to know the Forth language to follow
the example.)

398 Bridging the Hardware-Software Gap Part III


The Initial Design

Our first step is to work out the architecture-the registers, busses, and data
paths. Forth is a stack-oriented language, and several of its important operations
involve manipulations of the elements on the stack. In this example, we focus
on the stack operations. We wish the several top elements of the stack to be
available for direct use; the deeper elements will be kept in a RAM. Figure 10-15
is a portion of the architecture, showing the top three elements of the stack.
(The stack elements may contain as many bits as required by the problem, but
this decision does not concern us here.) The input to each stack element is
through a set of multiplexers. Each of the potential sources of a given element
of the stack becomes an input to that element's multiplexer. (Notice the similarity
of this data-routing design to that used in the LD20.) For testing purposes, in

SWR

Stack
SO
Sl SOCTL
S2
S3

Stack
SO
Sl REGCTL
SlCTL (6 bits)
S2
S3

so
Sl Stack

S2 S2CTL
S3

MUXCTL
(8 bits)

Figure 10-15 The architecture of a microprogrammed design.

Chap. 10 Microprogrammed Design 399


addition to the elements of the stack, inputs include the switch register on the
display panel. We will call the select signals for the three multiplexers MO, M],
and M2, and refer to the entire collection of multiplexer select signals as MUXCTL.
In addition to holding its contents and loading new information, each element
of the stack may perform internal bit-shifting operations. To support these
activities, each stack element requires 2 control inputs; for stack elements, SO,
S], and S2, we call the stack controls SOCTL, S] CTL, and S2CTL, and we call
the collection of six stack-element controls REGCTL.
The next step in the design is to work out, in rough form, the algorithms
to control the architecture. The thought put into this step will often lead to
modifications of the architecture. The design process involves moving between
increasingly refined sketches of the architecture and control until we feel reasonably
confident that we understand our problem thoroughly. Then there is hope that
when we build our machine, it might actually work!
At this stage in the design of the algorithm, we can see what signals from
the architecture we must test in our microcode in order to direct the flow of the
microinstructions. The 2910 sequencer accepts a single signal as a test input,
and the 2910's instructions may act upon the value of this signal. Consistent
with our earlier development of microprogrammed controllers, we place a test-
input multiplexer in the Forth Machine's architecture to deliver the single test
signal to the 2910. Since we know which, if any, signal is required for testing
in each microinstruction, we may use some of the microinstruction's command
bits as the select code for the multiplexer. The full Forth Machine design requires
about a dozen test inputs, so a 16-input multiplexer with a 4-bit select code is
appropriate. Figure 10-16 shows the structure of the apparatus for selecting
test inputs, including the two that we use in this design. We have allocated the
first 4 microinstruction command bits (bits 0-3) to the control of the test multiplexer;
this is an arbitrary choice.
Logic Engine Designer's
con trol unit architecture

LD.L
TST.L

Designer's test input

Figure 10-16 Structure of the test input in a microprogrammed design.

400 Bridging the Hardware-Software Gap Part III


The Initial Testing of the Architecture

Now it is time to construct and test the architecture. On the Logic Engine's
backpanel, we layout the chips required by the architecture, assemble the
appropriate sockets and chips, and wire-wrap the design. The size of the backpanel
permits us to develop and debug the architecture without partitioning the com-
ponents among small printed circuit boards.
At this point, we usually make some preliminary tests of the registers and
the data paths. The Logic Engine's display panel has numerous lights and
switches to assist us. Using wire-wraps or jumpers, we connect the important
outputs to any of the display panel's LEDs and connect switches to the inputs.
A disposable cardboard overlay for the display panel allows us to label the lights
and switches. We may use the display panel's variable-speed clock to provide
clocking signals. Manual clocking permits us to debug statically-we deliver
clock transitions only when we wish. This is a powerful debugging technique.
Now we exercise the architecture with the display panel's switches, and
observe the results on the lights. In effect, we are manually delivering rudimentary
control to the architecture prior to developing the actual control program, thus
allowing early detection of gross errors in the wiring or design.

Developing the Control Program

Once we are satisfied that the architecture is working properly, we turn to the
detailed development of the control algorithm. We rely on the Logic Engine to
help us develop the control in two ways: by providing a standard environment
for developing and executing microprograms, and by aiding us to program and
test the code.
The Logic Engine's microprogram assembler, LEASMB, has two parts.
In the declaration phase we specify symbolic names for all the variables and
quantities of interest, and we describe the structure of the microinstruction. The
program phase contains the microcode itself, in symbolic form. The use of
symbolic notations is of great value .because of their descriptive power and
because changes in the design may usually be made with little disturbance to
the program. During the earlier phases of the design, natural names will emerge
for the important signals that control the architecture. It is convenient to use
these names in the microcode.
Figure 10-17 is a microprogram for our Forth Machine design. We will
use this code to introduce the elements of the microassembly language. (Later,
you will study the process of generating a microprogram for a more complex
microprogrammed design. Our treatment of the microassembly language will be
informal.)
The microcode in Fig. 10-17 supports a small portion of the testing of the
Forth Machine-the manual loading of data from the switch register on the
display panel and the exercising of the Forth language's rotate instruction. This
code includes declarations and microinstructions that illustrate a variety of features
of the microassembler.

Chap. 10 Microprogrammed Design 401


lOOIC ENGINE DEVEIDEMENT SYSTEM MICROPROGRAM ASSEMBLER
FORl'H TEST: LOGIC ENGINE DEMCNSTRATION PROGRAM

ID FORI'H TEST
~ * FORI'H ENGINE
N
* SAMPLE DECLARATIONS AND SAMPLE MICROCODE
SIZE 18: Number of command bits
MODE IDGIC
* TEST MUX CCNFIGURATICN
INMUX COM (O:3),T=%HHHH,D=Q
LD. L INV INMUX=O, T--%L
TST.L INV INMUX=l,T=%L
* crMMAND FIELD DEClARATIONS
MID<CI'L (7 : 0 ) COM (4: 11 ) , T=$FF , D=%'rrrl'rrrr
MO EQU MID<CI'L (7 : 5 ): Mux 0 select signals
Ml EQU MUXCI'L(4:2): Mux 1 select signals
M2 EOO MUXCI'L(l:O): Mux 2 select signals
MOS2 INV M0=5: Select Reg S2 thru Mux 0
OJ MOSWR INV MO=O: Select 9.lTitch Reg thru Mux 0
.... M1SO INV Ml=3: Select Reg SO thru Mux 1
c:
(C
M2S1 INV M2=1: Select Reg Sl thru Mux 2
:;' REOCTL COM (12: 17 ), T=%HHHHHH, D=%FFFFFF
-
(C

:r
CD
:::c
!.DAD3
ROl'ATE
EOO %111111: load 80,81,S2
INV MOS2,M1SO,M2S1,REOCTL=!.DAD3: Rotate stack.
III
PROG
.... LOC XDDDI ccce C
c.
:E
III
000 ORG 0
000 BEGIN *
~
en 000 10033 OFFO 0 !.DAD
EQU
.nMP TEST IF LD .L=%F
o 001 50013 OFFO 0 JUMP * IF LD. L=%'r
~
III
JUMP BEGIN:MOSWR,M180,M2S1,
.... 002 30003 OODF C REOCTL=LOAD3: **Push swi.tches onto stack.
CD
003 10003 lFFO 0 TEST JUMP LOAD IF TST. L=%F
G)
III 004 50043 lFFO 0 .nMP * IF T8T.L=%'r
"C
005 30003 OADF C ROI' JUMP BEGIN:ROI'ATE: **Rotate top 3 stack. elenents
END
"0
III
o ERROR (S) DETECl'ED
~
= Figure 10-17 Microcode for the design example,
LEASMB microprogram statements have an optional label field, an operation
field, and an optional operand field, in that order. Within the operand field, the
required subfields are separated by semicolons; comments may follow the operand
field, if preceded by a semicolon. Lines beginning with an asterisk are comments.
An LEASMB output listing, such as in Fig. 10-17, shows the source program
and, to the left of the program phase, the object code in hexadecimal notation.
An LEASMB program begins with the directive ID and ends with the
directive END. The ID is the name assigned to the object code produced by
the assembler, END marks the physical end of the program text. The declarations
precede the program; the directive PROG separates the two. In the declaration
phase, the directive SIZE specifies how many command bits are used in the
design. The directive MODE, which takes an argument LOGIC or VOLTAGE,
announces how the assembler is to interpret numeric data that could describe
either logical or voltage values. As mixed logicians, we choose LOGIC mode.
Unless otherwise specified, numbers are in decimal notation. Binary constants
are preceded by %, and may contain numerals 1 and 0, logical T and F, or
voltage Hand L.
The declaration phase has three main directives: COM, INV, and EQU.
The program phase has, besides mnemonics for each of the 2910 sequencer's
instructions, two directives: ORG and EQU. COM specifies the nature of the
command bits in the microinstruction; INV allows a symbolic name to invoke
complex operations on command bits; EQU allows the equivalencing of names
to values or to other names. ORG declares the origin of the microinstructions
in the control store.
Command bits, presented to the architecture by the command field of the
pipeline register, are defined by the COM directive. In Fig. 10-17, the definition
of REGCTL provides the following information: REGCTL is a field of 6 bits
which occupy bits 12 through 17 of the command field of the microinstruction.
[If we choose, we may refer to individual bits or groups of bits of REGCTL
using a default set of indices, 0 for the leftmost bit through 5 for the rightmost
bit. Thus REGCTL(0:3) would reference the leftmost 4 bits of the field. This
notation frees the programmer from a dependence on the positions of particular
command bits.] For each bit of REGCTL, truth is represented by a high voltage
level (T = %HHHHHH). Whenever any bits of REGCTL are not mentioned in
a microinstruction, the default values will be false (D = %FFFFFF). (A mi-
croinstruction will usually deal explicitly with only a few command bits; the
default declarations of command bits are therefore of great importance in freeing
the programmer from unnecessary details. In this example, the default values
hold the current contents of the stack registers.)
The MUXCTL(7:0) declaration specifies that MUXCTL occupies bits 4
through 11 of the command field. In our program we may refer to individual
bits or groups of bits using indices 7 (for the leftmost bit) to 0 (for the rightmost).
This explicit indexing notation overrides the normal default notation. For each
bit of MUXCTL, truth is represented by a high voltage (T = %HHHHHHHH).
By convention, their unspecified default values are false.
In our illustration, we usually wish to deal with the group of command

Chap. 10 Microprogrammed Design 403


signals that controls a particular multiplexer in Fig. 10-15. So, for convenience,
we define three variables MO, Ml, and M2 in the next three lines of the program.
MO is declared to be a field of 3 bits, equivalent to bits 7 to 5 of MUXCTL.
Similarly, Ml and M2 are declared to be equivalent to bits 4 to 2 and bits 1 to
o of MUXCTL. With these definitions, we may refer to the field MUXCTL as
a whole or to subfields MO, Ml, and M2, or to any bit or group of bits of
MUXCTL.
These uses of the EQU directive equate names with fields. EQU may also
be used to equate names with values or with addresses. LOAD3 is defined with
an EQU as equivalent to the binary pattern %111111. BEGIN is defined in the
program phase as equivalent to *. Since * in this context designates the current
assignable control-store address, this notation equates the symbol BEGIN to the
address 000.
In examining Fig. 10-15, you will see that, in order to select stack element
SO as the output of multiplexer 1, we must present the code 3 (binary %011) to
the MUXI select inputs. For convenience, we define a symbol MlS0 that will
invoke (INV) the value 3 on the field M1. If we wish to pass element SO through
MUXI, we write MlS0, thus assigning the value 3 (%011) to the field Ml in the
microinstruction. In the microcode in Fig. 10-17, the instruction at location 002
illustrates this usage. The symbol ROTATE illustrates how we easily may develop
complex invocations. The use of ROTATE in the instruction at location 005
invokes the previously defined invocations MOS2, MlS0, and M2S1, and invokes
the value LOAD3 in the command bits defined for REGCTL. Invocations are
the most important concept in LEASMB; they are the key to achieving high-
level specifications of our microinstruction operations.
The microprogram performs two operations: loading the contents of the
display panel's switch register into stack element SO (and simultaneously pushing
the stack down), and performing a cyclic rotation of the top 3 elements of the
stack. Two pushbuttons on the display panel, LD and TST, control the actions.
When LD is pressed and released, the load-and-push operation will occur; pressing
and releasing TST will cause the execution ofthe rotate operation. It is necessary
to assure that the microcode for loading and rotating will be executed only once
for each push of the button. The code at locations 000 and 001 performs a
single-pulser function for the LD button; the code at location 003 and 004 performs
a similar function for the TST button. From the discussion in Chapter 6, you
can see that in our microcode we have developed pure-control versions of the
single-pulsers. We return for a closer examination of this structure later in this
chapter.
When the instruction at location 003, JUMP LOAD IF TST.L=%F, is
executed, test input signal TST.L must be selected through the designer's test
multiplexer, and a false value of TST.L must cause a branch to microinstruction
LOAD. In the declaration phase, the command variableINMUX describes the
select signals for the test mUltiplexer. The declaration of the variable TST.L
specifies that TST.L will invoke the value 1 for INMUX. This agrees with Fig.
10-16, in which TST.L appears at input position 1 of the test multiplexer.
Test signals entering the test multiplexer may be represented as T = H or

404 Bridging the Hardware-Software Gap Part III


T = L, like LD.H and TST.L. A jump may be desired for either a true or a
false value of a test signal, as in the microinstructions at locations 000 and 001,
or at 003 and 004. The LEASMB microassembler arranges for the correct voltage
to be presented to the 2910's CC input, in accordance with the specifications in
the program's instructions and declarations. The designer declares the voltage
conventions once and may thereafter deal with signals as logical entities. In our
example, the declaration of TST.L has a term T = %L appended to an otherwise
normal invocation. This declares that if a microinstruction ever specifies TST.L
as the source for the 2910's CC input, then truth is represented as a low voltage
level. The microprogram sequencer portion of an LEASMB microinstruction
contains a bit, set by the assembler, that specifies whether the incoming test
voltage should be inverted before it enters the 2910. (In exercises at the end of
the chapter, you are asked to ascertain from Fig. 10-17 which bit of the micro-
instruction corresponds to this CC-inversion flag, and what voltage transformations
are induced by the instructions in the microcode.)
Use this informal description of the language to follow the test program.
In the Logic Engine's microassembly language, we have tried to encourage the
use of high-level structured coding within a simple syntax. Let us now return
to the design of our Forth Machine circuit.

Testing the System

U sing the command-bit structure declared in the microprogram, we wire the


appropriate bits of the Logic Engine's pipeline register to our architecture. We
assemble the test microprogram and load it into the Logic Engine's WCS, making
the appropriate monitor commands. We clear the Logic Engine, thereby causing
the next microinstruction to be taken from location 000. Feeling bold, we enter
the automatic run mode, causing the Logic Engine repeatedly to execute the
instructions at locations 000 and 003, waiting for us to press the LD or TST
pushbuttons on the display panel. We put a desired value into the display panel's
switch register, and press and release the LD button. Assuming that we have
wired the outputs of the stack elements to the LEDs on the display panel, we
may observe the results directly on the panel.
If, as is likely, there is some problem with the behavior of the load-and-
push operation, we must debug the architecture and the code. The single-step
mode is useful at this point; it allows us to execute our microcode yet freeze
the system after each instruction so that we may observe the status of any signals
in the architecture. Breakpoints are also useful, to suspend the execution of
microinstruction. In debugging larger microprograms, a combination of breakpoint
and single-stepping is a powerful aid to the designer.
If we isolate a problem and wish to supply a particular set of commands
to the architecture, we may use the monitor's manual-execution mode, in which
we present manually generated command-bit patterns for loading into the pipeline
register. During manual execution, we may (in fact, must) specify whether we
wish the Logic Engine to issue a clocking signal to our architecture. We have
detailed control of the entire debugging process. If we determine that a particular

Chap. 10 Microprogrammed Design 405


microinstruction is incorrect and the error does not warrant a reassembly of the
program at this time, we may manually change the microcode in the WCS.
Without all this assistance from the development system, the designer would
find it very difficult to express the control algorithm and debug the design.
Now we are ready to study a more complex microprogrammed design.

DESIGNING A MICROPROGRAMMED MINICOMPUTER


With the aid of powerful development facilities such as those provided by the
Logic Engine, we are in a position to tackle a more difficult problem. You have
already studied the LD20 hardwired minicomputer in Chapters 7, 8, and 9; in
this chapter we will borrow its architecture and will develop a microprogrammed
version of the control algorithm. We call the new design the LD30. Although
the execution will be slower, the development of the LD30's control algorithm
will be vastly simpler than the hardwired algorithm of the LD20. Nevertheless,
to accomplish a microprogrammed implementation of this magnitude will require
the use of all the sophisticated development aids offered by the Logic Engine.

Developing a Microprogram
The following prescription is useful for developing the microprogram:

1. Develop the main architecture. For our LD30, we would follow almost
identically the path taken in Chapter 7, and we will adopt the LD20's main
architecture.
2. Specify the obvious command bits for controlling the architecture. We
need not worry about details at this stage, but our knowledge of the ar-
chitecture will give us much insight into the commands needed to control
it. We may modify our list of commands later.
3. Write high-level microinstructions for the control algorithm. We will use
the Logic Engine's microassembly language, which will allow us to express
complex operations in symbolic terms without going into details. Since we
have already studied the elements of the control process for the LD20, we
can draw on this experience, using notations that are as close as reasonable
to those used in the LD20's ASM chart.
4. Develop the declaration phase of the microcode. This is primarily the
specification of invocation variables to expand in ever-increasing detail the
high-level notations used in the microinstructions.
5. Tidy up. Complete the details of command-bit representations, test inputs,
minor architectural elements, and definitions of the behavior of specific
chips used in the architecture.
The Architecture of the LD30
We adopt almost intact the architecture developed for the LD20. Figure 10-18
shows one of the twelve bit-slices in the main data structure. We see the principal
registers, the ALU, and the main bussing system for the 12-bit data of the LD30.

406 Bridging the Hardware-Software Gap Part III


AC

MA
MA j
OUT
MEM MEM j ALU j
IN
ALU
MUX j
DATA
MB MUX
ME j
PC
PCj

EA j

INPUTj - - - - _....

IR
IR j

Figure 10-18 The main data paths of the LD30.

We will use the LD20's structure for the accumulator and link; this is shown
in Fig. 10-19. Also, we will use the LD20's structure for the memory and its
controller. We anticipate that the LD20's state generator will be completely
missing from our microprogrammed implementation, since the microprogram
controller in the Logic Engine performs the next-state selection. We expect our
LD30 to contain at least one new architectural element: the test multiplexer that
supports the delivery of the specified test signal to the microprogram controller.
We will specify other architectural elements as we encounter them in developing
the control algorithm for the LD30.

LINK
LINK
ACO Q 1---41......-1 RT IN
ACll LINK
AC shift
o register

12

ACO-ACII
Figure 10-19 The accumulator and link of the LD30.

A First Approximation of the Command Bits

To provide a concrete point of reference for our later work, it is useful to specify
as many of the architecture's control signals (the microinstruction commands)

Chap. 10 Microprogrammed Design 407


as are evident from the LD30's present architecture. This step could be deferred,
since at this point it is just an approximation, but we have learned to appreciate
having this early anchor to reality. As we pursue a top-down design, this knowledge
can keep us from generating interesting but unproductive microcode.
Our first attempt to define control signals for our LD30 might take the form
of Table 10-3. At a rough guess, we expect to have to test around 30 input or
status signals; 5 or 6 bits will probably be required for the test multiplexer's
select code. The controls for the data mux, the ALU, and the principal data
registers are the same as in the LD20. The link mUltiplexer and the memory
controls are also carried over from the earlier work on the LD20.

TABLE 10-3 PRELIMINARY


COMMAND SIGNALS FOR THE LD30

TESTMUXCTL COM 5 or 6 bits


DATAMUXCTL COM 3 bits
ALUCTL COM 6 bits
MALD COM 1 bit
MBLD COM 1 bit
PCLD COM 1 bit
IRLD COM 1 bit
ACCTL COM 2 bits
LINKMUXCTL COM 3 bits
MEMORYCTL COM 2 bits
ENABLE.EA COM 1 bit

We expect to need additional architectural elements for our LD30's interrupt


control, input-output control, Operate instruction priority control, and the halting
operations, but we are not yet clear about their nature.

Writing the High-Level Microcode

We will use our earlier analysis of the LD20's control algorithm (see Chapter
7) to help develop the LD30's microcode. The hard work-understanding the
PDP-8's specifications well enough to describe a correct control algorithm-is
the same for the microprogrammed LD30 as for the hardwired LD20. In the
LD20, we expressed the algorithm as an ASM chart. For the LD30, we will
use the Logic Engine's LEASMB microassembly language. The two versions
differ dramatically in appearance-the ASM chart is a two-dimensional flow
diagram, whereas the microassembly language is linear, like a computer program.
The two versions also differ in detail, since the LD20's ASM contains many
states with multiple tests (multiple qualifiers) and conditional outputs, which are
not permitted in the single-qualifier microinstructions supported by the Logic
Engine. And the implementations of the two control algorithms are wildly different.
However, the overall sequencing should remain the same. We do not need an
ASM for our work on the LD30, but to tie this work more closely with the
LD20, we will use LD20 nomenclature when practical. Figure 10-20 is a high-
level diagram of the LD30's control algorithm. The algorithm consists of four

408 Bridging the Hardware-Software Gap Part III


FETCH PHASE

Halt Yes
?
IDLE
....-...
No

Fetch next
instruction
l
Continue Yes

~
')

Form
effective
No l
Manual
address operation
No ?

Operand
needed
?
y yesl

Perform
MANUAL
operation
Yes

Get operand
J
from memory

EXECUTE PHASE
Execute the
instruction

I
Figure 10-20 LD30 control.

principal blocks: the fetch phase, the execute phase, the idle phase, and the
manual phase. We must decompose each block into a sequence of microinstruc-
tions. As always, we will strive for a top-down development.

The Idle Phase of the LD30

The purpose of the idle phase is to detect an operator's action at the LD30
display panel. As in the LD20, we must take care to process only synchronized
input signals in our algorithm, and we must provide a means of processing each
depression of a pushbutton on the display panel only once.
We may contemplate the use of a flip-flop HALTFF to record the status
of halt requests, as we did in designing the LD20. A request to halt will eventually
cause the algorithm to reach the idle phase; we should be prepared to clear the

Chap. 10 Microprogrammed Design 409


halt request at this time. The addition of a HALTFF flip-flop to our architecture
will require appending two command bits to our list (for controlling J and K in
a JK flip-flop or D and its enable in an enabled D flip-flop).
In the LD20 a composite signal MANSW* indicates the assertion of at least
one pushbutton signal. MANSW* is the input to a single-pulser whose output
is MANPULSE. MANPULSE is synchronized with our system clock and assures
that a depression of a pushbutton is responded to only once. Following the
discussion of single-pulsers in Chapter 6, we can envision several implementations
of the MANPULSE signal.
Figure 1O-21a shows an architectural implementation of MANPULSE, in-
volving the two flip-flops and an AND gate that we have come to expect of this
circuit. In this circuit, the microcode for the LD30's idle phase might be:

IDLE EQU *
JUMP * IF MANPULSE=%F; CLEAR.HALTFF
JUMP FETCH IF CONT.SW=%T
CALL MANUAL
JUMP IDLE

MANSW. D Q

MANPULSE

(a) Pure. architectural implementation

MANSW* MANUAL.SW

(b) For a pure microprogrammed implementation

Figure 10-21 Two versions of the hardware for the LD30's single-pulser.

From Chapter 6, we learned that the single-pulser could be expressed as


a pure algorithm rather than as an architectural element. In hardwired design,
both views result in the same circuit. In microprogrammed design, the algorithm
is expressed as microcode, and is never reduced to a hardwired circuit. Thus,
we may implement the LD30's idle phase using a single flip-flop to synchronize
the asynchronous signal MANSW*, as shown in Fig. lO-21b and in the following
microcode:

410 Bridging the Hardware-Software Gap Part III


IDLE EQU *
JUMP * IF MANUAL.SW=%T; CLEAR.HALTFF
JUMP * IF MANUAL.SW=%F
JUMP FETCH IF CONT. SW=%T
CALL MANUAL
JUMP IDLE

The first instruction jumps in place until all buttons are released, thus assuring
that any previous depression of a pushbutton is processed only once. The next
step is to hang up until the operator depresses a pushbutton, which then allows
the idle phase to proceed. This second procedure is superior, since it requires
less hardware and only one additional microinstruction. This form of single-
pulser was used in our Forth Machine design example.
As we did in the LD20, we will insert into the architecture the large OR
gate necessary to produce MANSW* from the collection of individual pushbutton
signals on the display panel. We could dispense with this OR gate by writing
microcode that serially single-pulses through the individual pushbutton signals,
but this would require synchronizing each signal separately and would generate
much additional microcode.
The Manual Phase of the LD30

The processing of manual operations in the LD20 was rather tedious, since the
manual operations were merged with the execute phase. In the LD30, the manual
phase is much more straightforward, as the following microcode shows:
MANUAL EQU *
CALL LDMA IF LDMA.SW=%T
CALL LDMB IF LDMB.SW=%T
CALL LDPC IF LDPC.SW=%T
CALL LDIR IF LDIR.SW=%T
CALL LDAC IF LDAC.SW=%T
CALL CLEAR IF CLEAR.SW=%T
CALL EXAMINE IF EXAMINE. SW=%T
CALL LDMEM IF LDMEM.SW=%T
CALL DEPOSIT IF DEPOSIT. SW=%T
RTN

LDMA RTN SWR.TO.MA (the first 11.11


marks a null sequencer
field)
LDMB RTN SWR.TO.MB
LDPC RTN SWR.TO.PC
LDIR RTN SWR. TO. IR
LDAC RTN SWR.TO.AC
CLEAR RTN CLEAR. LINK, CLEAR. INT. ENABLE

LDMEM CALL WRITE; SWR.TO.MB


RTN
DEPOSIT CALL WRITE; SWR.TO.MB
RTN ; INCREMENT.MA

Chap. 10 Microprogrammed Design 411


This microcode employs a sequential test of each pushbutton; if one is
depressed, a subprogram is called to process it. A typical operation such as
SWR.TO.MA must cause the transfer of the switch register to the selected
register, in this case the memory address register. At this stage, we are unconcerned
with the details of this transfer, other than to realize that, in our architecture,
it can be performed in one clock cycle. Later, we will expand SWR.TO.MA
and its cousins into the proper command-bit assertions, but now we do not
permit these details to interrupt our thoughts.
Writing into memory requires the presentation to the memory buffer register
of the data to be written, followed by an execution of a standard memory-write
subprogram. We may specify the details of the WRITE subprogram at our
convenience, but to satisfy your curiosity, we will do so now:
WRITE EQU *
CONT START.WRITE
JUMP * IF CC=%F
RTN
Using the standard memory protocol of the LD20, we issue a START. WRITE
signal (one of our two memory command signals) and wait until the cycle-
complete response CC becomes true.
We've completed the microcode for the idle and manual phases. Next
comes the instruction-fetch phase.
The Fetch Phase of the LD30
The conversion of the fetch phase of the LD20's ASM to our microprogrammed
version is straightforward:
FETCH EQU *
Fl JUMP IDLE IF HALTFF=%T
CALL INTERRUPT.TEST
Fl.l CONT PC.TO.MA. PC.TO.MB. LOAD.IOP.ENABLER

F2 CONT INCREMENT. PC

F3 CALL READ.TO.IR
F3.l JUMP FETCH.DONE IF NO.MEMORY; EA.TO.MA

F4 CALL READ.TO.MB
JUMP FETCH.DONE IF DIRECT.ADDRESSING
CALL AUTO IF AUTO. INDEXING

F6 JUMP FETCH.DONE IF NO.INDIRECT.OPERAND; MB.TO.MA

F7 CALL READ.TO.MB

FETCH.DONE JUMP EXECUTE;; operand. if any, is in MB


* EA, if any. is in MA

412 Bridging the Hardware-Software Gap Part III


AUTO EQU *; auto-indexing
CALL WRITE; INCREMENT.MB
RTN

This microcode calls on several subprograms that are as yet unspecified:


INTERRUPT.TEST, READ.TO.IR, READ.TO.MB. The memory-read subpro-
grams are short, and we will elaborate them here; the treatment of INTER-
R UPT. TEST will appear later.
READ.TO.MB EQU * read memory(MA) into MB
CALL READ
RTN MEM.TO.MB

READ.TO.IR EQU * read memory(MA) into IR


CALL READ
RTN MEM.TO.IR

READ EQU * ; initiate and synchronize memory reads


CONT START. READ
JUMP * IF CC=%F
RTN

In Chapter 7, developing the fetch phase was rather complex but now that
we understand it, rendering our understanding into microcode is simple. As
always, we have made good use of our ability to describe each microinstruction
in high-level terms.

The Execute Phase of the LD30

At the start of the execute phase, IR will contain the current instruction, MB
will contain any needed operand from memory, and MA will contain the effective
address EA, if required. For the execute phase, we must develop microcode to
decode the instruction and perform each type of PDP-8 operation. Given our
work on the LD20, the execution of most of the instructions is straightforward,
and we might proceed as follows; some of the details are left for you to complete.

EXECUTE EQU * main entry for execute phase


JUMP *+2 IF SING.INST.SW=%F
CONT SET.HALTFF only if single instruction switch is on
JUMP DECODE.INST , , decode the instruction, somehow

EXEC.DONE JUMP FETCH .. all done. return to fetch phase

* PDP-8 instruction-execution code. The LD30 instruction


* decoding step must branch to the proper code.

AND. CODE EQU *; for PDP-8 AND instruction


* You complete this microcode.

Chap. 10 Microprogrammed Design 413


TAD. CODE EQU * ; for PDP-8 TAD instruction
JUMP TAD. END IF ALU.COUT=%F; ADD.MB.TO.AC
CONT ; COMPLEMENT. LINK; only for a carry-out
TAD. END JUMP EXEC. DONE

ISZ.CODE EQU *; for PDP-8 ISZ instruction


CALL WRITE; INCREMENT.MB
JUMP ISZ.END IF MB.IS.ZERO=%F
CO NT ; INCREMENT.PC ; only if result is zero
ISZ.END JUMP EXEC.DONE

DCA.CODE EQU * ; for PDP-8 DCA instruction


* You complete this microcode.

JMS.CODE EQU *; for PDP-8 CALL instruction


CALL WRITE; PC.TO.MB
JUMP EXEC.DONE; INCR.MA.TO.PC

JMP.CODE EQU * ; for PDP-8 JUMP instruction


* You complete this microcode.

IOT.CODE EQU *; for PDP-8 lOT instruction


* This one is more complex. Defer it until later.

OP.CODE EQU *; for PDP-8 Operate instruction


* This one is more complex. Defer it until later.

In the TAD.CODE block, the development of the arithmetic sum is caused


by the execution of the ADD.MB.TO.AC invocation in the first microinstruction.
The result of the addition, both the sum and the carry-out status, are available
during the instruction. At the end of the instruction, the sum is loaded into the
AC. During the instruction, we test the carry-out signal ALU.COUT, and if it
is true, we arrange to complement the link bit.
The microcode for the execution of three of the PDP-8's instructions is left
to you to complete. Of the algorithm for the LD30's execute phase, there remain
only to specify the code for the Operate and lOT instructions, deal with interrupts,
settle on how we will decode instructions, and clean up a few loose ends.

The Operate instruction. In designing the LD20, we developed a so-


phisticated method of saving clock cycles in the execution of the PDP-8's Operate
instructions. Microprogrammed control, although allowing much parallel activity
in each microinstruction, is inherently more serial than hardwired control. The
elaborate priority-detection system of the LD20 usually saved a few cycles (and
served as a good illustration of priority circuits), but attempting a similar scheme
in our inherently slower LD30 would be misplaced effort. We resign ourselves
to plodding serially through all the possible operations within each group of the
PDP-8's operate instruction. Let's assume that the decoding of the instruction,
in addition to detecting the Operate instruction, has also identified the subgroup.

414 Bridging the Hardware-Software Gap Part III


This is an arbitrary but reasonable choice, and it will fit right in with the elegant
scheme to be discussed later.
A reasonable microcode for a part of the PDP-8's Operate instruction might
then be as follows:
OP.Gl.CODE EQU * . for PDP-8 Operate instruction, group 1

Gl.Pl JUMP *+2 IF IR4=0


CONT CLEAR.AC CLA operation

JUMP *+2 IF IR5=0


CONT CLEAR. LINK CLL operation

Gl. P2 JUMP *+2 IF IR6=0


CONT COMPLEMENT.AC ; CMA operation

In problems at the end of the chapter, you are asked to complete the
microcode for the Operate instruction,

lOT Instruction. The PDP-8's lOT instruction performs input and output
operations. As in the LD20, our LD30 must arrange for the signals IOP1, IOP2,
and IOP4 to be asserted, if requested, long enough to examine the values of the
incoming status signals IOSKIP, ACCLR, and ORAC. This examination will
require several microinstructions, during which the appropriate lOP signal must
remain solidly asserted. We could write brute-force microcode for this problem,
with code to test the status of each of the three lOP signals. However, our
choice is to introduce an element into the LD30's architecture to generate and
maintain the lOP signals as required. We have already seen such an element
in the LD20: the lOP signal enabler. This circuit has a 4-bit shift register that
allows the enabling of each lOP signal in turn. We will use this circuit, shown
in Fig. 7-34, in our LD30. The microcode for the lOT instruction must then
loop three times over the status tests-once for each of the possible lOP signals.
We use the 2910 sequencer's internal counting instructions to manage the looping.
IOT.CODE EQU *; for PDP-8 input-output operations
LDCT 2 ;; load the 2910 R-register with (loop-count - 1)
IOT.LOOP CONT ; SHIFT. lOP. ENABLER
E3 JUMP *+2 IF IOSKIP=%F
CONT ; INCREMENT.PC ; if IOSKIP is asserted
E4 JUMP *+2 IF ACCLR=%F
CONT ; CLEAR.AC ; if ACCLR is asserted
E5 JUMP *+2 IF ORAC=%F
CONT ; OR.INPUT.TO.AC ; if ORAC is asserted

RPCT IOT.LOOP; decrement & test R-reg, branch if non-zero


JUMP EXEC.DONE

The lOT instruction provided two suboperations ION and IOF for enabling
and disabling the interrupt system. The suboperations are distinguished from

Chap. 10 Microprogrammed Design 415


regular lOT instructions by a zero device address in IR3-IR8. The microcode
for these operations is:
IOT.ION.CODE EQU *; turn on (enable) interrupt system
JUMP EXEC.DONE; SET.INT.ENABLE

IOT.IOF.CODE EQU *; turn off (disable) interrupt system


JUMP EXEC.DONE; CLR.INT.ENABLE

Interrupt Processing in the L030

The LD30 must be able to detect an interrupt request and present a synchronized
version of the request to the microprogrammed control algorithm. The PDP-8's
interrupt protocol also requires that we be able to determine if interrupts are
enabled. If an interrupt is to occur, we must be able to force the execution of
a CALL 0 instruction. For these functions, we adopt the same architectural
elements used in the LD20. We show the architecture for the interrupt system
in Fig. 10-22.
Handling an interrupt request began early in the fetch phase with a subprogram
call to INTERRUPT. TEST. At the time we wrote that call, we had no clear

INTERRVPTREQVEST* INTERR VPTREQVEST


(Synchronized)

INT.EN.CTL(O) Q INTERRVPTENABLE
INT.EN.CTL( l)

Forcing a JMS

I JAMIRO
lRO
o ALVa

o ALVl --------------~ JRl

o AL V2 -------------.;~ JR2

Figure 10-22 Architecture of the interrupt system in the LD30.

416 Bridging the Hardware-Software Gap Part III


idea how interrupts would be handled. Now, having made some architectural
commitments to generate the INTERRUPT.REQUEST and INTERRUPT.ENABLE
signals, and to finesse the awkward problem of generating a CALL 0, we are
able to specify the microcode fully.
INTERRUPT. TEST EQU *; determine if an interrupt should occur
* no interrupt is possible if the previous instruction was an
* lOT instruction that enabled the interrupt system.
RTN IF IOT.ION=%T ; no interrupt if just enabled
RTN IF INTERRUPT.ENABLE=%F; no int if disabled
RTN IF INTERRUPT.REQUEST=%F; no int if no request

* Create an interrupt! Disable interrupts and force a "CALL 0' '.


CJPP EXECUTE, PASS; CLR.INT.ENABLE, FORCE.JMS.TO.ZERO

We called the interrupt-test code as a subprogram. If no interrupt occurs,


the INTERRUPT.TEST microcode returns to the subprogram caller. If an interrupt
occurs, the algorithm enters the regular execute-phase microcode, which is not
a subprogram. Then, the code never makes a normal subprogram return back
to the fetch phase, so we must take care to adjust the 2910 stack to remove the
unused return point. The 2910 operation CJPP in the pass mode accomplishes
both a jump and a stack pop.

Instruction Decoding in the LD30

The microcode for the execute phase calls a subprogram to decode the PDP-8
instruction residing in the IR. The operation code for most instructions is specified
by the three bits in IRO, IRl, and IR2. Since our microinstructions can examine
only one signal at a time, a pure microcode solution to instruction decoding is
clumsy (although perfectly feasible). Here is a brute-force way:
DECODE.INST EQU * pure microcode instruction decoding
JUMP CODE.IXX IF IRO=l
CODE.OXX JUMP CODE.OIX IF IRl=l
CODE.OOX JUMP TAD. CODE IF IR2=1
JUMP AND. CODE
CODE.OIX JUMP DCA. CODE IF IR2=1
JUMP ISZ.CODE
CODE.IXX JUMP CODE .11X IF IRl=l
CODE. lOX JUMP JMP.CODE IF IR2=1
JUMP JMS.CODE
CODE .I1X JUMP OP.CODE IF IR2=1
JUMP lOT. CODE

In developing microcode for the execution phase, we decided that the


instruction-decoding step would distinguish the three groups of Operate instructions
and the two special cases of the IOT instruction. We could complete this decoding
with further tests of additional bits of the LD30's IR, although the test for the
lOT-instruction special cases of the lOT instruction is complicated by having to

Chap. 10 Microprogrammed Design 417


identify a 6-bit field of zeroes in bits IR3-IR8. All of this microcode is a serialized
version of a basically parallel function-the decoding of a several-bit code. We
already know that combinational logic, implemented with gates, decoders, or
ROMs, can achieve rapid decoding.
We may simplify the decoding of LD30 instructions and illustrate some
powerful control techniques by developing a solution that mixes architecture and
algorithm. The 2910, at the heart of the Logic Engine's microprogram controller,
normally expects to receive its D input from the microinstruction pipeline register;
14 of the 2910's 16 operation codes assume this. However, for the 2910's JMAP
instruction, the D input comes from a different source, usually from a "mapping
ROM." Let's use a ROM to assist the decoding.
Our goal is to examine bits of the LD30's IR and branch within our control
microcode to the proper code for executing each type of PDP-8 instruction. The
decoding of the PDP-8's basic instructions and our special cases requires examining
all 12 bits of the IR. A pure ROM solution would use ROMs with 12 address
inputs to produce about a dozen microcode addresses. However, many of the
bits of the IR are needed only to detect the two special cases of the lOT
instruction, ION and IOF. Our plan is to incorporate two signals in hardware:
IOT.ION will be true for the interrupt-enable suboperation; IOT.IOF will be true
for the interrupt-disable suboperation. For now, we assume the existence of
these two signals; we will discuss how to create them later.
Now the inputs to our instruction-decoding ROM are considerably simplified.
We require IRO, IRI, and IR2 to decode the main operation code; IR3 and IRll
to distinguish the Operate instruction groups; and IOT.ION and IOT.lOF to
identify the lOT subinstructions and to distinguish ordinary input-output lOT
instructions from the special cases. These 7 signals form the inputs to the ROM;
the ROM's output must produce microcode branch addresses that we can use
to transfer to sections of the microprogram dealing with the PDP-8's 12 instructions
and subinstructions. What are the 12 microcode branch addresses? From our
microcode for the execute phase we see examples such as AND.CODE,
IOT.CODE, and OP.G2.CODE. But the experienced microprogram mer will rec-
ognize a pitfall: the actual addresses of these locations in the microprogram
memory depend heavily on the details of the microcode. If we modify the
microcode, the likelihood is that a reassembly of the microcode will change the
addresses of the subprograms. We certainly do not desire to reprogram our
ROM every time we reassemble the LD30's control microprogram! To avoid
this difficulty, we introduce a jump table into our microprogram. This table will
be in a fixed location in the microinstruction memory, and we will guarantee
not to alter its location. The ROM's outputs will point to entries in this jump
table, and the table entries, from their fixed locations, will jump to the subprogram
addresses. The advantages are great: the contents of the ROM remain fixed
even though we modify the microcode, and our microassembler will handle the
drudgery of locating the subprograms. Table 10-4 shows the structure of the
mapping ROM and the microcoded jump table. If speed of execution is sufficiently
important, after the design is ready for production you may program a new

418 Bridging the Hardware-Software Gap Part III


TABLE 10-4 DECODING THE LD30'S INSTRUCTIONS

CONTENTS OF THE MAPPING ROM


Inputs Z Outputs

N
8
f-<
""0
......
~ Jump
0
e5 P2
...... ~
......
'"......~ P2
...... 0
...... 0
...... address

0 0 0 X X X X $10 AND.INST
0 0 1 X X X X $11 TAD.INST
0 1 0 X X X X $12 ISZ.lNST
0 1 1 X X X X $13 DCA.INST
1 0 0 X X X X $14 JMS.INST
0 1 X X X X $15 JMP.lNST
1 0 X X 1 0 $16 IOT.ION.INST
0 X X 0 $17 IOT.IOF.INST
0 X X 0 0 $18 IOT.lNST
1 0 X X X $19 OP.G1.lNST
1 0 X X $1A OP.G2.lNST
X X $lB OP.G3.lNST
MICROCODED JUMP TABLE
ORG $10; opcode jump table
AND. INST JUMP AND. CODE
TAD.INST JUMP TAD. CODE
ISZ. INST JUMP ISZ.CODE
DCA.INST JUMP DCA. CODE
JMS.INST JUMP JMS.CODE
lOT. ION. INST JUMP IOT.ION.CODE
IOT.IOF.INST JUMP IOT.IOF.CODE
OP.Gl.INST JUMP OP. Gl. CODE
OP.G2.INST JUMP OP.G2.CODE
OP.G3.INST JUMP OP.G3.CODE

mapping ROM to point directly to the instruction execution code, thereby eliminating
the microcoded jump table.
With the addition of the mapping ROM for decoding instructions, we may
now write the DECODE.INST microcode whose existence we assumed when
we developed the original microcode for the LD30's execute phase:
DECODE.INST EQU *; instruction decoding
JMAP ;; jump to correct instruction processing code

Declarations for the LD30 Control Microprogram

We have developed the main elements of the LD30's architecture and specified
the control algorithm in high-level terms. Now we will develop the declaration
phase of the microcode, in which we will expand our high-level invocations into
actual command signals directed to the LD30 architecture.

Chap. 10 Microprogrammed Design 419


Consider operations involving the main data paths. If we scan the executable
instructions in the microprogram, we can catalog a large number of operations
on the main data path. PC.TO.MA, CLEAR.AC, ADD.MB.TO.AC, and
SWR.TO.IR are examples. In our version of the microcode, there are 27 different
invocations that refer to the main data path architecture. These require selecting
an operand through the data multiplexer, performing an appropriate ALU operation,
and loading the result into some register. Still resisting the temptation to jump
into the details of the hardware too soon, we break down each of these high-
level invocations into more detailed specifications:
PC.TO.MA INV SEL.PC, ALU.PASS, LOAD.MA
CLEAR.AC INV ALU.ZERO, LOAD.AC
ADD.MB.TO.AC INV SEL.MB, ALU.PLUS, LOAD.AC
SWR.TO.IR INV SEL.SWR, ALU.PASS, LOAD.IR

We have tried to choose obvious names for the intermediate operations. For
instance, SEL.MB means "select the MB register as the input to the ALU,"
ALU.PLUS means "cause the ALU to add its operands," and LOAD.AC means
"load the AC with whatever is at its data inputs." At this stage, if we were to
change our minds about some details of our main data path architecture, it is
likely that these declarations would not require alteration.
Next, we must expand the specifications for each of the intermediate-level
commands. Eventually, we will end up with detailed assignments of elementary
signals to microinstruction command bits, but we need not hurry this process.
We may describe the data multiplexer selection codes as follows, using the
same ordering of the multiplexer inputs as in the LD20.
* DATA MULTIPLEXER SELECTIONS
SEL.PC INV DATAMUXCTL=O
SEL.MB INV DATAMUXCTL=l
SEL.MA INV DATAMUXCTL=2
SEL.AC INV DATAMUXCTL=3
SEL.SWR INV DATAMUXCTL=5
SEL.MEM INV DATAMUXCTL=6
SEL.INPUT INV DATAMUXCTL=7
SEL.EA INV DATAMUXCTL=7, ENABLE.EA

We have described the selection of each input to the multiplexer in terms


of its multiplexer select code. (As in the LD20, we have left data multiplexer
input 4 unused, to allow for expansion. We have routed INPUT and EA, both
controlled by three-state signals, into a single data multiplexer input. EA's
control signal is ENABLE.EA, which will become one of the LD30's micro-
instruction command signals; INPUT needs no microinstruction command signal
for its three-state control since that is governed by the PDP-8's input-output
protocol.)
Now we can finally specify the details of DATAMUXCTL: it is a microin-
struction command-bit field of 3 bits, as suggested in Table 10-3. Until we begin
the wiring, the actual locations of the three bits are unimportant, even distracting.

420 Bridging the Hardware-Software Gap Part III


If we later decide to change their position in the microinstruction, the only change
to the microprogram is in the COM statement itself.
A suitable final version of this command statement might be:
DATAMUXCTL COM (6:8).T=%HHH

where each select bit is declared as T = H.


We may unravel the specifications of the ALU operations in a similar way.
Here are specifications for two of the seven ALU operations needed in the LD30:
* ALU OPERATIONS (using the 74LS181)
ALU.PASS INV ALUCTL=LS181.PASS
ALU.PLUS INV ALUCTL=LS181.PLUS

To declare the operation of the 74LS181 ALU chip, we go to the integrated


circuit data book and extract the voltage behavior of the chip. For instance:
* 74LS181 arithmetic logic unit
* Order of signals is S3,S2,Sl,SO,M,CIN
LS181.PASS EQU %LLLLLH
LS181.PLUS EQU %HLLHLH

The final specification of the command-bit field ALUCTL is:


ALUCTL COM (9: 14) ,D=LS181.PASS

The field occupies 6 bits. We choose to specify the default assignment so that
whenever we are not specifically controlling the ALU, it will be performing a
PASS operation.
Specifying the control of most of the data registers is simple, since only a
single control signal is involved, but the AC, with its ability to shift and load,
requires two control signals. Here is a representative sample of the declarations:
* SOME DATA REGISTER OPERATIONS
LOAD.MB INV MBLD
LOAD.PC INV PCLD

* ACCUMULATOR OPERATIONS (using 74LS194)


LOAD.AC INV ACCTL=LS194.LOAD
AC.RIGHT INV ACCTL=LS194.RIGHT
AC.LEFT INV ACCTL=LS194.LEFT

* 74LS194 SHIFT REGISTER OPERATION


* Order of signals is Sl,SO
LS194.HOLD EQU %LL
LS194.RIGHT EQU %HL
LS194.LEFT EQU %LH
LS194.LOAD EQU %HH

Finally, the specification of the command bits for the ME, PC, and AC
registers is:

Chap. 10 Microprogrammed Design 421


MBLD COM (16),T=%L,D=%F
PCLD COM (17),T=%L,D=%F
ACCTL COM (19:20),T=%HH,D=LS194.HOLD

We are nearly done. We may now tabulate the test input signals needed
to drive the microcode, and prepare the necessary declaration statements. Table
10-5 contains the declarations. Each invocation describes the voltage polarity
of the signal and its position within the test multiplexer.

TABLE 10-5 TEST INPUT


DECLARATIONS OF THE LD30

*TEST MULTIPLEXER INPUTS


TM EQU TESTMUXCTL
MANUAL.SW INV TM=2,T=%H
CONT.SW INV TM=12,T=%L
LDMA.SW INV TM=ll,T=%L
LDMB.SW INV TM=lO,T=%L
LDPC.SW INV TM=9,T=%L
LDIR.SW INV TM=8,T=%L
LDAC.SW INV TM=7,T=%L
CLEAR.SW INV TM=6,T=%L
EXAMINE.SW INV TM=5,T=%L
LDMEM.SW INV TM=4,T=%L
DEPOSIT.SW INV TM=3,T=%L
SING.INST.SW INV TM=13,T=%L
IR4 INV TM=26,T=%H
IR5 INV TM=27,T=%H
IR6 INV TM=28,T=%H
IR7 INV TM=29,T=%H
IR8 INV TM=30,T=%H
IR9 INV TM=31, T=%H
IRIO INV TM=32,T=%H
IRll INV TM=33,T=%H
INTERRUPT. REQUEST INV TM=l,T=%H
INTERRUPT. ENABLE INV TM=O,T=%H
HALTFF INV TM=15,T=%L
CC INV TM=14,T=%H
NO.MEMORY INV TM=23,T=%L
DIRECT.ADDRESSING INV TM=25,T=%L
NO.INDIRECT.OPERAND INV TM=22,T=%L
AUTO.INDEXING INV TM=21,T=%L
IOT.ION INV TM=24,T=%L
IOSKIP INV TM=18,T=%L
ACCLR INV TM=17,T=%L
ORAC INV TM=16,T=%L
ALU.COUT INV TM=34,T=%L
MB.IS.ZERO INV TM=20,T=%L
SKIP INV TM=19,T=%H

422 Bridging the Hardware-Software Gap Part III


This concludes our presentation of the development of the LD30's microcode
declarations. The declarations tend to be lengthy, but they contain information
important both for the microprogram assembler and for the reader of the microcode.
Properly specified, the declarations exhibit a gradual evolution of detail, from
the high-level specifications in the microprogram algorithm down to the voltage
behavior of each command line in the microinstruction. Later modification and
maintenance of the design are greatly?simplified by the existence of this orderly,
well documented record. We said earlier that the LEASMB invocation was the
key to top-down design. Now you can see the power of this concept.
Our Design of the LD30 is Complete!

Our treatment of the microprogram for the LD30 is finished. We have shown
representative portions of the algorithm and of the declarations that describe the
terms used in the algorithm. How different is the LD30 machine from the LD20?
The main elements of the architecture are the same in both the hardwired
and microprogrammed designs. Several minor architectural features of the LD20
survive in the LD30, and the LD30 contains two major additions: the test input
multiplexer and the jump-address EPROM used in instruction decoding. Figure
10-23 shows the elements of the LD30's architecture. Not much is left.
The LD30 requires about 70 chips; the LD20, 125. The dramatic change
between the two is, of course, in the implementation of the control algorithm.
In moving from the LD20's hardwired control to the LD30's microprogrammed
control, we eliminate about 55 integrated circuit chips and innumerable wires.
The LD30's control algorithm looks like a program, and it is one. It contains
about 120 microprogram instructions and several hundred lines of declarations.
Comments are often useful in the microprogram, but with well-chosen nomenclature
and systematic top-down development, the microcode is largely self-documenting.
In the senior-level hardware laboratory at Indiana University, our students
study, construct, debug, and extend the LD20 hardwired and the LD30 micro-
programmed versions of the PDP-8. With the aid of the Logic Engine, our
students accomplish this in one semester, using wire-wrap technology. The proof
of their performance is to download and execute actual PDP-8 programs-a
source of immense satisfaction for the students and the instructors.

SUMMING UP
Starting in 1964 with IBM's use of microprogramming in their System/360 digital
computers, manufacturers have increasingly adopted microprogramming methods
for the control of computers. It is fair to say that most computer designs now
have at least some microcode in their control-a fact often unknown to the
programmer, since the microcode is not visible. The conventional programmer
works with the computer's machine or assembly language, or with higher-level
languages, without being aware that there is really another layer of programming-
the microcode-buried in the hardware.
The great advantages of microcoding are its uniformity and ease of mod-

Chap. 10 Microprogrammed Design 423


LINK AC

Main data paths

[J[J[JB
HALTFF Interrupt Interrupt lOP
D
JAMIRO

request enable enabler

o
BB Memory
control
Effective
address
AUTO-INDEXING

oMB.ZERO
MANUAL.SW
FF

5 combinational Instruction Test input


signals jump address selector

Figure 10-23 Hardware elements in the LD30.

ification. Carrying the notion further, we could control all digital tasks with a
single type of controller, such as a Logic Engine. Using microcode, we make
the controller perform a specialized task for each type of device, for instance
executing the PDP-8's instructions in the LD30. With identical copies of the
basic microprogrammable processor, we could control computers, line printers,
card readers, floppy disks, terminals, and other devices having suitable speed
requirements. Each of these would have its own architecture, which would

424 Bridging the Hardware-Software Gap Part III


include the device itself and the logic needed to interface with the microcode.
Each controller would have its own microcode to control the architecture.
This has great potential advantages for both the designer and the maintenance
engineer. The designer develops control algorithms with a uniform style, in a
programming mode, using support systems derived from experience with software.
The maintenance engineer would have to master only one type of control hardware.
Each device's specialized control algorithm could be dealt with as a microprogram,
a form that is generally easier to understand than a hardwired design. Diagnostic
and documentation aids based on software principles can greatly speed both the
development and the maintenance processes.
Unfortunately, although computer manufacturers make extensive use of
proprietary microprogrammable processors, few general-purpose microprogramming
systems such as the Logic Engine are available commercially.
In this chapter, we have discussed some basic principles of microprogramming
and have shown how the concepts may become a powerful tool for design,
requiring sophisticated support systems. We have not attempted to discuss the
wide variety of microprogramming techniques; for this you may consult more
specialized tests, such as those listed in the Readings and Sources.
Microprogramming is a great stride toward bridging the gap between hardware
and software. Let us now move all the way, to the use of small conventional
computers, microcomputers, as digital controllers.

READINGS AND SOURCES

ANDREWS, M., Principles of Firmware Engineering in Microprogram Control. Computer


Science Press, Woodland Hills, Calif., 1980. Good discussions of classical micropro-
grammed architectures. Techniques for reducing the width and length of a ROM. Uses
ASM notation for microprogramming.
Bipolar Microprocessor Logic and Intelface. Advanced Micro Devices, 901 Thompson
Place, P.O. Box 3453, Sunnyvale, Calif. 94088. Technical data and applications for
the Am2900, Am29100, and Am29300 series.
BLAKESLEE, THOMAS R., Digital Design with Standard MSI and LSI, 2nd ed. John Wiley
& Sons, New York, 1979. Good description of target versus host machines.
GLASSER, LANCE A., and DANIEL W. DOBBERPUHL, The Design and Analysis of VLSI
Circuits. Addison-Wesley Publishing Co., Reading, Mass., 1985.
KLINGMAN, EDWIN E., Microprocessor System Design, Vol. 2. Microcoding, Array Logic,
and Architectural Design. Prentice-Hall, Englewood Cliffs, N.J., 1982. Micro-
programming.
LSI Databook. Monolithic Memories, 2175 Mission College Blvd., Santa Clara, Calif.
95954. PALs, memory products, arithmetic units, system building blocks.
MANO, M. MORRIS, Computer System Architecture, 2nd ed. Prentice-Hall, Englewood
Cliffs, N.J., 1982.
MEAD, CARVER, and LYNN CONWAY, Introduction to VLSI Systems. Addison-Wesley
Publishing Co., Reading, Mass., 1980. The first VLSI textbook.

Chap. 10 Microprogrammed Design 425


MICK, JOHN, and JAMES BRICK, Bit-Slice Microprocessor Design. McGraw-HilI Book Co.,
New York, 1980. A collection of design notes for the Advanced Micro Devices 2900
bit-slice family. This book is useful far beyond the AM2900 chips.
MYERS, GLENFORD J., Digital System Design with LSI Bit-Slice Logic. John Wiley &
Sons, New York, 1980.
PROSSER, FRANKLIN, Logic Engine Development System: System Reference Manual. Logic
Design, Laramie, Wyo., 1983.
PROSSER, FRANKLIN, Logic Engine Development System: LEASMB Microprogram Assembler
Reference Manual. Logic Design, Laramie, Wyo., 1983.
PRoSSER, FRANKLIN, and ROBERT WEHRMEISTER, C421-C422 Advanced Computer Organization
Laboratory Manual. Computer Science Department, Indiana University, Bloomington,
Ind., 47405, 1985. Laboratory manual to support the construction, debugging, and
study of the LD20 and LD30 implementations of the PDP-81. The laboratory project
uses the Logic Engine Development System, manufactured by Logic Design. Ask the
authors of this book for information.
PROSSER, FRANKLIN, and DAVID WINKEL, "The Logic Engine Development System: Support
for microprogrammed bit-slice development," Proceedings MICRO 16, October 1983,
page 84.
System Design Handbook, 2nd ed. Monolithic Memories, 2175 Mission College Blvd.,
Santa Clara, Calif. 95054, 1985. Section 3 contains a shifter and pipeline architecture
similar to that of the Logic Engine.
WESTE, NEIL, and KAMRAN ESHRAGHIAN, Principles of CMOS VLSI Design: A Systems
Perspective. Addison-Wesley Publishing Co., Reading, Mass., 1985.

EXERCISES

10-1. Describe Wilkes's great contribution to the systematic development of control


algorithms.
10-2. Produce a Wilkes implementation similar to Fig. 10-3 for the ASM in Fig. 5-13.
10-3. Produce a Wilkes implementation similar to Fig. 10-3 for the ASM in Fig. 5-20.
10-4. Explain the meaning of "microprogram."
10-5. How does a ROM implementation of an ASM differ from Wilkes's implementation?
In what ways is the ROM method less flexible?
10-6. In microprogramming terminology, what is a "qualifier"?
10-7. Produce a ROM-based multiple-qualifier microprogrammed implementation of the
ASM in Fig. 5-13.
10-8. Produce a ROM-based multiple-qualifier microprogrammed implementation of the
ASM in Fig. 5-20.
10-9. Sketch a representative part of a ROM-based multiple-qualifier implementation of
the Black Jack Dealer ASM in Fig. 6-32.
10-10. Discuss the advantages and disadvantages of the following methods of ROM-based
microprogramming design. Consider the ease of implementation, generality of
ASM structure, understandability of the result, ease of use, and so on.
(a) Multiple qualifier.

426 Bridging the Hardware-Software Gap Part III


(b) Single qualifier, two addresses.
(c) Single qualifier, one address.
(d) Single qualifier in a Logic Engine environment.
10-11. Produce a single-qualifier, single-address microprogram implementation of an ap-
propriate modification of the ASM in Fig. 5-13. For this task you must alter Fig.
5-13 to accommodate the requirements of the microprogram. State clearly any
assumptions you make in producing the altered ASM.
10-12. Perform Exercise lO-l1 for the ASM in Fig. 5-20.
10-13. Design a controller for single-qualifier, two-address microinstructions (after Fig.
10-5). The controller should support up to 256 microinstructions, up to 32 test
inputs, and up to 40 command outputs. Specify the MSI chips that you use.
10-14. Perform Exercise 10-13 for a controller to execute microinstructions of the form
of Fig. 10-8.
10-15. Why does the text's single-qualifier method not permit conditional outputs? Design
a single-qualifier microinstruction interpreter that supports conditional outputs.
10-16. Complete the specification of the microcontroller in Fig. lO-lO.
10-17. Characterize microprogrammed design as seen by:
(a) The hardware designer.
(b) The computer programmer.
10-18. Why is the hardwired control described in ASM charts not the best tool for the
design of large, complex systems?
10-19. Propose a Logic Engine implementation of the Black Jack Dealer's control algorithm
of Fig. lO-9. Express the microcode in the style of the Logic Engine's microassembly
language described in the text.
10-20. Although RAM is appropriate for storing microcode in the Logic Engine during
the development and debugging of an algorithm, why might RAM not be the best
choice for a production version of a microprogrammed controller? On the other
hand, what advantages might RAM-based microprogram storage have in a production
version?
10-21. Compare the LD20 and LD30 as to speed, number of states, total design effort,
and anticipated ease of maintenance.
10-22. The text mentions an early microprogram sequencer, the 2909 4-bit sequencer
slice. To achieve a realistic number of microprogram address bits, several of
these chips are to be cascaded together. What connections do you anticipate will
be required between the cascaded 2909 chips?
10-23. Look at Fig. 10-17, which contains the microcode for the Forth Machine design
example. The object code is presented in hexadecimal notation, with 4 bits per
hexadecimal digit. In the object code, 1 represents a high voltage and 0 a low
voltage. One of the bits in the X field of the object code is the 2910's CCEN
input. The Logic Engine implements an unconditional jump by making CCEN
false, thereby forcing a pass operation in the 29lO. A conditional jump, which
makes use of the 2910's CC input, requires CCEN to be true.
(a) By inspection ofthe object code in Fig. lO-17, determine which bit of the X-
field is CCEN.
(b) State whether the 29lO's CCEN is high-active or low-active.
10-24. Another of the X-field bits in the Logic Engine microcode of Fig. 10-17 is CCINV,

Chap. 10 Microprogrammed Design 427


the assembler's specification that the voltage of the incoming test signal must be
inverted in order that the 2910 properly interpret the signal. The 2910's CC input
is low-active.
(a) By inspection ofthe object code, determine which bit corresponds to CCINV.
(b) Using the declaration for LD.L and TST.L, determine the correct status of
the CCINV bits (true or false) in the microinstructions at locations 000, 001,
003, and 005.
(c) Determine if CCINV is high-active or low-active.
10-25. From Fig. 10-17, expand the hexadecimal digits for the command bits in the
object microinstructions at locations 001 and 002, so that you can observe the 18
command signals individually. For each of these microinstructions, verify that
the bits in the object code are consistent with the declarations in the source
program.
10-26. In Fig. 10-17, why is the curious term T=%L appended to the declarations for
LD.L and TST.L?
10-27. Complete the LD30 microcode for the following PDP-8 instructions:
(a) AND. .
(b) DCA.
(c) JMP.
10-28. Complete the LD30 microcode for the parts of the PDP-8's Operate instruction
not elaborated in the text:
(a) Group 1.
(b) Group 2.
10-29. Exercise 8-52 requires the addition of the PDP-8's Group 3 instructions to the
LD20's hardwired design. Repeat that exercise for the LD30, showing all mod-
ifications or extensions of the architecture and the microcode.
10-30. Consider the PDP-8's IOT instruction. Assume that the lOP signal enabler is
eliminated from the LD30's architecture. Implement the lOT instruction solely
in microcode. Remember to keep each required lOP signal asserted throughout
the testing of the incoming status signals.
10-31. Explain clearly why we chose to implement the decoding of the LD30's instructions
with a mapping ROM and a separate jump table rather than by using the faster
method of having the mapping ROM jump directly to the appropriate microcode.
10-32. From the LD30 microcode given in the text, find ten examples of operations on
the main data path, other than PC.TO.MA, CLEAR.AC, ADD.MB.TO.AC, and
SWR.TO.IR. Write LD30 invocation declarations for your ten operations, following
the examples in the text. Expand any intermediate operations until each element
is finally reduced to command declarations.
10-33. Perform Exercise 8-50 using the LD30's microprogrammed design. Show mod-
ifications or additions to the architecture and microcode.
10-34. Perform Exercise 8-51 using the LD30's microprogrammed design. Show mod-
ifications or additions to the architecture and microcode.
10-35. Perform Exercise 8-52 using the LD30's microprogrammed design. Show mod-
ifications or additions to the architecture and microcode.
10-36. Perform Exercise 8-53 using the LD30's microprogrammed design. Show mod-
ifications or additions to the architecture and microcode.

428 Bridging the Hardware-Software Gap Part III


10-37. Choose a computer of your choice, such as the PDP-ll, M6809, M68000, or Intel
8080, and write a microprogram to implement the processor's instruction set. Use
the style of the Logic Engine's microassembly language. The suggested processors
are complex, and you may wish to consider a subset of the instructions. The
user's manual for your chosen processor will probably not give much insight into
the internal organization of the processor. As we did with the LD20 and LD30
versions of the PDP-8, start with the instruction set and the obvious registers
referenced by the instructions. You are free to specify additional registers, data
paths, and control signals required to achieve your implementation.
10-38. Propose a Logic Engine implementation of the Black Jack Dealer's control algorithm
of Fig. 10-10. Express the microcode in the style of the Logic Engine's assembly
language described in the text.

Chap. 10 Microprogrammed Design 429

You might also like