Power Estimation Methods For Sequential Logic Circuits: Pedram, M. Despain
Power Estimation Methods For Sequential Logic Circuits: Pedram, M. Despain
Power Estimation Methods For Sequential Logic Circuits: Pedram, M. Despain
3, SEPTEMBER 1995
Abstruct- Recently developed methods for power estimation the increasing scale of integration, we believe that power
have primarily focused on combinational logic. We present a dissipation will assume greater importance, especially in multi-
framework for the efficient and accurate estimation of average chip modules where heat dissipation is one of the biggest
power dissipation in sequential circuits.
Switching activity is the primary cause of power dissipation problems.
in CMOS circuits. Accurate switching activity estimation for Power dissipation of a circuit, like its area or speed, may
sequential circuits is considerably more difficult than that for be significantly improved by changing the circuit architecture
combinational circuits, because the probability of the circuit or the base technology [3]. However, once these architectural
being in each of its possible states has to be calculated. The
Chapman-Kolmogorov equations can be used to compute the or technological improvements have been made, it is the
exact state probabilities in steady state. However, this method switching of the logic that will ultimately determine the power
requires the solution of a linear system of equations of size dissipation.
where N is the number of flip-flops in the machine. Methods for the power estimation of logic-level combi-
We describe a comprehensive framework for exact and ap- national circuits based on switching activity estimation have
proximate switching activity estimation in a sequential circuit.
The basic computation step is the solution of a nonlinear system been presented previously (e.g., [2], [4], [7], [9], [lo], [13]).
of equations which is derived directly from a logic realization of Power and switching activity estimation for sequential circuits
the sequential machine. Increasing the number of variables or the is significantly more difficult, because the probability of the
number of equations in the system results in increased accuracy. circuit being in any of its possible states has to be computed.
For a wide variety of examples, we show that the approximation Given a circuit with N flip-flops, there are 2N possible states.
scheme is within 1-3% of the exact method, but is orders of
magnitude faster for large circuits. Previous sequential switching These state probabilities are, in general, not uniform. As an
activity estimation methods can have significantly greater inac- example, consider the sequential circuit of Fig. 1 and the
curacies. example State Transition Graph of Fig. 2. Assuming that the
circuit was in state R at time 0, and that at each clock cycle
random inputs are applied, at time M (i.e., steady state) the
I. INTRODUCTION probabilities of the circuit being in state R, A, B, C are
i, i, i,
$, and respectively. These state probabilities have to
F OR MANY consumer electronic applications low average
power dissipation is desirable and for certain special be taken into account during switching activity estimation of
applications low power dissipation is of critical importance. the combinational logic part of the machine. Power dissipation
For applications such as personal communication systems and and switching activity of CMOS combinational logic are
hand-held mobile telephones, low-power dissipation may be modeled by randomly applied vector pairs. In the case of
the tightest constraint in the design. More generally, with sequential circuits, the vector pair (211, 212) applied to the
combinational logic is composed of a primary input part and a
present state part (see Fig. l), namely (ilQs1, i2Q.52). Given
Manuscript received June 15, 1994; revised February 20, 1995 and March il@sl, the next state s2 is uniquely determined given the
31, 1995. The work of C.-Y. Tsui and A. M. Despain was supported in part by functionality of the combinational logic. For example, if il
the Advanced Research Projects Agency under Contract J-FBI-91-194. The happens to be 0 and the machine of Fig. 2 is in state R, the
work of M. Pedram was supported in part by the Advanced Research Projects
Agency under Contract F33615-95-C-1627, and by the SRC under Contract machine will move to state B. This correlation between the
94-DJ-559. The work of J. Monteiro’s and S. Devadas was supported in part applied vector pairs has to be taken into account in order to
by the Advanced Research Projects Agency under Contract DABT63-94-C- obtain accurate estimates of the switching activity in sequential
0053, and in part by a NSF Young Investigator Award with matching funds
from Mitsubishi Corporation. circuits.
C.-Y. Tsui is with the Department of Electrical Engineering, Hong Kong A first attempt at estimating switching activity in logic-
University of Science and Technology, Hong Kong. level sequential circuits was presented in [4]. This method can
M. Pedram and A. Despain are with the Department of Electrical Engineer-
ing, University of Southem Califomia, Los Angeles, CA 90089 USA. accurately model the correlation between the applied vector
J. Monteiro and S. Devadas are with the Department of Electrical Engineer- pairs, but assumes that the state probabilities are all uniform.
ing and Computer Science, Massachusetts Institute of Technology, Cambridge, Extensions of this method can produce accurate estimates for
MA 02139 USA.
B. Lin is with IMEC, Belgium, France. acyclic sequential circuits such as pipelines, but not for more
IEEE Log Number 9413459. general cyclic circuits [SI.
1063-8210/95$04.00 0 1995 IEEE
TSUI et al.: POWER ESTIMATION METHODS FOR SEQUENTIAL LOGIC CIRCUITS 40.5
11. PRELIMINARIES
Symbolic
Simulation
Equations
Equation (1) is used by the power estimation techniques simulation method takes into account the correlation due
such as [4], [9] to relate switching activity to power dissipa- to reconvergence of input signals and accurately measures
tion. switching activity.
The same computation can be performed more efficiently,
although not exactly, using probabilistic simulation techniques
B. Combinational Circuits
such as [lo] and [13] or Monte-Carlo simulation [ 2 ] . In the
Average power can be estimated for combinational circuits remainder of this paper, whenever we need to perform the
by computing the average switching activity at every gate in above computation, we will refer to the symbolic simula-
the circuit. tion equations (which provide the exact solution). It should
It is assumed that we are given transition probabilities at however be made clear that any other solution technique
each of the primary inputs to the circuit. That is, for every (probabilistic simulation, Monte-Carlo simulation, etc.) can be
primary input the probability of the primary input staying at used instead.
0 (0 + 0), staying at 1(1 + l),making a 0 + 1transition and
making a 1 + 0 transition are given. Given these probabilities,
111. RIE EXACTMETHOD
the average switching activity at each gate in the circuit can
be calculated.
A symbolic simulation method that performs this compu- A. Modeling Correlation
tation was given in [4]. Under the chosen gate delay model, To model the correlation between the two vectors in a
the method first constructs a Boolean function representing the randomly applied vector pair, we have to augment the com-
logical value at any gate output at each time point 2 t based on binational estimation method described in Section 11-B. This
the primary input variables IO applied at time 0 and I t applied augmentation is summarized in Fig. 3.
at time t. For instance, one may compute the functions fi (t+1) In Fig. 3, we have a block corresponding to the symbolic
+
and f i ( t 2 ) for a particular gate gi. The Boolean conditions simulation equations for the combinational logic of the general
at the inputs that correspond to a 0 + 1 transition on gi sequential circuit shown in Fig. 1. The symbolic simulation
+ +
between times t 1 and t 2 are represented by the function equations have two sets of inputs, namely ( I 0 , l t ) for the
+ +
f i ( t 1) . f i ( t 2 ) . The probability of a 0 + 1 transition primary inputs and ( P S ,N S ) for the present state lines.
+ +
occurring between time t 1 and t 2 given the transition However, given IO and P S , N S is uniquely determined by
probabilities at the primary inputs is the probability of the the functionality of the combinational logic. This is modeled
+ +
Boolean function fi(t 1). f i ( t 2 ) evaluating to a 1. (This by prepending the next state logic to the symbolic simulation
probability can be evaluated exactly using Binary Decision equations.
Diagrams 111 or approximately using Monte Carlo simulation.) The configuration of Fig. 3 implies that the gate output
For each gate, probabilities of transitions occurring at any switching activity can be determined given the vector pair
time point can be evaluated efficiently, and these probabilities (IO,I t ) for the primary inputs, but only PS for the state lines.
are summed over all the time points to obtain the average Therefore, to compute gate output transition probabilities, we
switching activity (at each gate). require the transition probabilities for the primary input lines,
Under the zero delay, unit delay, or a general delay model and the static probabilities for the present state lines. This
(where delays are obtained from library cells), the symbolic configuration was originally proposed in [4].
TSUI et al.: POWER ESTIMATION METHODS FOR SEQUENTIAL LOGIC CIRCUITS 407
prob(s;) = 1.
i=l where the Cm's are cubes of the disjoint cover. Each C, is
This linear set of K equations can be solved to obtain the a function of the present state lines and primary inputs. We
different prob( s;) 's. partition the inputs to C, into two groups: the symbolic state
This system of equations is known as the Chap- support SS, which includes all states si that have set the
man-Kolmogorov equations for a discrete-time appropriate state bits, and the primary input support I , which
discrete-transition Markov process. Indeed, if the Markov includes the P I inputs of C,. Hence C, = SSmI,. The
process satisfies the conditions that it has a finite number of signal probability of n is thus given by:
states, its essential states form a single-chain and it contains
no periodic-states, then the above system of equations will
have a unique solution [12].
prob(n) = c
m E Disjoint-Cover(n)
prob(Cm). (3)
Assume that the probability of i l being a 1 is 0.5, and state state machine controllers, datapaths2as well as pipelines. First,
probabilities are prob(00) = $, prob(0l) = prob(l0) = i, the power dissipation of the circuit was calculated using the
and prob(l1) = 2.
(The first bit corresponds to p s l and the exact state probabilities as described in Section 111-C. Next,
second to ps2.) The probability of the first cube is given the exact state probabilities, the line probabilities were
determined as described in the previous paragraph. Using
prob(i1 A p s i ) = p r o b ( i l ) x [prob(lO) f p r o b ( l l ) ] the topology of Fig. 3 and the computed present state line
probabilities for the P S lines, approximate power dissipations
=0.5 x (2 + 2) were calculated for each circuit. The average err03 in the
-1
- 4' power dissipation measures obtained using the line probabil-
ity approximation over all the circuits was only 2.8%. The
Similarly the probability of the second cube is: maximum error for any one example was 7.3%. Assuming
uniform line probabilities of 0.5 as in [4] results in significant
errors of over 40% for some examples.
The above experiment leads us to conclude that if accurate
line probabilities can be determined then using line probabil-
ities rather than state probabilities is a viable altemative. We
only have to determine N numbers for a N flip-flop machine,
Finally we have: one for each present state line, rather than 2 N numbers, one
for each possible state.
(a)
(li*dprowiLilyf-)
k = a userdelined limil
k
NSk-'
d
/
k-unrolled network %
symbolic
simulption
EqU&lU Transition
probabilities
(b)
Fig. 5. Calculation of signal and transition probabilities by network un-
B. Inaccuracy in Formulation rolling.
101 >
IO, m Next State
c Logic
ps2 m ..........
..1
psn NSII
-.... J
A
correlation between the state lines, when computing the state method models the correlation between m-tuples of present
line probabilities. state lines. The method is pictorially illustrated in Fig. 6 for
The exact present state line probabilities can be obtained m = 2.
by unrolling the next state logic 0;) times (Fig. 4(a)). This is The number of equations in the case of m = 2 is 3N/2.
however impractical. We thus approximate the signal proba- We have:
bilities by unrolling the next state logic k times where k is a
l ln]s i A nsi+l = fi A
n ~ i , i + ~ [= fi+l
user defined parameter.
The equations corresponding to k = 2 will be: nsi, i+l[l~]= nsi A 7~si+l=fi A
-
fi+l
ns;,;+l[Ol] =-A nsi+1 = fi A fi+l.
ns: =f&, . ' . , ih,p s i , . . . , p s h )
= fl(ii,. ..,
.I,...,
= f 1b1
iz,
iz, . . , nsO,)
fl(Z:,-, iL,Ps:,-.,psg),
We have to solve for prob(nsi, i+l[ll]), prob(nsi,i+l[lO]),
and prob(nsi,i+l [Ol]) [rather than prob(nsi) and prob(ns;+l)
as in the case of m = 1). We use:
. . . , fj&, . . . , ZL, ps:, . . . , psO,)]
... prob(psi A PSi+l) =prob(nsi,i+1[111)
ns:, = f&, . . . , ih, f l ( i 7 , . . . , iL, ps:, . . . , p s g ) , Prob(psi A =prob(nsi,i+1[101)
. . . , fj&, . . . , ZL, ps:, . . . , p s k ) ] . proqpsi A PSi+l) =prob(nsz, i+l[Oll)
The number of equations is the same. The number of primary in the evaluation of the prob(fi)'s.
input variables has increased, but the probabilities for these The signal probability evaluation methods of Section VII-C
variables are known. can be easily augmented to use the above probabilities. In the
Fig. 5(a) shows the method used to calculate signal proba- case of the OBDD-based method placing each psi and psi+l
bility of the intemal nodes of the FSM using the k-unrolled pair adjacent in the chosen ordering allows signal probability
network with signal probability feedback. computation by a linear-time traversal.
The number of equations for m = 3 is 7 N / 3 .When m = N ,
B. Switching Activity Computation the number of equations will become 2 N and the method will
degenerate to the Chapman-Kolmogorov method.
The topology of Fig. 3 was proposed as a means of taking
into account the correlation between the applied input vector The choice of the m-tuples of present and next state lines
pair when computing the transition probabilities. This method is made by grouping next state lines that have the maximal
takes one cycle of correlation into account. amount of shared logic into each m-tuple. Note that the
accuracy of line probability estimation will depend on the
It is possible to take multiple cycles of correlation into
account by prepending the symbolic simulation equations with choice of the m-tuples.
the k-unrolled network. This is illustrated in Fig. 5(b). Instead
of connecting the next state logic network to the symbolic B. Switching Activity Computation
simulation equations, we unroll the next state logic network k To estimate switching activity given m-tuple present state
times and connect the next state lines of the kth stage of the line probabilities, the topology of Fig. 3 is used as before. The
unrolled network, the next state lines of the (IC - 1)th stage, difference is that for m = 2 the prob(psi A psi+l), prob(psi A
and the primary input of the ( k - 1)th stage to the symbolic psi+l) and prob(ps;A psi+l) values are used to calculate the
simulation equations. switching activities.
VI. IMPROVING ACCURACY USING m-EXPANDED NETWORKS VII. SOLVING THE SYSTEM OF EQUATIONS
NONLINEAR
We describe two methods to solve the nonlinear system of
A. State Line Probability Computation
equations obtained using k-unrolled or m-expanded networks.
We describe a different method to improve the accuracy of We will assume that the nonlinear system can be represented
the basic approximation strategy outlined in Section IV.This as P = G(P ) or as Y ( P ) = 0 as described in Section IV.
TSUI et al.: POWER ESTIMATION METHODS FOR SEQUENTIAL LOGIC CIRCUITS 41 1
si = P j . PrOb(fipsj) + (1 - P j ) . prob(f;=).
Differentiating with respect to pj gives:
a fj.
B. Signal Probability Evaluation Assuming uniform probabilities for the present state line
In the nonlinear equation solver, regardless of whether we probabilities and state probabilities as in [4] can result in
are using the Picard-Peano method or the Newton-Raphson significant inaccuracies in power estimates.
method, we have to repeatedly evaluate the signal probability Computing the present state line probabilities using the
of a Boolean function given input probabilities, i.e., com- technique presented in the previous sections results in 1)
pute prob[fi(il,i 2 , . . . ,i ~ psl,
, p s z , " . , p s ~ ) given
] the accurate switching activity estimates for all intemal nodes
prob(ik)'s and the prob(psj)'s. in the network implementing the sequential machine;
There exist several methods to evaluate signal probability. 2) accurate, robust and computationally efficient power
An exact method corresponds to using Ordered Binary Deci- estimate for the sequential machine.
sion Diagrams (OBDD's) [I]. If an OBDD can be created for In Table I, results are presented for several circuits. In the
fi, then prob(fi) can be evaluated in linear time in the size of
table, combinational corresponds to the purely combinational
the OBDD for fi. OBDD's can be cofactored in linear time, estimation method of [4] and uniform-prob corresponds to
allowing for the efficient evaluation of the Jacobian entries. the sequential estimation method of [4] that assumes uniform
An alternative is to use Monte Carlo simulation. Approxi- state probabilities. The column line-prob corresponds to the
mate signal probabilities can be computed using random logic technique of Section IV and using the Newton-Raphson
simulation on the multilevel network corresponding to f i .
method with a convergence criterion of 0.0001% to solve the
Our experience has been that the signal probabilities quickly
equations. These equations correspond to IC = 0 or m = 1.
converge to the exact results obtained using OBDD's. In order
Finally, state-prob corresponds to the exact state probability
to evaluate a particular Jacobian entry, the appropriate input to
calculation method of Section 111. The zero delay model was
fi has to be set to 0 (I) and random simulation is performed
assumed, however, any other delay model could have been
on the remaining inputs.
used instead.
The first set of circuits corresponds to finite state machine
RESULTS
VIII. EXPERIMENTAL controllers. These circuits typically have the characteristic
In this section we present experimental results that illustrate that the state probabilities are highly nonuniform. Restricting
the following points: oneself to combinational power dissipation (combinational)
Exact and explicit computation of state probabilities is or assuming uniform state probabilities (unifomz-prob) results
possible for controller type circuits. However, it is not in significant errors. However, the line probability method of
viable for data path circuits. Purely combinational logic Section IV produces highly accurate estimates when compared
estimates result in significant inaccuracies. to exact state probability calculation.
TSUI et al.: POWER ESTIMATION METHODS FOR SEQUENTIAL LOGIC CIRCUITS 413
~~ ~
accum4 0.084 0 0
accum8 0.086 0 0
count7 1 0.2 0.2 1 1 1
accuml6 0.096 0 0 count8 1 0.2 0.2 1 1 1
I1 I I 11 I I I
COUt4 0.169 0 0 cbp32.4 3 0.8 2.4 4 18.5 74
addl6 3 0.3 0.9 3 3 9
count8 0.192 0 0 mult8 2 3.25 6.5 4 9.25 37
I I s953 30 0.04 1.1 4 0.5 2
cbp32.4 I -I
- II - II
I1
-I s 1196 2 1.1 2.2 2 2 4
si23a 2 1.15 2.3 2 2.5 5
mult8
s 1196 they are added together, the overall error may become small
due to error cancelation. Increasing k improves the accuracy of
power estimates for individual nodes (see Table VI), but does
not necessarily improve the accuracy of power estimate for
the circuit due to the unpredictability of the error cancelation
iterations, and if oscillation is detected, the Newton-Raphson during the summing step. The m-expansion-based method
method is applied. The Newton-Raphson method does not behaves more predictably for this set of examples, however,
require the domain to be contractive, however, the initial guess again no guarantees can be made regarding the improvement
has to be “close” to the solution P* in a manner quantified in accuracy (of total power estimates) on increasing m, except
by Theorem 7.3. that when m is set to the number of flip-flops in the machine,
In Table V, we present results that indicate the improvement the method produces the Chapman-Kolmogorov equations,
in accuracy in power estimation when k-unrolled or m- and therefore the exact state probabilities are obtained. The
expanded networks are used. Results are presented for the Newton-Raphson method with a convergence criterion of
finite state machine circuits of Table I for 0 5 IC 5 2 and 0.0001% was used to obtain the line probabilities in Tables
1 5 m _< 4.4 The percentage differences in power from the V and VI.
exact power estimate are given. In general, if k 4 00, the error The CPU times for power estimation are in seconds on
will reduce to 0%, however, increasing k when IC is small is a SUN SPARC-2. These times can be compared with those
not guaranteed to reduce the error in total power estimates listed in Table I under the “Line Prob.” column as those times
(e.g., consider styr). This phenomenon can be explained as correspond to k = 0 and m = 1. Based on these results, we
follows. The total power estimate is obtained by summing conclude that k = 1 and m = 2 provide a good compromise
power consumptions of all nodes in the circuit. The individual between accuracy and run-time.
power estimates may be under- or over-estimated, yet when During the synthesis process, we often want to know the
4The initial error for dk16 and s r e g benchmarks is 0, thus, there is no switching activity of individual nodes instead of a single power
need to improve the accuracy by using larger values of k and m. consumption figure. Table VI presents the percentage error in
TSUI er al.: POWER ESTIMATION METHODS FOR SEQUENTIAL LOGIC CIRCUITS 415
TABLE V can accurately model the correlation between the applied input
OF POWER ESTIMATION
RESULTS BASEDON vector pairs can be used.
IC-UNROLLED
AND 1~-EXPANDEDNETWORKS
Massoud Pedram (M’90-S’9&M’91) received the Alvin M. Despain (S’58-M’65) received the B.S.,
B.S. degree in electrical engineering from the Cal- M.S., and Ph.D. degrees in electrical engineering
ifornia Institute of Technology in 1986, and the from the University of Utah in 1960, 1962, and
M.S. and Ph.D. degrees in electrical engineering and 1966, respectively.
computer sciences from the University of California, He is the Powell Professor of Computer Engi-
Berkeley in 1989 and 1991, respectively. neering at the Univeristy of Southern California
He is an Assistant Professor of Electrical En- (USC), and a Professor in the Computer Science
gineering-Systems at the University of Southern and Electrical Engineering Systems Departments.
California. His research interests span many aspects He has been an Assistant Research Professor at the
of design and synthesis of VLSI circuits, with University of Utah, an Associate Professor at Utah
particular emphasis on layout optimization, logic State University, a Visiting Associate Professor at
synthesis and behavioral optimization, layout-driven synthesis, and design for Stanford University, a Professor at the University of California at Berkeley,
low power. and has been at USC since 1989. He is a pioneer in the study of high-
Dr. Pedram is a recipient of the National Science Foundation’s Research performance computer systems for symbolic calculations. His research group
Initiation Award in 1992 and the Young Investigator Award in 1994. His builds experimental software and hardware systems including compilers,
research has received a number of awards including one ICCD Best Paper custom VLSI processors, and multiprocessor systems. Their goal is to de-
Award and a Distinguished Paper Citation from ICCAD. He has served on termine principles for the design of high-performance computer systems.
the technical program committee of a number of conferences and workshops, Despain’s research interests include computer architecture, multiprocessor and
including the Design Automation Conference. He was the co-founder and multicomputer systems, logic programming, and design automation.
General Chair of the 1994 International Workshop on Low Power Design,
and the General Chair of the 1995 International Symposium on Low Power
Design. He has given several tutorials on low power design at major CAD
conferences and forums including, ICCAD and DAC. He is a member of the
ACM.
Bill Lin received the B.Sc., M.S., and Ph.D. degrees
in electrical engineering and computer sciences
from the University of California, Berkeley, in
1985, 1988, and 1991, respectively.
Since graduating from Berkeley, he has been
working in the VLSI Systems Design Method-
Srinivas Devadas (S’87-M’88) received the B. ologies division of the Inter-University Micro-
Tech degree in electrical engineering from the In- Electronics Center (IMEC) in Leuven, Belgium.
dian Institute of Technology, Madras in 1985, and At IMEC, he is currently heading the System
the M.S. and Ph.D. degrees in electrical engineenng Control and Communications group, which is
from the University of California, Berkeley, in 1986 mainly focusing on system design technology for
and 1988, respectively. embedded hardware-software systems. This group also has a major effort in
Since August 1988, he has been at the Massa- asynchronous design methods and high-speed telecom Am-based network
chusetts Institute of Technology, Cambridge, and is applications. He has been work-package leader in several E.C. sponsored
currently an Associate Professor of Electrical Engi- Esprit projects an Belgian Flemish government sponsored IWT projects.
neenng and Computer Science. He held the Analog Previously, he has worked at the Hewlett Packard Corporation, the Hughes
Devices Career Development Chair of Electncal Aircraft Company, and the Western Digital Corporation. He has authored or
Engineering from 1989 to 1991. His research interests span all aspects of co-authored more than 70 scientific publications in the area of CAD methods
synthesis of VLSI systems. for VLSI design.
Dr. Devadas has received five Best Paper Awards at CAD conferences and Dr. Lin has served on the program committee of several international
journals, including the 1990 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN conferences. In 1987, he received a Best Paper Award at the 24th Design
Best Paper Award. In 1992, he received the NSF Young Investigator Award. Automation Conference, Miami, FL. In 1989 and 1990, respectively, he
He has served on the technical program commttees of several conferences and received a Distinguished Paper Citation at the IFIF’ VLSI conference in
workshops including the International Conference on Computer Design, and Munich, Germany, and at the ICCAD conference in Santa Clara, CA. In
the International Conference on Computer-Aided Design He is a member of 1994, he received a best paper nomination at the ACM Design Automation
the ACM. Conference, San Diego, CA.