P o w e r S i m u l a t i o n a n d E s t i m a t i o n i n V L S I C i r c u i t s

connectionbuttsΗλεκτρονική - Συσκευές

26 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

135 εμφανίσεις

P o w er Sim ulation and Estimation in VLSI Circuits
Massoud P edram
Univ ersit y of Southern California
Departmen t of EESystems
Los Angeles CA 
p edramcenguscedu
Abstract
In the past the ma jor concerns of the VLSI designer w ere area sp eed and cost
p o w er consideration w as t ypically of secondary imp ortance In recen t y ears ho w
ev er this has b egun to c hange and increasingly  p o w er is b eing giv en comparable
w eigh t to other design considerations Sev eral factors ha v e con tributed to this trend
including the remark able success and gro wth of the class of batteryp o w ered p er
sonal computing devices and wireless comm uni cations systems that demand high
sp eed computation and complex functionalit y with lo w p o w er consumption In these
applications extending the battery service life is a critical design concern There
also exists a signican t pressure for pro ducers of highend pro ducts to reduce their
p o w er consumption The main driving factors for lo w er p o w er dissipation in these
pro ducts are the cost asso ciated with pac k aging and co oling as w ell as the circuit
reliabilit y 
System designers ha v e started to resp ond to the requiremen t of p o w erconstrained
system designs b y a com bination of arc hitectural impro v emen ts and adv anced design
automation metho dologies and tec hnqiues for lo w p o w er In parallel researc hers
and CAD to ol dev elop ers ha v e in tro duced a v ariet y of mo dels algorithms and
tec hniques for estimating the p o w er dissipation in VLSI circuits and systems in
supp ort of the lo w p o w er optimization and syn thesis tec hniques
This article describ es represen tativ e con tributions to the p o w er mo deling and esti
mation of VLSI circuits at v arious lev els of design abstraction
 In tro duction
In the past the ma jor concerns of the VLSI designer w ere area p erformance cost and
reliabilit y p o w er considerations w ere mostly of only secondary imp ortance In recen t
y ears ho w ev er this has b egun to c hange and increasingly  p o w er is b eing giv en compa
rable w eigh t to area and sp eed Sev eral factors ha v e con tributed to this trend P erhaps
the primary driving factor has b een the remark able success and gro wth of the class
of p ersonal computing devices p ortable desktops audio and videobased m ultimedia
pro ducts and wireless comm unications systems p ersonal digital assistan ts and p ersonal
comm unicators whic h demand highsp eed computation and complex functionalit y with
lo w p o w er consumption There also exists a strong pressure for pro ducers of highend
pro ducts to reduce their p o w er consumption
When the target is a lo wp o w er application the searc h for the optimal solution m ust
include at eac h lev el of abstraction a design impro v emen t lo op  In suc h a lo op a
p o w er analyzer
estimator ranks the v arious design syn thesis and optimization options
and th us helps in selecting the one that is p oten tially more e ectiv e from the p o w er
standp oin t Ob viously  collecting the feedbac k on the impact of the di eren t c hoices
on a lev elb ylev el basis instead of just at the v ery end of the o w ie at the gate
lev el enables a shorter dev elopmen t time On the other hand this paradigm requires
the a v ailabilit y of p o w er sim ulators and estimators as w ell as syn thesis and optimization
to ols that pro vide accurate and reliable results at v arious lev els of abstraction
It has b een p oin ted out in the in tro duction that the a v ailabilit y of lev elb ylev el p o w er
analysis and estimation to ols that are able to pro vide fast and accurate results are k ey
for increasing the e ectiv eness of automatic design framew orks organized W e start this
section with a concise description of tec hniques for soft w arelev el estimation Section
 W e then mo v e to the b eha vioral lev el Section  where w e discuss existing p o w er
estimation approac hes that rely on informationtheoretic Section  complexit ybased
Section   and syn thesisbased Section  mo dels Finally  w e fo cus our atten tion
to designs describ ed at the R Tlev el Section  This is the area where most of the
researc h activit y on p o w er mo deling and estimation has b een concen trated in recen t
times w e co v er t w o of the most in v estigated classes of metho ds namely  those relying
on regressionbased mo dels Section  and on samplingbased mo dels Section  
Finally  w e mo v e to the gatelev el p o w er estimation Section  where w e discuss existing
dynamic p o w er estimation approac hes that rely on statistical sampling Section  and
probabilistic compaction Section   as w ell as probabilit ybased signal propagation
sc hemes Section 
This review article ma y b e supplemen ted with other surv eys on the topic including
     
 Soft w areLev el P o w er Estimation
The rst task in the estimation of p o w er consumption of a digital system is to iden tify
the t ypical application programs that will b e executed on the system A nontrivial
application program consumes millions of mac hine cycles making it nearly imp ossible
to p erform p o w er estimation using the complete program at sa y  the R Tlev el Most of
the rep orted results are based on p ower macr omo deling  an estimation approac h whic h
is extensiv ely used for b eha vioral and R Tlev el estimation see Sections  and 
In  the p o w er cost of a CPU mo dule is c haracterized b y estimating the a v erage
capacitance that w ould switc h when the giv en CPU mo dule is activ ated In   the
switc hing activities on address instruction and data buses are used to estimate the
p o w er consumption of the micropro cessor In   based on actual curren t measuremen ts

of some pro cessors Tiw ari et al presen t the follo wing instructionlev el p o w er mo del
Ener gy
p

X
i
 B C
i
N
i
 
X
ij
 S C
ij
N
ij
 
X
k
O C
k
where Ener gy
p
is the total energy dissipation of the program whic h is divided in to three
parts The rst part is the summation of the base energy cost of eac h instruction  B C
i
is the base energy cost and N
i
is the n um b er of times instruction i is executed The
second part accoun ts for the circuit state  S C
ij
is the energy cost when instruction i is
follo w ed b y j during the program execution Finally  the third part accoun ts for energy
con tribution O C
k
of other instruction e ects suc h as stalls and cac he misses during the
program execution
In  Hsieh et al presen t a new approac h called pr oledriven pr o gr am synthesis 
to p erform R Tlev el p o w er estimation for high p erformance CPUs Instead of using
a macromo deling equation to mo del the energy dissipation of a micropro cessor the
authors use a syn thesized program to exercise the micropro cessor in suc h a w a y that
the resulting instruction trace b eha v es in terms of p erformance and p o w er dissipation
m uc h the same as the original trace The new instruction trace is ho w ev er m uc h shorter
than the original one and can hence b e sim ulated on a R Tlev el description of the target
micropro cessor to pro vide the p o w er dissipation results quic kly 
Sp ecically  this approac h consists of the follo wing steps
 P erform arc hitectural sim ulation of the target micropro cessor under the instruction
trace of t ypical application programs
 Extract a char acteristic pr ole  including parameters suc h as the instruction mix
instruction
data cac he miss rates branc h prediction miss rate pip eline stalls etc
for the micropro cessor
 Use mixed in teger linear programming and heuristic rules to gradually transform
a generic program template in to a fully functional program
 P erform R Tlev el sim ulation of the target micropro cessor under the instruction
trace of the new syn thesized program 
Notice that the p erformance of the arc hitectural sim ulator in gatev ectors
second is
roughly  to  orders of magnitude higher than that of a R Tlev el sim ulator
This approac h has b een applied to the In tel P en tium pro cessor whic h is a sup er
scalar pip elin ed CPU with KB w a y setasso ciativ e data instruction and data cac hes
branc h prediction and dual instruction pip eline demonstrating  to  orders of magni
tude reduction in the R Tlev el sim ulation time with negligible estimation error
 Beha vioralLev el P o w er Estimation
Con v ersely from some of the R Tlev el metho ds that will b e discussed in Section 
estimation tec hniques at the b eha viorallev el cannot rely on information ab out the gate

lev el structure of the design comp onen ts and hence m ust resort to abstract notions of
ph ysical capacitance and switc hing activit y to predict p o w er dissipation in the design
 InformationTheoreti c Mo dels
Information theoretic approac hes for highlev el p o w er estimation     dep end on in
formation theoretic measures of activit y for example en trop y to obtain quic k p o w er
estimates
En trop y c haracterizes the randomness or uncertain t y of a sequence of applied v ectors
and th us is in tuitiv ely related to switc hing activit y  that is if the signal switc hing is
high it is lik ely that the bit sequence is random resulting in high en trop y  Supp ose the
sequence con tains t distinct v ectors and let p
i
denote the o ccurrence probabilit y of an y
v ector v in the sequence Ob viously 
P
t
i ￿￿
p
i
  The en trop y of the sequence is giv en
b y
h 
t
X
i ￿￿
p
i
log p
i
where log x denotes the base logarithm of x  The en trop y ac hiev es its maxim um v alue
of log t when p
i
  t  F or an n bit v ector t 
n
 This mak es the computation of
the exact en trop y v ery exp ensiv e Assuming that the individual bits in the v ector are
indep enden t then w e can write
h 
n
X
i ￿￿
 q
i
log q
i
  q
i
 log  q
i

where q
i
denotes the signal probabilit y of bit i in the v ector sequence Note that this
equation is only an upp erb ound on the exact en trop y  since the bits ma y b e dep enden t
This upp erb ound expression is ho w ev er the one that is used for p o w er estimation
purp oses F urthermore in  it has b een sho wn that under the temp oral indep endenc e
assumption the a v erage switc hing activit y of a bit is upp erb ounded b y one half of its
en trop y 
The p o w er dissipation in the circuit can b e appro ximated as
Power     V
￿
f C
tot
E
av g
where C
tot
is the total capacitance of the logic mo dule including gate and in terconnect
capacitances and E
av g
is the a v erage activit y of eac h line in the circuit whic h is in
turn appro ximated b y one half of its a v erage en trop y  h
av g
 The a v erage line en trop y
is computed b y abstracting information obtained from a gatelev el implemen tation In
  it is assumed that the w ordlev el en trop y p er logic lev el reduces quadratically
from circuit inputs to circuit outputs whereas in   it is assumed that the bitlev el
en trop y from one logic lev el to next decreases in an exp onen tial manner Based on these
assumptions t w o di eren t computational mo dels are obtained
In  Marculescu et al deriv e a closedform expression for the a v erage line en trop y
for the case of a linear gate distribution ie when the n um b er of no des scales linearly

b et w een the n um b er of circuit inputs n  and circuit outputs m  The expression for
h
av g
is giv en b y
h
av g

nh
in
 n  m  ln
h
in
h
out


m
n
h
out
h
in


m
n

h
out
h
in

ln
h
in
h
out

where h
in
and h
out
denote the a v erage bitlev el en tropies of circuit inputs and outputs
resp ectiv ely  h
in
is extracted from the giv en input sequence whereas h
out
is calculated
from a quic k functional sim ulation of the circuit under the giv en input sequence or b y
empirical en trop y propagation tec hniques for prec haracterized library mo dules In  
Nemani and Na jm prop ose the follo wing expression for h
av g

h
av g


 n  m 
 H
in
 H
out

where H
in
and H
out
denote the a v erage sectional w ordlev el en tropies of circuit inputs
and outputs resp ectiv ely  The sectional en trop y measures H
in
and H
out
ma y b e obtained
b y monitoring the input and output signal v alues during a highlev el sim ulation of the
circuit In practice ho w ev er they are appro ximated as the summation of individual
bitlev el en tropies h
in
and h
out

If the circuit structure is giv en the total mo dule capacitance is calculated b y tra v ers
ing the circuit netlist and summing up the gate loadings Wire capacitances are esti
mated using statistical wire load mo dels Otherwise C
tot
is estimated b y quic k mapping
for example mapping to input univ ersal gates or b y information theoretic mo dels
that relate the gate complexit y of a design to the di erence of its input and output en
tropies One suc h mo del prop osed b y Cheng and Agra w al in   for example estimates
C
tot
as
C
tot

m
n

n
h
out
This estimate tends to b e to o p essimistic when n is large hence in   F errandi et al
presen t a new total capacitance estimate based on the n um b er N of no des ie to
m ultiplexors in the Or der e d Binary De cision Diagr ams OBDD   represen tation of
the logic circuit as follo ws
C
tot
 
m
n
N h
out
 
The co ecien ts of the mo del are obtained empirically b y doing linear regression analysis
on the total capacitance v alues for a large n um b er of syn thesized circuits
En tropic mo dels for the con troller circuitry are prop osed b y T y agi in   where
three en tropic lo w er b ounds on the a v erage Hamming distance bit c hanges with state
set S and with T states are pro vided The tigh test lo w er b ound deriv ed in this pap er
for a sparse nite state mac hine FSM ie t    T
￿  ￿￿

p
log T  where t is the total
n um b er of transitions with nonzero steadystate probabilit y is the follo wing
X
s
i
s
j
￿ S
p
ij
H  s
i
 s
j
  h  p
ij
    log T       log log T 

where p
ij
is the steadystate transition probabilit y from s
i
to s
j
 H  s
i
 s
j
 is the Ham
ming distance b et w een the t w o states and h  p
ij
 is the en trop y of the probabilit y
distribution p
ij
 Notice that the lo w er b ound is v alid regardless of the state enco ding
used
In   using a Mark o v c hain mo del for the b eha vior of the states of the FSM the
authors deriv e theoretical lo w er and upp er b ounds for the a v erage Hamming distance
on the state lines whic h are v alid irresp ectiv e of the state enco ding used in the nal
implemen tation Exp erimen tal results obtained for the mcnc b enc hmark suite sho w
that these b ounds are tigh ter than the b ounds rep orted  
 Complexit yBased Mo dels
These mo dels relate the circuit p o w er dissipation to some notion of cir cuit c omplexity 
Example parameters that in uence the circuit complexit y include the n um b er and the
t yp e of arithmetic and
or Bo olean op erations in the b eha vioral description the n um b er
of states and
or transitions in a con troller description and the n um b er of cub es literals
in a minim um sumofpro ducts factoredform expression of a Bo olean function
Most of the prop osed complexit ybased mo dels rely on the assumption that the
complexit y of a circuit can b e estimated b y the n um b er of equiv alen t gates  This in
formation ma y b e generated onthe y using analytical predictor functions or retriev ed
from a prec haracterized highlev el design library  An example of this tec hnique is the
chip estimation system   whic h uses the follo wing expression for the a v erage p o w er
dissipation of a logic mo dule
Power  f N  Ener gy
g ate
    V
￿
C
load
 E
g ate
where f is the clo c k frequency  N is the gate equiv alen t coun t for the comp onen t
Ener gy
g ate
is the a v erage in ternal consumption for an equiv alen t gate it includes par
asitic capacitance con tributions as w ell as shortcircuit curren ts p er logic transition
C
load
is the a v erage capacitiv e load for an equiv alen t gate it includes fanout load ca
pacitances and in terconnect capacitances and E
g ate
is the a v erage output activit y for
an equiv alen t gate p er cycle C
load
is estimated statistically based on the a v erage fanout
coun t in the circuit and custom wire load mo dels E
g ate
is dep enden t on the functionalit y
of the mo dule The data is precalculated and stored in the library and is indep enden t
of the implemen tation st yle static vs dynamic logic clo c king strategy librarysp ecic
parameters gate inertia glitc h generation and propagation and the circuit con text in
whic h the mo dule is instan tiated This is an example of an implemen tationindep en den t
and dataindep enden t p o w er estimation mo del
In   Nemani and Na jm presen t a highlev el estimation mo del for predicting the
area of an optimized singleoutput Bo olean function The mo del is based on the as
sumption that the area complexit y of a Bo olean function f is related to the distribution
of the sizes of the onset and o set of the function F or example using the linear

measure  the area complexit y of the onset of f is written as
C
￿
 f  
N
X
i ￿￿
c
i
p
i
where the set of in tegers f c
￿
 c
￿
     c
N
g consists of the distinct sizes of the essen tial
prime implican ts of the onset and w eigh t p
i
is the probabilit y of the set of all min terms
in the onset of f whic h are co v ered b y essen tial primes of size c
i
 but not b y essen tial
primes of an y larger size The area complexit y of the o set of f C
￿
 f  is similarly
calculated Hence the area complexit y of function f is estimated as
C  f  
C
￿
 f   C
￿
 f 


The authors next deriv e a family of regression curv es whic h happ en to ha v e exp onen tial
form relating the actual area A  f  of random logic functions optimized b y the SIS
program in terms of the n um b er of gates to the area complexit y measure C  f  for
di eren t output probabilities of function f  These regression equations are subsequen tly
used for total capacitance estimation and hence highlev el p o w er estimation The w ork
is extended in   to area estimation of m ultipleoutput Bo olean functions
A similar tec hnique w ould rely on predicting the qualit y of results pro duced b y ED A
o ws and to ols The predictor function is obtained b y p erforming regression analysis on a
large n um b er of circuits syn thesized b y the to ols and relating circuitsp ecic parameters
and
or design constrain ts to p ostsyn thesis p o w er dissipation results F or example
one ma y b e able to pro duce the p o w er estimate for an unoptimized Bo olean net w ork b y
extracting certain structural prop erties of the underlying directed acyclic graph a v erage
complexit y of eac h no de and usersp ecied constrain ts and plugging these v alues in the
predictor function
Complexit ybased p o w er prediction mo dels for con troller circuitry ha v e b een pro
p osed b y Landman and Rabaey in   These tec hniques pro vide quic k estimation of
the p o w er dissipation in a con trol circuit based on the kno wledge of its target imple
men tation st yle that is prec harged pseudoNMOS or dynamic PLA the n um b er of
inputs outputs states and so on The estimates will ha v e a higher degree of accuracy b y
in tro ducing empirical parameters that are determined b y curv e tting and least squared
t error analysis on real data F or example the p o w er mo del for an FSM implemen ted
in standard cells is giv en b y
Power     V
￿
f  N
I
C
I
E
I
 N
O
C
O
E
O
 N
M
where N
I
and N
O
denote the n um b er of external input plus state lines and external
output plus state lines for the FSM C
I
and C
O
are regression co ecien ts whic h are
empirically deriv ed from lo wlev el sim ulation of previously designed standard cell con
trollers E
I
and E
O
denote the switc hing activities on the external input plus state lines
and external output plus state lines and nally N
M
denotes the n um b er of min terms in
an optimized co v er of the FSM Dep endence on N
M
indicates that this mo del requires
a partial p erhaps sym b olic implemen tation of the FSM

 Syn thesisBased Mo dels
One approac h for b eha viorallev el p o w er prediction is to assume some R Tlev el template
and pro duce estimates based on that assumption This approac h requires the dev elop
men t of a quick synthesis capabilit y whic h mak es some b eha vioral c hoices mimic king
a full syn thesis program Imp ortan t b eha vioral c hoices include t yp e of I
O memory
organization pip elinin g issues sync hronization sc heme bus arc hitecture and con troller
design This is a dicult problem esp ecially in the presence of tigh t timing constrain ts
F ortunately  designers or the en vironmen t often pro vide hin ts on what c hoices should b e
made After the R Tlev el structure is obtained the p o w er is estimated b y using an y of
the R Tlev el tec hniques that will b e describ ed in Section 
Relev an t data statistics suc h as the n um b er of op erations of a giv en t yp e bus and
memory accesses and I
O op erations are captured b y static pr oling based on sto c has
tic analysis of the b eha vioral description and data streams     or dynamic pr o
ling based on direct sim ulation of the b eha vior under a t ypical input stream    
Instructionlev el or b eha vioral sim ulators are easily adapted to pro duce this information
 R TLev el P o w er Estimation
Most R Tlev el p o w er estimation tec hniques use regressionbased switc hed capacitance
mo dels for circuit mo dules Suc h tec hniques whic h are commonly kno wn as p ower
macr omo deling  are review ed next
 RegressionBased Mo dels
A t ypical R Tlev el p o w er estimation o w consists of the follo wing steps
 Characterize ev ery comp onen t in the highlev el design library b y sim ulating it
under pseudorandom data and tting a m ultiv ariable regression curv e ie the
p o w er macromo del equation to the p o w er dissipation results using a least mean
square error t   
 Extract the v ariable v alues for the macromo del equation from either static analysis
of the circuit structure and functionalit y  or b y p erforming a b eha vioral sim ulation
of the circuit In the latter case a p o w er cosim ulator link ed with a standard R T
lev el sim ulator can b e used to collect input data statistics for v arious R Tlev el
mo dules in the design
 Ev aluate the p o w er macromo del equations for highlev el design comp onen ts whic h
are found in the library b y plugging the parameter v alues in the corresp onding
macromo del equations
 Estimate the p o w er dissipation for random logic or in terface circuitry b y sim ulat
ing the gatelev el description of these comp onen ts or b y p erforming probabilistic
p o w er estimation The lo w lev el sim ulation can b e signican tly sp ed up b y the

application of statistical sampling tec hniques or automatabased compaction tec h
niques
The macromo del for the comp onen ts ma y b e parameterized in terms of the input bit
width the in ternal organization
arc hitecture of the comp onen t and the supply v oltage
lev el Notice that there are cases where the construction of the macromo del of step
 can b e done analytically using the information ab out the structure of the gatelev el
description of the mo dules without resorting to sim ulation as prop osed b y Benini et al
in    On the other hand if the lo wlev el netlist of the library comp onen ts is not kno wn
whic h ma y b e the case for soft macros step  can b e replaced b y data collection from
past designs of the comp onen t follo w ed b y appropriate pro cess tec hnology scaling   
In addition the macromo del equation in step   ma y b e replaced b y a tablelo ok up
with necessary in terp olation equations
In the follo wing paragraphs w e review v arious p o w er macromo del equations whic h
exhibit di eren t lev els of accuracy v ersus computation
information usage tradeo 
The simplest p o w er macromo del kno wn as the p ower factor appr oximation tec h
nique    is a c onstant typ e mo del whic h uses an exp erimen tally determined w eigh ting
factor to mo del the a v erage p o w er consumed b y a giv en mo dule p er input c hange F or
example the p o w er dissipation of an n  n bit in teger m ultiplier can b e written as
Power     V
￿
n
￿
C f
activ
where V is the supply v oltage lev el C is the capacitiv e regression co ecien t and f
activ
is the activ ation frequency of the mo dule this should not b e confused with the a v erage
bitlev el switc hing activit y of m ultiplier inputs
The w eakness of this tec hnique is that it do es not accoun t for the data dep endency
of the p o w er dissipation F or example if one of the inputs to the m ultiplier is alw a ys 
w e w ould exp ect the p o w er dissipation to b e less than when b oth inputs are c hanging
randomly  In con trast the sto chastic p ower analysis tec hnique prop osed b y Landman
and Rabaey in    is based on an activit ysensitiv e macromo del called the dual bit
typ e mo del  whic h main tains that switc hing activities of high order bits dep end on the
temp oral correlation of data whereas lo w er order bits b eha v e randomly  The mo dule is
th us completely c haracterized b y its capacitance mo dels in the sign and white noise bit
regions The macromo del equation form is then giv en b y
Power     V
￿
f  n
u
C
u
E
u
 n
s
￿￿
X
xy ￿￿￿
C
xy
E
xy

where C
u
and E
u
represen t the capacitance co ecien t and the mean activit y of the
unsigned bits of the input sequence while C
xy
and E
xy
denote the capacitance co ecien t
and the transition probabilit y for the sign c hange xy in the input stream n
u
and n
s
represen t the n um b er of unsigned and sign bits in the input patterns resp ectiv ely  Note
that E
u
 E
xy
 and the b oundary b et w een sign and noise bits are determined based on

the applied signal statistics collected from sim ulation runs Expanding this direction
one can use a bitwise data mo del as follo ws
Power     V
￿
f
n
X
i ￿￿
C
i
E
i
where n is the n um b er of inputs for the mo dule in question C
i
is the regression
capacitance for input pin i  and E
i
is the switc hing activit y for the i th pin of the
mo dule This equation can pro duce more accurate results b y including for example
spatiotemp oral correlation co ecien ts among the circuit inputs This will ho w ev er
signican tly increase the n um b er of v ariables in the macromo del equation and th us
the equation ev aluation o v erhead
Accuracy ma y b e impro v ed esp ecially  for comp onen ts with deep logic nesting suc h
as m ultipliers b y p o w er macromo deling with resp ect to b oth the a v erage input and
output activities the inputoutput data mo del  that is
Power     V
￿
f  C
I
E
I
 C
O
E
O

where C
I
and C
O
represen t the capacitance co ecien ts for the mean activities of the
input and output bits resp ectiv ely  The dual bit t yp e mo del or the bit wise data mo del
ma y b e com bined with the inputoutput data mo del to create a more accurate but
more exp ensiv e macromo del form Recen tly  in    the authors presen ted a Dtable
p o w er macromo deling tec hnique whic h captures the dep endence of p o w er dissipation
in a com binational logic circuit on the a v erage input signal probabilit y  the a v erage
switc hing activit y of the input lines and the a v erage zerodela y switc hing activit y of
the output lines The latter parameter is obtained from a fast functional sim ulation of
the circuit The pap er also presen ts an automatic macromo del construction pro cedure
based on random sampling principles Note that the equation form and v ariables used
for ev ery mo dule are the same although the regression co ecien ts are di eren t
A parametric p o w er mo del is describ ed b y Liu and Sv ensson in   where the
p o w er dissipation of the v arious comp onen ts of a t ypical pro cessor arc hitecture including
onc hip memory  busses lo cal and global in terconnect lines Htree clo c k net o c hip
driv ers random logic and data path are expressed as a function of a set of parameters
related to the implemen tation st yle and in ternal arc hitecture of these comp onen ts F or
example consider a t ypical onc hip memory a storage arra y of transistor memory
cells whic h consists of four parts The memory cells the ro w deco der the column
selection the read
write circuits The p o w er mo del for a cell arra y of
n ￿ k
ro ws and

k
columns in turn consists of expressions for  the p o w er consumed b y
k
memory
cells on a ro w during one prec harge or one ev aluation   the p o w er consumed b y the
ro w deco der  the p o w er needed for driving the selected ro w  the p o w er consumed
b y the column select part and  the p o w er dissipated in the sense amplier and the
readout in v erter F or instance the memory cell p o w er expression  in ab o v e is giv en
b y
Power
memcell
    V V
sw ing

k
 C
int

n ￿ k
C
tr


where V
sw ing
is the v oltage swing on the bit

bit line whic h ma y b e di eren t for read
v ersus write C
int
giv es the wiringrelated ro w capacitance p er memory cell and
n ￿ k
C
tr
giv es the total drain capacitances on the bit

bit line Notice that during the read time
ev ery memory cell on the selected ro w driv es exactly bit or
bit 
A salien t feature of the ab o v e macromo del tec hniques is that they only pro vide infor
mation ab out a v erage p o w er consumption o v er a relativ ely large n um b er of clo c k cycles
The ab o v e tec hniques whic h are suitable for estimating the a v eragep o w er dissipation
are referred to as cumulative p o w er macromo dels In some applications ho w ev er esti
mation of a v erage p o w er only is not sucien t Examples are circuit reliabilit y analysis
maxim um curren t limits heat dissipation and temp erature gradien t calculation latc h
up conditions noise analysis resistiv e v oltage drop and inductiv e b ounce on p o w er and
ground lines and design optimization p o w er
ground net top ology design n um b er and
placemen t of decoupling capacitors bu er insertion etc In these cases cycleac cur ate
 p atternac cur ate  p o w er estimates are required
Meh ta et al prop ose a clustering approac h for patternaccurate p o w er estimation in
  This approac h relies on the assumption that closely related input transitions ha v e
similar p o w er dissipation Hence eac h input pattern is rst mapp ed in to a cluster and
then a table lo okup is p erformed to obtain the corresp onding p o w er estimates from pre
calculated and stored p o w er c haracterization data for the cluster The w eakness of this
approac h is that for eciency reasons the n um b er of clusters has to b e relativ ely small
whic h w ould in tro duce errors in to the estimation result In addition the assumption
that closely related patterns eg patterns with short Hamming distance result in
similar p o w er distribution ma y b e quite inaccurate esp ecially when the mo dechanging
bits are in v olv ed ie when a bit c hange ma y cause a dramatic c hange in the mo dule
b eha vior
Addressing these problems W u et al describ e in   an automatic pro cedure for
cycleaccurate macromo del generation based on statistical sampling for the tr aining
set design and regression analysis com bined with appropriate statistical tests ie the
F
￿
test for macromo del v ariable selection and co ecien t calculation The test iden
ties the most least p o w ercritical v ariable to add to delete from the set of selected
v ariables The statistical framew ork enables prediction of the p o w er v alue and the con
dence lev el for the predicted p o w er v alue This w ork is extended b y Qiu et al in  
to capture imp ortan t rstorder temp oral correlations and spatial correlations of up
to order three at the circuit inputs Note that here the equation form and v ariables
used for eac h mo dule are unique to that mo dule t yp e Exp erimen tal results sho w that
p o w er macromo dels with a relativ ely small n um b er of input v ariables ie  predict
the mo dule p o w er with a t ypical error of  for the a v erage p o w er and   for
the cycle p o w er
The ab o v emen tioned highlev el p o w er macromo deling tec hniques are mostly appli
cable to the com binational mo dules suc h as adders m ultipliers in the circuit This is
b ecause the p o w er consumption in a com binational circuit mo dule can b e fully deter
mined from the statistical c haracteristics of the v ector sequence applied to its primary
inputs As men tioned ab o v e in some cases it is b enecial to use v ariables related to

the v ectors whic h app ear on the primary outputs to further impro v e the accuracy of the
p o w er macromo del for com binational mo dules F or sequen tial circuit mo dules using
only v ariables related to the external inputs and outputs is not enough to mo del the
p o w er consumption accurately This is b ecause p o w er consumption in sequen tial circuits
strongly dep ends on the state transitions whic h are not visible from outside T o gener
ate an accurate p o w er mo del for the sequen tial mo dule one th us needs to include some
v ariables related to the in ternal state of the mo dule
F or complex in tellectual prop ert y IP cores generating a p o w er macromo del is ev en
more c hallenging b ecause of the follo wing
 The IP core ma y ha v e in ternal state v ariables it is therefore v ery dicult to
generate an accurate p o w er mo del for the IP core if w e ha v e access to information
ab out its primary inputs and primary outputs only  Therefore it is desirable to
construct a p o w er macromo del for the IP core that includes v ariables related to
not only its primary inputs
outputs but also its in ternal state v ariables
 A t the system lev el IP cores are usually sp ecied b eha viorally through input
output
b eha vioral mo deling P o w er mo deling ho w ev er requires information ab out the in
ternal states of the IP core Therefore it is necessary to mo dify the IP core
sp ecication at the system lev el so that the p o w er mo del and p o w er ev aluation
to ol can retriev e the information they need for p o w er estimation and optimization
 There ma y b e thousands of in ternal state v ariables in the IP core Using all of
these v ariables will result in a p o w er mo del whic h is to o large to store in the
system library and to o exp ensiv e to ev aluate F urthermore requiring information
ab out all in ternal states greatly increases the complexit y of IP core sp ecication
and puts undue burden on the IP v endor to rev eal details ab out the IP core whic h
can otherwise b e hidden from the IP user
T o solv e these problems one needs to dev elop an IP p o w er mo del constructor whic h
will automatically generate p o w er macromo dels for IP cores using the minim um n um b er
of in ternal state v ariables This ma y b e ac hiev ed b y using a statistical relev ance testing
pro cedure suc h that only the imp ortan t v ariables are retained in the p o w er mo del In
this w a y  a p o w er macromo del with small n um b er of v ariables y et sucien t accuracy 
can b e constructed
 SamplingBased Mo dels
R Tlev el p o w er ev aluation can b e implemen ted in the form of a p ower c osimulator
for standard R Tlev el sim ulators The cosim ulator is resp onsible for collecting input
statistics from the output of the b eha vioral sim ulator and pro ducing the p o w er v alue at
the end If the cosim ulator is in v ok ed b y the R Tlev el sim ulator ev ery sim ulation cycle
to collect activit y information in the circuit it is called c ensus macr omo deling 
Ev aluating the macromo del equation at eac h cycle during the sim ulation is actually
a census surv ey  The o v erhead of data collection and macromo del ev aluation can b e

high T o reduce the run time o v erhead Hsieh et al use simple r andom sampling to select
a sample and calculate the macromo del equation for the v ector pairs in the sample only
  The sample size is determined b efore sim ulation The sampler macr omo deling ran
domly selects n cycles and marks those cycles When the b eha vioral sim ulator reac hes
the mark ed cycle the macromo deling in v ok es the b eha vioral sim ulator for the curren t
input v ectors and previous input v ectors for eac h mo dule The input statistics is only
collected in these mark ed cycles Instead of selecting only one sample of large size w e
can select sev eral samples of at least  units to insure normalit y of sample distribu
tion b efore the sim ulation Then the a v erage v alue of sample means is the estimate of
p opulation mean In this manner the o v erhead of collecting input statistics at ev ery cy
cle whic h is required b y census macromo deling is substan tially reduced Exp erimen tal
results sho w that sampler macromo deling results in an a v erage eciency impro v emen t
of X o v er the census macromo deling with and a v erage error of 
The macromo del equation is dev elop ed b y using a training set of input v ectors The
training set satises certain assumptions suc h as b eing pseudorandom data sp eec h data
etc Hence the macromo del ma y b ecome biased meaning that it pro duces v ery go o d
results for the class of data whic h b eha v e similarly to the training set otherwise it pro
duces p o or results One w a y to reduce the gap b et w een the p o w er macromo del equation
and the gatelev el p o w er estimation is to use a r e gr ession estimator as follo ws   It can
b e sho wn that the plot of the gatelev el p o w er v alue v ersus a w elldesigned macromo del
equation estimate for man y functional units rev eals an appro ximately linear relation
ship Hence the macromo del equation can b e used as a predictor for the gatelev el
p o w er v alue In other w ords the sample v ariance of the ratio of gatelev el p o w er to
macromo del equation p o w er tends to b e m uc h smaller than that of the gatelev el p o w er
b y itself It is th us more ecien t to estimate the mean v alue of this ratio and then use
a linear regression equation to calculate the mean v alue of the circuitlev el p o w er The
adaptive macr omo deling th us in v ok es a gatelev el sim ulator on a small n um b er of cycles
to impro v e the macromo del equation estimation accuracy  In this manner the bias
of the static macromo dels is reduced or ev en eliminated Exp erimen tal results sho w
that the census macromo deling incurs large error an a v erage of  for the b enc hmark
circuits compared to gate lev el sim ulation The adaptiv e macromo deling ho w ev er ex
hibits an a v erage error of only  whic h demonstrates the sup eriorit y of the adaptiv e
macromo deling tec hnique
 GateLev el P o w er Estimation
In CMOS circuits p o w er is mostly consumed during logic switc hing This simplies the
p o w er estimation problem to that of calculating the toggle coun t of eac h circuit no de
under a gate lev el dela y mo del Sev eral gate lev el p o w er estimation tec hniques ha v e
b een prop osed in the past These tec hniques whic h o er orders of magnitude sp eedup
compared to the con v en tional sim ulationbased tec hniques at the circuit lev el can b e
classied in to t w o classes dynamic and static  Dynamic tec hniques explicitly sim ulate

the circuit under a t ypical input v ector sequence or input stream The p oten tial
problem with these approac hes is that the estimation results strongly dep end on the
input v ector sequence a phenomenon often describ ed as p attern dep endenc e  The static
tec hniques do not explicitly sim ulate under an y input v ector sequence Instead they
rely on statistical information suc h as the mean activit y and correlations ab out the
input stream The pattern dep endence problem in these tec hniques is less sev ere as these
tec hniques either implicitly consider all input v ector sequences or p erform a smo othing
op eration on the input streams to reduce dep endence on an y giv en sequence
 Statistical Sampling
Existing tec hniques for p o w er estimation at the gate and circuitlev el can b e divided
in to t w o classes Static and dynamic  Static tec hniques rely on probabilistic informa
tion ab out the input stream suc h as the mean activit y of the input signals and their
correlations to estimate the in ternal switc hing activit y of the circuit While these are
v ery ecien t their main limitation is that they cannot accurately capture factors suc h
as slew rates glitc h generation and propagation and DC gh ting Dynamic tec hniques
explicitly sim ulate the circuit under a t ypical input stream They can b e applied at
b oth the circuit and gatelev el Their main shortcoming is ho w ev er that they are v ery
slo w Moreo v er their results are highly dep enden t on the sim ulated sequence T o alle
viate this dep endence and thereb y pro duce a trust w orth y p o w er estimate the required
n um b er of sim ulated v ectors is usually high whic h further exacerbates the run time
problem An example of direct sim ulation tec hniques is  
T o address this problem a Monte Carlo simulation tec hnique w as prop osed in  
This tec hnique uses an input mo del based on a Mark o v pro cess to generate the input
stream for sim ulation The sim ulation is p erformed in an iterativ e fashion In eac h
iteration a v ector sequence of xed length called sample is sim ulated The sim ulation
results are monitored to calculate the mean v alue and v ariance of the samples The
iteration terminates when some stopping criterion is met see Figure 
This approac h su ers from three ma jor shortcomings First when the v ectors are
regenerated for sim ulation the spatial correlations among v arious inputs cannot b e
adequately captured whic h ma y lead to inaccuracy in the p o w er estimates Second
the required n um b er of samples whic h directly impacts the sim ulation run time is
appro ximately prop ortional to the ratio b et w een the sample v ariance and the square
of the sample mean v alue F or certain input sequences this ratio b ecomes large th us
signican tly increasing the sim ulation run time Finally  there is a general concern
ab out the normalit y assumption of the sample distribution Since the stopping criterion
is based on suc h an assumption if the sample distribution deviates signican tly from the
normal distribution whic h ma y happ en if the n um b er of units p er sample is small or the
p opulation distribution is illb eha v ed then the sim ulation ma y terminate prematurely 
Illb eha v ed p opulation distributions that ma y cause premature termination include bi
mo dal m ultimo dal and distributions with long or asymmetric tails
The rst concern can b e addressed b y dev eloping a sampling pro cedure whic h ran

Figure  Mon te Carlo Sim ulation Lo op
domly dra ws the units in the sample from the input p opulation instead of building an
appro ximate Mark o v mo del of the p opulation and then generating samples whic h con
form to this appro ximate mo del Suc h tec hniques are kno wn as simple r andom sampling
  They ma y or ma y not b e com bined with the Mon te Carlo sim ulation lo op
The second and to some exten t the third concern can b e addressed b y dev elop
ing more ecien t sampling pro cedures One suc h pro cedure called str atie d r andom
sampling  is in tro duced in   The k ey idea is to partition the p opulation in to disjoin t
subp opulations called str ata  in suc h a w a y that the p o w er consumption c haracteristics
within eac h stratum b ecomes more homogeneous The samples are then tak en randomly
from eac h stratum The units in eac h sample are allo cated prop ortional to the sizes of
the strata This generally results in a signican t reduction in the sampling v ariance see
Figures and 
The stratication itself is based on a lo wcost predictor eg zerodela y p o w er esti
mation whic h needs to b e ev aluated for ev ery unit in the p opulation The zerodela y
p o w er estimates need not to pro duce high accuracy on a unitb yunit basis indeed they
should only sho w high statistical correlation with circuitlev el p o w er estimates see for
example the scatter plot for the ISCAS b enc hmark C sho wn in Figure  When
the p opulation size is large one can use a t w ostage stratied sampling pro cedure to
reduce the o v erhead of predictor calculation and stratication The prop osed t w ostage
stratied sampling tec hnique can b e easily extended to m ultistage sampling
Compared to   the tec hnique of   o ers the follo wing adv an tages  It p erforms
sampling directly on the p opulation and th us the estimation results are un biased 
Exp erimen tal results on a large set of the ISCAS b enc hmarks under nonrandom input

0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.
8
0
10
20
30
40
50
60
70
Estimated Power (mW)
Distribution
Simple Random Sampling
Figure  Sample Distribution for Simple Random Sampling
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
0
10
20
30
40
50
60
70
Distribution
Estimated Power (mW)
Stratified Random Sampling
Figure??Sample Distribution for Strati?ed Random Sampling?
??
Figure  Scatter Plot of CircuitLev el P o w er v ersus ZeroDela y P o w er
sequences sho w a X impro v emen t in the sampling eciency  Dicult p opulation
distributions can b e handled more e ectiv ely 
Another ecien t sampling pro cedure called line ar r e gr ession estimation  is in tro
duced in   The k ey observ ation is that the sample v ariance of the ratio of circuit
lev el p o w er to zerodela y p o w er tends to b e m uc h smaller than that of the circuitlev el
p o w er b y itself This can b e easily seen b y examining the scatter plots of circuitlev el
p o w er v ersus zerodela y p o w er for a large n um b er of circuits under a v ariet y of input
p opulations
It is th us more ecien t to estimate the mean v alue of this ratio and then use a
regression equation to calculate the mean v alue of the circuitlev el p o w er Exp erimen tal
results again sho w a X impro v emen t in the sampling eciency compared to simple
random sampling
So far w e ha v e only considered examples of parametric sampling tec hniques ie
those whose stopping criterion is deriv ed b y assuming normalit y of the sample distri
bution A nonparametric sampling tec hnique ie those whose stopping criterion is
deriv ed without an y apriori assumption ab out the sample distribution is presen ted in
  In general nonparametric tec hniques do not su er from the premature termi
nation problem ho w ev er they tend to b e to o conserv ativ e and lead to o v ersampling
to ac hiev e the sp ecied error and condence lev els The tec hnique of   whic h uses
the prop erties of order statistics  tries to reac h a go o d tradeo b et w een the estima
tion accuracy and the computational eciency  More researc h is needed to assess the
eciency and robustness of nonparametric v ersus parametric sampling tec hniques
F or sync hronous sequen tial circuits eg nite state mac hines driv en b y a common
clo c k if w e kno w the state transition graph of the FSM w e can solv e the Chapman
Kolmogoro v equations for the stationary state probabilities cf Section  Next based
on these probabilities w e can randomly generate a presen tstate v ector whic h along with
a randomly generated external input v ector determines the nextstate v ector With a

second random external input v ector generated according to the statistics of the external
input sequence a unit of the p o w er sample can b e constructed The subsequen t units
are generated b y randomly setting the state line v ectors follo w ed b y random generation
of t w o external input v ectors and p o w er measuremen t Unfortunately  the n um b er of
ChapmanKolmogoro v equations is exp onen tial in the n um b er of ip ops and hence
this approac h is not p ossible for large FSMs
Existing statistical tec hniques for the estimation of the mean p o w er in sync hronous
sequen tial circuits eg nite state mac hines driv en b y a common clo c k are classied
in to t w o groups Hierarc hical and at The hierarc hical tec hnique p erforms b eha vioral or
R Tlev el sim ulation of the target circuit for all units ie external input patterns in the
usersp ecied p opulation and collects the state line v alues corresp onding to eac h unit
Next it treats the FSM as a com binational circuit ie it cuts the sequen tial feedbac k
lines whic h receiv es an extended input sequence and whic h needs to b e sim ulated at
the circuitlev el for accurate p o w er estimation the new sequence is the concatenation
of the external input line and the state line v alues The FSM p o w er estimation problem
is thereb y transformed in to a sampling problem for the resulting com binational circuit
assuming that all the ip ops are edgetriggered The shortcomings of this tec hnique
are that it requires R Tlev el sim ulation of the target FSM for all units of the p opulation
and that the resulting state line v alues m ust b e stored for sampling in the next phase
The rst problem is not critical since the R Tlev el sim ulation is orders of magnitude
faster than the circuitlev el sim ulation and hence w e can a ord sim ulating the whole
v ector sequence at the R Tlev el The second problem ma y b e of concern if the length of
the input sequence is large and the computer memory is limited
The hierarc hical tec hnique tak es a somewhat more complicated form when the input
p opulation is innite in this case the signal and transition probabilities of the state lines
m ust b e mo deled b y a Mark o v pro cess whic h is in turn deriv ed from a Mark o v mo del
represen tation of the external input sequence This problem has b een tac kled in  
where the follo wing t w o phase pro cedure is prop osed  Use sequence generation based
on the giv en Mark o v mo del of eac h bit of the external input sequence and do Mon te
Carlo sim ulation to compute the signal probabilities and transition probabilities of eac h
state line and hence construct the corresp onding Mark o v mo del for eac h state line 
Cut the feedbac k lines in the FSM and do Mon te Carlo sim ulation on the com binational
part of the FSM using the bitlev el Mark o v mo dels of the external input and state lines
A ma jor disadv an tage of this tec hnique is that in the second step while estimating p o w er
in the com binational circuit spatial correlations b et w een the signal lines are ignored
This in tro duces large errors in the p o w er estimates
The at tec hnique    consists of t w o phases W armup p erio d and random
sampling The purp ose of the w armup p erio d is to p erform a n um b er of input v ector
sim ulations to ac hiev e steady state probabilit y conditions on the state lines Only then
p o w er sampling is p erformed This is b ecause for meaningful p o w er sampling the state
v ectors fed in to the circuit ha v e to b e generated according to their stationary state
probabilit y distribution in the FSM The problems of ho w to c ho ose an initial state
for eac h sample and ho w long the w armup length should b e are discussed in   A

randomness test to dynamically decide the prop er w armup length is prop osed in  
The test is applied to the sequence of p o w er v alues observ ed during the w armup p erio d
a binning tec hnique is used to transform the sequence of p o w er v alues in to a binary
sequence so that the test can b e applied A ma jor concern with this tec hnique is that
the randomness test should actually b e applied to the sequence of state line patterns
rather than to the sequence of p o w er v alues The former task app ears to b e dicult
In practice the w armup p erio d requires a large n um b er of sim ulated v ectors de
p ending on the FSM b eha vior and c haracteristics of the external input sequence This
mak es the eciency of p o w er estimation for sequen tial circuits m uc h lo w er than that for
com binational circuits
In addition to estimating the mean v alue of the p o w er dissipation in a circuit theory
of order statistics and stratied sampling tec hniques ha v e b een used to estimate the
maxim um p o w er dissipation   and the cum ulativ e distribution function for the p o w er
dissipation   This information is v ery useful to c hip designers who are in terested in
reliabilit y analysis and A C
DC noise analysis
 Probabilistic Compaction
Another approac h for reducing the p o w er sim ulation time is to compact the giv en long
stream of bit v ectors using probabilistic automata    The idea in   is to build
a sto chastic state machine SSM whic h captures the relev an t statistical prop erties of
a giv en long bit stream and then excite this mac hine b y a small n um b er of random
inputs so that the output sequence of the mac hine is statistically equiv alen t to the initial
one The relev an t statistical prop erties denote for example the signal and transition
probabilities and rstorder spatiotemp oral correlations among bits and across consec
utiv e time frames The pro cedure then consists of decomp osing the SSM in to a set of
deterministic state mac hines and realizing it through SSM syn thesis with some aux
iliary inputs The compacted sequence is generated b y uniformly random excitemen t
of suc h inputs As an example consider the input sequence sho wn in Figure  the
corresp onding SSM mo del is sho wn in Figure  and the compacted sequence is sho wn
in Figure 
Figure  Initial Sequence
The n um b er of states in the probabilistic automaton is prop ortional to the n um b er
of distinct patterns in the initial v ector sequence Since this n um b er ma y b e large
ie w orstcase exp onen tial in the n um b er of bits in eac h v ector one has to manage
the complexit y b y either  P artitioning the n bits in to b groups with a maxim um
size of k bits p er group and then building a probabilistic automaton for eac h group of
bits indep enden tly  P artitioning the long v ector sequence in to consecutiv e blo c ks of

Figure  SSM Mo del for the Sequence
Figure  Compacted Sequence
v ectors suc h that the n um b er of distinct v ectors in eac h blo c k do es not exceed some
user dened parameter sa y K  The shortcoming of the rst approac h is that one
ma y generate a bit pattern v ector that w as not presen t in the initial sequence since
correlations across di eren t groups of bits are ignored This is in turn a problem in
certain applications with forbidden input patterns co des not used illegal instructions
bad memory addresses etc Th us the second approac h is often more desirable
An impro v ed algorithm for v ector compaction is presen ted in   The foundation
of this approac h is also probabilistic in nature It relies on adaptiv e dynamic mo deling
of binary input streams as rstorder Mark o v sources of information and is applicable to
b oth com binational and sequen tial circuits The adaptiv e mo deling tec hnique itself b est
kno wn as dynamic Markov chain mo deling  w as recen tly in tro duced in the literature on
data compression   as a candidate to solv e v arious data compression problems This
original form ulation is extended in   to manage not only correlations among adjacen t
bits that b elong to the same input v ector but also correlations b et w een successiv e
patterns The mo del captures completely spatial correlations and rstorder temp oral
correlations and conceptually  it has no inheren t limitation to b e further extended to
capture temp oral dep endencies of higher orders
A hierarc hical tec hnique for compacting large sequences of input v ectors is presen ted
in  The distinctiv e feature of this approac h is that it in tro duces hierarc hical Mark o v
c hain mo deling as a exible framew ork for capturing not only complex spatiotemp oral

correlations but also dynamic c hanges in the sequence c haracteristics suc h as di eren t
input mo des The hierarc hical Mark o v mo del is used to structure the input space in to
a hierarc h y of macro and microstates A t the rst lev el in the hierarc h y there is a
Mark o v c hain of macrostates describing the input mo des whereas at the second lev el
there is a Mark o v c hain of microstates describing the in ternal b eha vior of eac h input
mo de The primary motiv ation in doing this structuring is to enable a b etter mo deling
of the di eren t sto c hastic lev els that are presen t in sequences that arise in practice
Another imp ortan t prop ert y of suc h mo dels is to individuali ze di eren t op erating
mo des of a circuit or system via higher lev els in the hierarc hical Mark o v mo del th us
pro viding a high adaptabilit y to di eren t system op erating mo des eg activ e standb y 
sleep etc The structure of the mo del itself is general and with further extensions can
allo w an arbitrary n um b er of activ ations of its submo dels Once w e ha v e constructed the
hierarc h y for the input sequence starting with a macrostate a compaction pro cedure
with a sp ecied compaction ratio is applied to compact the set of microstates within
that macrostate When done with pro cessing the curren t macrostate the con trol
returns to the higher lev el in the hierarc h y and based on the conditional probabilities
that c haracterize the Mark o v c hain at this lev el a new macrostate is en tered and the
pro cess rep eats
This sequence compaction approac h has b een extended in   to handle FSMs
More precisely  it is sho wn that  Under the stationarit y and ergo dicit y assumptions
complete capture of the c haracteristics of the external input sequence feeding in to a
target FSM is sucien t to correctly c haracterize the join t external inputs plus in ternal
states transition probabilities  If the external input sequence has order k  then a lag
k Mark o v c hain mo del of this sequence will suce to exactly mo del the join t transition
probabilities in the target FSM  If the input sequence has order t w o or higher then
mo deling it as a lagone Mark o v c hain cannot exactly preserv e ev en the rstorder join t
transition probabilities in the target FSM The k ey problem is th us to determine the
order of the Mark o v source whic h mo dels the external input sequence In   based on
the notion of blo c k metric en trop y  a tec hnique for iden tifying the order of a c omp osite
source of information that is a source whic h emits sequences that can b e piecewise
mo deled b y Mark o v c hains of di eren t orders is in tro duced
Results of these approac hes sho w  orders of magnitude compaction dep ending
on the initial length and c haracteristics of the input sequence with negligible error ie
  in most cases using P o w erMill as the sim ulator As a p eculiar prop ert y  note that
none of these approac hes needs the actual circuit to compact the input sequences
 Probabilistic Sim ulation
P o w er dissipation in CMOS circuits comes from three sources  the leak age curren t
 the shortcircuit whic h is due to the DC path b et w een the supply rails during out
put transitions and  the c harging and disc harging of capacitiv e loads during logic
c hanges In a w elldesigned circuit the con tribution of rst t w o sources is relativ ely
small Therefore p o w er dissipation at the gate lev el can b e accurately appro ximated b y

only considering the c harging and disc harging of capacitiv e loads as giv en b y
P ow er
g ate


T
c
V
￿
dd
X
n
C
n
sw
n
where T
c
is the clo c k p erio d V
dd
is the supply v oltage the summation is p erformed o v er
all gates no des in the circuit C
n
is the load capacitance of no de n and sw
n
is the
a v erage switc hing activit y of no de n that is the exp ected n um b er of signal c hanges
p er clo c k cycle
In the ab o v e equation sw
n
is related to the timing relationship or dela y among the
input signals of eac h gate Indeed the output signal of a no de ma y c hange more than
once due to unequal signal arriving times at its inputs The p o w er consumed b y these
extra signal c hanges is generally referred as the glitch p ower  If the con tribution of the
glitc h p o w er is relativ ely small one can appro ximate sw
n
b y using the zero dela y mo del
e g assuming that all logic c hanges propagate through the circuit instan taneously  and
hence eac h no de c hanges logic v alue at most once The ratio of the p o w er estimations
obtained under a real dela y and the zero dela y mo dels re ects the signicance of glitc h
p o w er A t ypical v alue for this ratio is    F or some t yp e of circuits suc h as parit y
c hains adders m ultipliers this ratio could b e w ell ab o v e   F or this t yp e of circuits
the dela y mo del b ecomes v ery imp ortan t that is the dela y c haracterization of eac h gate
m ust b e v ery accurate Moreo v er CMOS gates ha v e an inertial dela y  Only glitc hes with
adequate strength that is glitc h width can o v ercome the gate inertia and propagate
through the gate to its output Accurate mo deling of these e ects requires signican tly
longer execution time
A direct sim ulativ e approac h to p o w er estimation w ould en umerate all pairs of in
put v ectors and compute the n um b er of logic c hanges p er v ector pair The a v erage
switc hing activit y is then the sum o v er all p ossible v ector pairs of the pro duct of the
o ccurrence probabilit y and the n um b er of the logic c hanges for eac h v ector pair This
pro cedure will ob viously b e exp onen tial in complexit y  In the follo wing w e review some
imp ortan t denitions that will help manage this complexit y
 Signal pr ob ability  The signal probabilit y P
s
 n  of a no de n is dened as the a v erage
fraction of clo c k cycle in whic h the steady state v alue of n is logic one under the
zero dela y mo del
 T r ansition pr ob ability  The transition P
ij
t
 n  of a no de is dened as the a v erage
fraction of clo c k cycles in whic h the steady state v alue of n undergo es a c hange
from logic v alue i to logic v alue j under the zero dela y mo del Note that P
￿￿
t
 n  
P
￿￿
t
 n  and P
￿￿
t
 n   P
￿￿
t
 n   P
s
 n 
 T emp or al indep endenc e  Under temp oral indep endenc e assumption the signal
v alue of an input no de n at clo c k cycle i is indep enden t of its signal v alue at
clo c k cycle i 
 Sp atial indep endenc e  Under spatial indep endence assumption the logic v alue of
an input no de n is indep enden t of the logic v alue of an y other input no de m

a
c
b
g
c
a
b
1
0
(a) An example circuit (b) the OBDD representation for g
g
Figure  An OBDD example
 Sp atiotemp or al c orr elation  The spatial and temp oral correlation of the input sig
nals are collectiv ely referred as spatiotemp oral correlation
 F ul l signal swing  All logic transitions are from zero to one and vice v ersa that
is no in termediate v oltages are presen t
The concepts of signal and transition probabilities are v ery useful for p o w er estima
tion under the zero dela y mo del Under the real dela y mo del the signal and transition
probabilities are mainly used to describ e the input v ector statistics
Under the zero dela y mo del P
￿￿
t
 n   P
￿￿
t
 n  can replace the term sw
n
in the p o w er
calculation equation to giv e the p o w er consumption for eac h no de With the temp oral
indep endence assumption P
￿￿
t
 n  can b e written as
P
￿￿
t
 n    P
s
 n  P
s
 n 
Therefore the sim ulation can b e p erformed using one v ector to obtain the signal proba
bilities instead of using t w o v ectors to explicitly calculate the transition probabilities
The early w orks for computing the signal probabilities in a com binational net w ork
adopt adopt the spatial indep endence assumption to k eep the problem manageable In
Figure  w e use signal probabilit y calculation based on OBDDs to illustrate ho w the
spatial indep endence assumption impro v es the eciency  Figure  a sho ws the example
circuit Figure  b sho ws the global function of no de g represen ted b y an OBDD With
the spatial indep endenc e assumption the signal probabilit y P
s
 g  can b e calculated b y
recursiv ely cofactoring the global function of g  that is
g  cg
c

cg
c
P
s
 g   P
s
 c  P  g
c
   P
s
 c  P  g
c

In Figure  b the left and righ t branc hes of the ro ot no de compute g
c
and g
c
 re
sp ectiv ely  By recursiv ely applying these equations with resp ect to all OBDD v ariables
P
s
 g  can b e computed ecien tly b y tra v ersing eac h OBDD no de exactly once using
a dynamic programming approac h Without the spatial indep endence assumption the

second equation b ecomes in v alid In the w orst case one ma y need to explicitly en umer
ate all the disjoin t cub es paths in the OBDD represen tation in function g to compute
P
s
 g 
When w e compute the signal probabilities using OBDD the OBDDs can b e either
g l obal or l ocal  The former refers to OBDDs whic h are constructed in terms of the
v ariables asso ciated with circuit inputs while the latter refers to OBDDs whic h are
constructed in terms of the v ariables asso ciated with some set of in termediate no des in
the fanin cone of the no de in question
W e should p oin t out that the temp oral and spatial indep endence assumptions are
only made to impro v e the eciency at the cost of reducing the accuracy  These assump
tions do not hold for most reallife input v ectors Therefore the abilit y to do p o w er
estimation while accoun ting for reallife temp oral and spatial correlations b ecomes an
imp ortan t criterion when comparing v arious estimation tec hniques
The second class of the switc hing activit y estimation tec hniques called static tec h
niques do not explicitly sim ulate under the giv en input v ector sequence This class can
b e further divided in to the follo wing groups exact and appr oximate 
 Exact Probabilistic Analysis
These exact tec hniques pro vide the exact switc hing activit y estimates under the assumed
dela y mo dels and sp ecied input statistics This is ac hiev ed b y implicitly exhaustive
enumer ation of all input v ectors Unfortunately  the w orst case complexit y of these
approac hes is exp onen tial Moreo v er the data structures emplo y ed to p erform the
implicit en umeration are also exp onen tial in the circuit input size As a result the
feasibilit y of these tec hniques is restricted to smallsize circuits
 Exact T ec hniques under the Zero Dela y Mo del
In   eac h of the circuit inputs is asso ciated with a v ariable name that represen ts the
input signal probabilit y  Giv en the algebraic expressions in term of these v ariables of
all fanin no des of a in ternal circuit no de g  an algebraic expression for g can b e deriv ed
based on the no de function of g  The signal probabilit y of eac h circuit no de g is then
calculated from the deriv ed algebraic expression A m uc h more e ectiv e approac h based
OBDDs is prop osed in   This latter tec hnique w as illustrated with the example of
Figure  Both of these approac hes assume temp oral and spatial indep endence of circuit
input signals
Another OBDDbased approac h that considers the temp oral correlation of the input
signals is prop osed in   The tec hnique uses OBDD v ariables of t wice the n um b er of
circuit inputs That is for eac h circuit input c  t w o OBDD v ariables c
￿
and c
￿
 are used
to represen t the signal v alues of c at time  and  in a t w ov ector sim ulation The
computation of transition probabilit y of eac h circuit no de g is carried out as follo ws
Let g
￿￿
represen t the b o olean function for g to pro duce a  to  signal c hange under the

zero dela y mo del W e can write
g
￿￿

c
￿
c
￿
g
c
￿
c
￿

c
￿
c
￿
g
c
￿
c
￿
 c
￿
c
￿
g
c
￿
c
￿
 c
￿
c
￿
g
c
￿
c
￿
P
￿￿
t
 g   P
￿￿
t
 c  P  g
c
￿
c
￿
  P
￿￿
t
 c  P  g
c
￿
c
￿
  P
￿￿
t
 c  P  g
c
￿
c
￿
  P
￿￿
t
 c  P  g
c
￿
c
￿

The ordering of the OBDD v ariables is arranged so that c
￿
and c
￿
for eac h circuit input
c are next to eac h other This ordering pro vides an e ectiv e metho d to nd the cofactors
directly from the OBBD no des A more ecien t metho d of calculating the transition
probabilities without v ariable duplication is presen ted in  
 Exact T ec hniques under the Real Dela y Mo del
An approac h based on symb olic simulation is prop osed in   In the sym b olic sim ula
tion the state of an in ternal circuit no de g is describ ed b y a set of sym b olic functions
dened o v er time      in the temp oral order e g g
t
￿
 g
t
￿
     g
t
n
where t
￿
 
and g
t
i
sp ecies the input v ectors that will pro duce logic one o v er time  g
t
i
 g
t
i ￿￿
  g
t
i

  if i  n  i  n  The state of eac h circuit input c is describ ed b y t w o single v ariable
sym b olic functions c
￿
and c
￿
 dened at time  and  resp ectiv ely  The computa
tion of all sym b olic functions for eac h gate can b e p erformed during a top ological order
This pro cedure is similar to ev en tdriv en sim ulation That is for eac h sym b olic function
dened at time t at an y fanin of gate g  a new sym b olic function at the output of g at
time t  d  d is the dela y of the gate is constructed based on the Bo olean function of g
and the states of all other fanins of g at time t  The sw
n
is calculated b y summing the
signal probabilities of the exclusiv e or functions of sym b olic functions that are dened
at t w o consecutiv e time instances e g
sw
g

n ￿ ￿
X
i ￿￿
P
s
 g
t
i
 g
t
i ￿￿

The ma jor disadv an tage of this metho d is that it is computationally in tractable as the
n um b er of sym b olic functions for eac h gate could b e large Moreo v er the sym b olic func
tions could b ecome v ery complicated when one attempts to mo del the glitc h propagation
mec hanism accurately and p oten tially lter out glitc hes that ha v e short width
So far no exact tec hniques ha v e b een prop osed for considering the spatial correlation
of input signals under either the zero or the real dela y mo del The dicult y lies in the
fact that the spatial correlations can in general exist among ev ery mtuple of inputs
where m ranges from to the n um b er of circuit inputs n  In the w orst case there
is an exp onen tial n um b er of spatial correlation parameters that need to b e considered
compared to the  n for the temp oral correlation case
 Appro ximate T ec hniques
These tec hniques are dev elop ed as appro ximation tec hniques for the implicit en umer
ation approac hes They are referred as appro ximate probabilistic tec hniques mainly
b ecause the probabilistic quan tities suc h as signal and transition probabilities are ex
plicitly vs implicitly in exact tec hniques propagated in the net w orks

  Probabilistic T ec hniques under the Zero Dela y Mo del
An ecien t algorithm to estimate the signal probabilit y of eac h in ternal no de using
pairwise correlation co ecien ts among circuit no des is prop osed in   This tec hnique
allo ws the spatial correlations b et w een pairs of circuit input signals to b e considered A
more general approac h that accoun ts for spatiotemp oral correlations is prop osed in  
The mathematical foundation of this extension is a four state timehomogeneous Mark o v
c hain where eac h state represen ts some assignmen t of binary v alues to t w o lines x and
y and eac h edge describ es the conditional probabilit y for going from one state to next
The computational requiremen t of this extension can ho w ev er b e high since it is linear in
the pro duct of the n um b er of no des and n um b er of paths in the OBDD represen tation of
the Bo olean function in question A practical metho d using lo cal OBDD constructions
and dynamic lev el b ounding is describ ed b y the authors
This w ork has b een extended to handle highly correlated input streams using the
notions of conditional indep endence and isotrop y of signals   Based on these notions
it is sho wn that the relativ e error in calculating the signal probabilit y of a logic gate
using pairwise correlation co ecien ts can b e b ounded from ab o v e
  Probabilistic T ec hniques under the Real Dela y Mo del
In    C RE S T  the concept of pr ob ability waveforms is prop osed to estimate the
mean and v ariance of the curren t dra wn b y eac h circuit no de Although C RE S T w as
originally dev elop ed for switc h lev el sim ulation the concept of probabilit y w a v eforms
can b e easily extended to the gate lev el as w ell A probabilit y w a v eform consists of
an initial signal probabilit y and a sequence of transition ev en ts o ccurring at di eren t
time instances Asso ciated with eac h transition ev en t is the probabilit y of the signal
c hange In a probabilit y w a v eform one can calculate the signal probabilit y at an y time
instance t from the initial signal probabilit y and probabilities of eac h transition ev en t
whic h o ccurred b efore t  The propagation mec hanism for probabilit y w a v eforms is ev en t
driv en in nature F or eac h transition ev en t arriv ed at time t at the input of a gate g 
a transition ev en t is sc heduled at the gate output at time t  d  d is the dela y of the
gate probabilit y of the ev en t is calculated based on the probabilit y of the input ev en ts
and the signal probabilities of other fanins at time t 
In    D E N S I M  the notion of tr ansition density whic h is the a v erage n um b er of
transitions p er second is prop osed It can replace S W
n
T
c
in p o w er calculation equation
to compute the p o w er dissipation for eac h no de An ecien t algorithm based on Bo olean
di erence op eration is prop osed to propagate the transition densities from circuit inputs
throughout the circuit with the transition densit y of eac h no de calculated based on the
follo wing form ula
D  y  
n
X
i ￿￿
P 
 y
 x
i
 D  x
i

where y is the output of a no de x
￿
i
s are the inputs of y  D  y  is the transition densit y
of no de y and
 y
 x
i
is the Bo olean di erence of y resp ect to x
i
 P 
 y
 x
i
 is calculated

using OBDDs The accuracy of the transition densit y equation can b e impro v ed b y
considering higher order Bo olean di erence  
Under the real dela y mo del the correlation b et w een gate fanins is the ma jor source of
inaccuracy for the probabilistic tec hniques Moreo v er b ecause the signals ma y c hange
more than once it is v ery dicult to accurately mo del these correlations In  
based on an assumption on the probabilit y distribution function of the glitc h width a
conceptual lo wpass lter mo dule is prop osed that impro v es the accuracy of D E N S I M 
In    T P S  the concept of probabilit y w a v eforms is extended to partially ac
coun t for the signal correlation among di eren t gate fanins The signal correlation is
appro ximated b y the steadystate correlations under the zero dela y mo del T o ecien tly
implemen t the tec hniques the probabilit y w a v eform of a no de is divided in to four tagge d
waveforms based on initial and nal steady state F or an input no de g  there are 
join t tagged w a v eforms at the gate inputs After the probabilit y w a v eforms for all 
join t tagged w a v eforms are calculated they are com bined in to four tagged w a v eforms
according to a forcing set table   deriv ed from the no de function of g  The OBDDs are
used to calculate the probabilities of eac h tagged w a v eform and the signal correlations
This approac h requires signican tly less memory and runs m uc h faster than sym b olic
sim ulation y et ac hiev es high accuracy  eg the a v erage error in aggregate p o w er con
sumption is ab out  One in teresting c haracteristic of this approac h is that it will
giv e exact p o w er estimate for the zero dela y mo del if the zero dela y mo del is assumed
Moreo v er the tec hnique can b e com bined with     to consider the spatiotemp oral
correlations of circuit input signals
Both D E N S I M and T P S use OBDDs to impro v e their eciency their computa
tional w ork can b e divided in to t w o phases The rst phase constructs the required
OBDDs and computes the signal probabilities or correlations using OBDDs the sec
ond phase computes either the transition densit y or the tagged w a v eforms during a
p ostorder tra v ersal from circuit inputs to circuit outputs
  Probabilistic T ec hniques for Finite State Mac hines
The ab o v emen tioned probabilistic metho ds for p o w er estimation fo cus on com binational
logic circuits Accurate a v erage switc hing activit y estimation for FSMs is considerably
more dicult than that for com binational circuits for t w o reasons
 The probabilit y of the circuit b eing in eac h of its p ossible states has to b e calculated
 The presen tstate line inputs of the FSM are strongly correlated that is they
are temp orally correlated due to the mac hine b eha vior as represen ted in its State
T ransition Graph description and they are spatially correlated b ecause of the giv en
state enco ding
A rst attempt at estimating switc hing activit y in FSMs w as presen ted in   The
idea is to unroll the nextstate logic once th us capturing the temp oral correlations of
presen tstate lines and then p erform sym b olic sim ulation on the resulting circuit whic h

is hence treated as a com binational circuit This metho d do es not ho w ev er capture the
spatial correlations among presen tstate lines and mak es the simplistic assumption that
the state probabilities are uniform
The ab o v e w ork w as impro v ed on in   as follo ws F or eac h state s
i
   i  K in
the STG w e asso ciate a v ariable pr ob  s
i
 corresp onding to the steadystate probabilit y
of the mac hine b eing in state s
i
at t    F or eac h edge e in the STG w e ha v e
eC ur r ent signifying the state that the edge fans out from eN ext signifying the state
that the edge fans out to and eI nput signifying the input com bination corresp onding
to the edge Giv en static probabilities for the primary inputs to the mac hine w e can
compute pr ob  I nput  the probabilit y of the com bination I nput o ccurring
￿
W e can
compute pr ob  eI nput  using
pr ob  eI nput   pr ob  eC ur r ent   pr ob  I nput 
F or eac h state s
i
w e can write an equation
pr ob  s
i
 
X
￿ e such that eN ext ￿ s
i
pr ob  eI nput 
Giv en K states w e obtain K equations out of whic h an y one equation can b e deriv ed
from the remaining K  equations W e ha v e a nal equation
K
X
i ￿￿
pr ob  s
i
  
This linear set of K equations can b e solv ed to obtain the di eren t pr ob  s
i
s This
system of equations is kno wn as the ChapmanKolmogoro v equations for a discretetime
discretetransition Mark o v pro cess Indeed if the pro cess satises the conditions that it
has a nite n um b er of states its essen tial states form a singlec hain and it con tains no
p erio dicstates then the ab o v e system of equations will ha v e a unique solution
The ChapmanKolmogoro v metho d requires the solution of a linear system of equa
tions of size N where N is the n um b er of ip ops in the mac hine In general his
metho d cannot handle circuits with large n um b er of ip ops b ecause it requires ex
plicit consideration of eac h state in the circuit On the p ositiv e side state probabilities
for some v ery large FSMs ha v e b een calculated using a fully implicit tec hnique describ ed
in  
The authors of   also describ e a metho d for appro ximate switc hing activit y esti
mation of sequen tial circuits The basic computation step is the solution of a nonlinear
system of equations as follo ws
pr ob  ns
￿
  pr ob  f
￿
 i
￿
 i
￿
     i
M
 ps
￿
 ps
￿
     ps
N

pr ob  ns
￿
  pr ob  f
￿
 i
￿
 i
￿
     i
M
 ps
￿
 ps
￿
     ps
N

￿
Static probabili ties can b e computed from sp ecied transition probabiliti es

  
pr ob  ns
N
  pr ob  f
N
 i
￿
 i
￿
     i
M
 ps
￿
 ps
￿
     ps
N

where pr ob  ns
i
 corresp onds to the probabilit y that ns
i
is a  and pr ob  f
i
 i
￿
 i
￿
     i
M

ps
￿
 ps
￿
     ps
N
 corresp onds to the probabilit y that f
i
 i
￿
 i
￿
     i
M
 ps
￿
 ps
￿
     ps
N

is a  whic h is of course dep enden t on the pr ob  ps
j
 and the pr ob  i
k

W e are in terested in the steady state probabilities of the presen t and nextstate lines
implying that
pr ob  ps
i
  pr ob  ns
i
  p
i
  i  N
A similar relationship w as used in the ChapmanKolmogoro v equations
The set of equations giv en the v alues of pr ob  i
k
 b ecomes
y
￿
 p
￿
g
￿
 p
￿
 p
￿
     p
N
  
y
￿
 p
￿
g
￿
 p
￿
 p
￿
     p
N
  
  
y
N
 p
N
g
N
 p
￿
 p
￿
     p
N
   
where the g
i
s are nonlinear functions of the p
i
s W e will denote the ab o v e equations
as Y  P    or as P  G  P  In general the Bo olean function f
i
can b e written as a list
of min terms o v er the i
k
and ps
j
and the corresp onding g
i
function can b e easily deriv ed
The xed p oin t or zero of this system of equations P  G  P  or Y  P    can
b e found using the PicardP eano or NewtonRaphson iteration   The uniqueness
or the existence of the solution is not guaran teed for an arbitrary system of nonlinear
equations Ho w ev er since in our application w e ha v e a corresp ondence b et w een the
nonlinear system of equations and the State T ransition Graph of the sequen tial circuit
there will exist at least one solution to the nonlinear system F urther con v ergence is
guaran teed under mild assumptions for our application
Increasing the n um b er of v ariables or the n um b er of equations in the ab o v e system
results in increased accuracy  F or a wide v ariet y of examples it is sho wn that the
appro ximation sc heme is within but is orders of magnitude faster for large circuits
Previous sequen tial switc hing activit y estimation metho ds exhibit signican tly greater
inaccuracies
 T ransistorLev el P o w er Estimation
T ransistorlev el sim ulators pro vide the highest accuracy for circuit p o w er estimation
They are capable of handling v arious device mo dels di eren t circuit design st yles sin
gle and m ultiphase clo c king metho dologies tristate driv es etc Ho w ev er they su er
from memory and execution time constrain ts and are not suitable for large cellbased
designs Circuit tec hniques for p o w er measuremen t capacitiv e and shortcircuit p o w er
comp onen ts using the p o w er meters is describ ed in    A fast and accurate

circuitlev el sim ulator based on the step wise equiv alen t conductance and piecewise lin
ear w a v eform appro ximation has b een describ ed in  
P o w erMill   is a transistorlev el p o w er sim ulator and analyzer whic h applies an
ev en tdriv en timing sim ulation algorithm based on simplied tabledriv en device mo d
els circuit partitioning and singlestep nonlinear iteration to increase the sp eed b y
t w o to three orders of magnitude o v er SPICE while main taining an accuracy of within
p o w er information instan taneous a v erage and RMS curren t v alues as w ell as the
total p o w er consumption due to capacitance curren ts transien t short circuit curren ts
and leak age curren ts
Conclusions
The increased degree of automation of industrial design framew orks has pro duced a
substan tial c hange in the w a y digital ICs are dev elop ed The design of mo dern systems
usually starts from sp ecications giv en at a v ery high lev el of abstraction This is b ecause
existing ED A to ols are able to automatically pro duce lo wlev el design implemen tations
directly from descriptions of this t yp e
It is widely recognized that p o w er consumption has b ecome a critical issue in the
dev elopmen t of digital systems then electronic designers need to ols that allo w them
to explicitly con trol the p o w er budget during the v arious phases of the design pro cess
This is b ecause the p o w er sa vings obtainable through automatic optimization are usually
more signican t than those ac hiev able b y means of tec hnological c hoices eg pro cess
and supplyv oltage scaling
In this pap er w e ha v e pro vided a nonexhaustiv e review of existing metho dologies
and to ols for highlev el p o w er mo deling and estimation as w ell as for p o w erconstrained
syn thesis and optimization Suc h metho dologies and to ols are y ounger and therefore
less dev elop ed than those a v ailable at the gate and circuitlev el A w ealth of researc h
results and a few pioneering commercial to ols ha v e app eared nonetheless in the last cou
ple of y ears W e exp ect this eld to remain quite activ e in the foreseeable future New
trends and tec hniques will emerge some approac hes describ ed in this review will con
solidate while others will b ecome obsolete this is in view of tec hnological and strategic
c hanges in the w orld of micro electronics
References
 F N Na jm
A Surv ey of P o w er Estimation T ec hniques in VLSI Circuits IEEE T r ansac
tions on VLSI Systems  V ol  No  pp    
 M P edram
P o w er Minimization in IC Design Principles and Applications A CM T r ans
actions on Design A utomation of Ele ctr onic Systems  V ol  No  pp  
 J M Rabaey and M P edram Editors L ow Power Design Metho dolo gies  Klu w er Academic
Publishers 
 J Mermet and W Neb el Editors L ow Power Design in De ep Submicr on Ele ctr onics  Klu w er
Academic Publishers 

 T Sato Y Ootaguro M Nagamatsu H T ago
Ev aluation of Arc hitecturalLev el P o w er
Estimation for CMOS RISC Pro cessors ISLPE IEEE International Symp osium on
L ow Power Ele ctr onics  pp   San Jose CA Octob er 
 CL Su CY Tsui A M Despain
Lo w P o w er Arc hitecture Design and Compilation
T ec hniques for HighP erformance Pro cessors IEEE CompCon  pp   F ebru
ary  
 V Tiw ari S Malik A W olfe
P o w er Analysis of Em b edded Soft w are A First Step T o
w ards Soft w are P o w er Minimization IEEE T r ansactions on VLSI Systems  V ol  No 
pp    
 CT Hsieh M P edram H Meh ta F Rastgar
ProleDriv en Program Syn thesis for Ev alu
ation of System P o w er Dissipation D A C A CMIEEE Design A utomation Confer enc e 
pp  Anaheim CA June 
 D Marculescu R Marculescu M P edram
Information Theoretic Measures for P o w er
Analysis IEEE T r ansactions on CAD  V ol  No  pp  
 M Nemani F Na jm
T o w ards a HighLev el P o w er Estimation Capabilit y  IEEE T r ans
actions on CAD  V ol  No  pp  
 K T Cheng V D Agra w al
An En trop y Measure for the Complexit y of MultiOutput
Bo olean F unctions D A C
 A CMIEEE Design A utomation Confer enc e  pp  
Orlando FL June 
 F F errandi F F ummi  E Macii M P oncino D Sciuto
P o w er Estimation of Beha vioral
Descriptions D A TE  IEEE Design A utomation and T est in Eur op e  pp   P aris
F rance F ebruary 
 R E Bry an t
GraphBased Algorithms for Bo olean F unction Manipulation IEEE T r ans
actions on CAD  pp  August 
 A T y agi
En tropic Bounds on FSM Switc hing IEEE T r ansactions on VLSI Systems 
V ol  No  pp    
 D Marculescu R Marculescu and M P edram
Theoretical b ounds for switc hing activit y
analysis in nitestate mac hines ISLPED  A CMIEEE International Symp osium on
L ow Power Ele ctr onics and Design  pp   Mon terey  CA August 
 K MullerGlaser K Kirsc h K Neusinger
Estimating Essen tial Design Characteristics to
Supp ort Pro ject Planning for ASIC Design Managemen t ICCAD  IEEEA CM Inter
national Confer enc e on Computer A ide d Design  pp   San ta Clara CA No v em b er

 M Nemani F Na jm
HighLev el Area Prediction for P o w er Estimation CICC
 Cus
tom Inte gr ate d Cir cuits Confer enc e  pp   San ta Clara CA Ma y 
 M Nemani F Na jm
HighLev el Area and P o w er Estimation for VLSI Circuits ICCAD

 IEEEA CM International Confer enc e on Computer A ide d Design  pp   San
Jose CA No v em b er 
 P  Landman J Rabaey 
Activit ySensitiv e Arc hitectural P o w er Analysis for the Con trol
P ath ISLPD A CMIEEE International Symp osium on L ow Power Design  pp 
Dana P oin t CA April 
  A P  Chandrak asan M P otk onjak R Mehra J Rabaey  R W Bro dersen
Optimizing
P o w er Using T ransformations IEEE T r ansactions on CAD  V ol   No  pp   
  J M Chang M P edram
Mo dule Assignmen t for Lo w P o w er Eur oD A C  IEEE Eu
r op e an Design A utomation Confer enc e  pp  Genev a Switzerland Septem b er 
 N Kumar S Katk o ori L Rader R V em uri
ProleDriv en Beha vioral Syn thesis for Lo w
P o w er VLSI Systems IEEE Design and T est of Computers  V ol   No  pp   

  R San Martin J Knigh t
Optimizing P o w er in ASIC Beha vioral Syn thesis IEEE Design
and T est of Computers  V ol  No  pp  
 L Benini A Bogliolo M F a v alli G De Mic heli
Regression Mo dels for Beha vioral P o w er
Estimation P A TMOS  International Workshop on Power and Timing Mo deling Op
timization and Simulation  pp  Bologna Italy  Septem b er 
  L Benini A Bogliolo G De Mic heli
CharacterizationF ree Beha vioral P o w er Mo del
ing D A TE  IEEE Design A utomation and T est in Eur op e  pp  P aris F rance
F ebruary 
  L Benini A Bogliolo G De Mic heli
Adaptiv e Least Mean Square Beha vioral P o w er
Mo deling EDTC
 IEEE Eur op e an Design and T est Confer enc e  pp    P aris
F rance Marc h 
  S P o w ell P  Chau
Estimating P o w er Dissipation of VLSI Signal Pro cessing Chips The
PF A T ec hniques IEEE Workshop on VLSI Signal Pr o c essing  V ol IV pp   
  P  Landman J Rabaey 
P o w er Estimation for HighLev el Syn thesis ED A C IEEE
Eur op e an Confer enc e on Design A utomation  pp  P aris F rance F ebruary 
  S Gupta F N Na jm
P o w er Macromo deling for HighLev el P o w er Estimation D A C
A CMIEEE Design A utomation Confer enc e  pp  Anaheim CA June 
 D Liu C Sv ensson
P o w er Consumption Estimation in CMOS VLSI Chips IEEE Jour
nal of Solid State Cir cuits  V ol  No  pp   
 H Meh ta R Ow ens M J Irwin
Energy Characterization Based on Clustering D A C
A CMIEEE Design A utomation Confer enc e  pp   Las V egas NV June 
 Q W u CS Ding CT Hsieh M P edram
Statistical Design of MacroMo dels for R T
Lev el P o w er Ev aluation ASPD A C  A CMIEEE Asia South Pacic Design A utomation
Confer enc e  pp    Chiba Japan Jan uary 
 Q Qiu Q W u M P edram CS Ding
CycleAccurate MacroMo dels for R TLev el P o w er
Analysis ISLPED
 A CMIEEE International Symp osium on L ow Power Ele ctr onics
and Design  pp   Mon terey  CA August 
 CT Hsieh CS Ding Q W u M P edram
Statistical Sampling and Regression Esti
mation in P o w er MacroMo deling ICCAD  IEEEA CM International Confer enc e on
Computer A ide d Design  pp  San Jose CA No v em b er 
 C M Huizer
P o w er Dissipation Analysis of CMOS VLSI Circuits b y means of Switc h
Lev el Sim ulation IEEE Eur op e an Solid State Cir cuits Confer enc e  pp   
 R Burc h F Na jm P  Y ang T T ric k
A Mon te Carlo Approac h for P o w er Estimation
IEEE T r ansactions on VLSI Systems  V ol  No  pp  
 CS Ding CT Hsieh Q W u M P edram
Stratied Random Sampling for P o w er Es
timation ICCAD  IEEEA CM International Confer enc e on Computer A ide d Design 
San Jose CA pp   No v em b er 
 LP  Y uan CC T eng SM Kang
Statistical Estimation of Av erage P o w er Dissipation in
CMOS VLSI Circuits Using Nonparametric T ec hnique ISLPED  A CMIEEE Interna
tional Symp osium on L ow Power Ele ctr onics and Design  pp  Mon terey  CA August

 F N Na jm S Go el and I N Ha jj
P o w er estimation in sequen tial circuits D A C 
A CMIEEE Design A utomation Confer enc e  pp   San F rancisco CA June 
  TL Chou K Ro y 
Statistical Estimation of Sequen tial Circuit Activit y  ICCAD
IEEEA CM International Confer enc e on Computer A ide d Design  pp   San Jose
CA No v em b er 

  LP  Y uan CC T eng SM Kang
Statistical Estimation of Av erage P o w er Dissipation in
Sequen tial Circuits D A C A CMIEEE Design A utomation Confer enc e  pp  
Anaheim CA June 
 A Hill CC T eng S M Kang
Sim ulati onBased Maxim um P o w er Estimation ISCAS
  IEEE International Symp osium on Cir cuits and Systems  V ol IV pp  A tlan ta
GA Ma y 
  CS Ding Q W u CT Hsieh M P edram
Statistical Estimation of the Cum ulativ e Dis
tribution F unction for P o w er Dissipation in VLSI Circuits D A C A CMIEEE Design
A utomation Confer enc e  pp  Anaheim CA Jun 
 CY Tsui D Marculescu R Marculescu and M P edram
Impro ving the Eciency of
P o w er Sim ulators b y Input V ector Compaction D A C A CMIEEE Design A utomation
Confer enc e  pp  Las V egas NV Jun 
  D Marculescu R Marculescu M P edram
Sto c hastic Sequen tial Mac hine Syn thesis T ar
geting Constrained Sequence Generation D A C A CMIEEE Design A utomation Con
fer enc e  pp  Las V egas NV Jun 
  R Marculescu D Marculescu M P edram
Adaptiv e Mo dels for Input Data Compaction
for P o w er Sim ulators ASPD A C  A CMIEEE AsiaPacic Design A utomation Confer
enc e  pp  Chiba Japan Jan 
  G V Cormac k R N Horsp o ol
Data Compression Using Dynamic Mark o v Mo deling
Computer Journal  V ol  No  pp   
  R Marculescu D Marculescu M P edram
P o w er Estimation Using Hierarc hical Mark o v
Mo dels D A C A CMIEEE Design A utomation Confer enc e  pp  Anaheim
CA Jun 
  D Marculescu R Marculescu M P edram
Sequence Compaction for Probabilistic Analy
sis of Finite State Mac hines D A C A CMIEEE Design A utomation Confer enc e  pp  
 Anaheim CA Jun 
 R Marculescu D Marculescu M P edram
Comp osite Sequence Compaction for Finite
State Mac hines Using Blo c k En trop y and HigherOrder Mark o v Mo dels ISLPED

A CMIEEE International Symp osium on L ow Power Ele ctr onics and Design  pp 
Mon terey  CA Aug 
 K P  P ark er and J McClusk ey 
Probabilistic treatmen t of general com binational net
w orks IEEE T r ansactions on Computers  V ol C  pp  June 
 S Chakra v art y 
On the Complexit y of Using BDDs for the Syn thesis and Analysis of
Bo olean Circuits
th A nnual A l lerton Confer enc e on Communic ation Contr ol and Com
puting  pages  
 P  Sc hneider and U Sc hlic h tmann
Decomp osition of Bo olean F unctions for Lo w P o w er
Based on a New P o w er Estimation T ec hnique WLPD International Workshop on L ow
Power Design  pp    Napa CA April  
 R Marculescu D Marculescu and M P edram
Switc hing Activit y Analysis Considering
Spatiotemp oral Correlation ICCAD IEEEA CM International Confer enc e on Com
puter A ide d Design  pp    San Jose CA No v em b er  
 A Ghosh S Dev adas K Keutzer S Malik and J White
Estimation of Av erage Switc hing
Activit y in Com binational and Sequen tial Circuits D A C  A CMIEEE Design A utoma
tion Confer enc e  pp   Anaheim CA June  
 S Ercolani M F a v alli M Damiani P  Oliv o and B Ricco
T estabilit y Measures in
Pseudorandom T esting IEEE T r ansactions on CAD  V ol  pp   June  

 R Marculescu D Marculescu M P edram
Ecien t P o w er Estimation for Highly Corre
lated Input Streams D A C  A CMIEEE Design A utomation Confer enc e  pp   
San F rancisco CA June 
 F Na jm R Burc h P  Y ang I Ha jj
Probabilistic Sim ulatio n for Reliabilit y Analysis of
CMOS VLSI Circuits IEEE T r ansactions on CAD  V ol  No  pp   
 F Na jm
T ransition Densit y A New Measure of Activit y in Digital Circuits IEEE
T r ansactions on CAD  V ol   No  pp   
 T L Chou K Ro y and S Prasad
Estimation of Circuit Activit y Considering Signal Cor
relation and Sim ultaneous Switc hing ICCAD IEEEA CM International Confer enc e
on Computer A ide d Design  pp  San Jose CA No v em b er  
 F Na jm
Lo wP ass Filter for Computing the T ransition Densit y in Digital Circuits IEEE
T r ansactions on CAD  V ol  No  pp   Septem b er  
 CY Tsui M P edram A M Despain
Ecien t Estimation of Dynamic P o w er Dissipa
tion Under a Real Dela y Mo del ICCAD IEEEA CM International Confer enc e on
Computer A ide d Design  pp   San ta Clara CA No v em b er 
 CY Tsui J Mon teiro M P edram S Dev adas A M Despain B Lin
P o w er Estimation
in Sequen tial Logic Circuits IEEE T r ansactions on VLSI Systems  V ol  No  pp  
 
 G D Hac h tel E Macii A P ardo F Somenzi
Mark o vian Analysis of Large Finite State
Mac hines IEEE T r ansactions on CAD  V ol  No   pp    
 H M Lieb erstein A Course in Numeric al A nalysis  Harp er  Ro w Publishers 
 S M Kang
Accurate Sim ulation of P o w er Dissipation in VLSI Circuits IEEE J Solid
State Cir cuits  V ol  pp  Octob er 
 G Y Y acoub W H Ku
An Enhanced T ec hnique for Sim ulating ShortCircuit P o w er
Dissipation IEEE Journal of Solidstate Cir cuits  V ol  pp    June 
 P  Buc h S Lin V Nagasam y and E S Kuh
T ec hniques for F ast Circuit Sim ulatio n Ap
plied to P o w er Estimation of CMOS circuits SLPD A CMIEEE International Sym
p osium on L ow Power Design pp  Dana P oin t CA April 
 C X Huang B Zhang AC Deng and B Swirski
The Design and Implemen tation
of P o w erMill SLPD A CMIEEE International Symp osium on L ow Power Design
pp  Dana P oin t CA April