Minimum Energy CMOS Circuit Design Considering Variation

bumpedappleΚινητά – Ασύρματες Τεχνολογίες

21 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

56 εμφανίσεις

Minimum Energy CMOS Circuit Design
Considering Variation

Chris Ferguson, Max Korbel, and David Money Harris

Harvey Mudd College,

Claremont, CA 91711


Abstract
.
Near
-
threshold and sub
-
threshold voltage operation in chips is
a potentially very useful method of lowering power consumption in
devices that must be very energy efficient but do not require
intensive
computation
. Most research has focused
on lowering the

total e
nergy by
dropping the voltage to very low and

extending cycle time as is
necessary to
reduce the failure rate
. Random dopant fluctuation can
cause intrinsic errors due to mismatch between consecutive devices that
cannot be solved by simply increas
ing the cycle time. Taking into
account this process variation may raise the optimal lowest energy near
-
threshold voltage to a higher value than previously expected.


1

Introduction



A growing body of applications including wireless sensor networks and e
nergy
-
scavenging systems require modest computational requirements but extremely low energy per
operation. Such systems are typically operated at a relatively low voltage while actively
computing, and then turned off to stop leakage when the computation is

complete. The best
operating voltage is a balance of dynamic and leakage energy; operating at a lower voltage
reduces dynamic energy but increases the computation time and hence the total leakage energy.
Early work in this field has concluded that minimum

energy is achieved by using minimum
-
sized
transistors nearly everywhere and operating sub
-
threshold at a supply voltage of about 300 mV.


Much of this work is based on the assumption that at this sizing and voltage, all circuits
will function properly giv
en sufficiently long delay. This assumption fails to consider the
possibility that mismatch between devices could cause a time
-
independent

failure which can
never be corrected. This type of failure would be caused by mismatch between consecutive
devices, p
rimarily caused by stochastic effects such as random dopant fluctuation (RDF). As the
supply voltage is lowered and sizing is kept small, these random effects are more likely to cause
a catastrophic mismatch. If these failures cause a chip to move away fro
m the minimum energy
voltage, then it is possible that a circuit design created to prevent these failures might allow for
lower
-
energy operation.


2

Background



Previous work suggests that the optimal operating voltage for minimum energy is near
300 mV
[2]
and predict that, even when considering process variation, the nominal delay is 15
FO4 delays

[1]
.

One paper [4] even claims that functional operation is possible at 60 mV.


Most previous work

done does not take into account RDF as a significant issue

and
predict that the inclusion of RDF will simply require longer hold times.
This

study

shows,
however, that many errors are intrinsic and will cause failures
regardless of

how long the clock
cycles are

extended
.

This

research suggests that this

pushes
up the optimal voltage closer to
500

mV.


Some papers, such as [3], claim that larger sizing is necessary to account for process
variation, but
this research suggests

that the minimum size

at a higher voltage is actually the
lower energy option
.

Other res
earch

has focused on designing circuits especially for operation in
sub
-
threshold or near
-
threshold regions.


3

Simulation Methods



This study was carried out using a 65nm IBM process. The investigation primarily
focused on examining the circuit design space in the form of circuit sizing and supply voltage.
By investigating the failure rate of circuits in this design space as well as t
heir energy, a
reliability
-
focused minimum
-
energy design could be found. In order to best characterize circuits
in this technology, several different circuits were simulated and examined. These circuits were a
nand2, a nor2, an inverter, a chain of 12 fano
ut
-
of
-
four inverter, and a latch. The design of the
latch can be
seen in

Figure 1
.




Figure 1



The design of latch used in this study
.



Based on preliminary investigation, current through the pMOS transistors
causes

the
circuits to fail. Thus from the perspective of this study, changing device sizing means changing
the width of the pMOS transistors. For the combinational circuits, this sizing change is simply a
multiple of the minimum width applied to the pMOS transi
stor(s).



3.1

Failure Criteria



Because of the high degree of variation in the circuits involved, the definition of what
constitutes a device failure was carefully considered. The criteria was based on the static noise
margins of an ideal inverter at a g
iven voltage. For any measured circuits, a successful
high
value was at or above the V
OH

of an ideal inverter at the given operational voltage, and a
successful low value was below the
V
OL

under the same conditions
.
Using these criteria for an
individual

device's output is designed to provide the best indication of when that device would
cause an overall failure in a chip.


In addition to these criteria on the output, DC noise was added to the inputs (and clock)
of all circuits under test. This DC offset
was set to be the same as the output failure criteria,
V
OH

and
V
OL
.

This allows for the simulation to best reflect the worst
-
case scenario, when there is a
marginal gate preceding the device under test.



3.2

Latch Sizing Effects



An initial investigation

determined that the optimal way to resize the latch is to upsize all
of the pMOS transistors in the different devices in the latch, rather than changing any specific
inverter or tristate. This was found by measuring the different failure modes detailed in

section
REF for a minimum
-
sized latch, and comparing it to the failure rates when each of the individual
devices is upsized.


The simulation was performed using a 500
-
point Monte Carlo simulation at 0.3V
including local mismatch and global variation as de
termined by the IBM model.
Table 1

details
the results from this simulation. As shown, device X1 improves the write
-
low failure, X2
improves the write
-
high failure, X3 improves the hold
-
high failure, and X4 improves hold
-
low
failure.



X1

X2

X3

X4

All Data


1x

2x

1x

2x

1x

2x

1x

2x

1x

2x

Write
-
high

8e
-
3

8e
-
3

8e
-
3

0

8e
-
3

8e
-
3

8e
-
3

8e
-
3

8e
-
3

0

Hold
-
high

26e
-
3

26e
-
3

26e
-
3

18e
-
3

26e
-
3

10e
-
3

26e
-
3

34e
-
3

26e
-
3

2e
-
3

Write
-
low

8e
-
3

2e
-
3

8e
-
3

12e
-
3

8e
-
3

10e
-
3

8e
-
3

8e
-
3

8e
-
3

2e
-
3

Hold
-
low

30e
-
3

24e
-
3

30e
-
3

30e
-
3

30e
-
3

46e
-
3

30e
-
3

12e
-
3

30e
-
3

4e
-
3

Total Failures

56e
-
3

50e
-
3

56e
-
3

52e
-
3

56e
-
3

56e
-
3

56e
-
3

46e
-
3

56e
-
3

8e
-
3

Table 1



Latch device resizing results for 0.3 V, simulation size N=500. Failure rates are represented as fraction of
total number of runs that are failures, i.e. a rate of 5e
-
1 means that half of the runs had a device failure



From these results, it is clear tha
t no individual device in the latch can be upsized to
change the total failure rate with much success. However, if all the pMOS in all the devices are
upsized, there can be a substantial reduction in the number of failures by a latch.


3.3

Timing Effects



Because this study was concerned with failures caused by mismatch and not by timing, a
suitable simulation cycle time had to be selected
so

that even the slowest circuits would have
changed values.

However, because the energy of the circuit is dominated
by leakage energy,
having a cycle time that is too long provides circuit designs that leak less with an advantage over
other designs only because of the overly
-
long leakage time.

The delay time for the simulation
was set to be the delay time of 50 fanout
-
of
-
4 inverters tha
t have the worst
-
case time (µ

+ 3

*
σ
).


This delay time was calculated using a chain of 50 Fo4 inverters with local and global
variation.

A different value was calculated for each circuit design (sizing and voltage) that was
tested.

Th
is allowed for
a consistent timing method for the simulations across all different sizes
and voltages.



3.4

Final Simulation Setup



For
the

final major simulation,
the grid of voltage and sizing was split into seven blocks
where each block was given a
ppr
oximately

600,000 Monte Carlo simulations total. It was
broken up in this way so that
the most detailed results could be attained

where

it was important
in the least amount of simulation time.


F
rom previous

smaller

simulations,
it was possible to
predict

where the most interesting data would probably be

and
where
the minimum energy
was
likely
to occur. The blocks were chosen as shown in Table 2.

Each set of runs gave a failure
rate and

average

energy.
Each Monte Carlo simulation simulated on a latch, a
n inverter, a
nand2, a nor
2,
and a

fanout
-
of
-
4 inverter chain.

The simulations were run on
a clock time equal
to the worst
-
case time delay of 50 fanout
-
of
-
4 inverters as described in the previous section.


Sizing,
Voltage(mV)

1x

1.25x

1.5x

2x

400

40001

40001

40001

85715

425

40001

40001

40001

85715

450

40001

40001

40001

85715

475

40001

40001

40001

85715

500

300000

40001

40001

85715

525

300001

300001

120001

85715

550

300001

300001

120001

85715

575

100001

100001

120001

200001

600

100001

100001

120001

200001

625

100001

100001

120001

200001

Table
2



The number of Monte Carlo simulations run for each sizing
-
voltage pair
.
Note: the point (1x, 50

mV)
was run again at a higher number of simulations to get more accurate results near the minimum
energy location.


4

Results



The results of this research are
primarily

based on these final simulation
runs.
Minimum
energies exist at lower voltages because as voltage decreases, dynamic energy does too, but at a
certain point, leakage energy dominates

the total energy and causes the total energy to rise again
for very low voltages.
Figure

2
shows the total average energy as a function of voltage for
different sizings of a latch.

The energy minimum voltage moves lower as the sizing increases
(from 500

mV for 1x

sizing

to 425

mV for 2x sizing)
, but the lowest ene
rgy point increases with
sizing with 1x having the lowest energy despite its higher voltage.

This makes sense since the
larger sizes will increase leakage energy
if

the clock cycle is kept cons
tant.



Figure 2



Energy versus voltage for sizings 1x, 1.25x, 1.5x, and 2x of the latch.


Fig
ure 3 again shows energy versus voltage, but this time it is for 1x sizing fo
r the
inverter, nand2, and nor
2. It is apparent that for these combinational
structures, just as for the
latch, higher voltages such as 500

mV and 525

mV

are the most energy efficient. Also just like
the latch, the minimum energies
for the combinational logic
increase with
transistor
size.



Figure 3



Energy versus voltage for I
nverter, NAND2, and NOR2; all for 1x sizing.



0.5
0.6
0.7
0.8
0.9
1
1.1
0.4
0.425
0.45
0.475
0.5
0.525
0.55
0.575
0.6
0.625
Energy (fJ)

Vdd (V)

Energy vs. Vdd for Different Latch Sizings

1x
1.25x
1.5x
2x
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.4
0.425
0.45
0.475
0.5
0.525
0.55
0.575
0.6
0.625
Energy (fJ)

Vdd (V)

Energy vs. Vdd for Combinational Logic

for 1x Sizing

Inverter
NAND2
NOR2

Failure rate data is taken at each of these points as well.


A
s expected
,

for any given
device,

as the voltage increases, the failure rate decreases. However, there is something odd with

the detected

failure rates: after a certain point they suddenly drop to zero, but they don’t drop to
zero due to
lack of

sufficiently
high resolution.


For example
,
the lowest non
-
zero detected failure rate is 7.27 x 10
-
4

at 525

mV

for the
inverter at 1x sizing
, but a
t 25mV higher there are no failures detected. The inverte
r at the
simulation run with 550

mV and 1x sizing had about 300,000 Monte Carlo runs which gave it a
failure rate resolution of about 3.33 x 10
-
6
. That means that the real failure rate could have b
een
two orders of magnitude less than the previous failure rate at 525

mV and still would have
probably been detected

as non
-
zero
. This same
observation was made

for all

the failure rates

trends

for
almost all devices and
sizes, with no failure rate being

lower than 7.27 x 10
-
4

despite
high resolutions. Figure 4 shows, for 1x sizing, how
sudden the failure rate drop offs are. All
s
imulations were run up to 625

mV, so in this case the latch, nand2, and inverter suddenly dr
op
to zero failures.



Figure 4



Failure rate versus voltage on a logarithmic scale for 1x sizing.


Further investigation of the sudden drop offs show that the failure rate plateaus at
a
constant

value

just

before the sudden drop offs in failure rate

to zero
.

Simulations were run at
successively closer

voltages to the drop off points to confirm. So for the inverter, for example, at
voltages 525 mV and 530 mV, the failure rate was exactly same, but at 540 mV, there were
suddenly zero failures detected. The line does not seem to be co
ntinuous. It was also confirmed
that

the simulations were, in fact, still random and unique each time they were run.
Changes to
the options on the simulator to make the output more accurate did not solve the problem. The
reason for these strange results

has not as of yet been determined
, but it is possible that they are
the result of some kind of intersection between two non
-
continuous models

used by the
simulation software

in the range of interest to us.



1.00E-04
1.00E-03
1.00E-02
1.00E-01
1.00E+00
0.4
0.45
0.5
0.55
0.6
Failure Rate (Log scale)

Voltage (V)

Failure Rate vs. Voltage for 1x sizing

Inverter
NOR2
Latch
NAND2
5

Conclusions



Our results suggest that the op
timal lowest energy voltage is closer to 500

mV than 300

mV as othe
r research has suggested. The accuracy of these results cannot be guaranteed
,
however, since our simulations
were discovered to be flawed.

It is possible that our results
relating to fail
ure rates are
incorrect

and the trends we see are actually different in reality.
Future
research may

determine the source of the simulation issues and

show that the results are correct
which would then

imply that the intrinsic failures caused by RDF and m
ismatched consecutive
devices would push the optimal minimum energy voltage up
to a higher value
than was
previously expected.




References


[1]

B. Zhai, S. Hanson, D. Blaauw, D. Sylvester, “Analysis and Mitigation of Variability in
Subthreshold Design”,
fpim䕄′〰

[㉝

䜮d C桥測n 䐮a B污lu眬w 吮T 䵵摧MⰠ䐮a py汶敳瑥rⰠ
丮k 䭩洬h
“Yield
J
d物癥渠乥ar
J
瑨牥獨潬d
SRAM Design”,
fCC䅄 ㈰〷

[㍝

J. Kwong, A. Chandrakasan, “Variation
J
䑲楶敮ip楺楮i 景f 䵩湩浵m 䕮b牧y p畢
J
瑨牥獨潬d
Circuits”, ISLPED 2006

[㑝

p⸠䝵灴dⰠ䄮A
Raychowdhury, K. Roy, “Digital Computation in Subthreshold Region for
啬瑲r汯l
J
m潷or 佰l牡瑩潮㨠䄠䑥癩ve
J
C楲i畩u
J
Architecture Codesign Perspective”, IEEE
㈰㄰