Lecture 4

mittenturkeyΗλεκτρονική - Συσκευές

26 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

58 εμφανίσεις

Lecture
4
:

Delay
Optimization
and Logical
Effort

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

2

Outline


RC Delay Models


Delay Estimation


Logical Effort


Delay in a Logic Gate


Multistage Logic Networks


Choosing the Best Number of Stages


Example
s


Summary



CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

3

Delay Definitions


t
pdr
:
rising propagation delay


Max time f
rom input to
rising output crossing
V
DD
/2


t
pdf
:
falling propagation delay


Max time f
rom input to
falling output crossing
V
DD
/2


t
pd
:
average propagation delay


t
pd

= (t
pdr

+ t
pdf
)/2


t
r
:
rise time


From output crossing 0.2
V
DD

to 0.8 V
DD


t
f
:
fall time


From output crossing 0.8
V
DD

to 0.2 V
DD

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

4

Delay Definitions


t
cdr
:
rising contamination delay


Min time f
rom input to rising output crossing V
DD
/2


t
cdf
:
falling contamination delay


Min time f
rom input to falling output crossing
V
DD
/2


t
cd
:
average contamination delay


t
c
d

= (t
cdr

+ t
cdf
)/2

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

5

Simulated Inverter Delay


Solving differential equations by hand is too hard


SPICE simulator solves the equations numerically


Uses more accurate I
-
V models too!


But simulations take time to write, may hide insight

(V)
0.0
0.5
1.0
1.5
2.0
t(s)
0.0
200p
400p
600p
800p
1n
t
pdf
= 66ps
t
pdr
= 83ps
V
in
V
out
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

6

Delay Estimation


We would like to be able to easily estimate delay


Not as accurate as simulation


But easier to ask “What if?”


The step response usually looks like a 1
st

order RC
response with a decaying exponential.


Use RC delay models to estimate delay


C = total capacitance on output node


Use
effective resistance

R


So that t
pd

= RC


Characterize transistors by finding their effective R


Depends on average current as gate switches

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

7

Effective Resistance


Shockley models have limited value


Not accurate enough for modern transistors


Too complicated for much hand analysis


Simplification: treat transistor as resistor


Replace I
ds
(V
ds
, V
gs
) with effective resistance R


I
ds

= V
ds
/R


R averaged across switching of digital gate


Too inaccurate to predict current at any given time


But good enough to predict RC delay

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

8

RC Delay Model


Use equivalent circuits for MOS transistors


Ideal switch + capacitance and ON resistance


Unit nMOS has resistance R, capacitance C


Unit pMOS has resistance 2R, capacitance C


Capacitance proportional to width


Resistance inversely proportional to width

k
g
s
d
g
s
d
kC
kC
kC
R
/
k
k
g
s
d
g
s
d
kC
kC
kC
2
R
/
k
Input
capacitance
Output
capacitance
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

9

RC Values


Capacitance


C = C
g

= C
s

= C
d

= 2 fF/
m
m of gate width in 0.6
m
m


Gradually decline to 1 fF/
m
m in nanometer techs.


Resistance


R


6 K
W
*
m
m in 0.6
m
m process


Improves with shorter channel lengths


Unit transistors


May refer to minimum contacted device (4/2
l
)


Or maybe 1
m
m wide device


Doesn’t matter as long as you are consistent

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

10

Inverter Delay Estimate


Estimate the delay of a fanout
-
of
-
1 inverter

C
C
R
2
C
2
C
R
2
1
A
Y
C
2
C
C
2
C
C
2
C
R
Y
2
1
d = 6RC

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

11

Delay Model Comparison

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

12

Example: 3
-
input NAND


Sketch a 3
-
input NAND with transistor widths chosen to
achieve effective rise and fall resistances equal to a unit
inverter (R).

3

3

3

2

2

2

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

13

2
2
2
3
3
3
2
2
2
3
3
3
3C
3C
3C
3C
2C
2C
2C
2C
2C
2C
3C
3C
3C
2C
2C
2C
2
2
2
3
3
3
3
C
3
C
5
C
5
C
5
C
9
C
3
-
input NAND Caps


Annotate the 3
-
input NAND gate with gate and diffusion
capacitance.

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

14

Example


Sketch a 2
-
input NOR gate with selected transistor
widths so that effective rise and fall resistances are
equal to a unit inverter’s. Annotate the gate and
diffusion capacitances.

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

15

Elmore Delay


ON transistors look like resistors


Pullup or pulldown network modeled as
RC ladder


Elmore delay of RC ladder

R
1
R
2
R
3
R
N
C
1
C
2
C
3
C
N




nodes
1 1 1 2 2 1 2
......
pd i to source i
i
N N
t R C
RC R R C R R R C
 

       

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

16

Example: 3
-
input NAND


Estimate worst
-
case rising and falling delay of 3
-
input NAND
driving
h

identical gates.

9
C
3
C
3
C
3
3
3
2
2
2
5
hC
Y
n
2
n
1
h copies


9 5
pdr
t h RC
 














3 3 3 3 3 3
3 3 9 5
11 5
R R R R R R
pdf
t C C h C
h RC
      
 
 
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

17

Delay Components


Delay has two parts


Parasitic delay


9 or 11 RC


Independent of load


Effort delay


5h RC


Proportional to load capacitance

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

18

Contamination Delay


Best
-
case (contamination) delay can be substantially less than
propagation delay.


Ex: If all three inputs fall simultaneously



5
9 5 3
3 3
cdr
R
t h C h RC
   
   
 
   
 
   
9
C
3
C
3
C
3
3
3
2
2
2
5
hC
Y
n
2
n
1
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

19

Delay in a Logic Gate


Express delays in process
-
independent unit


Delay has two components: d =
f

+
p



f
:
effort delay =
g
h

(a.k.a. stage effort)


Again has two components



g
:
logical effort


Measures relative ability of gate to deliver current


g



1 for inverter



h
:
electrical effort

= C
out

/ C
in


Ratio of output to input capacitance


Sometimes called fanout



p
: parasitic delay


Represents delay of gate driving no load


Set by internal parasitic capacitance


abs
d
d


 

3RC






3 ps in 65 nm process



60 ps in 0.6
m
m process

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

20

Electrical Effort
:
h
=
C
out

/
C
in
Normalized Delay
:
d
Inverter
2
-
input
NAND
g
=
1
p
=
1
d
=
h
+
1
g
=
4
/
3
p
=
2
d
=
(
4
/
3
)
h
+
2
Effort Delay
:
f
Parasitic Delay
:
p
0
1
2
3
4
5
0
1
2
3
4
5
6
Electrical Effort
:
h
=
C
out

/
C
in
Normalized Delay
:
d
Inverter
2
-
input
NAND
g
=
p
=
d
=
g
=
p
=
d
=
0
1
2
3
4
5
0
1
2
3
4
5
6
Delay Plots


d


=
f

+
p




=
gh

+
p



What about


NOR2?



CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

21

Computing Logical Effort


DEF:
Logical effort is the ratio of the input
capacitance of a gate to the input capacitance of an
inverter delivering the same output current
.


Measure from delay vs. fanout plots


Or estimate by counting transistor widths

A
Y
A
B
Y
A
B
Y
1
2
1
1
2
2
2
2
4
4
C
in
= 3
g = 3/3
C
in
= 4
g = 4/3
C
in
= 5
g = 5/3
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

22

Catalog of Gates

Gate type

Number of inputs

1

2

3

4

n

Inverter

1

NAND

4/3

5/3

6/3

(n+2)/3

NOR

5/3

7/3

9/3

(2n+1)/3

Tristate / mux

2

2

2

2

2

XOR, XNOR

4, 4

6, 12, 6

8, 16, 16, 8


Logical effort of common gates

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

23

Catalog of Gates

Gate type

Number of inputs

1

2

3

4

n

Inverter

1

NAND

2

3

4

n

NOR

2

3

4

n

Tristate / mux

2

4

6

8

2n

XOR, XNOR

4

6

8


Parasitic delay of common gates


In multiples of p
inv

(

1)

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

24

Example: Ring Oscillator


Estimate the frequency of an N
-
stage ring oscillator






Logical Effort:

g = 1


Electrical Effort:

h = 1


Parasitic Delay:

p = 1


Stage Delay:

d = 2


Frequency:

f
osc

= 1/(2*N*d) = 1/4N

31 stage ring oscillator in
0.6
m
m process has
frequency of ~ 200 MHz

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

25

Example: FO4 Inverter


Estimate the delay of a fanout
-
of
-
4 (FO4) inverter






Logical Effort:

g = 1


Electrical Effort:

h = 4


Parasitic Delay:

p = 1


Stage Delay:

d = 5

d
The FO4 delay is about


300 ps in 0.6
m
m process


15 ps in a 65 nm process



CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

26

Multistage Logic Networks


Logical effort generalizes to multistage networks


Path Logical Effort



Path Electrical Effort



Path Effort

i
G g


out-path
in-path
C
H
C

i i i
F f g h
 
 
10
x
y
z
20
g
1
= 1
h
1
= x/10
g
2
= 5/3
h
2
= y/x
g
3
= 4/3
h
3
= z/y
g
4
= 1
h
4
= 20/z
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

27

Multistage Logic Networks


Logical effort generalizes to multistage networks


Path Logical Effort



Path Electrical Effort



Path Effort



Can we write F = GH?

i
G g


out path
in path
C
H
C



i i i
F f g h
 
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

28

Paths that Branch


No! Consider paths that branch:



G

= 1


H

= 90 / 5 = 18


GH

= 18


h
1


= (15 +15) / 5 = 6


h
2


= 90 / 15 = 6


F

= g
1
g
2
h
1
h
2

= 36
= 2GH


5
15
15
90
90
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

29

Branching Effort


Introduce
branching effort


Accounts for branching between stages in path







Now we compute the path effort


F = GBH



on path off path
on path
C C
b
C


i
B b


i
h BH


Note:

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

30

Multistage Delays


Path Effort Delay



Path Parasitic Delay



Path Delay

F i
D f


i
P p


i F
D d D P
  

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

31

Designing Fast Circuits




Delay is smallest when each stage bears same effort




Thus minimum delay of N stage path is




This is a
key

result of logical effort


Find fastest possible delay


Doesn’t require calculating gate sizes

i F
D d D P
  

1
ˆ
N
i i
f g h F
 
1
N
D NF P
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

32

Gate Sizes


How wide should the gates be for least delay?







Working backward, apply capacitance
transformation to find input capacitance of each gate
given load it drives.


Check work by verifying input cap spec is met.

ˆ
ˆ
out
in
i
i
C
C
i out
in
f gh g
g C
C
f
 
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

33

Example: 3
-
stage path


Select gate sizes x and y for least delay from A to B

8
x
x
x
y
y
45
45
A
B
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

34

Example: 3
-
stage path






Logical Effort


G = (4/3)*(5/3)*(5/3) = 100/27


Electrical Effort

H = 45/8


Branching Effort

B = 3 * 2 = 6


Path Effort


F = GBH = 125


Best Stage Effort



Parasitic Delay

P = 2 + 3 + 2 = 7


Delay


D = 3*5 + 7 = 22 = 4.4 FO4

8
x
x
x
y
y
45
45
A
B
3
ˆ
5
f F
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

35

Example: 3
-
stage path


Work backward for sizes


y = 45 * (5/3) / 5 = 15


x = (15*2) * (5/3) / 5 = 10

P: 4
N: 4
45
45
A
B
P: 4
N: 6
P: 12
N: 3
8
x
x
x
y
y
45
45
A
B
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

36

Best Number of Stages


How many stages should a path use?


Minimizing number of stages is not always fastest


Example: drive 64
-
bit datapath with unit inverter




D

= NF
1/N

+ P



= N(64)
1/N
+ N

1
1
1
1
8
4
16
8
2
.
8
23
64
64
64
64
Initial Driver
Datapath Load
N
:
f
:
D
:
1
64
65
2
8
18
3
4
15
4
2
.
8
15
.
3
Fastest
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

37

Derivation


Consider adding inverters to end of path


How many give least delay?







Define best stage effort

N - n
1
Extra Inverters
Logic Block:
n
1
Stages
Path Effort F


1
1
1
1
N
n
i inv
i
D NF p N n p

   

1 1 1
ln 0
N N N
inv
D
F F F p
N

    



1 ln 0
inv
p
 
  
1
N
F


CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

38

Best Stage Effort



has no closed
-
form solution



Neglecting parasitics (p
inv

= 0), we find


= 2.718 (e)


For p
inv

= 1, solve numerically for


= 3.59



1 ln 0
inv
p
 
  
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

39

Review of Definitions

Term

Stage

Path

number of stages

logical effort

electrical effort

branching effort

effort

effort delay

parasitic delay

delay

i
G g


out-path
in-path
C
C
H

N
i
B b


F GBH

F i
D f


i
P p


i F
D d D P
  

out
in
C
C
h

on-path off-path
on-path
C C
C
b


f gh

f
p
d f p
 
g
1
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

40

Method of Logical Effort

1)
Compute path effort

2)
Estimate best number of stages

3)
Sketch path with N stages

4)
Estimate least delay

5)
Determine best stage effort


6)
Find gate sizes

F GBH

4
log
N F

1
N
D NF P
 
1
ˆ
N
f F

ˆ
i
i
i out
in
g C
C
f

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

41

Limits of Logical Effort


Chicken and egg problem


Need path to compute G


But don’t know number of stages without G


Simplistic delay model


Neglects input rise time effects


Interconnect


Iteration required in designs with wire


Maximum speed only


Not minimum area/power for constrained delay

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

42

Example
1

Calculate the


a) logical effort


b) parasitic delay



c) effort and







d) delay in the following 6
-
input AND implementations as a function of
the path electrical effort H. Which implementation is the fastest if a)
H=1, b) H=5 and c) H=20?


(
a
)
(
b
)
(
c
)
(
d
)
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

43

Example 2


For the path between x and F in the following circuit
calculate:





a. Total delay







b. Path effort and parasitic delay





c.
minimum theoritical delay









6
8
4
20
C
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

44

Summary


Logical effort is useful for thinking of delay in circuits


Numeric logical effort characterizes gates


NANDs are faster than NORs in CMOS


Paths are fastest when effort delays are ~4


Path delay is weakly sensitive to stages, sizes


But using fewer stages doesn’t mean faster paths


Delay of path is about log
4
F FO4 inverter delays


Inverters and NAND2 best for driving large caps


Provides language for discussing fast circuits


But requires practice to master