Lecture 4

Ηλεκτρονική - Συσκευές

26 Νοε 2013 (πριν από 4 χρόνια και 5 μήνες)

89 εμφανίσεις

Lecture
4
:

Delay
Optimization
and Logical
Effort

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

2

Outline

RC Delay Models

Delay Estimation

Logical Effort

Delay in a Logic Gate

Multistage Logic Networks

Choosing the Best Number of Stages

Example
s

Summary

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

3

Delay Definitions

t
pdr
:
rising propagation delay

Max time f
rom input to
rising output crossing
V
DD
/2

t
pdf
:
falling propagation delay

Max time f
rom input to
falling output crossing
V
DD
/2

t
pd
:
average propagation delay

t
pd

= (t
pdr

+ t
pdf
)/2

t
r
:
rise time

From output crossing 0.2
V
DD

to 0.8 V
DD

t
f
:
fall time

From output crossing 0.8
V
DD

to 0.2 V
DD

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

4

Delay Definitions

t
cdr
:
rising contamination delay

Min time f
rom input to rising output crossing V
DD
/2

t
cdf
:
falling contamination delay

Min time f
rom input to falling output crossing
V
DD
/2

t
cd
:
average contamination delay

t
c
d

= (t
cdr

+ t
cdf
)/2

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

5

Simulated Inverter Delay

Solving differential equations by hand is too hard

SPICE simulator solves the equations numerically

Uses more accurate I
-
V models too!

But simulations take time to write, may hide insight

(V)
0.0
0.5
1.0
1.5
2.0
t(s)
0.0
200p
400p
600p
800p
1n
t
pdf
= 66ps
t
pdr
= 83ps
V
in
V
out
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

6

Delay Estimation

We would like to be able to easily estimate delay

Not as accurate as simulation

But easier to ask “What if?”

The step response usually looks like a 1
st

order RC
response with a decaying exponential.

Use RC delay models to estimate delay

C = total capacitance on output node

Use
effective resistance

R

So that t
pd

= RC

Characterize transistors by finding their effective R

Depends on average current as gate switches

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

7

Effective Resistance

Shockley models have limited value

Not accurate enough for modern transistors

Too complicated for much hand analysis

Simplification: treat transistor as resistor

Replace I
ds
(V
ds
, V
gs
) with effective resistance R

I
ds

= V
ds
/R

R averaged across switching of digital gate

Too inaccurate to predict current at any given time

But good enough to predict RC delay

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

8

RC Delay Model

Use equivalent circuits for MOS transistors

Ideal switch + capacitance and ON resistance

Unit nMOS has resistance R, capacitance C

Unit pMOS has resistance 2R, capacitance C

Capacitance proportional to width

Resistance inversely proportional to width

k
g
s
d
g
s
d
kC
kC
kC
R
/
k
k
g
s
d
g
s
d
kC
kC
kC
2
R
/
k
Input
capacitance
Output
capacitance
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

9

RC Values

Capacitance

C = C
g

= C
s

= C
d

= 2 fF/
m
m of gate width in 0.6
m
m

m
m in nanometer techs.

Resistance

R

6 K
W
*
m
m in 0.6
m
m process

Improves with shorter channel lengths

Unit transistors

May refer to minimum contacted device (4/2
l
)

Or maybe 1
m
m wide device

Doesn’t matter as long as you are consistent

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

10

Inverter Delay Estimate

Estimate the delay of a fanout
-
of
-
1 inverter

C
C
R
2
C
2
C
R
2
1
A
Y
C
2
C
C
2
C
C
2
C
R
Y
2
1
d = 6RC

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

11

Delay Model Comparison

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

12

Example: 3
-
input NAND

Sketch a 3
-
input NAND with transistor widths chosen to
achieve effective rise and fall resistances equal to a unit
inverter (R).

3

3

3

2

2

2

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

13

2
2
2
3
3
3
2
2
2
3
3
3
3C
3C
3C
3C
2C
2C
2C
2C
2C
2C
3C
3C
3C
2C
2C
2C
2
2
2
3
3
3
3
C
3
C
5
C
5
C
5
C
9
C
3
-
input NAND Caps

Annotate the 3
-
input NAND gate with gate and diffusion
capacitance.

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

14

Example

Sketch a 2
-
input NOR gate with selected transistor
widths so that effective rise and fall resistances are
equal to a unit inverter’s. Annotate the gate and
diffusion capacitances.

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

15

Elmore Delay

ON transistors look like resistors

Pullup or pulldown network modeled as

R
1
R
2
R
3
R
N
C
1
C
2
C
3
C
N

nodes
1 1 1 2 2 1 2
......
pd i to source i
i
N N
t R C
RC R R C R R R C
 

       

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

16

Example: 3
-
input NAND

Estimate worst
-
case rising and falling delay of 3
-
input NAND
driving
h

identical gates.

9
C
3
C
3
C
3
3
3
2
2
2
5
hC
Y
n
2
n
1
h copies

9 5
pdr
t h RC
 

3 3 3 3 3 3
3 3 9 5
11 5
R R R R R R
pdf
t C C h C
h RC
      
 
 
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

17

Delay Components

Delay has two parts

Parasitic delay

9 or 11 RC

Effort delay

5h RC

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

18

Contamination Delay

Best
-
case (contamination) delay can be substantially less than
propagation delay.

Ex: If all three inputs fall simultaneously

5
9 5 3
3 3
cdr
R
t h C h RC
   
   
 
   
 
   
9
C
3
C
3
C
3
3
3
2
2
2
5
hC
Y
n
2
n
1
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

19

Delay in a Logic Gate

Express delays in process
-
independent unit

Delay has two components: d =
f

+
p

f
:
effort delay =
g
h

(a.k.a. stage effort)

Again has two components

g
:
logical effort

Measures relative ability of gate to deliver current

g

1 for inverter

h
:
electrical effort

= C
out

/ C
in

Ratio of output to input capacitance

Sometimes called fanout

p
: parasitic delay

Represents delay of gate driving no load

Set by internal parasitic capacitance

abs
d
d

 

3RC

3 ps in 65 nm process

60 ps in 0.6
m
m process

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

20

Electrical Effort
:
h
=
C
out

/
C
in
Normalized Delay
:
d
Inverter
2
-
input
NAND
g
=
1
p
=
1
d
=
h
+
1
g
=
4
/
3
p
=
2
d
=
(
4
/
3
)
h
+
2
Effort Delay
:
f
Parasitic Delay
:
p
0
1
2
3
4
5
0
1
2
3
4
5
6
Electrical Effort
:
h
=
C
out

/
C
in
Normalized Delay
:
d
Inverter
2
-
input
NAND
g
=
p
=
d
=
g
=
p
=
d
=
0
1
2
3
4
5
0
1
2
3
4
5
6
Delay Plots

d

=
f

+
p

=
gh

+
p

NOR2?

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

21

Computing Logical Effort

DEF:
Logical effort is the ratio of the input
capacitance of a gate to the input capacitance of an
inverter delivering the same output current
.

Measure from delay vs. fanout plots

Or estimate by counting transistor widths

A
Y
A
B
Y
A
B
Y
1
2
1
1
2
2
2
2
4
4
C
in
= 3
g = 3/3
C
in
= 4
g = 4/3
C
in
= 5
g = 5/3
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

22

Catalog of Gates

Gate type

Number of inputs

1

2

3

4

n

Inverter

1

NAND

4/3

5/3

6/3

(n+2)/3

NOR

5/3

7/3

9/3

(2n+1)/3

Tristate / mux

2

2

2

2

2

XOR, XNOR

4, 4

6, 12, 6

8, 16, 16, 8

Logical effort of common gates

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

23

Catalog of Gates

Gate type

Number of inputs

1

2

3

4

n

Inverter

1

NAND

2

3

4

n

NOR

2

3

4

n

Tristate / mux

2

4

6

8

2n

XOR, XNOR

4

6

8

Parasitic delay of common gates

In multiples of p
inv

(

1)

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

24

Example: Ring Oscillator

Estimate the frequency of an N
-
stage ring oscillator

Logical Effort:

g = 1

Electrical Effort:

h = 1

Parasitic Delay:

p = 1

Stage Delay:

d = 2

Frequency:

f
osc

= 1/(2*N*d) = 1/4N

31 stage ring oscillator in
0.6
m
m process has
frequency of ~ 200 MHz

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

25

Example: FO4 Inverter

Estimate the delay of a fanout
-
of
-
4 (FO4) inverter

Logical Effort:

g = 1

Electrical Effort:

h = 4

Parasitic Delay:

p = 1

Stage Delay:

d = 5

d

300 ps in 0.6
m
m process

15 ps in a 65 nm process

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

26

Multistage Logic Networks

Logical effort generalizes to multistage networks

Path Logical Effort

Path Electrical Effort

Path Effort

i
G g

out-path
in-path
C
H
C

i i i
F f g h
 
 
10
x
y
z
20
g
1
= 1
h
1
= x/10
g
2
= 5/3
h
2
= y/x
g
3
= 4/3
h
3
= z/y
g
4
= 1
h
4
= 20/z
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

27

Multistage Logic Networks

Logical effort generalizes to multistage networks

Path Logical Effort

Path Electrical Effort

Path Effort

Can we write F = GH?

i
G g

out path
in path
C
H
C

i i i
F f g h
 
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

28

Paths that Branch

No! Consider paths that branch:

G

= 1

H

= 90 / 5 = 18

GH

= 18

h
1

= (15 +15) / 5 = 6

h
2

= 90 / 15 = 6

F

= g
1
g
2
h
1
h
2

= 36
= 2GH

5
15
15
90
90
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

29

Branching Effort

Introduce
branching effort

Accounts for branching between stages in path

Now we compute the path effort

F = GBH

on path off path
on path
C C
b
C

i
B b

i
h BH

Note:

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

30

Multistage Delays

Path Effort Delay

Path Parasitic Delay

Path Delay

F i
D f

i
P p

i F
D d D P
  

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

31

Designing Fast Circuits

Delay is smallest when each stage bears same effort

Thus minimum delay of N stage path is

This is a
key

result of logical effort

Find fastest possible delay

Doesn’t require calculating gate sizes

i F
D d D P
  

1
ˆ
N
i i
f g h F
 
1
N
D NF P
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

32

Gate Sizes

How wide should the gates be for least delay?

Working backward, apply capacitance
transformation to find input capacitance of each gate

Check work by verifying input cap spec is met.

ˆ
ˆ
out
in
i
i
C
C
i out
in
f gh g
g C
C
f
 
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

33

Example: 3
-
stage path

Select gate sizes x and y for least delay from A to B

8
x
x
x
y
y
45
45
A
B
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

34

Example: 3
-
stage path

Logical Effort

G = (4/3)*(5/3)*(5/3) = 100/27

Electrical Effort

H = 45/8

Branching Effort

B = 3 * 2 = 6

Path Effort

F = GBH = 125

Best Stage Effort

Parasitic Delay

P = 2 + 3 + 2 = 7

Delay

D = 3*5 + 7 = 22 = 4.4 FO4

8
x
x
x
y
y
45
45
A
B
3
ˆ
5
f F
 
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

35

Example: 3
-
stage path

Work backward for sizes

y = 45 * (5/3) / 5 = 15

x = (15*2) * (5/3) / 5 = 10

P: 4
N: 4
45
45
A
B
P: 4
N: 6
P: 12
N: 3
8
x
x
x
y
y
45
45
A
B
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

36

Best Number of Stages

How many stages should a path use?

Minimizing number of stages is not always fastest

Example: drive 64
-
bit datapath with unit inverter

D

= NF
1/N

+ P

= N(64)
1/N
+ N

1
1
1
1
8
4
16
8
2
.
8
23
64
64
64
64
Initial Driver
N
:
f
:
D
:
1
64
65
2
8
18
3
4
15
4
2
.
8
15
.
3
Fastest
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

37

Derivation

Consider adding inverters to end of path

How many give least delay?

Define best stage effort

N - n
1
Extra Inverters
Logic Block:
n
1
Stages
Path Effort F

1
1
1
1
N
n
i inv
i
D NF p N n p

   

1 1 1
ln 0
N N N
inv
D
F F F p
N

    

1 ln 0
inv
p
 
  
1
N
F

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

38

Best Stage Effort

has no closed
-
form solution

Neglecting parasitics (p
inv

= 0), we find

= 2.718 (e)

For p
inv

= 1, solve numerically for

= 3.59

1 ln 0
inv
p
 
  
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

39

Review of Definitions

Term

Stage

Path

number of stages

logical effort

electrical effort

branching effort

effort

effort delay

parasitic delay

delay

i
G g

out-path
in-path
C
C
H

N
i
B b

F GBH

F i
D f

i
P p

i F
D d D P
  

out
in
C
C
h

on-path off-path
on-path
C C
C
b

f gh

f
p
d f p
 
g
1
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

40

Method of Logical Effort

1)
Compute path effort

2)
Estimate best number of stages

3)
Sketch path with N stages

4)
Estimate least delay

5)
Determine best stage effort

6)
Find gate sizes

F GBH

4
log
N F

1
N
D NF P
 
1
ˆ
N
f F

ˆ
i
i
i out
in
g C
C
f

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

41

Limits of Logical Effort

Chicken and egg problem

Need path to compute G

But don’t know number of stages without G

Simplistic delay model

Neglects input rise time effects

Interconnect

Iteration required in designs with wire

Maximum speed only

Not minimum area/power for constrained delay

CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

42

Example
1

Calculate the

a) logical effort

b) parasitic delay

c) effort and

d) delay in the following 6
-
input AND implementations as a function of
the path electrical effort H. Which implementation is the fastest if a)
H=1, b) H=5 and c) H=20?

(
a
)
(
b
)
(
c
)
(
d
)
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

43

Example 2

For the path between x and F in the following circuit
calculate:

a. Total delay

b. Path effort and parasitic delay

c.
minimum theoritical delay

6
8
4
20
C
CMOS VLSI Design

CMOS VLSI Design
4th Ed.

5: DC and Transient Response

44

Summary

Logical effort is useful for thinking of delay in circuits

Numeric logical effort characterizes gates

NANDs are faster than NORs in CMOS

Paths are fastest when effort delays are ~4

Path delay is weakly sensitive to stages, sizes

But using fewer stages doesn’t mean faster paths

Delay of path is about log
4
F FO4 inverter delays

Inverters and NAND2 best for driving large caps

Provides language for discussing fast circuits

But requires practice to master