Bayesian Model Fusion: Large-Scale

Electronics - Devices

Nov 2, 2013 (4 years and 6 months ago)

83 views

Slide
1

Bayesian Model Fusion: Large
-
Scale
Performance Modeling of Analog and Mixed
-
Signal Circuits by Reusing Early
-
Stage Data

Fa Wang*, Wangyang Zhang*, Shupeng Sun*, Xin Li*, Chenjie Gu

*ECE Dept. Carnegie Mellon University, Pittsburgh, PA 15213

Intel Corp.

Hillsboro, OR 97124

Slide
2

Outline

Background

Bayesian Model Fusion

Experiment Results

Conclusion

Slide
3

Process Variations and Performance Modeling

Statistical performance modeling: approximate circuit
performance as an analytical function of process variations

P
erformance model is a powerful tool for efficient circuit
analysis:

Yield estimation

Corner extraction

Sensitivity analysis

Small
S
ize

Large
V
ariation

65nm

45nm

32nm

f:

circuit performance of interest (e.g. read delay of SRAM)

∆X:

a vector of random variables to model process variations

g
i
(
∆X):

basis functions (e.g., linear or quadratic polynomials)

α
i
:

model coefficients

Slide
4

Solving Performance Model: Least Squares Fitting (LSF)

Determine performance model

Total of
M

basis

Total of
K

MC samples

Basis 1

Basis 2

Basis
M

Basis functions

Model

coefficients

LSF

A set of sampling points are collected

Model coefficients are solved from the following linear equation

The problem is required to be over
-
determined in order to be
solvable (i.e. K > M)

Slide
5

Challenge: High Dimensionality

High dimensionality becomes a challenge in performance
modeling

Large number of independent random variables must be used to
describe variations in each transistor

Increased number of transistors in circuits

Example: a commercial 32nm CMOS process

~40 random variables to model mismatches of a single transistor

Due to high dimensionality (i.e. large # of basis functions), it’s
unrealistic to apply LSF (which requires # of MC samples> #
of basis functions)

Circuit

Transistor #

Random variable #

Operational amplifier

~ 50

~
2000

SRAM critical path

~ 10K

~
40
0K

Slide
6

To handle the high dimensionality problem,
sparsity

feature of
circuits has been explored
[1]

Sparsity

means that the circuit performance variability is only
dominated by a few random variables

Example: In SRAM critical path, many
Vth

mismatches of
transistors are not important

Performance model has a
sparse

profile:

Most of coefficients are zero or close to zero

Sparsity

[1] X. Li, "Finding deterministic solution from underdetermined equation: large
-
scale performance modeling of
analog/RF circuits," TCAD, vol. 29, no. 11, pp. 1661
-
1668, Nov. 2010

Basis functions

Model coefficients

Performance

Slide
7

Sparse Regression

Sparse regression algorithm is an efficient performance
modeling algorithm that utilizes the
sparsity

feature

Sparse regression is better than LSF because it requires less
number of samples by using
sparsity

feature

Efficiency of performance modeling can be further improved,
by considering

from design flow (will
be discussed in detail later)

Slide
8

Outline

Background

Bayesian Model Fusion

Experiment Results

Conclusion

Slide
9

Bayesian Model Fusion (BMF): Overview

Key idea: BMF facilitates late stage performance modeling
by
reusing data collected in the
early

stage

Early stage

data

Late stage

data

Performance modeling

Performance modeling

BMF

Performance modeling

Proposed

Slide
10

Analog and mixed
-
signal (AMS) circuit design spans multiple stages

AMS Circuit Design Flow

Design

cycle for analog and mixed
-
signal circuits

Schematic
design stage

Layout

design stage

Circuit
modeling

Performance
modeling

Performance
modeling

Early stage

Late stage

Slide
11

Correlation in AMS Design Flow

Leads to correlation among different stages

Comparator:

Schematic stage

Layout stage

One important fact in AMS design flow is that different stages
share the same circuit topology and functionality

Slide
12

Correlation in Performance Models

Correlation:
f
E
(
∆X
) and
f
L
(
∆X
) are “likely” to be similar

α
E1

α
E2

α
E3

α
E4

α
L1

α
E1

?/
L2

α
E2

?/
L3

α
E3

?/
L4

α
E4

f
E
(
∆X
)

f
L
(
∆X
)

g
1
(
∆X
)

g
2
(
∆X
)

g
3
(
∆X
)

g
4
(
∆X
)

f
E
(
∆X
):

early
-
stage performance model

f
L
(
∆X
):

late
-
stage performance model

α
E
i
,

α
L
i
:

model coefficients

g
i
(
∆X):

basis functions

Slide
13

Early stage
performance model

Very few late stage data

Early stage data

Bayesian inference

(Proposed)

Late stage performance model

The Proposed Algorithm Flow

Early
stage

Late
stage

Likelihood

Prior

Slide
14

Prior

Prior is a
distribution

that describes the uncertainty of
parameters based on early stage data, before late stage data
is taken into account

In our work, information in early design stage is encoded in
prior, which describes the uncertainty of late stage model
coefficients

Prior distribution

pdf
(
α
L,m
)

α
L,m2

α
L,m1

Higher

Probability

Lower

Probability

Slide
15

Prior

Magnitude information of early
-
stage model coefficients is
encoded in prior

Magnitude information here describes whether the absolute
value of coefficient is relatively large or small

Small (or zero) coefficients information represents
sparsity

profile, which is essential for performance model
[1]

Define prior distribution as a zero
-
mean Gaussian distribution

Key idea of encoding: the shape of prior is related to magnitude
information

Prior distribution

[1] X. Li, "Finding deterministic solution from underdetermined equation: large
-
scale performance modeling of
analog/RF circuits," TCAD, vol. 29, no. 11, pp. 1661
-
1668, Nov. 2010

Slide
16

Likelihood

Likelihood is a function of parameters, which evaluates
how
parameters fit with data

Late stage information is encoded in likelihood function

Specifically, late stage
performance function

information is
encoded in likelihood function

In our work, likelihood function describes how well model
coefficients fit with late stage data

Likelihood

likelihood(
α
L,m
)

α
L,m2

α
L,m1

Better fit

Worse fit

Slide
17

However, if we determine model coefficients solely based on
likelihood, we may have over
-
fitting problem

In our case, # of samples in late stage is smaller than # of model
coefficients in late stage

Bayesian’s theorem

Maximum
-
a
-
posteriori (MAP) estimation:

Maximum
-
A
-
Posteriori Estimation

Prior

Likelihood

Posterior

MAP estimation of
α
L

Prior distribution

pdf
(
α
L
)

Likelihood

Posterior

likelihood(
α
L
)

Slide
18

Outline

Background

Bayesian Model Fusion

Experiment Results

Conclusion

Slide
19

SRAM Example

Example 1: CMOS SRAM

Designed in a commercial 32nm SOI

61572 independent random process parameters are considered

Read delay is considered as performance

Linear performance model is fitted

Experiments run on a 2.5GHz Linux server with 16GB memory

Slide
20

Modeling Error

Two different methods are compared:

The proposed method (BMF)

Orthogonal Matching Pursuit (OMP)

Modeling error

4x

Slide
21

Modeling Time Speed
-
up

BMF requires 4x less samples to achieve similar accuracy as
OMP in SRAM

4x runtime speed
-
up
to build performance model

OMP

BMF

(Proposed)

Post
-
layout

samples

4
00

1
00

delay

error

1.02%

0.99%

Simulation

cost

(Hour)

38.77

9.69

Fitting

cost

(Second)

3.56

2.11

Total

modeling

cost

(Hour)

38.77

9.69

Slide
22

RO Example

Example 2: CMOS ring oscillator

Designed in a commercial 32nm SOI

7177 independent random process parameters are considered

Power, frequency and phase noise are considered as performance

Linear performance model is fitted

Experiments run on a 2.5GHz Linux server with 16GB memory

Slide
23

Modeling Error

Modeling error is measured
for power, frequency and
phase noise

Power

Frequency

Phase noise

9x

9x

9x

Slide
24

Modeling Time Speed
-
up

BMF requires 9x less samples to achieve similar accuracy as
OMP in RO

9x runtime speed
-
up
to build performance model

OMP

BMF

(Proposed)

Post
-
layout

samples

900

100

Power

error

0.77%

0.72%

Frequency

error

0.65%

0.54%

Phase

noise

error

0.12%

0.12%

Simulation

cost

(Hour)

12.58

1.40

Fitting

cost

(Second)

5.75

1.69

Total

modeling

cost

(Hour)

12.58

1.40

Slide
25

Conclusion

The proposed BMF method facilitates efficient high
-
dimensional performance modeling at late stage by reusing
early stage data

BMF achieves more than 4x runtime speedup over traditional
OMP method on SRAM and RO test cases

BMF can be used for commercial applications such as macro
-
modeling based verification