Bayesian Model Fusion: Large-Scale

basketontarioElectronics - Devices

Nov 2, 2013 (3 years and 7 months ago)

63 views

Slide
1


Bayesian Model Fusion: Large
-
Scale
Performance Modeling of Analog and Mixed
-
Signal Circuits by Reusing Early
-
Stage Data

Fa Wang*, Wangyang Zhang*, Shupeng Sun*, Xin Li*, Chenjie Gu


*ECE Dept. Carnegie Mellon University, Pittsburgh, PA 15213


Intel Corp.

Hillsboro, OR 97124


Slide
2


Outline


Background


Bayesian Model Fusion


Experiment Results


Conclusion


Slide
3


Process Variations and Performance Modeling






Statistical performance modeling: approximate circuit
performance as an analytical function of process variations






P
erformance model is a powerful tool for efficient circuit
analysis:


Yield estimation


Corner extraction


Sensitivity analysis

Small
S
ize

Large
V
ariation

65nm

45nm

32nm

f:

circuit performance of interest (e.g. read delay of SRAM)

∆X:

a vector of random variables to model process variations

g
i
(
∆X):

basis functions (e.g., linear or quadratic polynomials)

α
i
:

model coefficients

Slide
4


Solving Performance Model: Least Squares Fitting (LSF)


Determine performance model

Total of
M

basis

Total of
K


MC samples

Basis 1

Basis 2

Basis
M

Basis functions

Model

coefficients


LSF


A set of sampling points are collected


Model coefficients are solved from the following linear equation


The problem is required to be over
-
determined in order to be
solvable (i.e. K > M)



Slide
5


Challenge: High Dimensionality


High dimensionality becomes a challenge in performance
modeling


Large number of independent random variables must be used to
describe variations in each transistor


Increased number of transistors in circuits


Example: a commercial 32nm CMOS process


~40 random variables to model mismatches of a single transistor






Due to high dimensionality (i.e. large # of basis functions), it’s
unrealistic to apply LSF (which requires # of MC samples> #
of basis functions)



Circuit

Transistor #

Random variable #

Operational amplifier

~ 50

~
2000

SRAM critical path

~ 10K

~
40
0K

Slide
6



To handle the high dimensionality problem,
sparsity

feature of
circuits has been explored
[1]


Sparsity

means that the circuit performance variability is only
dominated by a few random variables


Example: In SRAM critical path, many
Vth

mismatches of
transistors are not important


Performance model has a
sparse

profile:


Most of coefficients are zero or close to zero











Sparsity

[1] X. Li, "Finding deterministic solution from underdetermined equation: large
-
scale performance modeling of
analog/RF circuits," TCAD, vol. 29, no. 11, pp. 1661
-
1668, Nov. 2010

Basis functions

Model coefficients

Performance

Slide
7


Sparse Regression


Sparse regression algorithm is an efficient performance
modeling algorithm that utilizes the
sparsity

feature


Sparse regression is better than LSF because it requires less
number of samples by using
sparsity

feature


Efficiency of performance modeling can be further improved,
by considering
additional information

from design flow (will
be discussed in detail later)


Slide
8


Outline


Background


Bayesian Model Fusion


Experiment Results


Conclusion


Slide
9


Bayesian Model Fusion (BMF): Overview


Key idea: BMF facilitates late stage performance modeling
by
reusing data collected in the
early

stage

Early stage

data

Late stage

data

Performance modeling

Performance modeling

Traditional

BMF

Performance modeling

Proposed

Slide
10



Analog and mixed
-
signal (AMS) circuit design spans multiple stages









AMS Circuit Design Flow

Design

cycle for analog and mixed
-
signal circuits

Schematic
design stage

Layout


design stage

Circuit
modeling

Performance
modeling





Performance
modeling



Early stage

Late stage

Slide
11


Correlation in AMS Design Flow


Leads to correlation among different stages

Comparator:

Schematic stage

Layout stage


One important fact in AMS design flow is that different stages
share the same circuit topology and functionality

Slide
12


Correlation in Performance Models










Correlation:
f
E
(
∆X
) and
f
L
(
∆X
) are “likely” to be similar


α
E1

α
E2

α
E3

α
E4



α
L1



α
E1

?/
L2



α
E2

?/
L3



α
E3

?/
L4



α
E4



f
E
(
∆X
)

f
L
(
∆X
)

g
1
(
∆X
)

g
2
(
∆X
)

g
3
(
∆X
)

g
4
(
∆X
)

f
E
(
∆X
):

early
-
stage performance model

f
L
(
∆X
):

late
-
stage performance model

α
E
i
,

α
L
i
:

model coefficients

g
i
(
∆X):

basis functions

Slide
13


Early stage
performance model

Very few late stage data

Early stage data

Bayesian inference

(Proposed)

Late stage performance model

The Proposed Algorithm Flow

Early
stage

Late
stage

Likelihood

Prior

Slide
14


Prior










Prior is a
distribution

that describes the uncertainty of
parameters based on early stage data, before late stage data
is taken into account


In our work, information in early design stage is encoded in
prior, which describes the uncertainty of late stage model
coefficients
















Prior distribution

pdf
(
α
L,m
)

α
L,m2

α
L,m1

Higher

Probability

Lower

Probability

Slide
15


Prior


Magnitude information of early
-
stage model coefficients is
encoded in prior


Magnitude information here describes whether the absolute
value of coefficient is relatively large or small


Small (or zero) coefficients information represents
sparsity

profile, which is essential for performance model
[1]


Define prior distribution as a zero
-
mean Gaussian distribution


Key idea of encoding: the shape of prior is related to magnitude
information













Prior distribution

[1] X. Li, "Finding deterministic solution from underdetermined equation: large
-
scale performance modeling of
analog/RF circuits," TCAD, vol. 29, no. 11, pp. 1661
-
1668, Nov. 2010

Slide
16


Likelihood


Likelihood is a function of parameters, which evaluates
how
parameters fit with data


Late stage information is encoded in likelihood function


Specifically, late stage
performance function

information is
encoded in likelihood function


In our work, likelihood function describes how well model
coefficients fit with late stage data









Likelihood

likelihood(
α
L,m
)

α
L,m2

α
L,m1

Better fit

Worse fit

Slide
17



However, if we determine model coefficients solely based on
likelihood, we may have over
-
fitting problem


In our case, # of samples in late stage is smaller than # of model
coefficients in late stage


Bayesian’s theorem





Maximum
-
a
-
posteriori (MAP) estimation:










Maximum
-
A
-
Posteriori Estimation

Prior

Likelihood

Posterior

MAP estimation of
α
L

Prior distribution

pdf
(
α
L
)

Likelihood

Posterior

likelihood(
α
L
)

Slide
18


Outline


Background


Bayesian Model Fusion


Experiment Results


Conclusion


Slide
19


SRAM Example


Example 1: CMOS SRAM


Designed in a commercial 32nm SOI


61572 independent random process parameters are considered


Read delay is considered as performance


Linear performance model is fitted


Experiments run on a 2.5GHz Linux server with 16GB memory

Slide
20


Modeling Error


Two different methods are compared:


The proposed method (BMF)


Orthogonal Matching Pursuit (OMP)


Modeling error

4x

Slide
21


Modeling Time Speed
-
up


BMF requires 4x less samples to achieve similar accuracy as
OMP in SRAM


4x runtime speed
-
up
to build performance model

OMP

(Traditional)

BMF

(Proposed)

Post
-
layout

samples

4
00

1
00

Read

delay

error

1.02%

0.99%

Simulation

cost

(Hour)

38.77

9.69

Fitting

cost

(Second)

3.56

2.11

Total

modeling

cost

(Hour)

38.77

9.69

Slide
22


RO Example


Example 2: CMOS ring oscillator


Designed in a commercial 32nm SOI


7177 independent random process parameters are considered


Power, frequency and phase noise are considered as performance


Linear performance model is fitted


Experiments run on a 2.5GHz Linux server with 16GB memory

Slide
23


Modeling Error


Modeling error is measured
for power, frequency and
phase noise

Power

Frequency

Phase noise

9x

9x

9x

Slide
24


Modeling Time Speed
-
up


BMF requires 9x less samples to achieve similar accuracy as
OMP in RO


9x runtime speed
-
up
to build performance model

OMP

(Traditional)

BMF

(Proposed)

Post
-
layout

samples

900

100

Power

error

0.77%

0.72%

Frequency

error

0.65%

0.54%

Phase

noise

error

0.12%

0.12%

Simulation

cost

(Hour)

12.58

1.40

Fitting

cost

(Second)

5.75

1.69

Total

modeling

cost

(Hour)

12.58

1.40

Slide
25


Conclusion


The proposed BMF method facilitates efficient high
-
dimensional performance modeling at late stage by reusing
early stage data


BMF achieves more than 4x runtime speedup over traditional
OMP method on SRAM and RO test cases


BMF can be used for commercial applications such as macro
-
modeling based verification