"Zero" Defect IC Quality - Auburn University

woundcallousSemiconductor

Nov 1, 2013 (3 years and 11 months ago)

96 views


Outlier Screening For “Zero” Defect
IC Quality

Adit D. Singh

Electrical & Computer Engineering

Auburn University

adsingh@auburn.edu

05/18/01 V4.3



Outline



Many integrated circuits contain
fabrication defects upon manufacture


Die yields may only be 20
-
50% for high
end circuits


ICs must be carefully tested to screen out
faulty parts before integration in systems


Small “latent” defects that cause
early life
failure

must also be eliminated



New “screening” methods address this
problem

IC Testing is a Difficult Problem



Need 2
3
= 8 input patterns to
exhaustively test a 3
-
input NAND


2
N

tests needed for N
-
input circuit


Many ICs have > 100 inputs





Only a very few input combinations
can be applied in practice




2
100

= 1.27 x 10
30


Applying 10
30

tests at 10
9

per second (1 GHZ)
will require

10
21

secs = 400 billion centuries!

3
-
input NAND



IC Testing in Practice


For high
-
end ICs



20
-
100 seconds of test time on very expensive
production testers


Several thousand test patterns applied


Test patterns chosen to detect likely faults


High economic impact


-
total test costs are approaching total


manufacturing costs


Despite the costs, testing is imperfect





How well must we test?


Approximate order
-
of
-
magnitude estimates



Number of parts per typical system: 100


Acceptable system defect rate: 1% (1 per 100)


Therefore, required part reliability


1 defect in 10,000


100 Defects Per Million (100 DPM)



Requirement ~100 DPM for commercial ICs



~500 DPM for ASICs



How well must we test?


Assume 2 million ICs manufactured with 50% yield



1 million GOOD >> shipped


1 million BAD >> test escapes cause defective


parts to be shipped


For 100 BAD parts in 1M shipped (DPM=100)


Test must detect 999,900


out of the 1,000,000 BAD

For 100 DPM: Needed Test Coverage = 99.99%


DPM Depends on incoming Yield

Test Coverage: 99.99% (Escapes 100 per million defective)


1 Million Parts @ 10% Yield


0.1 million GOOD >> shipped


0.9 million BAD >> 90 test escapes


DPM = 90 /0.1 = 900



1 Million Parts @ 90% Yield


0.9 million GOOD >> shipped


0.1 million BAD >> 10 test escapes


DPM = 10/0.9 = 11


Defect Clustering on Wafers


Defects on semiconductor
wafers are not uniformly
distributed but are
clustered



Local regions of low yield
can give high DPM


x

x

x

x

x

x


Good die from “Bad Neighborhoods” must be
more carefully tested to ensure no test escapes

x

x

IC Reliability: Early Life Failures


Manufacturing defects cause ICs to fail the
production test
-

“Killer” defects


-

failing parts discarded following testing


ICs also experience significant early life or
“infant mortality” failures

Reliability problem


Infant mortality results from
“Latent” defects



-

manufacturing flaws undetectable at initial test

Defect Types



Killer Defect

Latent Defect

Resistive open due to unfilled via causing a
TDF [
Madge et al, IEEE D&T 2003
]


A Resistive Via with an unfilled Void


Stress Testing



Infant mortality results from latent
manufacturing flaws that are undetectable
at initial wafer probe testing


Tested using accelerated life cycle or
stress tests


Burn
-
in tests exercise circuits at elevated
voltages and temperatures for a few hours
up to a few days in temperature controlled
burn
-
in “ovens”

Burn
-
in is Expensive



High end circuits have nanometer feature
sizes and operate on low voltages


Stress voltages and temperatures must be
carefully (individually) controlled to avoid
damaging the circuits >> expensive ovens


Needed burn
-
in times are growing because
voltage/temperature stress levels can only
be marginally increased from the nominal



Most ICs today cannot afford Burn
-
in!








Long warrantees ( ~ 4 years)


High warrantee repair costs ($1000)


Large number of parts per auto


High volumes ( ~30 million US sales)



Requirement:


< 10 DPM


“Zero Defects” Quality


New Automotive Reliability Specs


Motivated by


Focus: DPM Due to Latents



Assume Latent Defect Density


-

1% Killer Defect Density


Average No. (per die)


Killer Defects
(
l
)

1.0 0.5 0.1 0.01

Latent Defects
0.01 0.005 0.001 0.0001


Die Yield (e
-
l
)
37% 60% 90% 99%

Probability of Latent
0.01 0.005 0.001 0.0001


PPM Latents

10,000 5,000 1,000 100




Poisson Yield Model



Assume Latent Defect Density


-

1% Killer Defect Density


Die Yield (e
-
l
)
37% 60% 90% 99%

Probability of Latent
0.01 0.005 0.001 0.0001


PPM Latents

10,000 5,000 1,000 100


DPM from Latents:

Digital

10,000 5,000 1,000 100

Analog
(Latents 0.1%?) 1,000 500 100 10



Analog Parts

Reliability Screening



Screening involves discarding parts on
suspicion

without proof of functional error


Excessive
yield loss

if parts from “bad
neighborhoods” are completely discarded


*
100
-
1000X overkill

because there is no


test information from the die itself


“Outlier” detection methods can screen out
potential reliability failures with less overkill


-

but at higher test cost



Exploiting parameter correlation
“outlier" screening

Key Idea:


Analog circuit performance measures within die
or between nearby die on the wafer should be
correlated because of parameter matching


Any anomalies, even if within functional
specifications, indicate a defect which could be
a test escape and fail in the field, or result in a
reliability problem

Exploiting parameter correlation
for Reliability Screening


Application to


-

Digital Delay Testing


-

Analog Testing

Looks for an “outlier” electrical signature of a
latent defect



Timing Tests



Two
-
pattern test vectors <V
1
V
2
> cause a change at
the outputs


Switching delay

is the time from the application
(launch) of V
2

until change at the output


Worst case

switching delay < clock period





V
1

V
2


Limitation of Testing for Timing
Fails at Functional Clock Rate


Timing margins to allow for parameter
variations, clock skew, variations in test
conditions can make “small” defects
undetectable.


<V
1
V
2
>

critical path

Timing Margin

minVDD Testing



minVDD Testing finds the lowest VDD for
which the circuit passes a transition delay
fault (TDF) test for a given clock speed


An abnormal minVDD value with respect to
the expected value for the lot/neighborhood
indicates a defect that may be a test escape
or reliability failure

minVDD Testing



minVDD is found by repeatedly running the
test vectors at different VDD voltages and
performing a binary search until the failing
voltage is identified within desired accuracy


Since binary searches on full vector sets can
be expensive, methods have been developed
to work with reduced test sets
.


MinVDD Test


Reducing VDD slows the gates and
increases circuit delay until the circuit fails
at the rated clock


A delay defect can make it fail earlier,
exposing hidden “latent” defects


<V
1
V
2
>

critical path

Timing Margin

MinVDD vs Device Speed


Two different lots showing min VDD outliers and lot
-
to
-
lot intrinsic variation.

Minimum VDD results for different functional tests
clearly showing min VDD outliers (circled)

minVDD Testing


Applying Outlier Screening to
Analog Parts




“Outliers Screening with Multiple
-
Parameter Correlation Testing for
Analogue ICs”


Liguan Fang, Mohammed Lemnawar and Yizi Xing


Automotive Business Line

Philips Semiconductors

Nijmegen, The Netherlands


European Test Symposium, Southampton May 2006


Outliers Screening with Multiple
-
Parameter Correlation Testing for
Analogue ICs

Goal:


Achieve zero
-
defect product quality in
automotive application through outlier
screening


In Vehicle Network Communication IC


(interface between the protocol controller and
the physical bus)

How
Multiple
-
Parameter Correlation
Testing

Works for Analog ICs



Specifications can have wide limits to allow for
normal process based parameter variations



However parameters within the same device
track



One test result can often be more tightly
predicted based on others for the same part


-

Two identical channels have matched gain


-

Blocks using the same core amplifier layout


should have measured gain reflecting the


feedback ratios


How
Multiple
-
Parameter Correlation
Testing

Works

Block Diagram of the test Vehicle

In Vehicle Network Communication IC

Correlation in transmitter performance
in two modes (A+B and A+C)

FA indicates 10% of field returns caused by

particle defects at C4 and C5

Correlated Tests 540 and 550

Test 540: Blocks A and C active

Test 550: Blocks A and B active

New Test 555

Test 555 = Test 550
-

0.35 * Test 540 + 57.94

Mean value (from data) 0; standard deviation 2.2



6 Sigma Test limits for Test 555 = [
-
13.2, 13.2]

Verification based on historic data logs

One 6
-
sigma New Test 555 outlier out of 160K
devices tested and passed over 4 months


Test introduced in Production


Six screened parts in Batch A, mostly from
expected wafer locations: near wafer edge or near
other failed die



Defects observable in 4 out of 6 cases using only
visual inspection with microscope

Defect Visualization

Defect Visualization

Defect Visualization

Reduction Trend in Customer Returns

Conclusions

(Philips)



Virtually all the “Outliers” could be traced to
physical defects suggesting the potential for
reliability failure and customer returns


Reduction is customer returns since
introducing the test is strong evidence of its
effectiveness


The correlation test needs only minimum post
data calculation and no extra measurement


The extra test time was less than 1ms


Overall very low cost and effective approach

European Test Symposium, Southampton U.K.

May 2006


Adit D. Singh

Auburn University

Outlier Screening


What is screening?


Discarding some “suspect” die without conclusive
evidence that they will fail in operation


Based on:


Profiling: Bad Neighborhood


“company you keep”


Outliers: Behave differently in some way


“something doesn’t look right”




Basis for Screening


Profiling: “company you keep”



Outliers: “something doesn’t look right”


Airport Security “Screens”


because exhaustive testing


-

“strip searching and x
-
raying”
-



every airline passenger is cost prohibitive



Same cost trade
-
off!


What happens next?




Suspects are discarded


-

(bumped off the flight)





Suspects are tested further


Test Optimization


-

saves high cost testing of exhaustive


testing of every passenger


Adaptive testing


-

Appropriate tests are applied


depending on “what looks different”


Extreme cases!

More generally

Questions?