________________________________________
†
:
Corresponding Author
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
Constructing a Wafer Defect Diagnostic System by Integrating Yield
Prediction and Defect Pattern Recognition
Li

Chang Chao
†
Graduate Institute
of Industrial Management,
Taiwan Shoufu
University
No.
168
, Nanshi Li, Madou Town, Tainan County 72153, Taiwan(R.O.C.)
Tel.: +886

6

5718888 ext.854; fax: +886

6

571
247
3
Email: fredchao@dwu.edu.tw
Li

I Chao
Computer Center,
Taiwan Shoufu
University
No.
168
, Nanshi Li, Madou Town, Tainan County
72153, Taiwan(R.O.C.)
Email: liichao@yahoo.com.tw
Abstrac
t

Wafer yield is a highly effective means of evaluating the process capability of integrated circuit
manufacturers. The defect number and cluster intensity of defects on a wafer are two critical factors
influencing wafer yield. As wafer sizes increase, the c
lustering phenomenon of defects increases. Clustered
defects cause the conventional Poisson yield model to underestimate actual waf
er yield. The cluster
parameter
α
of the negative binomial model can be very scattered and negative when the model is applied
to
predict yield. Compound Poisson yield models are complicated. The degree of fitness must be considered
when the regression methods are utilized to model the yield. Obtaining good prediction network requires
substantial effort to identify the parameters
of back

propagation neural network. Although some yield models
consider the effects of defect clustering on yield prediction, these models have some drawbacks. Furthermore,
the possible causes of process variation can be found by operators through analyzi
ng the defect pattern on a
wafer. Judging the process variation by operators is time

consuming and the accuracy of variation detecting
can be influenced resulting from the erroneous judgment. Although some recognizing methods considering
defect pattern on
a wafer were proposed, these recognizing methods still have some flaws. This study presents
a novel wafer defect diagnostic system that utilizes a general regression neural network integrating a multi

class support vector machines to predict the wafer yiel
d and recognize the defect pattern on a wafer. A
simulation study is utilized to demonstrate the effectiveness of the proposed method.
Keywords:
Defect, clustering phenomenon, yield model, general regression neural network, pattern
recognition, support ve
ctor machines
.
1. INTRODUCTION
As wafer sizes increase, the clustering phenomenon of
defects increases. Clustered defects cause
the conventional
Poisson yield model underestimate actual wafer yield, as
defects are no longer uniformly distributed over a wafer.
Although some yield models
consider the effects of defect
clustering on yield prediction, these models have some
drawbacks.
Furthermore,
Wafers are inspected during
manufacturing by retrieving information about defect
number and defect pattern by manually inspecting or
automatically classifying defects.
However, human
recognition of defect patterns can be time

consuming and
in
accurate.
Although some recognizing methods for defect
pattern on a wafer were proposed, these recognizing
methods still have some flaws.
Numerous mathematical models have been developed
for predicting wafer yield in the last 40 years
(
Cunningham,
J.
A. 19
90
)(
Stapper, C.
H.
1991
)(
Stapper, C. H. & Rosner,
R. J. 1995
)(
Tyagi, A.
& Bayoumi, A.
M. 1992
)
. The
Poisson model is the simplest model to use
.
However,
Stapper
(1985)
reported that defects are typically clustered
rather than dispersed randomly over a
wafer. Clustered
defects usually violate the independence assumption of the
Poisson model.
Under this scenario, numerous yield models
obtain more accurate yield predictions than the Poisson
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
model.
The
Compound Poisson yield models are
complicated
(
Cunningh
am, J.
A. 1990
)
. The cluster
parameter
of the negative binomial model can be very
scattered and negative when the model is applied to predict
yield
(
Cunningham, J.
A. 1990
)
. Dupret and Kielbasa
(2004)
use the partial least square (PL
S) regression methods to
model the yield from measurements obtained during the
production. However, an advanced statistics is needed to
use the PLS regression methods.
Consequently, these
mathematical yield models have particular problems in
predicting waf
er yield.
S
tatistical approach, heuristic approach and
simulation approach
are
fundamental approaches to solv
e
pattern recognition problems
(
Nieddu, L. & Patrizi, G.
2000
)
.
A
n underlying statistical model for generating these
patterns
is
utilized to classify patterns by the
statistical
approach
.
The heuristic approach utilizes soft computing
schemes to perform pattern recognition.
However,
the
expensive evaluation processes
to achieve optimal solutions
need to overcome
(
Bhanu
et al
.
1995
)
.
The simulation
approach
subsequently lead
s
to a class of artificial neural
sys
tems termed neural networks
(
Jain
et al
.
2000
)
(
Nieddu,
L. & Patrizi, G.
2000
)
. However, neural networks
need to
adequately
determine the
parameters of the networks
(
Fiesler
, E.
1994
)
.
Constructing
wafer
yield
model
and c
onstructing
wafer defect
pattern
recognition
are
important issue
s
in
integrated circu
its (IC) manufacturing
.
This study presents
a novel wafer defect diagnostic system that utilizes a
general regression
neural network
(GRNN)
integrating a
multi

class support vector machines
(SVM)
to predict the
wafer yield and recognize the defect pattern on a wafer. A
simulation study is utilized to demonstrate the effectiveness
of the proposed method.
2.
RELATED
LITERATURE
The defect cluster indices
,
which
consider the effects
of defect clustering on yield prediction,
are introduced.
Approaches to
predicting wafer yield and approaches to
solving pattern recognition problems are then surveyed.
2.1
Defect Cluster
Index
Many cluster indices have been developed to
describe
the intensity of d
efects scattered on a wafer
(
Stapper, C.
H.
1973
) (
Tyagi, A. & Bayoumi, A. M.
1992
, 1994)
. The
ne
gative binomial yield model
utilizes a cluster parameter
to measure the intensity of defects clustered
(
Stapper,
C.H. 1973
)
. Tyagi and Bayoumi
(1992, 1994)
proposed a
variance/mean ratio
M
V
/
to evaluate the intensity of
defects clustered. Jun
et al
.
(1999)
proposed a cluster index
CI
to evaluate the intensity of defects clustered on a wafer.
Chao
(2009)
proposed a cluster index
CI
E
for depicting the
varying intensity of wafer cluster defects
.
2.
2
Y
ield
M
odels
The Poisson yield model
,
which is based on the
Poisson distribution
(
Ferris

Prabhu, A.
V. 1992
)
. The
Poisson yield model was sufficiently effective for small
chip sizes and tended to underestimate yields for larger chip
sizes
(
Cunningham, J. A. 1990
)
. To identify the clustering
properties of defects in the yield model, some
spatial
distributions, including compound Poisson distributions,
have been considered
(
Raghavachari, M.
et al
.
1997
)
. The
compound Poisson yield model replaces defect density,
which is assumed to be a constant in the Poisson yield
model, with a probabilit
y density function. The negative
binomial yield model, which is a widely applied yield
model, employs a gamma function for the distribution of
defect density
(
Okabe, T.
et al
.
1972
)(
Stapper, C.
H. 1973
)
.
The negative binomial model has been shown to be a
powerful prediction model in IC manufacturing. However,
reports also show that the cluster
parameter
in the negative
binomial model can be very scattered and negative when
the model is used to predict yield
(
Cunningham, J. A. 1990
)
.
Other yield models used
in various companies are
summari
zed in
the literature (
Stapper, C. H. & Rosner, R. J.
1995
)
.
2.
2.1
G
eneral
R
egression
N
eural
N
etwork
GRNN
is a
three

layer network model
(
Specht, D.
F.
1991
)
.
Input units are merely distribution units which
forward measurement variables to the pattern units in the
second (hidden) layer. This
hidden layer consists of one
neuron for each pattern in the training pattern.
The GRNN
is essentially trained after one p
ass of the training patterns
and its activation function normally uses an exponential
function. The unique parameter of GRNN is the smoothing
factor
which influences the output value; that is, high
smoothing factors produce
increased
relaxed surface fits
throughout the data.
Unlike the conventional regression
model, GRNN can be defined through its joint continuous
probability density function, rather than utilizing a
specified function that must be determined in advance.
The
GRNN model utilizes a Parzen
window
(
Parzen, E.
1962
)
,
which is a nonparameter approach to estimating the joint
continuous probability density function
.
GRNN measures how far a given sample pattern is
from patterns in the training set. When a new pattern is
presented to the network,
the input pattern is compared to
all of the patterns in the training set to determine how far it
is from those patterns. The output that is predicted by the
network is a proportional amount of all of the outputs in the
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
training set. The proportion is based
upon how far the new
pattern is from the given patterns in the training set. GRNN
uses an algorithm to find appropriate individual smoothing
factors for each input as well as an overall smoothing factor.
The algorithm proceeds in two parts. The first part
trains
the network with the data in the training set. The second
part tests a whole range of smoothing factors. The method
will produce networks which work much better on the test
set.
The major difference between GRNN and other
supervised neural networks
is that GRNN can treat
continuous valued outputs and categorize data, and t
here
are fewer training parameters
are required. Moreover,
GRNN
can be used for any regression problem in which a
linearity assumption is violated, and it converges fast on the
opt
imal regression surface as the number of samples
becomes substantially large. The GRNN model, then, is
used in this study to
construct
wafer yield
model
.
2.3 Recognizing Defect Patterns
Many techniques used for wafer defect pattern
recognition are
statistical approach, heuristic approach and
simulation approach (Nieddu, L. & Patrizi, G. 2000). The
statistical approach can be viewed as determining a strategy
for classifying samples based on the measurement of
feature vector, such that classification
error is minimized.
The heuristic approach attempts to clarify the essential
problem and use available personal knowledge to solve it
with the assistance of soft computing schemes. But there
exists lots limitations (Bhanu et al. 1995) (Chen, C. L. &
Chang,
M. H. 1998). The simulation approach emulates the
computational paradigm of a biological system. Current
knowledge of cerebral processes is transferred from a
neuro

physiological medium to an electronic one. This
leads to neural networks. But there exists
lots drawbacks
(Jain et al. 2000) (Nieddu, L. & Patrizi, G. 2000).
SVM
have been widely used for pattern recognition in recent
years. Several studies report that the
SVM
classification is
more accurate than existing classification algorithms (Hsu,
C. W. &
Lin, C. J. 2002) (Joachims, J. 1998).
The
multi

class
SVM
, then, is used in this study to
r
ecogniz
e
wafer
d
efect
p
atterns
.
2.
3
.1
Support Vector Machines
SVM
techni
que was introduced by Vapnik
(
Cortes, C.
& Vapnik, V.
1995
)
. The original intent of the SVM
algorithm was to use a linear separating hyperplane to build
a classifier.
F
or all hyperplanes separating data, there exists
a unique optimal hyperplane distinguished by the
maximum margin of separation between any tr
aining
point
and the hyperplane
.
If the training set of instance

label
pairs are non

linearly separable, the linear
SVM
may not
work well again. The non

linear kernel can then solve the
classification problem. The most commonly applied non

linear kernels are the polynomial kernel, the Gaussian
kernel and the sigmoid kernel. The classification problem
can obtain reasonable
results when the Gaussian kernel is
applied to map samples into a higher dimensional space
(
Keerthi, S. S. & Lin, C. J.
2003
)
.
The classification problem mentioned above refers to
binary classification. Many real

world problems, however,
have more than two
classes.
A multi

class
SVM
can be
employed to solve the classification problem that have
more than two classes. Many methods have been developed
to solve multi

class
SVM
such as the
one

against

all
method
(
Bottou
et al
. 1994),
one

against

one method
(
Kre
B
el
, U. 1999
),
Directed Acyclic Graph method
(
Platt
et
al
. 2000) and c
onsidering all
d
ata at a
o
nce
(
Vapnik, V.
1998
).
This study utilizes a multi

class
SVM
for wafer
defect pattern recognition.
Because the training time of the
one

against

one method is th
e shortest of these methods
(
Hsu, C. W. & Lin, C. J. 2002
)
, this method is used for
wafer defect pattern recognition in this study.
3
.
PROPOSED APPROACH
Wafers must be further analyzed to determine whether
a specific defect pattern causes the medium or l
ow yield.
Therefore, the factors affecting yield are selected as the
features for
the yield model and
the
pattern
recognition in
the
wafer defect diagnostic system
.
This study
construct
s a
wafer defect diagnostic system that utilizes a
GRNN
network
integrating a multi

class
SVM
to predict the wafer
yield and recognize the defect pattern on a wafer.
3
.
1
Feature
Selection
Yield models
can be described as
)
,
,
(
K
A
D
f
Y
(
1
)
w
here
D
represents
the average number of
defects per unit
area
,
K
represents an empirical correction factor for chip
area
A
(
Cunningham, J. A. 1990
)
.
The average number of
defects per unit area
D
can be used to describe the intensity
of the defect

dense areas on a wafer. The average number
of defects per unit area
D
can be used as a feature factor
for
the
wafer defect diagnostic system
.
The defect number and cluster intensity of defects on a
wafer
are two critical factors that may influence wafer yield.
T
he angle variation
A
CV
and the distance variation
D
CV
obtained by measuring the angle variation and the distance
variation of the individual defect on a wafer
are also
utilized as feature factors. The
A
CV
and
D
CV
can be
derived as follows:
Step 1: Determine the positive angle
i
,
which is the
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
angle
between the coordinates of individual defect and the
x

axis. The
i
can be described as
n
i
x
y
i
i
i
,...,
2
,
1
,
tan
1
(
2
)
where
i
x
and
i
y
denote the x and the y coordinates,
respectively, of the
i

th defect in the x

y plant. Sorting
i
in ascending order obtains
)
(
i
.
A sequence of angle
differences is defined as
n
i
A
i
i
i
,...,
2
,
1
,
)
1
(
)
(
(
3
)
where
0
)
0
(
.
Step 2: Determine
i
L
as the distance
between the
individual defect and the origin in the coordinate axes. The
i
L
can be described as
n
i
y
x
L
i
i
i
,...,
2
,
1
,
2
2
(
4
)
Sorting
i
L
in ascending order obtains
)
(
i
L
.
The
sequence of distance differences is defined by
n
i
L
L
D
i
i
i
,...,
2
,
1
,
)
1
(
)
(
(
5
)
where
0
)
0
(
L
.
Step 3: The
A
CV
and
D
CV
are defined
as
A
S
CV
A
A
(
6
)
D
S
CV
D
D
(
7
)
where
A
and
2
A
S
denote the sample mean and the
sample variance of
i
A
, respectively, and
D
and
2
D
S
denote the sample mean and the sample variance of
i
D
,
respectively. The variations of the ang
le differences and the
distance differences are smaller when defects are randomly
distributed than when defects are clustered. One of these
two variations is increased regardless of the defect pattern.
Therefore, the wafer map presents certain patterns of
defect
clusters as long as one of these differences posses a large
variation.
Therefore,
A
CV
and
D
CV
can provide feature
factors for
the wafer defect diagnostic system.
Moreover, t
he cluster index
E
CI
is
utilized to
be a
feature factor
and can be described as:
s
i
i
i
s
E
p
p
p
p
p
CI
1
2
2
1
))
1
(
log
(
)
,
,
,
(
(
8
)
where
s
represents the number of defect clusters;
i
p
represents the proportion of defects in the
i
th
cluster to total
number of wafer defects.
The more profound the cluster
phenomenon, the larger the
E
CI
.
Clearly, the
E
CI
possesses the advantage of accurately detecting the
intensity of clustering defects.
E
CI
is employed
as a
feature factor
for
the wafer defect diagnostic system.
3
.
2
Constructing Diagnostic System
A major cause affecting yield is the degree to which
defects are clustered (Friedman
et al
. 1997) (Stapper
et al
.
1983). In addition to the random pattern, common wafer
defect clustering patterns include bull’s eye pattern,
crescent moon pattern, bottom pattern and edge pattern
(Friedman
et al
. 1997).
Four feature factors (
D
,
A
CV
,
D
CV
and
E
CI
) are
suggested for
the wafer defect diagnostic system
.
The
GRNN
network predicts
wafer yield
by
employing these
four feature factors as inputs
and t
he actual yield of the
wafer
as
the only
output of
the GRNN yield model. The
percentage of the chip without defects on a wafer is used as
the actual yield value of the wafer
. T
hen
,
a
multi

class
SVM classifies wafer defect patterns by
employing these
four feature factors as inputs and one of five defect pa
tterns
as output
.
The relationships
among
these feature factors
,
yield
s
and defect patterns can be constructed by presenting
the adequate training and testing samples in the
wafer
defect diagnostic system.
T
he proposed approach for the
w
afer
d
efect
d
iagnostic
s
ystem
can be described as follows:
Step 1: Obtain the simulated defect wafer map. Utilize
Borland Delphi programming language to simulate all
possible defect clustering patterns for 8

inch wafers.
Step 2: Calculate
the values of
all
feature f
actors
for
each
wafer
. For each defect clustering pattern on a wafer,
calculate the
se
f
our feature factors (
D
,
A
CV
,
D
CV
and
E
CI
)
.
Step 3: Build a GRNN yield model. Input the
feature
factors
in Step 2 into the GRNN yield model. The actual
yield of the wafer is the only output of the GRNN yield
model. The percentage of the chip without defects on a
wafer is used as the actual yield value of the wafer. In this
study, the neural n
etworks package NeuroShell 2 is
employed to train and test the GRNN network.
Step 4
:
Build a
multi

class SVM classifi
er
.
These f
our
identical
feature factors are suggested for recognizing
defect patterns. A multi

class SVM classifies wafer defect
patterns
by employing these four feature factors as inputs
and one of five defect patterns as output.
In this study, the
LIBSVM
is employed to train and test the
multi

class SVM
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
(
Chang, C. C. & Lin, C. J. (2004)
(
Hsu, C. W. & Lin, C. J.
2002
)
.
Step 5:
D
iagnose
the wafers with
medium or low yield
.
Input the wafers
with
medium or low yield
predicted in
Step 3 to
the trained
multi

class SVM
in Step 4.
Wafers
can
be further analyzed to determine whether a specific defect
pattern causes the medium or low yield.
4
.
IMPLEMENTATION
4
.1
Simulation Study
This study employs three design factors to simulate
defect cluster patterns
in 8

inch wafers
: defect number,
percentage of defects located in grey regions and size of
grey regions.
Defect number
is the
number of defects
distributed over the entire wafer. Five factor levels for 25,
50, 100, 200 and 300 defects are simulated.
Percentage of
defects located in grey region represents the defect

dense
areas on a wafer. In the four clustering patterns,
four
fa
ctor
levels for 80%, 85%, 90% and 95% are simulated, and the
remaining d
efects are distributed randomly
.
Three sizes of
grey regions considered
are
25, 49 and 81 cm
2
.
According to the above three design factors. Each
trial of factor

level combination is r
eplicated five times, to
obtain 12
2
5 simulation trials. Specifically, there are 12
2
5
simulated wafer maps.
T
he 12
2
5 simulated wafer maps
were utilized as samples for constructing the
wafer defect
diagnostic system
.
The 12
2
5 wafers were divided into two
par
ts: one part containing
98
0 wafers used to train the
diagnostic system
; the second part containing 2
4
5 wafers
used to test the accuracy of the
diagnostic system
.
F
our
feature factors (
D
,
A
CV
,
D
CV
and
E
CI
)
are obtained
for each simulation wafer by simple calculation. These f
our
feature factors
are utilized as inputs
,
the resp
ective
yield
and
defect pattern of the 12
2
5 wafer maps are utilized as
outputs for the proposed
diagnostic
system
. The trained
diagnostic system
can then be
further analyzed to determine
whether a specific defect pattern causes the medium or low
yield.
Software utilized in this study for
GRNN network
was
the neural networks package NeuroShell 2
and for
multi

class SVM
was
LIBSVM
(
Hsu, C. W. & Lin, C. J.
2002
).
To obtain the generalization results, five

fold cross

validation was used to determine optimal parameter
combinations.
In this study, the unique parameter of GRNN
network, that is, the smoothing factor,
is set at 0.07
22, and
t
he penalty parameter and the kernel parameter for multi

SVM
were
8192
and 0.125, respectively.
Extra
1
0
wafer maps
are simulated to show
the
reproductive performance of the proposed d
iagnostic
s
ystem
. Table
1
summarizes the
attributed values
for th
ese
10
wafer
s.
The attributed value of GRNN predicted yield
column show that
w
afer
1, 2, 4, 6, 8, 10
present
the
medium or low yield
.
T
hese wafers must
be further
analyzed to determine whether a specific defect pattern
causes the
ph
enomenon
of yield down
.
Table
2
shows
the
actual defect pattern and the respective pattern recognized
by the multi

class SVM
for th
ese 6
wafer
s.
Table
2
reveals
that the proposed approach produces
a good
diagnostic
for
the wafers
with
medium or low yield
.
5
.
CONCLUSION
This study presents a novel
diagnostic system
that
utilizes
a GRNN network for predicting wafer yield
and
utilizes
a
multi

class
SVM
for recognizing wafer defect
patterns. A simulated case is applied to demonstrate the
effectiveness of the proposed model
.
The merits of the proposed approach are summarized
as follows:
1
.
The proposed method utilizes
f
our
relevant
feature
factors
:
D
,
A
CV
,
D
CV
and
E
CI
as input for
constructing the wafer defect
diagnostic system
. The
diagnostic
result
s
show that the proposed
method
achieves
accurate
diagnostic
.
2.
The proposed system can be integrated with KLA
inspection machines to
d
iagnose
wafers presenting medium
or low yield.
Table 1
:
Attributed values
for th
ese extra 10
wafer
s
Table 2
:
Actual defect pattern and
the
multi

class SVM
recognized
pattern
for th
ese 6
wafer
s
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
ACKNOWLEDGMENT
The authors would like to thank the National Science
Council of the Republic of China, Taiwan, for financially
supporting this research under Contract No. NSC
99

2218

E

434

001

.
REFERENCES
Bhanu,
B., Lee, S. & Ming, J. (1995). Adaptive image
segmentation using a genetic algorithm.
IEEE Transactions
on Systems Man Cybernetics
, 25(12), 1543

1567.
Bottou, L., Cortes, C., Denker, J., Drucker, H., Guyon, I.,
Jackel, L., LeCun, Y., Muller, U., Sackinger,
E., Simard, P.
& Vapnik, V. (1994). Comparison of classifier methods: A
case study in handwritten digit recognition.
Proceedings of
the International Conference on Pattern Recognition
. Los
Alamitos, CA : IEEE Computer Society Press.
Chang, C. C. & Lin, C.
J. (2004). LIBSVM: A library for
support vector machines.
http://www.csie.ntu.edu.tw/~cjlin/libsvm/
.
Chao, L. C. and Tong, L. I. (2009). Wafer defect pattern
recognition by multi

class support vector
machines by
using a novel defect cluster index.
Expert Systems with
Applications
(SCI), 36(6), 10158
–
10167.
Chen, C. L. & Chang, M. H. (1998). Optimal design of
fuzzy sliding

mode control: A comparative study.
Fuzzy
Sets and Systems
, 93(1), 37

48.
Cortes,
C. & Vapnik, V. (1995). Support vector networks.
Machine Learning
, 20(3), 273

297.
Cunningham, J.A. (1990). The use and evaluation of yield
models in integrated circuit manufacturing.
IEEE Trans. on
Semiconductor Manufacturing
, 3(2), 60

71.

3
Dupret, Y.,
& Kielbasa, R. (2004). Modeling semiconductor
manufacturing yield by test data and partial least squares.
Proceedings of 16th International Conference on
Microelectronics
(pp. 404

407). France.

4
Ferris

Prabhu, A.V. (1992).
Introduction to semiconductor
de
vice yield modeling
. Boston : Artech House.
Fiesler, E. (1994). Comparative bibliography of ontogenic
neural networks.
Proceedings of the International
Conference on Artificial Neural Networks
(pp. 793

796).
Sorrento, Italy.
Friedman, D. J., Hansen, M. H.,
Nair, V. N. & James, D. A.
(1997). Model

free estimation of defect clustering in
integrated circuit fabrication.
IEEE Transactions on
Semiconductor Manufacturing
, 10(3), 344

359.
Hsu, C. W. & Lin, C. J. (2002). A comparison of methods
for multi

class supp
ort vector machines.
IEEE Transactions
on Neural Networks
, 13(2), 415

425.
Jain, A. K., Duin, R. P. W. & Mao. J. (2000). Statistical
pattern recognition: A review.
IEEE Transactions on
Pattern Analysis and Machine Intelligence
, 22(1), 4

37.
Joachims, J. (1
998). Text categorization with support vector
machines: learning with many relevant features.
Proceedings of ECML

98, 10th European Conference on
Machine Learning
(pp. 137
–
142).
Jun, C. H., Hong, Y., Kim, S. Y., Park, K. S. & Park, H.
(1999). A simulation

based semiconductor chip yield model
incorporating a new defect cluster index.
Microelectronics
Reliability
, 39(4), 451

456.
Keerthi, S. S. & Lin, C. J. (2003). Asymptotic behaviors of
support vector machines with Gaussian kernel.
Neural
Computation
, 15(7)
, 1667

1689.
KreBel, U. (1999). Pairwise classification and support
vector machines.
Advances in kernel methods: Support
Vector Learning
(pp. 255

268). Cambridge, MA: MIT Press.
Nieddu, L. & Patrizi, G. (2000). Formal methods in pattern
recognition.
Europe
an Journal of Operation Research
,
120(3), 459

495.
Okabe, T., Nagata, M. & Shimada, S. (1972). Analysis of
yield of integrated circuits and a new expression of the
yield.
Electrical Engineering in Japan
, 92(12), 135

141.

17
Parzen, E. (1962). On estimatio
n of a probability density
function and mode.
The Annals of Mathematical Statistics
,
33(3), 1065

1076.
Platt, J. C., Cristianini, N. & Shawe

Taylor, J. (2000).
Large margin DAGs for multiclass classification. In:
Advances in neural information processing s
ystems
(pp.
547

553). Cambridge, MA: MIT Press.
Raghavachari, M., Srinivasan, A. & Sullo, P. (1997).
Poisson mixture yield models for integrated circuits: A
critical review.
Microelectronics Reliability
, 37(4), 565

580.
Specht, D.F. (1991). A general
regression neural network.
IEEE Trans. Neural Networks
, 2(6), 568

576.
The 11th Asia Pacific Industrial Engineering and Management Systems Conference
The 1
4
th Asia Pacific Regional Meeting of International Foundation for Production Research
Melaka
,
7
–
10 Dec敭ber 201
0
Stapper, C. H., Armstrong, F. M. & Saji, K. (1983).
Integrated circuit yield statistics.
Proceedings of the IEEE
,
71(4), 453

470.
Stapper
, C.H. (1973). Defect density distribution for LSI
yield calculations.
IEEE Transaction on Electron Devices
(Correspondence)
, 20(7), 655

657.
Stapper, C.H. (1991). On Murphy’s yield integral.
IEEE
Trans. on Semiconductor Manufacturing
, 4(4), 294

297.

24
St
apper, C. H., & Rosner, R. J. (1995). Integrated circuit
yield management and yield analysis: Development and
implementation.
IEEE Transactions on Semiconductor
Manufacturing
, 8(2), 95

102.

25
Stapper, C.H. (1985). The effects of wafer to wafer defect
dens
ity variations on integrated circuit defect and fault
distributions.
IBM Journal of Research Development
, 29(1),
87

97.

26
Tyagi, A., & Bayoumi, A.M. (1992). Defect clustering
viewed through generalized Poisson distribution.
IEEE
Trans. on Semiconductor Ma
nufacturing
,
5(3),
196

206.

27
Vapnik, V. (1998).
Statistical learning theory
. New York:
Wiley.
AUTHOR BIOGRAPHIES
Li

Chang Chao
is a
Assistant Professor
in
Department of
Industrial
Management
,
Diwan University
.
He received a
Doctoral Degree
from the
Department of Industrial
Engineering and Management
at
National Chiao Tung
University
,
Taiwan (R.O.C)
in
200
9
. His teaching and
research interests include operations research
, quality
management and data mining
.
His email address is
<
fredchao@dwu.edu.tw
>
Li

I
Chao
is a
Lecturer
in
Computer Center
,
Diwan
University
.
He received a
Master
Degree
from the
Department of
Information
Management
at
National
Kaohsiung
University
of Applied Sciences
,
Taiwan (R.O.C)
in
200
8
. His teaching and research interests include
data
mining and knowledge management
.
His email address is
<
liichao@yahoo.com.tw
>
Comments 0
Log in to post a comment