1
Using Data Mining Technique to Improve the Manufacturing Yield of LCD Industry
Ruey

Shun Chen
Department of Information
Management, China
University of Technology,
Hsinchu, Taiwan
rschen@cute.edu.tw
Y.C. Chen
Department of Information
Enginerring, Na
tional cheng
Kung University ,
Tainan, Taiwan
Dvd000001@gmail.com
C.C. Chen
Department of Information
Management, Tung Hai
University ,
Taichung, Taiwan
emily@thu.edu.tw
Abstract

This
paper
applies the data mining
technique to the TFT

LCD industry.
First, the
Information Gain of each attribute data was
computed. Then,
the
data mining engine is used to
further analyze the largest value
. The
engine
was
built based on Association Model
with
a
multi

dimension Cube to store information such as
machine
s
an
d product quantity that pass through
each station
,
Once
the data is uploaded to the Data
Warehouse, the next step is to find the association
rule
,
generate
d
by
the algorithm method
,
between
different stations and machines and compute its
confidence level.
This
paper
proposed an effective data mining
methodology to be used in the Array
manufacturing process. The result shows the yield
has increased, the process time has decreased
.
The
stability of the manufacturing process has vastly
been improved.
Keywor
ds

Data mining,
L
CD, manufacturing
process, Association rule.
I.
Introduction
The LCD industry has been regarded as
the “second semiconductor industry" in Taiwan.
Originally, it was Jin Ye Electron Inc. who had
introduced the TN

LCD technology from the
Unit
ed States and started the first TN

LCD
production line in Taiwan. Subsequently in 1997,
CPT
had attained the
LCD production
know

how from ADI, a company invested by
Japanese Mitsubishi Dynamo group. Two years
later, seven production lines were set up. The
TFT

LCD will succeed semiconductors
industries become the next gem in Taiwan.
Improving yield by statistics and
experimental method such as yield rate models
or yield rate
simulation all requires
very precise
statistics techniques. Unfortunately, when it
c
omes to complex interactions and non

linear
attributes, the function of those traditional
analyzing methods is limited. Moreover, it is
impractical to manually retrieving useful data for
decision making from the manufacturing
database. Usually hundreds of
attributes are
needed to establish system behavior patterns.
Thus, the data search technique can effectively
analyze original and huge data from database as
the basis for improving the yield rate.
The purpose of the present investigation is
to put forward
a valid data mining methodology
for Array production. By identifying the fault
and improving the yield, the enterprises can
reduce the production cost, achieve higher
order

meeting rates and increase their
competitiveness.
II.
LCD and Data Mining
A.
LCD
LCD ma
nufacturing requires processes
similar to semiconductors, such as lithography,
etching, ashing and ion doping. It also requires a
class 1 to class 1,000 clean room environments
similar to that for semi

conductors. The
difference would bethe number of step
s that it
takes to manufacture LCDs, which is usually
below 40 compared to around 230 steps for 64M
DRAMs.
The TFT array substrate is transformed from a
bare sheet of glass into the nervous system of the
display in a rigorous, intricate, highly sensitive
process that includes a series of application and
reduction steps involving chemicals, gases and
heat. These steps can be repeated five to ten
times, depending upon the design of the process.
Frequently, the machines performing these steps
are arranged in
a cluster

tool configuration to
improve efficiency between process steps due to
varying process times and maintenance
schedules for each step. Another popular
configuration of equipment is known as in

line.
B.
Data Mining Task
Most data mining goals fall
into the
following main categories [2], [4], [6]:
a.
Data Processing
:
Depending on the goals and
requirements of the KDD process, analysts
may select, filter, aggregate, sample, clean
and/or transform data.
b.
Prediction
:
Given a data item and a predictive
mode
l, predict the value for a specific
attribute of the data item.
c.
Regression
:
Given a set of data items, analyze
the dependency of some attribute values upon
the values of other attributes in the same item,
and the automatic production of a model that
can pr
edict these attribute values for new data.
d.
Associations:
Given a set of data items,
identify relationships between attributes and
items such as the presence of one pattern
2
implies the presence of another pattern.
e.
Classification:
Given a set of predefined
categorical classes, determine to which of
these classes a data item belongs.
f.
Clustering
:
Given a set of data items, partition
this set into a set of classes such that items
with similar characteristics are grouped
together. Clustering is the best used fo
r
finding groups of items that are similar.
C.
Data Mining Methodology
A variety of techniques are available to enable
the above goals. The most commonly used
techniques can be categorized in the following
groups[2], [3], [4], [5], [6]:
a.
Statistical Methods
:
Historically, statistical
work has focused mainly on testing of
preconceived hypotheses and on fitting
models to data.
b.
Decision Trees
:
A decision tree is a tree
where each non

terminal node represents a
test or decision on the considered data
item. Depend
ing on the outcome of the test,
one chooses a certain branch.
c.
Case

Based
Reasoning:
Case

based
reasoning (CBR) is a technology that tries
to solve a given problem by making direct
use of past experiences and solutions.
d.
Rough Sets:
A rough set is defined by
a
lower and upper of a set. Every member of
the lower bound is a certain member of the
set. Every non

member of the upper bound
is a certain non

member of the set.
e.
Neural Networks
:
Neural networks (NN)
are a class of systems modeled after the
human brain.
Like in the human brain, the
strength of neuron interconnections may
change in response to a presented stimulus
or an
obtained output, which enables the
network to “learn”.
f.
Bayesian
Networks:
Bayesian belief
network (BBN) are graphical
representations of
probability distributions,
derived from co

occurrence counts in the
set of data items. Specifically, a BBN is a
directed, acyclic graph, where the nodes
represent attribute variables and the edges
represent probabilistic dependencies
between the attribute
variables..
3.
Algorithm
Firstly, the model of the data should be
defined. The first step is to abstract the data. In
the Array manufacturing process is divided into
five masks. The machines are divided into seven
groups: PVD, CVD, PHO, Wet Etch, Dry Etch,
S
trip, and Clean, along with other
supplementaries. Panels in the line defect group
are unacceptable to customers and would impact
profoundly on yield. Dot defect group affects
product rank is conditionally acceptable by
customers and has a lighter impact o
n yield.
Machines in each manufacturing step have great
influence on the cause of defect. For example,
two line defects and one dot defect are found in
one substrate after 5 manufacturing steps. The
causes of defect are from 3 different steps and
are assoc
iated with 3 machine groups. Since
manufacturing machines in the same group can
be used in different mask process at the same
time, all the numbering of cause of defect could
be simplified as 1.n to represent each machine
group; to further distinguish mach
ine in which
process, value n can be re

defined by masks. As
defect type is the value generated by
examinations, all defective types and causes can
be combined to obtain a 3 dimensional structure.
The machine ID from history data then becomes
the attribute
value of multidimensional structure.
Association rule describes the
combinations of data properties and relationships
in statistics. Its general form: X1, X2… Xn =>Y,
it means using combinations of X1, X2…Xn to
predicting Y. The question to be discussed
would
be the relationship between the combination of
machines in array substrate production line in
the TFT

LCD panel factory and its yield. This is
an application of Association rule.The definition
of support and confidence can be explained as
following:
assuming there is an A, which can be
an item, A
i
, or a combination of items, Ai
∪
…
∪
A
j
, while Ai… A
j
are all items from the data (or
column). Support of A is the sum of data
transaction divided by the number of data
including A. The Confidence of A→B is the
probability of B occurred at the same time while
A occurred.
Support(A→B)=P(A
∩
B)
Confidence(A→B)=P(AB)= P(A
∩
B)/P(A)
Association algorithm procedures
Data mining engine was built based on
Association model, while Association rule uses a
multidimensi
onal Cube to store serial number of
machines and the
quantity
of array substrate
passing each manufacturing station. The
procedures of data mining process in Association
algorithm are as followed.
Step
1: Original data is centrally stored
according to dat
a model defined by Cube in data
warehouse
Step 2: One item is selected from machine group,
mask type, or serial number as the variable and
the rest are kept as constants to find the
combination with higher defect rate.
(i, j, k) represent three variables.
At this
stage, under the (0, j, k), (k, 0, k), (i, j, 0), the
combination creating the highest defect rate is
identified along with its Support and Confidence
values.
Step 3: Two items are selected from machine
3
group, mask type, or serial number as the
va
riables and the rest are kept as constants to
find the combination with higher defect rate.
(i, j, k) represents three variables. At this
stage, under the (i, 0, 0), (0, j, 0), (0, 0, k), the
combination creating the highest defect rate is
identified along
with its Support and Confidence
values.
Step 4: Using machine group, mask type, and
serial number as variables to find the
combination with higher defect rate.
Step 5: The Support and Confidence values from
4.
Application and Analysis
A real case data
is applied to the
developed algorithm for discussion purpose.
The encoded data was provided by the
LCD
manufactories.
A.
Application of Algorithm: Using LCD as
an example
The sources of data are CIM system
information and defect analysis of LCD
manufactor
ies. The information includes
production information of each station, for
instance, the yield, defect rate, serial numbers,
defect type, defect causes, production start and
ending time …etc.
Step
1: Original data is stored according to data
model defined
by Cube in data warehouse
The original data from Data Warehouse
after transformation and calculation. Data below
90% yield is selected and analyzed with the data
of S

Open cause in line defect group.
Step 2: One item is selected from machine group,
mask t
ype, or serial number as the variable and
the rest are kept as constants to find the
combination with a higher defect rate.
Combinations of yield below 90% are
identified. Among all, the mask type 3 with No.1
machine in the first machine group shows th
e
highest possibility to cause high defect rate.
Step 3: Two items are selected from machine
group, mask type, or serial number as the
variables and the rest are kept as constants to
find the combination with higher defect rate.
Support (machine group 3, m
ask type 3 with
low yield) =198/784=25.26%
Confidence (machine group 3, mask type 3
with low yield) = 198/215 = 92.09%
Step 4: Machine group, mask type, and serial
number are selected as the variables to find the
combination with higher defect rate.
Step
5: The Support and Confidence from
The Support and Confidence listed above
surpass other combinations. However, each
machine group is indispensable in the five
manufacturing procedures.
Step 6: Reports
With analysis on each defect, Support and
Confidence
are computed with the proposed
Data Mining technique to obtain valid
information.
B.
Benefit Analysis
(1) Improve the yield of Array manufacturing
process
Taking an example, the test number is the
sum of all panels tested during t
hat week. The
data is automatically recorded by the testing
facility. The number of good items is the number
of qualified panels. By dividing the number of
qualified panels by the total number of tested
panels, the yield should be 90.7
%
，
After improvement has been carried out,
while yield increases from 90.7% to 95.3%.
Based on those facts, it can be concluded that
CIM data mining system can effectively improve
the yield, where the improvement margin is
4.6%
5.
Conclusion
This paper propo
sed that by recording the
defects occurred during the manufacturing
process and combining it with Array data, The
research conclusions are as follows:
a.
A data mining system is established for LCD
industry.
b.
The Information Gain is calculated by
using attribu
te data. Further analysis is
conducted using the data mining engine
on the highest value.
c.
The data mining system improve the yield by
4.6%.
This paper has applied actual data to the
data mining technique and conduct experiments
to prove that the logarith
m proposed is feasible.
The yield has indeed increased
4.6%,.
The
overall manufacturing stability has substantially
improved.
The system can actively monitor the
manufacturing process and notify engineers via
Notes system or beepers to effectively improve
the process and yield.
Reference
(1)
Jiawei Han, Micheline Kamber, “Data Mining
Concepts and Techniques”, 2001.
(2)
Ming

Syan Chen
, Jiawei Han, and Philip S. Yu,
“Data Mining: An Overview from a Database
Perspective,” IEEE Transactions on Knowledge
and Data Engineering, Volume 8 No. 6,
pp.866

883,1996
(3)
S.S. Chen, P.Y Hsu, Y. L. Chen, “Mining
Association Rules in Sequence Data”, Journa
l of
Information Management, vol. 6, no.2,
pp.167

182, 1999.
(4)
R. Srikant and R. Agrawal, “Mining Quantitative
Association Rules in Large Relational Tables”,
ACM SIGMOD, pp.1~12, 1996.
(5)
Cabena and Peter, “Discovering Data Mining:
From Concept to Implementatio
n”, Prentice Hill,
1998.
4
Figure 1. Yield Improvement Passage
Yield rate improvemet trend
95.3%
90.7%
9.2
8.5
7.4
6.1
6.5
7.5
6.8
6.0
5.5
5.8
5.5
4.9
4.9
4.7
0.0
2.5
5.0
7.5
10.0
12.5
15.0
6/16
6/23
6/30
7/7
7/14
7/21
7/28
8/4
8/11
8/18
8/25
9/1
9/8
9/15
failure index
70.0%
75.0%
80.0%
85.0%
90.0%
95.0%
100.0%
yield rate
PRO
WET
PHO
CVD
PVD
良率
指數
Comments 0
Log in to post a comment