I
CIC
Express
Letters
ICIC
Internati
onal
ⓒ
20
1
0
ISSN
1
881

8
03X
Volume
4
,
Number
5
,
October
20
1
0
p p.
1
–
6
P r e d i c t i n g o f t h e S e mi c o n d u c t o r B o o k

to

Bill Ratio by Using
a Novel Genetic Algorithm based Support Vector Machine
Shih

Yu Chang
1
,
Chi

Yo Huang
2
,
Yu

Hsien Yang
1
Gwo

Hshiung Tzeng
3
and
Kuang

Hua Hu
4
1
Dep
artment
of Computer Science
National Tsing
Hua University
No. 101, Kuang Fu Rd, Sec. 2 Hsinchu, Taiwan
shihyuch@cs.nthu.edu.tw
2
Dep
artment
of Industrial Education
Nat
ional
Taiwan Normal University
No. 162, Ho

Ping East Road I
Taipei, Taiwan
cyhuang66@ntnu.edu.tw
3
Inst
itute
of Project Management
Kainan University
No. 1, Kainan RoadLuchu, Taiwan
ghtzeng@mail.knu.edu.tw
4
Department of Finance
Ta Hwa Institute of technology
No. 1, Ta

Hwa R
oad
, Chiung

Lin
,
Hsin

Chu, Taiwan
khhu@thit.edu.tw
Received
February
20
1
0; accepted
April
20
1
0
A
BSTRACT
.
The
book

to

bill (BB) ratio is a demand

to

supply ratio for the number of
orders booked to the number of orders filled. Thus, BB ratio predictions are important to
production, sales and marketing, as well finance managers and investors since accurate
predicti
on results can serve as the foundation for equipment ordering, integrated circuit
(IC) products pricing, inventory control, debt planning, and investments. Albeit important,
few researches tried to predict the BB ratio. In order to develop a forecast mecha
nism for
BB ratio prediction, this study introduced a novel genetic algorithm (GA) based support
vector machine (SVM). The kernel function and parameters will be optimized by the GA
based SVM. This GA based SVM is more generalized due to its weaker depend
ence on
experience. The prediction of semiconductor BB ratio was based on the historical data
between 1996

2010. In this study, the association between future realizations of BB ratio
performance and the current semiconductor fluctuation of information was
proposed to
assess the relative usefulness of these BB ratios. Based on the GA based SVM, how
forecast volatility and forecast inflation lead to delayed tool delivers can be demonstrated.
Meanwhile, based on the GA based SVM, prediction accuracy can be hi
gher than 80% for
most time slots. In the future, the GA based SVM can further be applied on other
predictions of industry cycles or trends.
Keywords:
Genetic algorithm (GA), support vector machine (SVM), semiconductor,
book

to

bill ratio (BB ratio)
.
1.
I
ntroduction.
The book

to

bill (BB) ratio is a demand

to

supply ratio for the number of orders
booked to the number of orders filled. It is an important indicator of sales in many
high

tech industries such as the semiconductor manufacturing, the printed cir
cuit boards
(PCB) manufacturing, and so on. In the semiconductor industry, many organizations such
as Semiconductor Equipment and Materials International (SEMI), Semiconductor
Equipment Association of Japan (SEAP), and Very Large

Scale Integrated Research
(VLSI
Research) provide reports on the BB ratio [1].
The Semiconductor Industry Association (SIA) employs the Price Waterhouse LLP, a
major international independent accounting firm, to collect data from firms, to investigate
questionable data submitted by
companies, and to calculate the semiconductor BB ratio. A
value of BB ratio being less than 1 implies the over

supply situation while the value being
greater than 1 implies a shortage. For example, a book

to

bill ratio of 0.9 implies that the
$90 worth of
new orders was received for every $100 of products being billed for the
specific time slot.
The BB ratio is essential for both industry managers and investors. For managers from
functional departments including production, sales and marketing, as well fi
nance managers,
the BB ratio can serve as the foundation for equipment ordering, integrated circuit (IC)
products pricing, inventory control, debt planning, and investments. For investors, the BB
ratio can serve as the basis for making appropriate investme
nts.
Albeit BB ratio is so important for managers and investors, the timeliness is usually
sacrificed for relevance or reliability [2]. With most accounting information systems, for
example, the earliest time transactions can be recorded is the moment when
an order is
placed. Revenue is subsequently recognized at the point of sale and the associated earnings
typically are not revealed until the end of the quarter. As a result, the earliest point at which
users can access to some sets of potentially value

re
levant financial statement information
is the quarterly report date [2]. Thus, an accurate prediction of the BB ratio is important to
the semiconductor industry since the prediction can serve as general indicator for the future
supply and demand situation.
However, few researches tried to predict the BB ratio. SVM,
a statistical learning theory based on the machine learning algorithms being presented by
Vapnik [6, 7], can be a possible approach to precisely predict the highly fluctuated BB
ratios by using l
inear model to implement nonlinear class boundaries through some
nonlinear mapping the input BB ratio vector into the high

dimensional feature space. But
how the parameters can be selected so as to optimize the prediction results can still prevent
research
ers from precise predictions of the non

linear BB ratio time series.
To resolve the above mentioned BB ratio prediction problems and help industry
managers or investors make correct decisions, a novel genetic algorithm (GA) based
support vector machine (S
VM) will be introduced to regress the BB ratio time series. Since
the parameter selection in SVM is rather important and impact the prediction accuracy
significantly, the GA will be introduced to optimize the parameters. The rolling prediction
of the histo
rical nonlinear BB ratio time series from 1996 to 2010 will be used to verify the
feasibility of the novel GA based SVM algorithm. The results demonstrate that the
GA

based SVM method can actually predict the fluctuation of BB ratios.
The remainder of this
paper is organized as follows, In Section 2, the industrial
background regarding to the BB ratio will be provided. Section 3 introduces the novel
GA

based SVM including the GA and the SVM algorithms. In Section 4, predictions of the
nonlinear historical B
B ratio time series will be demonstrated. The empirical study results
and findings will be discussed in Section 5. Finally, Section 6 will conclude the whole
article with observations, conclusions and recommendations for further study.
2.
Literature Revie
w
.
As in many customized capital goods industries, the semiconductor equipment supply
chain faces an order fulfillment dilemma. On the one hand, buyers of equipment expect
their suppliers to be responsive and to be able to fulfill orders within a relativel
y short order
lead

time; on the other hand, the high value and the customized nature of the products
makes it risky for the supplier to keep finished products or sub

systems in inventory,
leading to long and variable manufacturing lead

times. To resolve th
is dilemma, the buyers
(producers of micro chips) provide their equipment suppliers with order forecasts for the
next 24 months and longer.
Demand for semiconductor production equipment is triggered by the demand for chips,
including micro

processors and m
emory chips. Given that the demand for chips is in turn
generated by the demand for electronic devices, e.g., servers, personal computers, cell
phones, etc., semiconductor equipment makers find themselves at the wrong end of the
“bullwhip” [3]. They face b
usiness cycles that flood them with orders in one year and
starve them for work in the next.
The large chip producers create market forecasts on a monthly or quarterly basis. These
forecasts are used to project production capacity needs for the next 2

5 ye
ars. Forecasts and
capacity plans are updated on the basis of a rolling horizon principle. Chip manufacturers
use these product level demand forecasts combined with equipment output models to
allocate forecasted capacity requirements to both existing and p
otentially new
semiconductor fabs. If the forecasted capacity requirement is not supported by the size and
productivity of the installed equipment base, additional equipment must be ordered. This
projected need for additional equipment is shared with equi
pment suppliers in the form of
soft orders consistent with the principle of forecast sharing and collaborative planning.
Typically, the chip manufacturers are unlikely to actually commit to purchase
equipment at the time of the first forecast. Over the nex
t two years, the chip manufacturer
will obtain new information about developments on the market for chips as well as about
the effective capacity of the currently installed equipment base (based on production yields,
throughput time, and machine uptime). A
s a result, the chip manufacturer may update the
order and will usually delay making a firm order (i.e., issue a purchase order) until about
3

6 months prior to the projected delivery date [4].
The book

to

bill (BB) ratio is a demand

to

supply ratio for th
e number of orders
booked to the number of orders filled. The BB ratio is compiled monthly by Price
Waterhouse LLP on behalf of the Semiconductor Industry Association (SIA) based on
surveys of firms that manufacture semiconductors. The numerator of the BB
ratio
represents a seasonally adjusted, three

month moving average of new orders received,
while the denominator represents a seasonally adjusted three

month moving average of
chips shipped. A BB ratio of $1.10 indicates, therefore, that $1.10 in new order
s have been
received for every $1 of chips shipped, which ordinarily would be interpreted as a positive
signal regarding future industry sales levels.
The accounting firm collects data from a voluntary sample of companies on the fifth
business day of the m
onth. The SIA issues a press release containing the preliminary
estimate of the index for the previous month between the ninth and the twelfth of each
month. The news release is made from California and is picked up on the newswire. The
release of the BB r
atio was reported in the Wall Street Journal on the following day for all
releases during 1995 and 1996, and for ten of the 12 releases in 1994. The Wall Street
Journal typically reports the value of the index, and the change from the previous month’s
ind
ex. In addition, comments are sometimes solicited from firms in response to the release
of the index [2]. For example, a spokesman for Advanced Micro Devices responded to a
decline in the index by stating that “the stock market murders all chip stock but t
he industry
is fundamentally sound” (Wall Street Journal 1996a). The SIA does not typically include
the name of the accounting firm in the BB press release.
The first release of the BB ratio is technically a preliminary figure that is frequently
adjusted
by a small amount in the following month. However, it is the preliminary figure
which attracts the primary news coverage and which would be expected to convey the
newest information to the market. The adjustments to the preliminary BB announcements
were no
t found to be associated with stock returns and are not considered further in this
study.
Chandra et al. [5] find a significant correlation between changes in the BB and
subsequent changes in quarterly earnings. Chandra et al. [5] also find significant sto
ck price
movements on BB release dates. Our study is similar to that of Chandra [5] and the results
of both studies are generally consistent in finding that the BB announcements do provide
information to investors. While Chandra et al. [5] focus on the imp
act of the BB release on
the stock prices for a small sample of semiconductor manufacturers, our study examines the
broader industry

wide information effects for firms in the semiconductor, semiconductor
components and technology areas.
3.
SVM with GA
.
In
this section, the semiconductor BB ratio prediction mechanism, a GA

based SVM
algorithm, will be introduced. The GA will be introduced to optimize parameters for the
SVM based forecast mechanism. Then, historical BB ratio data sets will be predicted
accor
dingly.
3.1.
SVM
The SVM is a statistical learning theory based on machine learning algorithm
presented by Vapnik
[6,7]
.
SVM uses linear model to implement nonlinear class boundaries
through some nonlinear mapping the input vector
i
nto the high

dimensional feature
space. A linear model being constructed in the new space can represent a nonlinear decision
boundary in the original space. In the new space, an optimal separating hyperplane is
constructed. Thus, the SVM is known as the al
gorithm that finds a special kind of linear
model, the maximum margin hyperplane. The maximum margin hyperplane gives the
maximum separation between the decision classes. The training data sets that are closest to
the maximum margin hyperplane are called s
upport vectors. All other training data sets are
irrelative for defining the binary class boundaries.
For the linear separable case, a hyperplane separating the binary decision classes in the
three

attribute case can be represented as the following equati
on:
,
(1)
where
is the outcome,
are the attribute values, and there are four weights
to be
learned by the learning algorithm. In Eq. (1), the weights
are parameters that determine
the hyperplane. The maximum margin hyperplane can be represented as the following
equation in terms of the support vectors:
(2)
where
is the class value of traini
ng data sets
,
represents the dot product. The
vector
represented a test data set and the vectors
are the support vectors. In this
equation,
and
are parameters that determine the hyperplane. From the
implementation point of view, finding the support vector and determining the parameters
and
are equivalent to solving a l
inearly constrained quadratic programming.
As mentioned above, SVM constructs linear model to implement nonlinear class
boundaries through the transforming the inputs into the high

dimensional feature space. For
the nonlinear separating case, a high

dimens
ional version of Eq. (2) is simply represented
as followed:
.
(3)
The function
is defined as the kernel function. Any function that meets
Mercer’s condition can be used as the Kernel function, like polynomia
l, sigmoid, and
Gaussian radial basis function (RBF) used in SVM. In this work, the RBF kernel is given
by (4) is used.
(4)
where
denotes the variance of the Gaussian kernel. In addition, for the separable
case,
there is a lower bound 0 on the coefficient
in Eq. (3), for the non

separating case, SVM
can be generalized by placing an upper bound
on the coefficients
. Therefore
and
, of a SVM model is important to the accuracy of prediction.
The learning algorithm for a non

linear classifier SVM follows the design of an
optimal separating hyperplane in a feature space. The procedure is the same
as the one
being associated with hard and soft margin classifier SVMs in x

space. Accordingly, the
dual Lagrangian in z

space is [8]
(5)
and using the chosen kernels, the Lagrangian is maximized as follows.
(6)
(7)
(8)
Note the constraints must be revised for using in a non

linear soft margin classifier
SVM. The only difference these constraints and those of the separable non

linear c
lassifier
are in the upper bound C on the Lagrange multipliers
. Consequently, the constraints of
the optimization problem become
(9)
(10)
3.
2
.
Parameter Opt
imization
To design an effective SVM model, values of parameters in SVM have to be chosen
carefully in advance [9, 10]. These parameters include: (1) the regularization parameter
,
which determines the tradeo
ﬀ
cost between minimizing the training error and minimizing
the complexity of the model; (2) the parameter sigma (
or
) of the kernel function
which defines the non

linear mapping from the input space to some h
igh

dimensional
feature space (only the Gaussian kernel will be considered in this research while the
variance the kernel function is
); (3) a kernel function being used in this SVM, which is
used to construct a non

linear decision h
ypersurface in an input space [8].
To solve this SVM design problem, Lin [10] provided a systematic method for
selecting SVM parameters. Lin’s approach for selecting parameters of the support vector
regression was based on the concept of the sampling theor
y into the Gaussian Filter. Min
and Lee [11] also proposed a grid

search technique by using a 5

fold cross validation to
find out the optimal parameter values of the kernel function of SVM.
In contrast to abovementioned methods of parameter optimization on
SVM, this
reserach develops a novel GA based method, the GA based SVM, for optimizing the two
SVM parameters (
and
) simultaneously. The first parameter,
, will be used to
determine the
tradeo
ﬀ
between the fitting error minimization and model complexity. The
second parameter,
, is the bandwidth of the radial basis function (RBF) kernel.
3.3.
GA
A GA is based on the evolutionary process of animals where the propaga
tion of desired
traits happens by natural selection. In each generation the best traits are combined to
produce offspring that are better than their parents. Thus this is a greedy algorithm.
Since structural methods for confirming efficiently the selectio
n of parameters are
lacking. Therefore, GA is used in the proposed SVM model to optimize parameter selection.
To precisely establish a GA

based feature selection and parameter optimization system, the
following main steps (as shown in Fig.1) must be procee
ded. Following, the procedures of
GA is explained in detail.
A.
Chromosome representation
The two parameters,
and
, of SVM were directly coded to form the chromosome
in the proposed method. The chromosome
is represented as
, where
and
denote the regularization parameter
and
(the parameter of the kernel
function), r
espectively.
B.
Evaluating fitness function
A fitness function, assessing the performance of each chromosome, must be designed
before starts to search optimal values of SVM parameters. In this study, a mean absolute
percentage error (MAPE) is used as the fit
ness function. The MAPE is as follows:
(11)
where
and
represent the actual and forecast values and
is the number of
forecasting periods.
C.
Selection and reprodu
ction
Based on fitness functions, chromosomes with higher fitness values are more likely to
yield offspring in the next generation. The tournament selection method is applied to
choose chromosomes for reproduction.
D.
Crossover
Once a pair of chromosomes has
been selected for crossover, one or more randomly
selected positions are assigned to the to

be

crossed chromosomes. The newly crossed
chromosomes are then combined with the rest of the chromosomes to generate a new
population. This study we use the method
proposed by Adewuya [12] to prevent overload
of post

crossover when genetic algorithm with real

valued chromosomes are applied.
,
(12)
Move closer:
,
(13)
(14
)
Move
away:
,
(15)
(16)
and
represent the pair of populations before crossover operation;
and
represent the
pair of new populations after crossover operation.
E.
Mutation
The mutation operation follows the crossover operation and determines whether a
chromosome should be mutated in the next generation. In this study, uniform mutation
method is applied and designe
d in the presented model. Consequently, researchers can
select the method of mutation in GA

SVM best suited to their problems of interest.
Uniform mutation can represent as following:
,
(17)
,
(18)
(19)
where
denotes the number of parameters;
represents a random number in the range
, and
is the position of the mutation. LB and UB are the l
ow and upper bounds on
the parameters, respectively.
and
denote the low and upper bound at location
.
represents the population before mutation operation;
represents the
new population following mutation operation.
FIGURE 1.
The GA function
4.
Predicting of the BB ratio by the GA based SVM
.
4.1
.
Research Data
The empirical analysis is based on a proprieta
ry data set consisting of the BB ratio
from November 1996 to May 2010. The data set served as the input data in the rolling
prediction simulation.
4.2.
Concepts of the GA Based SVM
In this study, the Gaussian radial basis function (RBF) is used as the kern
el function of
SVM. Tay and Cao [13] showed that the upper bound
and
played an important role
in the performance of SVMs. An improper selection of these two parameters can cause the
overfitting or the underfi
tting problems. Since there is few general guidance to determine
the parameters of SVM, this study employs the GA to select optimal parameters
and
simultaneously for the best prediction performance.
In Figure
2, the optimal parameters were decided by using the GA

based SVM model.
First, some data sets were given for training. Further training runs were then carried out
using the GA to select the optimal combination of input data sets, and improve regression
pe
rformance. In the GA function, it determines whether it satisfies the stop condition, if not,
it continues preceding between GA function and SVM function until deriving the optimal
values of parameters
and
.
Next, the raw data sets were imported with the optimal
parameters and run the remaining steps of support vector machine. That is, the data was
trained to create a model, then predict new input data and get the prediction results and the
MAPE.
FIGURE
2
.
The GA

based SVM model
4.3.
GA

SVM Results
In Figure 3, the semiconductor BB ratio values observed from November 1996 to May
2010 were demonstrated. T
he historical BB ratio values would be selected as input data sets
for the pr
oposed GA

based SVM model. Predictions based on the data set were executed
by using both the SVM and the GA based SVM algorithms for benchmarking the
performance of the novel GA based SVM framework. The MAPE will serve as the
indicator for the prediction
accuracy.
In Figure 4, prediction results based on the GA based SVM and the fixed SVM with
parameters
=1000 and
=1 were demonstrated as the basis for comparisons, the results
based on the GA based SVM can fit
the raw data of historical BB ratio closely with
comparatively lower prediction error rate.
FIGURE 3.
Trend chart based on the historical BB ratio value from 1996 to 2010
F
IGURE
4
.
The BB ratio to the raw data, data after GASvM, and fixed SVM
We emplo
y the GA based SVM framework for rolling predictions,
the accuracy of
predictions are higher than 80% for most time slots (Figure 5) of the BB ratio in the
semiconductor industry.
F
IGURE
5
.
The MAPE by the GA based SVM
5.
Discussion
.
In this research, a
novel GA based SVM framework was introduced for predicting the
semiconductor BB ratios. Apparently, the GA based SVM performed better than the fixed
SVM. Prediction results by the traditional SVM with fixed values of parameters
demonstrated higher error r
ates. These results imply that the prediction errors of the
traditional SVM algorithm can be reduced dramatically by using the parameters being
optimized by using the GA.
With the GA based SVM algorithm, the prediction accuracy of the semiconductor BB
rat
io could be higher. Let the months before last year be the training data and the months in
the current year be the raw data for prediction. For example, in Figure 5, the data was
trained with with the months in 1996 and 1997, then predict the BB ratios in
1998 and the
MAPE can be calculated.
Based on the MAPE results being demonstrated
in Figure 5, the prediction accuracy
are usually higher than 80% while the global semiconductor market keeps steady growth.
However, some extreme situations exist. Following,
the extreme situations which caused
significant prediction errors will be discussed also.
In 1998, the MAPE achieved 40% since our training data sets are not enough (only two
years from 1996 to 1997) and the BB ratio decreased rapidly relative to the 1997
(please
refer to Figure 3) due to a recession (please refer to Figure 6). In 1999 and 2000, the
semiconductor kept growth while the prediction results are satisfactory with the prediction
accuracy higher than 80%. Next, in 2001 the MAPE was as high as 70%
due to the severe
industry downturn (Figure 6) being caused by the internet bubble and thus, a recession of
the world’s economy. In 2009, the Financial Tsunami drove the BB ratio down again which
influenced the prediction accuracy significantly
.
Source:
McClean et al. [14,15]
F
IGURE
6
.
Worldwide semiconductor industry growth rate (1978
–
2005)
6
.
Conclusions.
Forecasting the future BB ratio is important to the semiconductor manufacturing
industry. In this paper, a forecasting method is consisting of the
SVM and a GA was
provided. It is found that the novel GA based SVM algorithm can predict the BB ratios
accurately for the time slots when there is no significant recession in the semiconductor
industry. In the future, more data samples can be collected fo
r verifying the accuracy of this
forecast mechanism.
REFERENCES
[1]
T. Chen and Y.C. Wang, “A Hybrid Fuzzy and Neural Approach for Forecasting the Book

to

Bill ratio
in the Semiconductor Manufacturing Industry,”
The International Journal of Advanced Manufact
uring
Technology
, pp. 1

13.
[2]
N.L. Fargher, L.R. Gorman and M.S. Wilkins, “Timely Industry Information as an Assurance Service,
Evidence on the Information Content of the Book

to

Bill Ratio,”
Auditing
, vol. 17, pp. 109

124, 1998.
[3]
H.L. Lee, V. Padmanabhan and
S. Whang, “Information Distortion in a Supply Chain: the Bullwhip
Effect,”
Management Science
, vol. 43, n
o
. 4, pp. 546

558, 1997.
[4]
C. Terwiesch, Z.J. Ren, T.H. Ho and M.A. Cohen, “An Empirical Analysis of Forecast Sharing in the
Semiconductor Equipment Sup
ply Chain,”
Management Science
, vol. 51,
no.
2, pp. 208

220, 2005.
[5]
U. Chandra, A. Oricassini and G. Waymire, “The Information Content of Non

financial Disclosures,”
Working Paper
, Emero University, 1997.
[6]
V. Vanpnik, “
Statistical Learning Theory
,” J. Wiley,
1998.
[7]
K. Kim, “Financial Time Series Forecasting using Support Vector Machines,”
Neurocompuing
, vol. 55,
no.
1

2, pp. 307

319, 2003.
[8]
C.H. Wu, G.H. Tzeng, Y.J. Goo and W.C. Fang, “A Real

valued Genetic Algorithm to Optimize the
Parameters of Support Vector
Machine for Predicting Bankruptcy,”
Expert Systems with Applications
,
vol. 32,
no.
2, pp. 397

408, 2007.
[9]
K. Duan, S.S. Keerthi and A.N. Poo, “Evaluation of Simple Performance Measures for Tuning SVM
Hyperparameters,”
Neurocomputing
, vol. 51, pp. 41

59, 20
03.
[10]
P.T. Lin, “Support Vector Regression: Systematic Design and Performance Analysis,” Unpublished
Doctoral Dissertation, Department of Electronic Engineering, National Taiwan University.
[11]
J.H. Min and Y.C. Lee, “Bankruptcy Prediction using Support Vector M
achine with Optimal Choice of
Kernel Function Parameters,”
Expert Systems with Applications
, vol. 28,
no.
4, pp. 603

614, 2005.
[12]
A.A. Adewuya, “New Methods in Genetic Search with Real

valued chromosomes,” Unpublished
Master’s thesis, Massachusetts Institute
of Technology.
[13]
F.E.H. Tay and L. Cao, “Application of Support Vector Machines in Financial Time Series
Forecasting,”
Omega
, vol. 29,
no.
4, pp. 309

317, 2001.
[14]
B. McClean, B. Matas, and T. Yancey,
The McClean Report,
2001 Edition
.
Scottsdale, Arizona: IC
Insights, 2001.
[15]
B. McClean, B. Matas, and T. Yancey,
The McClean Report,
2005 Edition.
Scottsdale, Arizona: IC
Insights, 2005.
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο