Forecasting
stock
exchange movements using
neural networks:
Empirical evidence from
Kuwait
Mohamed M. Mostafa
a
,
a
New York Institute of Technology, Global Program in Bahrain, Manama, Bahrain
Available online 23 February 2010.
Abstract
Financial time series are very complex
and dynamic as they are characterized by extreme
volatility. The major aim of this research is to forecast the Kuwait
stock
exchange (KSE)
closing price movement
s using data for the period 2001
–
2003. Two
neural network
architectures: multi

layer perceptron (MLP)
neural networks
and generalized regression
neur
al networks
are used to predict the KSE closing price movements. The results of t
his
study show that neuro

computational models are useful tools in forecasting
st
ock
exchange
movements in emerging markets. These results also indicate that the
quasi

Newton training
algorithm produces less forecasting errors compared to other training algorithms. Due to their
robustness and flexibility of modeling algorithms, neuro

computational models are expected
to outperform traditional statistical techniques
such as regression and ARIMA in forecasting
stock
exchanges’ price movements.
Keywords:
KSE;
Neural networks
; Forecasting
Article Outline
1.
Introduction and related work
2.
Methodology
2.1.
Data
2.2.
Multi

layer perceptron
2.3.
Generalized regression
neural network
3.
Results
3.1.
MLP

based forecasting
3.2.
GRNN

based forecasting
4.
Implications, limitations and future research
References
1. Introduction and related work
The increasing globalization of the financial markets has heightened the interest in emerging
markets. In the Middle East there are 11 formal
stock
markets monitored by Standard and
Poors Emerging Markets Database (
Smith, 2007
).
Stock
markets are usually classified as
either developed or emerging. Three of the Middle East
stock
markets are categorized as
developed: Kuwait, United Arab Emirates and Qatar. Middle East
stock
markets are,
however, relatively small by world stan
dards as the 11 formal
stock
markets in the region
account for around 0.9% of world
stock
market capitalization (
Smith, 2007
).
Kuwait
stock
exchange
(KSE) was officially established in 1984 after the crash of
Almanakh (over

the

counter market). In the post liberation period (1992) KSE witnessed a lot
of reforms among which were the government privatization program to sell its holdings of
shares in loca
l shareholding companies, the launching of mutual funds, and the emergence of
institutional investors as the dominant players in the market (
Al

Loughani & Chappell, 2000
).
Throughout
its history, KSE has been characterized by an irregularity in trades and price
formation process. However, the Kuwaiti parliament has recently introduced some measures
that permit foreign investors to trade in KSE (
Al

Loughani & Chappell, 2000
). This study
aims to forecast the KSE movements using
neural network
(NN) models.
NN models have
been successfully used in prediction or forecasting studies across many
disciplines. One of the first successful applications of MLP is reported by
Lapedes and Farber
(1988)
. Using
two deterministic chaotic time series generated by the logistic map and the
Glass
–
Mackey equation, they designed an MLP that can accurately mimic and predict such
dynamic non

linear systems. Another major application of MLP is in electric load
consumption
(e.g.
[Darbellay and Slama, 2000]
and
[McMenamin and Monforte, 1998]
).
Many other problems have been solve
d by MLP. A short list includes air pollution forecasting
(e.g.
Videnova, Nedialkova, Dimitrova, & Popova, 2006
), maritime traffic forecasting
(
Mostafa, 2004
), airline passenger traffic forecasting (
Nam & Yi, 1997
), railway traffic
forecasting (
Zhuo, Li

Min, Yong, & Yan

hui, 2007
), commodity prices (
Kohzadi, Boyd,
Kemlanshahi, & Kaastra, 1996
), ozone level (
Ruiz

Suarez, Mayora

Ibarra, Torres

Jimenez,
& Ruiz

Suarez, 1995
), student grade point averages (
Gorr, Nagin, & Szczypula, 1994
),
forecasti
ng macroeconomic data (
Aminian, Suarez, Aminian, & Walz, 2006
), financial time
series forecasting (
Yu & Lai
Wang, 2009
), advertising (
Poh, Yao, & Jasic, 1998
), and market
trends (
Aiken & Bsat, 1999
).
Due to its goo
d performance in noisy environments, the generalized regression
neural
network
(GRNN) has been extensively used in various prediction and forecasting tasks in t
he
literature. For example,
Gaetz, Weinberg, Rzempoluck, and Jantzen (1998)
analyzed EEG
activity of the brain using a GRNN.
Chtioui, Panigrahi, and Francl (1999)
used a GRNN for
leaf wetness prediction.
Ibric, Jovanovi, Djuri, Paroj, and Solomun (2002)
used a GRNN in
the design of exte
nded

release aspirin tablets.
Cigizoglu (2005)
employed a GRNN to forecast
monthly water flow in Turkey. In this study, GRNN forecasting performance was found to be
superior to the
MLP and other statistical and stochastic methods.
Kim and Lee (2005)
used a
GRNN

based genetic algorithm to predict silicon oxynitride etching.
Hanna, Ural, and Saygili
(2007)
developed a GRNN model to predict seismic condition in sites susceptible to
liquefaction.
Shie (2008)
used a hy
brid method integrating a GRNN and a sequential
quadratic programming method to determine an optimal parameter setting of an injection

molding process.
There is an extensive literature in financial applications of NNs (e.g.
[Harvey et al., 2000]
and
[Kumar and Bhattacharya, 2006]
). For example,
Cao, Leggio, and Schniederjans (2005)
used
NNs to predict
stock
price movements for firms traded on the Shanghai
stock
exchange.
The authors compared the predictive power using linear models to the predictive power of the
univariate and multivariate NN models. Results showed that NN models outperform the linea
r
models. These results were statistically significant across the sample firms and indicated that
NN models are useful for
stock
price prediction.
Kryzanowski, Galler, and Wright (1993)
used NN models with historic accounting and macroeconomic data to identify
stocks
that
will outperform the market.
McGrath (2002)
used market to book and price earnings ratios in
a NN model to rank
stocks
based on the likelihood estimates.
(Ferson and Harvey, 1993)
and
(Kimoto et al., 1990)
used a series of macroeconomic variables to capture predictable
variation in
stock
price returns.
McNelis (1996)
used the Chilean
stock
market to predict
returns in the Brazilian markets.
Yumlu, Gurgen, and Okay (2005)
used various NN
architectures to model the performance of Istanbul
stock
exchange over the period 1990
–
2002.
Leigh, Hightower, and Modani (2005)
used NN models and linear regression models to
model the New York
Stock
Exchange Composite Index data for the period from 1981 to
1999. Results were robust and informative as to the role of trading volume in the
stock
market.
Chen, Leung, and Daouk (2003)
predicted the direction of return on market index of
the Taiwan
stock
exchange using a probabilistic NN model. The results were then
compared to the generalized methods of moments (GMM) with Kalman filter. From this
literature survey we find that no prev
ious studies have attempted to predict the movements of
KSE. In this study we aim to fill this research gap through the application of MLP and GRNN
models to forecast the movements of the KSE.
2. Methodology
2.1. Data
This study covers the time period of J
une 17, 2001 through November 30, 2003. Thus, the
data set contained 612 data points in time series. This data set is quite similar in length to data
sets in previous studies of similar nature. We used data for all listed companies traded on the
KSE. The d
ata consists of daily closing prices. The source of the closing price data is the
KSE
(
www.Kuwaitse.com
).
Table 1
shows
descriptive statistics of the open and closing prices of
the data used in this study.
Table 1.
Descriptive statistics of KSE opening and closing prices.
Open
Close
Mean
2550.021242
Mean
2553.124346
Standard Error
36.7838047
Standard Error
36.81831376
Median
2196.25
Median
2194.55
Mode
1746.7
Mode
1764.2
Standard Deviation
909.9810725
Standard Deviation
910.8347795
Sample Variance
828065.5523
Sample Variance
829619.9955
Kurtosis

0.63889504
Kurtosis

0.64911602
Skewness
0.891027464
Skewness
0.885
77702
Range
2945.2
Range
2942.8
Minimum
1578.7
Minimum
1579
Open
Close
Maximum
4523.9
Maximum
4521.8
Count
612
Count
612
2.2. Multi

layer perceptron
MLP consists of sensory units that make up the input layer, one or more hidden layers of
processing units (perceptrons), and one output layer of processing units (perceptrons). The
MLP performs a functional mapping from the input space to the output space. An MLP with a
single hidden layer having
H
hidden units and a single output,
y
, impleme
nts mappings of the
form
(1)
(2)
where
Z
h
is the output of the
h
th hidden unit,
W
h
is the weight between the
h
th hidden and the
output unit, and
W
0
is the output bias. There a
re
N
sensory inputs,
Xj
. The
j
th input is
weighted by an amount
β
j
in the
h
th hidden unit. The output of an MLP is compared to a
target output and an error is calculated. This error is back

propagated to the
neural network
and used to adjust the weights. This process aims at minimizing the mean square error
between the
network’s
prediction output and the target output.
MLP was first developed to mimic the functioning of the brain. It consists of interconnected
nodes referred to as processing elements that receive, process, and transmit information. MLP
consists of three
types of layers: the first layer is known as the input layer and corresponds to
the problem input variables with one node for each input variable. The second layer is known
as the hidden layer and is useful in capturing non

linear relationships among vari
ables. The
final layer is known as the output layer and corresponds to the classification being predicted
(
Baranoff, Sager, & Shively, 2000
).
Fig. 1
represents the typical structure of MLP.
Full

size image
(28K)
Fi
g. 1.
General structure of MLP.
First of all the
network
has
to be
trained
to produce the correct output with minimum error.
To achieve the minimum error the
network
first has to be trained until it produces a
tolerable
e
rror. This is how the training is done. Input is fed to the input nodes, from here the
middle layer nodes take the input value and start to process it. These values are processed
based on the randomly allocated initial weight of the links. The input travel
s from one layer to
another and every layer process the value based on the weights of its links. When the value
finally reaches the output node, the actual output is compared with the expected output. The
difference is calculated and it is propagated backw
ards, this is when the links adjust their
weights. After the error has propagated all the way back to first layer of middle level nodes,
the input is again fed to the input nodes. The cycle repeats and the weights are adjusted over
and over again until the
error is minimized. The key here is the weight of different links. The
weights of the links will decide the output value.
The MLP is the most frequently used
neural network
technique in pattern recognition
(
Bishop, 1999
) and classification problems (
Sharda, 1994
). H
owever, numerous researchers
document the disadvantages of the MLP approach. For example,
Calderon and Cheh (2002)
argue that the standard MLP
network
is subject to problems of local minima.
Swicegood
and Clark (2001)
claim that there is no formal method of deriving
a MLP
network
configuration for a given classification task. Thus, there is no direct method of finding the
ultimate structure for modeling process. Consequentl
y, the refining process can be lengthy,
accomplished by iterative testing of various architectural parameters and keeping only the
most successful structures.
Wang (1995)
argues tha
t standard MLP provides unpredictable
solutions in terms of classifying statistical data.
2.3. Generalized regression
neural network
GRNN was devised by
Specht (1991)
, casting a statistical method of function approximation
into a
neural network
form. The GRNN, like the MLP, is able to approximate any
functional relationship between inputs and outputs (
Wasserman, 1993
). Structurally, the
GRNN resembles the MLP. However, unl
ike the MLP, the GRNN does not require an
estimate of the number of hidden units to be made before training can take place.
Furthermore, the GRNN differs from the classical MLP in that every weight is replaced by a
distribution of weight which minimizes th
e chance of ending up in local minima. Therefore,
no test and verification sets are required, and in principle all available data can be used for the
training of the
network
(
Parojcic, Ibric, Djuric, Jovanovic, & Corrigan, 2005
).
The GRNN is a method of estimating the joint probability density function (pdf) of
x
and
y
,
giving only a train
ing set. The estimated value is the most probable value of
y
and is defined
by
(3)
The d
ensity function
f
(
x
,
y
) can be estimated from the training set using Parzen’s estimator
(
Parzen, 1962
)
(4)
The probability estimate
f
(
x
,
y
) assigns a sample probability of width
σ
for each sample
x
i
and
y
i
, and the probability estimate is
the sum of these sample probabilities (
Specht, 1991
).
Defining the scalar function
(5)
and assessing
the indicated integration yields the following:
(6)
The resulting regression
(6)
is directly applicable to problems involving numerical data.
The first hidden layer in the GRNN contains the radial units. A second hidden layer contains
units that help to estimate the wei
ghted average. This is a specialized procedure. Each output
has a special unit assigned in this layer that forms the weighted sum for the corresponding
output. To get the weighted average from the weighted sum, the weighted sum must be
divided through by t
he sum of the weighting factors. A single special unit in the second layer
calculates the latter value. The output layer then performs the actual divisions (using special
division units). Hence, the second hidden layer always has exactly one more unit than
the
output layer. In regression problems, typically only a single output is estimated, and so the
second hidden layer usually has two units.
Fig. 2
shows the general structure of th
e GRNN.
The GRNN can be modified by assigning radial units that represent clusters rather than each
individual training case: this reduces the size of the
network
and increases execution speed.
Centers can be assigned using any appropriate algorithm (i.e., sub

sampling,
K

means or
Kohonen).
Fig. 2
shows the General structure of the GRNN.
Full

size image
(26K)
Fig. 2.
General structure of the generalized regression
neural network
(GRNN).
3. Results
3.1. MLP

based forecasting
There are many software packag
es available for analyzing MLP models. We chose
NeuroIntelligence package (
Alyuda Research Company, 2003
). This software applies
artificial intelligence techniques to automatically f
ind the efficient MLP architecture.
Typically, the application of MLP requires a training data set and a testing data set (
Lek &
Guegan, 1999
). The training data set is used to trai
n the MLP and must have enough examples
of data to be representative for the overall problem. The testing data set should be
independent of the training set and is used to assess the classification/prediction accuracy of
the MLP after training. Following
Lim and Kirikoshi (2005)
, an error back

propagation
algorithm with weight updates occurring after each epoch was used for MLP training.
Fig. 3
shows the actual versus the fitted closing price values for the whole series using the
quick propagation training algorithm.
Fig. 4
shows the negat
ive exponential decay in the error
rate. As can be seen from
Fig. 4
, the best
network
was obtained after around 500 epochs
(trials). This figure is quite similar to typical convergence of errors in MLP models (similar
error graphs were obtained using other training algorithms).
Full

size image
(104K)
Fig. 3.
Actual versus fit using the quick propagation training algorithm.
Full

size image
(47K)
Fig. 4.
Error conversion and best
network
error.
Fig. 5
shows the actual versus the fitted closing price values for the whole series using the
conjugate gradient descent training algorithm.
Fig. 6
shows the actual versus the f
itted closing
price values for the whole series using the quasi

Newton training algorithm.
Full

size image
(
93K)
Fig. 5.
Actual versus fit using the conjugate gradient descent training algorithm.
Full

size image
(94K)
Fig. 6.
Actual versus fit using the quasi

Newton training algorithm.
Based on the descriptive statistics of different training algorithms reported in
Table 2
along
with the plots of the training algorithms used, it seems that the quasi

Newton training
algorithm produces less forecasting errors compared to the other two methods. This is also
evident from the target vs. output graph and from the error dependence graph using the quasi

Newton training algorithm (
Fig. 7
).
Table 2.
MLP training algorithms major statistics.
Target
Output
AE
ARE
Quick propagation
a
Mean
2534.744
2537.564
48.4
50
0.021
SD
908.299
890.698
31.966
0.015
Min
1579.000
1708.253
0.336
0.000
Max
4521.800
4369.447
155.094
0.084
Conjugate gradient descent
b
Mean
2534.744
2535.314
26.598
0.01
0
SD
908.299
902.767
22.605
0.009
Min
1579.000
1653.851
0.334
0.000
Max
4521.800
4433.294
170.075
0.049
Quasi

Newton algorithm
c
Mean
2534.744
2535.313
24.956
0.010
SD
908.2
99
903.801
21.542
0.008
Min
1579.000
1646.779
0.303
0.000
Max
4521.800
4442.282
166.271
0.046
a
Correlation = 0998; R
2
= 0.996.
b
Correlation = 0999; R
2
= 0.999.
c
Correlation = 0999; R
2
= 0.999.
Full

size image
(96K)
Fig. 7.
Predicted Vs. actual and error dependence graph using the quasi

Newton training
algorithm.
3.2. GRNN

based forecasting
There are many computer software packages available for building and analyzing NNs.
Because of its extensive capabilities for building
networks
based on a variety of training
and learning methods, NeuralTools Professional package (
Pa
lisade Corporation, 2005
) was
chosen to conduct GRNN analysis in this study. This software automatically scales all input
data. Scaling involves mapping each variable to a range with minimum and maximum values
of 0 and 1. NeuralTools Professional software
uses a non

linear scaling function known as the
‘tanh’, which scales inputs to a (−1, 1) range. This function tends to squeeze data together at
the low and high ends of the original data range. It may thus be helpful in reducing the effects
of outliers (
Tam, Tong, Lau, & Chan 2005
).
Table 3
shows the basic properties of the GRNN
model used in this study.
[Fig. 8]
and
[Figure 9]
show the error distribution and the predicted
versus actual results using the GRNN
network.
These figures indicate the robustness of the
GRNN and its normality of error distributions.
Table 3.
GRNN architecture.
Summary
Net Information
Confi
guration
GRNN Numeric Predictor
Training
Number of Cases
490
Number of Trials
61
Reason Stopped
Auto

Stopped
% Bad Predictions (30% Tolerance)
0.0000%
Root Mean Square Error
24.51
Mean Absolute Error
17.13
Std. Deviation of Abs. Error
17.53
Te
sting
Number of Cases
122
% Bad Predictions (30% Tolerance)
0.0000%
Root Mean Square Error
32.12
Mean Absolute Error
22.30
Summary
Std. Deviation of Abs. Error
23.12
Data Set
Number of Rows
612
Full

size image
(22K)
Fig. 8.
GRNN histogram of residuals (training/testing).
Full

size image
(25K)
Figure 9.
GRNN predicted versus actual values (tra
ining/testing).
4. Implications, limitations and future research
Our results confirm the theoretical work by
Hecht

Nielson (1989)
who has shown that NNs
can learn input
–
output relationships to the point of making perfect forecasts with the data on
which the
network
is trained. The good performance of the NN models in predicting KSE
closing price movements can be traced to its inherent non

linearity. This makes an NN ideal
for dealing with non

linear relations that may exist in the data. Thus,
neuro

computational
models are needed to better understand the inner dynamics of
stock
markets. Our results are
also in line with the findings of other research
ers who have investigated the performance of
NN compared to other traditional statistical techniques, such as regression analysis,
discriminant analysis, and logistic regression analysis. For example, in a study of credit

scoring models used in commercial
and consumer lending decisions,
Bensic, Sarlija, and
Zekic

Susac (2005)
compared the performance of logistic regression,
neural networks
and
decision trees. The PNN model produced the highest hit rate and the lowest type I error.
Similar findings have been reported in a study examining the performance of NN in
predicting bankruptcy (
Anandarajan, Lee, & Anandarajan 2001
) and diagnosis of acute
appendicitis (
Sakai et al., 2007
).
Despite the signif
icant contributions of this study, it suffers from a number of limitations.
First, despite the satisfactory performance of the NN models in this study, future research
might improve the performance of the NN models used in this study by integrating fuzzy
d
iscriminant analysis and genetic algorithms (GA) with NN models.
Mirmirani and Li (2004)
pointed out that traditional algorithms search for optimal weight vectors for a
neural
network
with a given architecture, while GA can yield an efficient exploration of the search
space when the modeler has little
apriori
knowledge of the structure of
problem domains.
Second, future research might use other NN architectures such as self

organizing maps
(SOMs) to classify movements in KSE. Due to the unsupervised character of their learning
algorithm and the excellent visualization ability, SOMs have bee
n recently used in myriad
classification tasks. Examples include classifying cognitive performance in schizophrenic
patients and healthy individuals (
Silver & Shmoish, 2008
), mutual
funds classification
(
Moreno, Marco, & Olmeda, 2006
), crude oil classification (
Fonseca, Biscaya, de Sousa
, &
Lobo 2006
), and classifying magnetic resonance brain images (
Chaplot, Patnaik, &
Jagannathan 2006
).
References
Aiken and Bsat, 1999
M. Aiken and M. Bsat, Forecasting market trends with
neur
al
networks,
Information Systems Management
16
(1999), pp. 42
–
49.
Al

Loughani and Chappell, 2000
N. Al

Loughani and D. Chappell, Modeling the day

of

the

week effect in the Kuwait
stock
exch
ange: A non

linear Garch representation,
Applied
Financial Economics
11
(2000), pp. 353
–
359.
Alyuda Research Company, 2003
Alyuda Research Company (2003). NeuroIntelligence User
Man
ual (Version 2.1).
Aminian et al., 2006
F. Aminian, E. Suarez, M. Aminian and D. Walz, Forecasting economic
data with
neural networks,
Computational Economics
28
(2006), pp. 71
–
88.
Full Text
via
CrossRef

View Record in Scopus

Cited By in Scopus (3)
Anandarajan et al., 2001
M. Anandarajan, P. Lee and A. Anandarajan, Bankruptcy prediction
of financially stressed firms: an examination of the predictive accuracy of artificial
neural
networks,
International Journal of Intelligent Systems in Accounting, Finance and
Management
10
(2001), pp. 69
–
81.
Full Text
via CrossRef
Baranoff et al., 2000
E. Baranoff, T. Sager and T. Shively, A semi

parametric stochastic
spline model as a managerial tool for potential insolvency,
Journal of Risk and Insurance
67
(2000), pp. 369
–
39
6.
Full Text
via CrossRef

View Record in Scopus

Cited By in Scopus
(4)
Bensic et al., 2005
M. Bensic, N. Sarl
ija and M. Zekic

Susac, Modelling small

business credit
scoring by using logistic regression,
neural networks
and decision trees,
Intelligent Systems
in Accounting, Finance and
Management
13
(2005), pp. 133
–
150.
Full Text
via CrossRef
Bishop, 1999
C. Bishop,
Neural networks
for pattern recognition, Oxford University Press,
New York (1999).
Calderon and Cheh, 2002
T. Calderon and J. Cheh, A roadmap for future
neural networks
research in auditing and risk assessment,
International Journal of Accounting Information
Systems
3
(2002), pp. 203
–
236.
Article

PDF (491 K)

View Record in Scopus

Cited By
in Scopus (30)
Cao et al., 2005
Q. Cao, K. Leggio and M. Schniederjans, A comparison between Fama and
French’s model and artificial
networks
in predicting the Chinese
stock
market,
Computers
and Operations Research
32
(2005), pp. 2499
–
2512.
Article

PDF (238 K)

View Record
in Scopus

Cited By in Scopus (19)
Chaplot et al., 2006
S. Chaplot, L.
Patnaik and N. Jagannathan, Classification of magnetic
resonance brain images using wavlets as input to support vector machines and
neural
network,
Biomedical Signal Processin
g and Control
1
(2006), pp. 86
–
92.
Article

PDF
(237 K)

View Record in Scopus

Cited By in Scopus (15)
Chen et al., 2003
A. Chen, M. Leung and H. Daouk, Application of
neural networks
to an
emerging financial market: forecasting and trading the Taiwan
stock
index,
Computers and
Operations Research
30
(2003), pp. 901
–
923.
Article

PDF (229 K)

View Record in
Scopus

Cited By in Scopus (54)
Chtioui et al., 1999
Y. Chtioui, S. Panigrahi and L. Francl, A generalized regression
neural
network
and its application for leaf wet
ness prediction to forecast plant disease,
Chemometrics and Intelligent Laboratory Systems
48
(1999), pp. 47
–
58.
Article

PDF
(204 K)

View Record in Scopus

Cited By in Scopus (26)
Cigizoglu, 2005
H. Cigizoglu, Generalized regression
neural network
in monthly flow
forecasting,
Civil Engineering and Environmental Systems
22
(2005), pp. 71
–
84.
View
Record in Scopus

Cited By in Scopus (25)
Darbellay and Slama, 2000
G. Darbellay and M. Slama, forecasting the short

term demand for
electricity: Do
neural networks
stand a better chance,
International Journal of Forecasting
16
(2000), pp. 71
–
83.
Article

PDF
(205 K)

View Record in Scopus

Cited By in Scopus
(67)
Ferson and Harvey, 1993
W. Ferson and C. Harvey, The risk and predictability of
internatio
nal equity returns,
Review of Financial Studies
6
(1993), pp. 527
–
566.
Fonseca et al., 2006
A. Fonseca, J. Biscaya, J. de Sousa and A. Lobo, Geographical
classification of crude oi
ls by Kohonen self

organizing maps,
Analytica Chimica Acta
556
(2006), pp. 374
–
382.
Article

PDF (339 K)

View Record in Scopus

Cited By in Scopus
(14)
Gaetz et al., 1998
M. Gaetz, H. Weinberg, E. Rzempoluck and K. Jantzen,
Neural network
classification and correlation analysis of EEG and MEG activity accompanying spontaneous
reversals of the Necker cube,
Cognitive Brain Research
6
(1998), pp. 335
–
346.
Article

PDF (247 K)

View Record in Scopus

Cited By in Scopus (10)
Gorr et al., 1994
W. Gorr, D. Nagin and J. Szczypula, comparative study of artificial
neural
network
and statistical models for predic
ting student point averages,
International Journal of
Forecasting
10
(1994), pp. 17
–
34.
Abstract

PDF (1636 K)

View Record in Scopus

Cited By in Scopus (46)
Hanna et al., 2007
A. Hanna, D. Ural and G. Saygili, Evaluation of liquefaction potential of
soil deposits us
ing artificial
neural networks,
Engineering Computations
24
(2007), pp. 5
–
16.
Full Text
via CrossRef

View Record in Scopus

Cited By in Scopus (4)
Harvey et al., 2000
C. Harvey, K. Travers and M. Costa, Forecasting emerging market returns
using
neural networks,
Emerging Markets Quarterly
4
(2000), pp. 43
–
55.
Hecht

Nielson, 1989
Hecht

Nielson, R. (1989). Theory of the ba
ck

propagation
neural
network.
In
International joint conference on
neural networks
(pp. 593
–
605
). Washington,
DC.
Ibric et al., 2002
S. Ibric, M. Jovanovi, C. Djuri, J. Paroj and L. Solomun, The application of
generalized regression
neural network
in the modeling and optimization of aspirin extended
release tablets with Eudragit(r) RSPO as matrix substance,
Journal of Controlled Release
82
(2002), pp. 213
–
222.
Article

PDF (736 K)
Kim and Lee, 2005
B. Kim and B. Lee, Prediction of
silicon oxynitride plasma etching using
a generalized regression
neural network,
Journal of Applied Physics
98
(2005), pp. 1
–
6.
Kimoto et al., 1990
Kimoto, T., Asakawa, K., Yoda, M., & Takeoka, M. (1990).
Stock
market prediction system with modular
neural networks.
In
Proceedings of the IEEE
international conference on
neural networks
(pp. 1
–
16).
Kohzadi et al., 1996
N. Kohzadi, M. Boyd, B. Kemlanshahi and I. Kaastra, A Comparison of
artificial
neural network
and tim
e series models for forecasting commodity prices,
Neurocomputing
10
(1996), pp. 169
–
181.
Abstract

Article

PDF (878 K)

View Record
in Scopus

Cited By in Scopus (41)
Kryzanowski et al., 1993
L. Kryzanowski, M. Galler and D. Wright, Using artificial
neural
networks
to pick
stocks,
Financial Analysts Journal
49
(1993), pp. 21
–
27.
Full Text
via
CrossRef
KSE, 2008
KSE (
www.Kuwaitse.com
), visited on November 10, 2008.
Ku
mar and Bhattacharya, 2006
K. Kumar and S. Bhattacharya, Artificial
neural network
vs.
linear discriminant analysis in credit ratings forecast,
Review of Accounting and Financ
e
5
(2006), pp. 216
–
227.
Full Text
via CrossRef
Lapedes and Farber, 1988
A. Lapedes and R. Farber, How
neural
nets work?. In: D.
Anderson, Editor,
Neural
information processing systems
, American Institute of Physics,
New York (1
988), pp. 442
–
456.
Leigh et al., 2005
W. Leigh, R. Hightower and N. Modani, Forecasting the New York
stock
exchange composite index with past price and interest rate on condition of volume spike,
Expert Systems with Applications
28
(2005), pp. 1
–
8.
Article

PDF (205 K)

View Record
in Scopus

Cited By in Scopus (8)
Lek and Guegan, 1999
S. Lek and J. Guegan
, Artificial
neural networks
as a tool in
ecological modelling: an introduction,
Ecological Modeling
120
(1999), pp. 65
–
73.
Article

PDF (151 K)

View Record in Scopus

Cited By in Scopus (194)
Lim and Kirikoshi, 2005
C. Lim and T. Kirikoshi, Predicting the effects of physician

directed
promotion on prescription yield and sales uptake using
neural networks,
Journal of
Targeting, Measurement and Analysis for Marketing
13
(2005), pp. 158
–
167.
McGrath, 2002
C. McGrath, Terminator portfolio,
Kiplinger’s Personal Finance
56
(2002),
p
p. 56
–
57.
McMenamin and Monforte, 1998
J. McMenamin and F. Monforte, Short term energy
forecasting with
neural networks,
Energy Journal
19
(1998), pp. 43
–
52.
McNelis, 1996
P. McNelis, A
neural network
analysis of Brazilian
stock
prices: Tequila
effects vs. pisco sour effects,
Journal of Emerging Markets
1
(1996), pp. 29
–
44.

Mirmirani and Li, 2004
S. Mirmirani and H. Li, Gold price,
ne
ural networks
and genetic
algorithm,
Computational Economics
23
(2004), pp. 193
–
200.
Full Text
via CrossRef

View
Record in Scopus

Cited By in Scopus (12)
Moreno et al., 20
06
D. Moreno, P. Marco and I. Olmeda, Self

organizing maps could improve
the classification of Spanish mutual funds,
European Journal of Operational Research
147
(2006), pp. 1039
–
1054.
Article

PDF (199 K)

View Record in
Scopus

Cited By in
Scopus (7)
Mos
tafa, 2004
M. Mostafa, Forecasting the Suez Canal traffic: A
neural network
analysis,
Maritime Policy and Management
31
(2004), pp. 139
–
156.
Full Text
via CrossRef

View
Record in Scopus

Cited By in Scopus (6)
Nam and Yi, 1997
K. Nam and J. Yi, Predicting airline passeng
er volume,
Journal of
Business Forecasting Methods and Systems
16
(1997), pp. 14
–
17.
Palisade, 2005
Palisade Corporation (2005).
NeuralTools professional user guide (version
1.0)
.
Ithaca, New York: Palisade Corporation.
Parojcic et al., 2005
J. Parojcic, S. Ibric, Z. Djuric, M. Jovanovic and O. Corrigan, An
investigation into the usefulness of generalized re
gression
neural network
analysis in the
development of level A in vitro
–
in vivo correlation,
European Journal of Pharmaceutical
Sciences
30
(2005), pp. 264
–
272.
Parzen, 1962
E. Parzen, On the estimation of a probability density function and mode,
Annals
of Mathematical Statistics
33
(1962), pp. 1065
–
1076.
MathSciNet
Poh et al., 1998
H. Poh, J. Yao and T. Jasic,
Neural networks
for the analysis and
forecasting of advertising impact,
International Journal of Intelligent Systems in Accounting,
Finance and management
7
(1998), pp. 253
–
268.
Full Text
via CrossRef
Ruiz

Suarez et al., 1995
J. Ruiz

Suarez, O. Mayora

Ibarra, J. Torres

Jimenez and L. Ruiz

Suarez, Short

term ozone forecasting by artificial
neural network,
Advances in Engineering
Software
23
(1995), pp. 143
–
149.
Abstract

Article

PDF (703 K)

View Record in
Scopus

Cited By in Scopus (36)
Sakai et al., 2007
S. Sakai, K. Ko
bayashi, S. Toyabe, N. Mandai, T. Kanda and K. Akazawa,
Comparison of the levels of accuracy of an artificial
neural network
model and a logistic
regression model for the diagn
osis of acute appendicitis,
Journal of Medical Systems
31
(2007), pp. 357
–
364.
Full Text
via CrossRef

View Record in Scopus

Cited By in Scopus
(4)
Sharda, 1994
R. Sharda,
Neural netwo
rks
for the MS/OR analyst: An application
bibliography,
Interfaces
24
(1994), pp. 116
–
13
0.
Shie, 2008
J. Shie, Optimization of injection

molding process for mechanical properties of
polypropylene components via a generalized regression
neural network,
Polymers for
Advanced Technologies
19
(2008), pp. 73
–
83.
Full Text
via CrossRef

View Record in
Scopus

Cited By in Scop
us (1)
Silver and Shmoish, 2008
H. Silver and M. Shmoish, Analysis of cognitive performance in
schizophrenia patients and healthy individuals with unsupervised clustering models,
Psychiatry Research
159
(2008), pp. 167
–
179.
Article

PDF (919 K)

View Record in
Scopus

Cited By in Scopus (6)
Smith, 2007
G. Smith, Random walks in Middle Eastern
stock
mark
ets,
Applied Financial
Economics
17
(2007), pp. 587
–
596.
Full Text
via CrossRef

View Record in Scopus

Cited
By in Scopus (0)
Specht, 1991
D. Specht, A general regression
neural networ
k,
IEEE Transactions on
Neural Networks
2
(1991), pp. 568
–
576.
Full Text
via CrossRef

View Record in Scopus

Cited By in Scopus (806)
Swicegood and Clark, 2001
P. Swicegood and J. Clark, Off

site monitoring systems for
prediction bank underperformance: A comparison of
neural networks,
disc
riminant
analysis, and professional human judgment,
International Journal of Intelligent Systems in
Accounting, Finance and Management
10
(2001), pp. 169
–
186.
Full Text
via CrossRef
Tam et al., 2005
C. Tam, T. Tong, T. Lau and K. Chan, Selection of vertical framework
system by probabilistic
neural network
models,
C
onstruction Management and Economics
23
(2005), pp. 245
–
254.
Full Text
via CrossRef

View Record in Scopus

Cited By in
Scopus (9)
Videnova et al., 2006
I. Videnova, D. Nedialkova, M. Dimitrova and S. Popova,
Neural
networks
for air pollution forecasting,
Applied Artif
icial Intelligence
20
(2006), pp. 493
–
506.
Full Text
via CrossRef

View Record in Scopus

Cited By in Scopus (1)
Wang, 1995
S. Wang, The unpredictability of standard back propagation
n
eural networks
in
classification applications,
Management Science
41
(1995), pp. 555
–
55
9.
Full Text
via
CrossRef
Wasserman, 1993
P. Wasserman, Advanced methods in
neural
computing, Van Nostrand

Reinhold, New York (1993).
Yu and Lai Wang, 2009
L. Yu and S.K. Lai Wang, A
neural

network

based nonlinear
metamodeling approach to financial time series forecasting,
Applied Soft Computing
9
(2009),
pp. 563
–
574.
Article

PDF (739 K)

View Record in Scopus

Cited By in Scopus (2)
Yumlu et al., 2005
S
. Yumlu, F. Gurgen and N. Okay, A comparison of global, recurrent and
smoothed

piecewise
neural
models for Istanbul
stock
exchange (ISE) prediction,
Pattern
Recognition Letters
26
(2005), pp. 2093
–
2103.
Article

PDF (292 K)

View Record in
Scopus

Cited By in Scopus (4)
Zhuo et al., 2007
W. Zhuo, J. Li

Min, Q. Yong and W. Yan

hui, Railway passeng
er traffic
volume prediction based on
neural network,
Applied Artificial Intelligence
21
(2007), pp.
1
–
10.
Full Text
via CrossR
ef
Expert Systems with Applications
Volume 37, Issue 9
, September 2010, Pages 6302

6309
Forecasting model of global
stock
index by stochastic time effective
neural network
Zhe Liao
a
and Jun Wang
,
a
,
a
Institute of Financial Mathematics and Financial Engineering, College of Science, Beijing
Jiaotong University, Beijing 100044, PR China
Available online 8 June 2009.
Abstract
In this paper, we investigate the statistical properties of the fluctu
ations of the Chinese
Stock
Index, and we study the statistical properties of HSI, DJI, IXIC and SP500 by
comparison. According to the theory of artificial
neural networks,
a stochastic time
effective function is introduced in the forecasting model of the indices in the present paper,
which gives an improved
neural network
–
the stochastic time effective
neural network
model. In this model
, a promising data mining technique in machine learning has been
proposed to uncover the predictive relationships of numerous financial and economic
variables. We suppose that the investors decide their investment positions by analyzing the
historical data
on the
stock
market, and the historical data are given weights depending on
their time, in detail, the nearer the time of the historical data is to the present,
the stronger
impact the data have on the predictive model, and we also introduce the Brownian motion in
order to make the model have the effect of random movement while maintaining the original
trend. In the last part of the paper, we test the forecasting
performance of the model by using
different volatility parameters and we show some results of the analysis for the fluctuations of
the global
stock
indices using
the model.
Keywords:
Brownian motion; Stochastic time effective function; Data analysis;
Neural
network
; Returns; Predict
Article Outline
1.
Introduction
2.
Methodology for stochastic time effective function
2.1.
Introduction of stochastic time effective function
2.2.
Procedure followed in the stochastic time effective model
3.
Experiment analysis
3.1.
Selection and preprocessing of data
3.2.
Training st
ochastic time effective
neural network
4.
Conclusions
Acknowledgements
References
1. Introduction
Recently, some progress has been made in the work done on the fluctuations of the Chinese
stock
market, for example, see
(Ji and Wang, 2007)
and
(Li and Wang, 2006)
. In the present
paper, we inves
tigate the statistical properties of the fluctuations of the indices of the Chinese
Stock
Exchange, and study the statistical properties of HSI, DJI, IXIC and S
P500 by
comparison. A predictive model of the
stock
prices is constructed using the theory of
artificial
neural networks
and a stochastic time effective function, further the data from the
Chinese
stock
markets are analyzed
in the model. China has
Stock
A and
Stock
B in the
stock
markets. The indices of
Stock
A and
Stock
B play an important role in the
Chinese
stock
markets, and the database of the indices is fro
m the website
w
ww.sse.com.cn
.
Recently, the properties of the fluctuations of the
stock
markets have been studied in many
research fields, for example, see
(Azoff, 1994)
,
(Nakajima, 2000)
,
(Pino et al., 2008
)
,
(Shtub
and Versano, 1999)
and
(Wang, 2007)
. Artificial
neural networks
are one of the
technologies that have made great progress in the study of the
stock
markets. Us
ually
stock
prices can be seen as a random time sequence with noise, artificial
neural networks,
as large

scale parallel processing nonlinear systems that depend on their own intrinsic link
data, provide methods and techniques that can approximate any nonlinear continuous
function, without a priori assumptions about th
e nature of the generating process, see
(Pino et
al., 2008)
and
(Shtub and Versano, 1999)
. They have good
self

learning ability, a strong anti

jamming capability,and have been widely used in the financial fields such as
stock
prices,
profits, exchange rate and risk
analysis and prediction. Although the historical data have a
great influence on the investors’ positions, the degree of impact of the data depends on the
date at which they occurr (or time), we get a high level effect of the data when they are very
near th
e current state. Furthermore, we also introduce the Brownian motion in the model (see
Wang, 2007
), in order to make the model have the effect of random movement while
maintaining th
e original trend. We test the forecasting performance of the model by using
different volatility parameters, and we show some results of the analysis for the fluctuations
of the global
stock
index using the model. In this work, the forecasting model is developed
to estimate the level of returns on
Stock
A Index of Chin
a. In Section
3
, the results show
that the forecasting model can predict the index behavior better in a short time interval than in
a long time interval, and show the different perfo
rmances of HSI, DJI, IXIC and SP500 by
comparison, see
(Abhyankar et al., 1997)
,
(Austin et al., 1997)
and
(Balvers et al., 1990)
.
In this paper, forecasting based on
neural network
involves two
major steps, data
preprocessing and structure design. In the pretreatment stage, the collected data should be
normalized and properly adjusted, in order to reduce the impact of noise in the
stock
markets. At the design stage, different data training sets, validations and data processing will
cause the different results, see
(Breen et al., 1
990)
and
(Campbell, 1987)
. In
stock
markets,
the environment and behavior of the mark
ets may change greatly, for example see the Chinese
stock
markets in 2007. As a result, the data in the data training set should be time

variant,
reflecting the
different behavior patterns of the markets at different times. If all the data are
used to train the
network
equivalently, the
network
system may not be consistent with the
development of the
stock
markets, see
(Chenoweth and Obradovic, 1996)
,
(Chenoweth and
Obradovic, 1996)
,
(Cybenko, 1989)
and
(Demuth and Beale, 1998)
. Especially in the current
Chinese
stock
markets,
stock
market trading rules and management systems are changing
rapidly, for example, the
daily price limit (now 10%), shareholding reformation, the direct
investment of Hong Kong
stock
markets, the reorganization of A share, B share and H
share, and
the establishment of financial derivatives such as futures and options. Therefore,
using the historical data of the past it is difficult to reflect the current Chinese
stock
markets’ development. However, if only the recent data are selected, a lot of useful
information will be lost which the early data hold. In the financial model of the present paper,
a promising data mining technique in machine learning is pr
oposed to uncover the predictive
relationships of numerous financial and economic variables. Considering the above

mentioned financial situation, this paper presents an improved
neural network
model, the
stochastic time effective series
neu
ral network
model: each historical datum is given a
weight depending on the time
it occurs in the model, and we also use the probability density
functions to classify the various variables from the training samples, see
(Desai and Bharati,
1998)
,
(Duda et al., 2001)
and
(Elton and Gruber, 1991)
.
2. Methodology for stochastic time effective function
2.1. Introductio
n of stochastic time effective function
In this section, first we describe a three

layer BP
neural network
model (see
Azoff, 1994
),
which is shown in
Fig. 1
. The
neural network
model includes three layers: input layer,
hidden layer and output layer. In the model, the proper number of the hidden layer nodes
requires validation techniques to avoid under

fitting (too few neurons) and over

fitting (too
many neurons). Generally,
too many neurons in the hidden layers, and, hence, too many
connections, produce a
neural network
that memorizes the data and lacks the ability to
generalize.
Full

size image
(27K)
Fig. 1.
Three

layer
neural network.
Suppose that a three

layer
neural network
has neurons, and for any fixed neuron
n
(
n
=1,2,…,
N
)
, the model has the following structure: let
{
x
i
(
n
):
i
=1,2,…,
p
}
denote the set of
input of neurons,
{
y
j
(
n
):
j
=1,2,…,
m
}
denote the set of output of the hidden layer neurons;
V
i
is
the weight that connects the node
i
in the input layer neurons to the node
j
in the hidden layer,
W
j
is the weight that connects the nod
e
j
in the hidden layer neurons to the node
k
in the
output layer; and
{
o
k
(
n
):
k
=1,2,…,
q
}
denote the set of output of neurons. Then the output
value for a unit is given by the follo
wing function
where
are the
neural
thresholds,
is the sigmoid activation function.
Let
T
k
(
n
)
be the actual value of the data sets, then the error of the corresponding neuron
k
to
the output is defined as
ε
k
=
T
k

o
k
. In this paper, the error of the output is defined as
,
then the error of the sample
n
(
n
=1,2,…,
N
)
is defined as
where
(
t
)
is the stochastic time effective function. Now we define
(
t
)
as follows:
where
τ
(>0)
is the time strength coefficient,
t
1
is the current time or the time of the newest
data in the data set, and
t
n
is an arbitrary time point in the data set.
μ
(
t
)
is the drift function (or
the trend term),
σ
(
t
)
is the volatility function, and
B
(
t
)
is the standa
rd Brownian motion
(
Wang, 2007
). The stochastic time effective function implies that the recent information has a
stronger effect for the investors than the old information. In deta
il, the nearer the events
happened, the greater the investors and markets affected. And the impact of data follows the
time exponential decay, see
(Nakajima, 2000)
and
(Pino et al., 2008)
. Then the total error of
all the data training sets in the set output layer with the stochastic time effective function is
defined as
2.2. Procedure followed in the stochastic time effective model
Note that the training objective of stochastic time effective
neural network
is to modify the
weights so as to minimize the error between the
network’s
prediction and the actual target.
When all the training data are data (that is
t
1
=
t
n
), the stochastic time effective
neural
network
is the general
neural network
model. In
Fig. 2
, the training algorithms procedures
of stochastic time effective
neural network
are shown, which are as follows:
Step 1: Train
a stochastic time effective
neural network
by choosing five kinds of
stock
prices in the input layer: daily opening price, daily closing price, daily highest price, daily
lowest price and daily trade volume, and one price of the
stock
prices in the output layer:
the closing price of the next trading day. Then set the connective weights, and input the
training data sets.
Step 2: At the beginning of data processing, connective weights
V
i
and
W
j
follow the uniform
distribution on
(

1,1)
, and let the
neural
threshold
θ
k
,
θ
j
be 0.
Step 3: Introduce the stochastic time effective function
t
in the error function
e
(
n
,
t
)
. Choose
different volatility parameters. Give the transfer function from the input layer to the hidden
layer and the transfer function from the hidden layer to the output layer.
Step 4: Establish an error acceptable model and set pre

s
et minimum error. If the output error
is below pre

set minimum error, go to Step 6, otherwise go to Step 5.
Step 5: Modify the connective weights: Calculate backward for the node in the output layer:
Calculate
δ
backward for the node in the hidden layer:
where
o
(
n
)
is the output of the neuron
n
,
T
(
n
)
is the actual value of the neuron
n
in the data
sets,
o
(
n
)[1

o
(
n
)]
is the derivative of the sigmoid activation function and
h
′
is each of the
nodes which connect with the node
h
and in the next hidden
layer after node
h
. Modify the
weights from the layer to the previous layer:
where
η
is the learning step, which usually takes constants between 0 and 1.
Step 6: Output the predictive value.
Full

size image
(66K)
Fig. 2.
Procedure followed in the stochastic time effective model.
3. Experiment analysis
3.1. Selection and preprocessing of data
I
n this paper, we select the data of
Stock
A Index (SAI) and
Stock
B Index (SBI) of
China for each trading day in a 18

year period, that is from December 19, 1990 to June 7,
2008, which are from the Shanghai and Shenzhen
Stock
Exchange. And we also choose the
data of HSI, DJI, IXIC and SP500 by contrast. First, we study the statistical properties of the
returns of the index using the stochastic time effective
neural network
model, and then
study the relativity between the Chinese
stock
indices and foreign country
stock
indices.
Fig. 3
presents the related coefficients between SAI, SBI, HSI, DJI, IXIC and SP500. In
Fig.
3
, we can see that the related coefficients between SAI and SBI, HSI, DJI, IXIC and SP500
are 0.885, 0.629, 0.463, 0.108, 0.329; and the related coefficients between SBI and SAI, HSI,
DJ
I, IXIC and SP500 are 0.885, 0.817, 0.688, 0.458, 0.623.
Full

size image
(60K)
Fig. 3.
Related coefficient
s.
In the model of this paper, we suppose that the
network
in
puts include five kinds of data,
daily opening price, daily closing price, daily highest price, daily lowest price and daily trade
volume, and the
network
outpu
ts include the closing price of the next trading day.
Fig. 4
presents the plot of the time sequence log return of SAI, SBI, IXIC and SP500. We
denote the price sequence of SAI, SBI,
IXIC and SP500 of time
t
by
,
then
R
(
t
)
denotes the logarithm of return rate, respectively, by
In
Fig. 4
, we can see that the prices of the index fluctuate wildly, and this indicates that there
is a big noise in the data that causes difficulty in forecasting. Thus, we should go
through data
preprocessing before forecasting, so the data are normalized as follows:
Similarly to (5), the normalized values of the above

mentioned five kinds of data can also be
given.
Full

si
ze image
(107K)
Fig. 4.
The plot of log returns of SAI (a), SBI (b), IXIC (c), and SP500 (d).
3.2. Training stochastic time effective
neural network
Data sets are divided into two parts, data training set and data testing set. We collect the data
of SAI in 1990
–
2006 as the training set and the data of SAI in 2007
–
2008 as the testing set.
According to the procedures of the three

layer
network
introduced in Section
2
, the number
of
neural
nodes in the input layer is 5, the number of
neural
nodes in the hidden layer is
20 and the number of
n
eural
nodes in the output layer is 1, and the threshold of the
maximum training
cycles is 100 and the threshold of the minimum error is 0.0001. We take
the
μ
(
t
)
(the drift parameter) and
σ
(
t
)
(the volatility parameter)
(
μ
(
t
),
σ
(
t
))
to be (1,
0), (1,
1)
and (1,
2). While
(
μ
(
t
),
σ
(
t
))
is (1,
0
), the model has the effect of only time effective function;
while
(
μ
(
t
),
σ
(
t
))
is (1,
1), the model has the effect of both time effective function and normal
randomization; and while
(
μ
(
t
),
σ
(
t
))
is (1,
2), the model has the effect of intensive
ran
domization.
In
Table 1
the parts of training errors of different trading dates for (a) SAI, (b) SBI, (c) HSI,
(d) DJI, (e) IXIC and (f) SP500 are given. It can be clearly seen that d
uring the years 1991
and 1992, the relative error is larger than that in the year 2007, this clearly shows the effect of
time effective function. Furthermore, the gap between the relative error of SAI and SBI is
much more greater than the relative error of
the foreign
stock
markets. So we can conclude
that the value of the historical data in the foreign
stock
markets is greater than that in the
Chinese
stock
markets, this means that the Chinese
stock
markets fluctuate more sharply
than the foreign markets.
Table 1.
Comparison of errors of different dates for SAI, SBI, HSI, DJI, IXIC and SP500 (
(
μ
(
t
),
σ
(
t
))
is
(1,
1)).
Time
Actual
Predictive
Error
(a) SAI
90/12/20
104.39
144.57
−0.384
91/05/10
108.53
149.06
−0.373
06/09/20
1820.8
1830.3
−0.005
07/02/13
2973.4
2974.0
−0.0001
(b) SBI
92/02/24
124.65
128.88
−0.034
93/12/24
93.14
87.64
0.059
05/07/11
59.88
60.8308
−0.02
06/11/15
108.27
106.37
0.017
(c) HSI
01/07/31
12316.7
12123.8
0.016
02/01/29
11014.2
10712.2
0.027
06/10/11
17862.8
17849.5
0.001
07/09/26
26430.2
26230.8
0.008
(d) DJI
91/12/23
3022.6
2948.1
0.025
92/07/07
3295.
2
3340.8
−0.014
06/07/26
11102.5
11125.5
−0.002
07/12/28
13365.9
134129
−0.004
Time
Actual
Predictive
Error
(e) IXIC
91/11/08
548.08
540.3
0.014
92/03/05
621.97
635.6
−0.022
06/03/07
2268.38
2290.0
−0.010
07/05/29
2572.06
2559.6
0.005
(f) SP500
91/01/03
321.91
339.06
−0.0
53
92/02/04
413.85
408.38
0.013
06/10/31
1377.94
1376.05
0.001
07/02/05
1446.99
1448.47
−0.001
Fig. 5
shows the fluctuations of the time sequence of rela
tive errors during the years 1991 to
2007 for the prices of SAI, SBI, HSI, DJI, IXIC and SP500. In these plots, 0 represents the
farthest data to the current date, and the larger
t
(date) represents the date that is nearer to the
current date.
Fig. 5
also clearly indicates that the stochastic time effective
neural network
model can be reali
zed by assigning different weights to the data of different times. Time
sequence of relative errors of (b) SBI and (d) DJI in
Fig. 5
clearly reflects the model of
randomization by th
e effect of the Brownian motion. And we also conclude that SBI is similar
to DJI, and that SAI is similar to HIS. By coincidence, the conclusion is supported by the
relativity analysis shown in
Fig. 3
.
Full

size image
(137K)
Fig. 5.
Relative errors of SAI, SBI, HSI, DJI, IXIC,and SP500
((
μ
(
t
),
σ
(
t
)))
is (1,
1).
In order to test the validity of the volatility parameter
σ
(
t
)
, we take
σ
(
t
)
to be 0 or 2. If
σ
(
t
)
is
0, then the model has no effect of randomization but has on
ly the effects of the time effective
function. If
σ
(
t
)
is 2, then the model has intensive effect of the wild fluctuation.
In
Table 2
, the predictive values and relative errors of SAI by the different values
σ
(
t
)
are
given. This shows that the relative error is the smallest when
σ
(
t
)=1
(almost below 1%) and
the relative error is the l
argest when
σ
(
t
)=0
(almost over 10%). So we can conclude that
adding in the Brownian motion in this financial model is propitious for increasing the
precision of prediction. However, the volatility parameter should not be too large, otherwise it
will increase the err
or of the prediction.
Table 2.
Predictive values and errors of time effective
neural network
model of SAI by different
σ
(
t
)
.
Time
Actual
Predictive
Error
(a)
σ
(
t
)=1
08/06/03
3605.859
3627.168
−0.0059
08/06/04
3535.809
3604.064
−0.0193
08/06/05
3516.219
3530.171
−0.0039
08/06/06
3493.189
3507.5
05
−0.0040
(b)
σ
(
t
)=0
08/06/03
3605.859
3655.273
−0.014
08/06/04
3535.809
3689.849
−0.044
08/06/05
3516.219
3559.069
−0.012
08/06/06
3493.189
3502.137
−0.003
(c)
σ
(
t
)=2
08/06/03
3605.859
4060.890
−0.126
08/06/04
3535.809
4034.350
−0.141
08/06/05
3516.219
4001.172
−0.138
08/06/06
3493.189
3968.767
−0.136
In
Table
3
, I stands for the average relative error, II stands for the average relative error of the
first 1000 days in the data sets, and III stands for the latest 100 days in the data sets. In
Table
3
, the time effective function in the model is clearly expressed. Take the relative error in SAI
at
σ
(
t
)=1
for example, the average relative error is 2.6%; the average relative error of the first
1000 days is 6.88% and the average relative error of the latest 100 days is 1.3%. So the latest
data are more valuable than the historical data of the past in
the
stock
market.
Table 3.
Average relative errors for SAI, SBI, DJI, IXIC, HSI and SP500 of different
σ
(
t
)
.
SAI
SBI
DJI
IXIC
HSI
SP500
(a)
σ
(
t
)=1
I
0.026
0.0212
0.0388
0.0152
0.0104
0.0074
II
0.0688
0.0262
0.0697
0.0366
0.0112
0.0077
III
0.013
0.0141
0.0058
0.0078
0.0109
0.007
8
(b)
σ
(
t
)=0
I
0.0766
0.0284
0.0128
0.2143
0.0322
0.0345
II
0.1637
0.0391
0.201
0.7247
0.0379
0.045
III
0.0514
0.0148
0.0103
0.0328
0.0247
0.0366
(c)
σ
(
t
)=2
I
0.2458
0.1041
0.043
0.1443
0.0636
0.0897
II
0.8686
0.1276
0.083
0.3466
0.0875
0.2256
III
0.0353
0.0985
0.034
0.1041
0.0347
0.0265
In
Table 3
, the global
stock
indices errors of the different values
σ
(
t
)
are also given. Take
the relative error in SAI for example, while
σ
(
t
)=1
, the average relative error is 2.6%; while
σ
(
t
)=0
, the average relative error is 7.66%; while
σ
(
t
)=2
, the average relative error is 24.58%.
So this implies that the appropriate volatility parameter is beneficial for the pr
ediction of the
stochastic time effective
neural network,
and that it is widely applicable to the global
stock
markets.
In
Fig. 6
, the comparison between predictive values and the actual values in SAI, SBI, HSI
and IXIC by the stochastic time eff
ective
neural network
model is shown.
Full

size image
(71K)
Fig. 6.
Comparison of the predictive values and actual values.
In
Fig. 7
, by using the linear regression method, we compare the predictive values of
stochastic time effective
neural network
mod
el with the actual values in SAI (a), SBI (b),
NSDK (c) and SP500 (d). Through the regression analysis, different linear equations in SAI
(a), SBI (b), NSDK (c) and SP500 (d) are obtained. Take the linear equation in SAI (a) for
example, the linear equatio
n is
And the correlation coefficient
R
=0.9984
. The linear equation for SBI (b) is
And the correl
ation coefficient
R
=0.9978
; the linear equation for NSDK(c) is
And the correlation coefficient
R
=0.9988
and the linear equation for SP500 (d) is
And the correlation coefficient
R
=0.9991
. So we test th
e accuracy of the results of the
forecasting from another angle.
Full

size image
(72K)
Fig. 7.
Regression
of the predictive values and actual values.
4. Conclusions
This paper introduces a new stochastic time effective function to model a stochastic time
effectiv
e
neural network
model. The effectiveness of the model has been analyzed by
performing a numerical experiment on the data of SAI, SBI, HSI, DJI, IXIC and SP500,
and
the validity of the volatility parameters of the Brownian motion is tested. Further, the present
paper shows some predictive results on the global
stock
in
dices using the stochastic time
effective
neural network
model.
Acknowledgements
The authors were supported in part by National Natural Science Foundation of Ch
ina Grant
No. 70771006 and by BJTU Foundation No. 2006XM044.
References
Abhyankar et al., 1997
A. Abhyankar, L.S. Copeland and W. Wong, Uncovering nonlinear
structure in real

time
stock

market indexes: The SP500, the DAX, the Nikkei 225, and the
FTSE

100,
Journal of Business Economics and Statistics
15
(1997), pp. 1
–
14.
Full Text
via
CrossRef

View Record in Scopus

Cited By in Scopus (58)
Austin et al., 1997
M. Austin, C. Looney and J. Zhuo, Security market timi
ng using
neural
network
models,
Expert Systems
3
(1997), pp. 3
–
14.
View Record in Scopus

Cited By in
Scopus (9)
Azoff, 1994
E.M. Azoff,
Neural network
time series forecasting of financial market, Wiley,
New York (1994).
Balvers et al., 1990
R.J. Balvers, T.F. Cosimano and B. McDonald, Predicting
stock
returns
in an eHcient market,
Journal of F
inance
55
(1990), pp. 1109
–
1128.
Full Text
via CrossRef
Breen et al., 1990
W. Breen, L.R. Glosten and R. Jagannathan, Predictable variations in
stock
index returns,
Journal of Finance
44
(1990), pp. 1177
–
1189.
Campbell, 1987
J. Campbell,
Stock
returns and the term structure,
Journal of Financial
Economics
18
(1987), pp. 373
–
399.
Abstract

PDF (1757 K)

View Record in Scopus

Cited By in Scopus (353)
Chenoweth and Obradovic, 1996
T. Chenoweth and Z. Obradovic, A multi

component
nonlinear prediction system for the SP500 index,
Neurocomputing
10
(1996), pp. 275
–
290.
Abstract

Article

PDF (1159 K)

View Record in Scopus

Cited By in Scopus (14)
Chenoweth and Obradovic, 1996
T. Che
noweth and Z. Obradovic, Embedding technical
analysis into
neural network
based trading systems,
Applied Artificial Intelligence
10
(1996), pp. 523
–
541.
View Record in Scopus

Cited By in Scopus (8)
Cybenko, 1989
G. Cybenko, Approximation by superpositions of a sigmoidal function,
Mathematics of Control Signals and Systems
2
(1989), pp. 303
–
314.
MathSciNet

View
Record in Scopus

Cited By in Scopus (2248)
Demuth an
d Beale, 1998
H. Demuth and M. Beale,
Neural network
toolbox: For use with
MATLAB (5th ed.), The Math Works, Inc, Natick, MA (1998).
Desai and Bharati, 1998
V.S. Desai and R. Bharati, The efficiency of
neural networks
in
predicting returns on
stock
an bond indices,
Decision Sciences
29
(1998), pp. 405
–
425.
Duda et al., 2001
R.O. Duda, P.E. Hart and D.G. Stork, Pattern clas
sic cation, Wiley, New
York (2001).
Elton and Gruber, 1991
E.J. Elton and M.J. Gruber, Modern portfolio theory and investment
analysis (4th ed.), Wiley, New York (1991).
Ji and Wang, 2007
M.F. Ji and J. Wang, Data analysis and statistical properties of Shenzhen
and Shanghai land indices,
WSEAS Transactions on Business and Economics
4
(2007), pp.
33
–
39.
Li and Wang, 2006
Q.D. Li and J. Wang, Statistical properties of waiting times and returns in
Chinese
stock
markets,
WSEAS Transactions on Business and Economics
3
(2006), pp.
758
–
765.
Nakajima, 2000
Y. Nakajima, Are fluctuations in
stock
price unexpected,
Artificial
Intelligence and Knowledge Based Processing (AI)
100
(2000), pp. 37
–
42.
Pino et al., 2008
R. Pino, J. Parreno, A. Gom
ez and P. Priore, Forecasting next

day price of
electricity in the Spanish energy market using artificial
neural networks,
Engineering
Applications of Artificial Intelligence
2
1
(2008), pp. 53
–
62.
Article

PDF (302 K)

View
Record in Scopus

Cited By in
Scopus (18)
Shtub and Versano, 1999
A. Shtub and P. Versano, Estimating the cost of steel pipe bending,
a comparison between
neural networks
and regression analysis,
Journal of Production
Economics
62
(1999), pp. 201
–
207.
Article

PDF (106 K)

View Record in Scopus

Cited
By in Scopus (18)
Wang, 2007
J. Wang, Stochastic process and its application in fin
ance, Tsinghua University
Press and Beijing Jiaotong University Press (2007).
Cor
responding author. Tel./fax: +86 10 51682867.
Expert Systems with Applications
Volume 37, Issue 1
, January 2010, Pages 834

841
Improving returns on
stock
investment through
neural network
selection
Tong

Seng Quah
,
,
a
and Bobby Srinivasan
b
a
School of Electrical and Electronic Engineering, Nanyang Technological University,
Nanyang Avenue, Singapore 639798, Singapore
b
School of Accountancy and Business, Nanyang Technological University, Nanyang Avenue,
Singapore 639798, Singapore
Available online 1 November 1999.
Abstract
The Artificial
Neural Network
(ANN) is a technique that is heavily researched and used i
n
applications for engineering and scientific fields for various purposes ranging from control
systems to artificial intelligence. Its generalization powers have not only received admiration
from the engineering and scientific fields, but in recent years,
the finance researchers and
practitioners are taking an interest in the application of ANN. Bankruptcy prediction, debt

risk assessment and security market applications are the three areas that are heavily
researched in the finance arena. The results, this
far, have been encouraging as ANN displays
better generalization power as compared to conventional statistical tools or benchmark.
With such intensive research and proven ability of the ANN in the area of security market
application and the growing import
ance of the role of equity securities in Singapore, it has
motivated the conceptual development of this project in using the ANN in
stock
selection.
With its prov
en generalization ability, the ANN is able to infer from historical patterns the
characteristics of performing
stocks.
The performance of
stocks
is reflective of their
profitability and the quality of management of the underlying company. Such information is
reflected in financial and technical variables. As such, the ANN
is used as a tool to uncover
the intricate relationships between the performance of
stocks
and the related financial and
technical variables. Historical data such
as financial variables (inputs) and performance of the
stock
(output) are used in this ANN application. Experimental results obtained this far have
been very enc
ouraging.
Author Keywords:
Technical analysis; Fundamental analysis;
Neural netwo
rk
; Economic
factors; Political factors; Firm specific factors
Article Outline
1. Introduction
2. Application of
neural network
in financial and commercial domains
3.
Neural
architecture
3.1. Select the Appropriate Algorithm
3.2. Architecture of ANN
3.3. Selection of the Learning Rule
3.4. Selection of the Appropriate Learning Ra
tes and Momentum
4. Variables selection
5. Experiment
5.1. Research design
5.2. Design 1 (Basic System)
5.3. Design 2 (Moving Window System)
5.4. Results
6. Conclusion and future works
References
1. Introduction
With the growing importance in the role of equities to both the international and local
investors, the selection of attractive
stocks
is of utmost importance to ensure a good return.
Therefore, a reliable tool in the selection process can be of great assistance to these investors.
An effective and efficient tool/system gives the investor the competitive edge over others as
he/she can identify the performing
stocks
with minimum effort.
Trading strategies, rules and concepts based on fundamental and technical analysis, have been
de
vised by both academics and practitioners in assisting the investors in their decision making
process. Innovative investors opt to employ information technology to improve the efficiency
in the process. This is done through transforming trading strategies
into computer known
languages so as to exploit the logical processing power of the computer. This greatly reduces
the time and effort in short

listing the list of attractive
stocks.
In this age where information technology is dominant, such computerized rule based expert
systems have great limitations that will affect its effectiveness and efficiency. However, with
the significant advancement in the field of Arti
ficial
Neural Network
(ANN), these
limitations have found a solution. In this research, the generalization ability of the ANN is
being harnessed in creating an
effective and efficient tool for
stock
selection. Results of the
researches in this field have so far been very encouraging.
2. Application of
neural network
in financial and commercial domains
Research developments in Bankruptcy prediction have showed that the ANN performs better
than the conventional statistical meth
ods such as Discriminant Analysis and Logistic
Regression.
Alici (1996)
, using UK data, shows that ANN, with the architecture consisting of
3 multi layer Preceptron (28 financial inp
uts, seven hidden neurons and two output neurons)
has an average of 71.38%(76.07%) for the failed firms (healthy firms) using
neural
network.
On the other hand,
the Discriminant Analysis and Logistic Regression have both
achieved 60.12%(71.43%) and 65.29%(71.07%) for the failed firms (Healthy Firms). The
feasibility of applying ANN into bankruptcy was also studied by
Raghupathi, Schkade and
Raju, 1991
and
Odom
, and many others.
Salchenberger, Cin
ar and Lash (1992)
use the ANN for classifying failure pertaining to the
Savings and Loans (S&Ls) organizations in US. In the Debt Risk Assessment applications,
Dutta and Shekhar (19
93)
use
neural network
to classify bonds.
The generalization ability of the ANN is also extended to commodity trading (copper).
Robles
and Naylor (1996)
are able to show that the ANN outperforms the traditional Weighted
Moving Average rule and a “buy and hold” strategy. In equity,
Gencay and Stengos (1996)
show that the ANN outperforms the
(linear model) in the identical research
methodology
. The ratios of the average MSPE of the testing model and benchmark are 0.961
and 1 for the ANN and the
, respectively.
Burgess and Refenes (1996)
prove that, the ANN has better generalization ability than
Ordinary Least Square in predicting daily return of the
index.
Neural network
is also
applied in predicting the trend of Italian
stock
market (
).
Chinetti and Rossignoli
(1993)
attempted to use technical variables to build an accurate prediction system to forecast
the timing of buying and selling for the
index.
Westheider (1994)
studied the
predictive power of the ANN on
stock
index returns
. He uses economic and fundamental
variables to predict monthly, quarterly and yearly returns.
3.
Neural
architecture
The computer software selected for the tra
ining and testing the
network is Neural
Planner
version 3.71. This software was programmed by Stephen Wolstenholme. It is an ANN
simulator strictly designed for
only one Back Propagation learning algorithm.
There are four major issues in the selection of the appropriate
network:
1. Select the Appropriate Algorithm.
2.
Architecture of the ANN.
3. Selection of the Learning Rule.
4. Selection of the Appropriate Learning Rates and Momentum.
3.1. Select the Appropriate Algorithm
Since the sole purpose of this project is to identify the top performing
stocks
and the
historical data that are used for the training process will have a known outcome (whether it is
considered Top performer or otherwise), algorithms designed for supervise
d learning are
ideal. Among the available algorithms, the Back Propagation algorithm designed by
Rumelhart, Hinton and Williams (1986)
is the most suitable as it is being intensively
tested in
Finance. Moreover, it is recognized as a good algorithm for generalization purposes.
3.2. Architecture of ANN
Architecture, in this context, refers to the entire structural design of the ANN (Input Layer,
Hidden Layer and Output Layer). It invol
ves determining the appropriate number of neurons
required for each layer and also the appropriate number of layers within the Hidden Layer.
The logic of the Back Propagation method is the Hidden Layer. The Hidden Layer can be
considered as the crux of the
Back Propagation method. This is because hidden layer can
extract higher level features and facilitate generalization, if the input vectors have low level
features of a problem domain or if the output/input relationship is complex. The fewer the
hidden un
its, the better is the ANN able to generalize. It is important not to over

fit the ANN
with large number of hidden units than required until it can memorize the data. This is
because the nature of the hidden units is like a storage device. It learns noise
present in the
training set, as well as the key structures. No generalization ability can be expected in these.
This is undesirable as it does not have much explanatory power in a different
situation/environment.
3.3. Selection of the Learning Rule
The lea
rning rule is the rule that the
network
will follow in its error reducing process. This
is to facilitate the derivation of the relationships between the input(s
) and output(s). The
generalized delta rule developed by
Rumelhart et al. (1986)
is used in the calculations of
weights. This particular rule is selected because it is heavily used a
nd proven effective in the
Finance researches.
3.4. Selection of the Appropriate Learning Rates and Momentum
The Learning Rates and Momentum are parameters in the learning rule that aid the
convergence of error, so as to arrive at the appropriate weights t
hat are representative of the
existing relationships between the input(s) and the output(s).
As for the appropriate learning rate and momentum to use, the
Software
has a feature that can determine appropriate learning rate and momentum for the
network
to
start training with. This functi
on is known as “Smart Start”. Once this function is activated,
the
network
will be tested using different values of learning rates and momentum to find a
combin
ation that yields the lowest average error after a single learning cycle. These are the
optimum starting values as using these rates improve error converging process, thus require
less processing time.
Another attractive feature is that the software comes
with an “Auto Decay” function that can
be enabled or disabled. This function automatically adjusts the learning rates and momentum
to enable a faster and more accurate convergence. In this function, the software will sample
the average error periodically,
and if it is higher than the previous sample then the learning
rate is reduced by 1%. The momentum is “decayed” using the same method but the sampling
rate is half of that used for the learning rate. If both the learning rate and momentum decay are
enabled
then the momentum will decay slower than the learning rate.
In general cases, where these features are not available, a high learning rate and momentum
(e.g. 0.9 for both the Learning Rates and Momentum) are recommended as the
network
will
converge at a faster rate than when lower figures are used. However, too high a Learning Rate
and Momentum will cause the error to oscillate and thus prevent the converging pr
ocess.
Therefore, the choice of learning rate and momentum are dependent on the structure of the
data and the objective of using the ANN.
4. Variables selection
In general, financial variables chosen are constrained by data availability. They are chosen
fi
rst on the significant influences over
stock
returns based on past literature searches and
practitioners’ opinions and then on the availability of such data. Mo
st of the data used in this
research is provided by Credit Lyonnais Securities (Singapore) Pte Ltd.
Stock
prices are
extracted from a financial database called
.
Broadly, factors that can affect
stock
prices can be classified into three categories:
economic factors, political factors and firm/
stock
specific factors. Economic factors have
significant
influence on the returns of individual
stock
as well as
stock
i
ndex in general as
they possess significant impact on the growth and earnings’ prospects of the underlying
companies thus affecting the valuation and returns. Moreover, economic variables also have
significant influence on the liquidity of the
stock
market. Some of the economic variables
used are: inflation rates, employment figures and producers’ price index.
Many researchers have found that it is difficult to a
ccount for more than one third of the
monthly variations in individual
stock
returns on the basis of systematic economic
influences, and shown that political fa
ctors could help to explain some of the missing
variations. Political stability is vital to the existence of business activities and the main driving
force in building a strong and stable economy. Therefore, it is only natural that political
factors such a
s fiscal policies, budget surplus/deficit etc do have effects on
stock
price
movements.
Firm specific factors affect only individual
stock
return. For example, financial ratios and
some technical information that affects the return structure of specific
stocks,
such as yield
factors, growth factors, momentum factors, risk factors and liquidity factors. As far as
stock
selection is
concerned, firm specific factors constitute to important considerations as it is
these factors that determine whether a firm is a bright start or a dim light in the industry. Such
firm specific factors can be classified into five major categories:
1. Yield
factors: these include “historical P/E ratio” and “Prospective P/E ratio”. The former
is computed by price/earning per share. The latter is derived by price/consensus earnings per
share estimate. Another variable is the “cashflow yield”, which is basicall
y price/operating
cashflow of the latest 12
months.
2. Liquidity factors: the most important variable is the “market capitalization”, which is
determined by “price of share×number of shares outstanding”.
3. Risk factors: the representative variable is the
“earning per share uncertainty”, which is
defined as “percentage deviation about the median EPS estimates”.
4. Growth factors: basically, this means the “return on equity (ROE)”, and is computed by
“net profit after tax before extraordinary items/sharehold
ers equity”.
5. Momentum factors: a proxy is derived by “average of the price appreciation over the
quarter with half of its weights on the last month and remaining weights being distributed
equally in the remaining two months”.
The inputs of the
neural network stock
selection system are the above seven inputs and the
output is the return differences between the
stock
and the market return (excess returns).
This is to enable the
neural network
to establish the relationships b
etween inputs and the
output (excess returns).
The training data set will include all data available until the quarter before the testing quarter.
This is to ensure that the latest changes in the relationship of the inputs and the output are
being captured
in the training process.
5. Experiment
The quarterly data required by the project are generally
stock
prices and financial variables
(inputs to the ANN
stock
selection system) from 1/1/93 to 31/12/96.
Stock
Prices, which ar
e used to calculate
stock
returns, are extracted from the Financial
Database called
. These
stock
returns, adjusted for dividends,
stock
splits
and bonus issues, will be used as output in the ANN training process.
One unique feature of this research is that Prospective P/E ratio, measured as
(Price/Consensus Earnings Per Share Estimate), is being us
ed as a forecasting variable. This
variable has not received much attention in Financial Research. Prospective P/E ratio is used
among practitioners as it can reflect the perceived value of
stock
with respect to EPS
(Earnings per share) expectations. It is used as a value indicator, which has similar
implications as that of the Historical P/E ratio. As such, a low Prospective P/E suggests that
the
stock
is undervalued with respect to its future earnings and vice versa. With its
explanatory power, Prospective P/E ratio qualifies as an input in the
stock
selection system.
Data on Earnings per Share estimates, which is used for the calculation of EPS Uncertainty
and Prospective P/E ratio, is available in
. This is a compilation
on EPS estimates and recommendations are put forward by financial analysts. The coverage
has estimates from countries over the A
sia Pacific Rim as from January 1993.
5.1. Research design
The purpose of this ANN
stock
selection system is to select
stocks
that are top performers
from the market (
Stock
that outperformed the market by 5%) and to avoid se
lecting under
performers (
Stocks
that underperformed the market by 5%). More importantly, the aim is to
beat the market benchmark (Quarterly return on the
index) on a portfolio basis.
This ANN
stock
selection system is a quarterly portfolio re

balancing strategy whereby it
will select
stocks
in the beginning of the quarter and performance (the retur
n of the
portfolio) will be assessed at the end of the quarter.
5.2. Design 1 (Basic System)
In this research design, the sample used for training consists of
stocks
that outperformed
and underperformed the market quarterly by 5% from 1/1/93 to 30/6/95.
The inputs of the ANN
stock
selection system are the seven inputs
chosen in
Section 4
and
the output will be the return differences between the
stock
an
d the market return (excess
returns). This is to enable the ANN to establish the relationships between inputs and the
output (excess returns).
The training data set will include
all data available until the quarter before the testing
quarter
. This is to en
sure that the latest changes in the relationship of the inputs and the
output are being captured in the training process.
The generalization ability of the ANN in selecting top performing
stocks
and whether the
system can perform is tested across time. The data used for the selection process are from the
third quarter of 1995(1/7/95
–
30/9/95), the fourth quarter of 1995 (1/10/95
–
31/12/95), the first
quarter of 199
6 (1/1/1996
–
31/3/1996), the second quarter of 1996 (1/4/1996
–
30/6/1996), the
third quarter of 1996 (1/7/1996
–
30/9/1996), the fourth quarter of 1996 (1/10/1996
–
31/12/1996). The limited test duration is constrained by the data availability.
The testing input
s are being injected into the system and the predicted output will be
calculated using the established weights. After which, the top 25
stocks
with the highest
output value will be selected to form a portfolio of
stocks.
These 25
stocks
are the top 25
stocks
recommended for purchase at the beginning of the quarter. Generalization ability of
the ANN will be determined by the perform
ance of the portfolio, measured by excess returns
over the market as well as the % of top performers in the portfolio as compared to the
benchmark portfolio (Testing Portfolio) at the end of the month.
5.3. Design 2 (Moving Window System)
The Basic System
is constrained by meeting the minimum sample size required for training
process. However, this second design is going to forgo the recommended minimum sample
size and introduce a Moving Window concept. This is to analyze the ANN ability to perform
under a
restricted sample size environment.
The inputs and output variables are identical with that of the Basic System, but the training
and testing samples are different. The Moving Window System uses three quarters as training
sample and the subsequent quarter
as the testing sample. The selection criterion is also
identical with that of the Basic System in research Design 1.
5.4. Results
The ANN is made to train with 10
000 and 15
000 cycles. The reason for using these numbers
of cycles for training is because t
he error converging is generally slow after 10
000 thus
suggesting adequate training. Moreover, it does not converge beyond 15
000. This is an
indicator that the
network
is over

trained.
The training of four hidden neurons for 10
000 cycles takes approximately 1.5
h, eight hidden
neurons takes about 3
h and the most complex (14 hidden neurons) took about 6
h on a
Pentium 100
MHz PC. As for those architectures tha
t require 15
000 cycles, it usually takes
about 1.5 times the time it takes to train the
network
for 10
000 cycles.
The results of the Basic
Stock
Selection System based on the training and testing schedules
mentioned are presented in two forms: (1) the excess return format and (2) the percentage of
the top performers
in the selected portfolio. These two techniques will be used to assess the
performance and generalization ability of ANN.
Testing results show that the ANN is able to “beat” the market overtime, as shown by positive
compounded excess returns achieved consi
stently throughout all architectures and training
cycles. This implies that the ANN can generalize relationships overtime. Even at individual
quarters’ level, the relationships between the inputs and the output established by the training
process is proven
successful by “beating” the market in 6 out of 8 possible quarters which is a
reasonable 75%. (
Fig. 1
.)
Full

size image
(16K)
Fig. 1. Graphical presentation of excess returns for 15
000 cycles.
The Basic
Stock
System has consistently performed better than the testing portfolio
overtime. This is evident by the fact that the selected portfolios have hig
her percentage of top
performing
stocks
(above 5%) than the testing portfolio overtime. This ability has also
enabled the
network
to better the performance of the market (
index) presented
ea
rlier. (
Fig. 2
and
Fig. 3
.)
Full

size image
(17K)
Fig. 2. Performance of portfolios with % of
stocks
with actual return
of 5% above market
for 10
000 cycles.
Full

size image
(24K)
Fig. 3. Graphical presentation of excess returns of portfolio.
The Moving Window Selection System is design
ed to test the generalization power of the
ANN in an environment with limited data.
The generalization ability of the ANN is again evident in the Moving Window
Stock
Selection System as it outperformed the Testing portfolio in 9 out of 13 testing quarters
(69.23%). This can be seen in the graphical presentation that the line representing the selected
portfolio is above the line representing testing portfolio most
of the time. (
Fig. 4
.) Moreover,
the compounded excess returns and the annualized compounded excess returns are better than
that of the testing portfolio by two times over. The sele
cted portfolios have outperformed the
market 10 out of 13 (76.92%) testing quarters and excess returns (127.48% for the 13 quarters
and 36.5% for the Annualized compounded return), which proved its consistent performance
over the market (
index) overtime. (
Fig. 5
.)
Full

size image
(21K)
Fig. 4. % of Top performers in the portfolio

moving window
stock
selection system.
Full

size image
(28K)
Fig. 5. Actual returns

moving window
stock
selection system.
The Selected Portfolios have outperformed the Testing Portfolio in nine (69.23%) and equal
the performance in one occasion. This further proves the generalization ability of the ANN.
Moreove
r, the ability to avoid selecting undesirable
stocks
is also evident by the fact that
the selected portfolios have less of this kind of
stocks
than the testing portfolio in 10 out of
13 occasions (76.92%).
From the experimental results, the portfolio of the Selected portfolios outperformed the
Testing and Market portfo
lios in terms of compounded actual returns overtime. The reason is
because the Selected portfolios outperform the two categories of portfolios in most of the
testing quarters thus achieving better overall position at the end of the testing period.
6. Concl
usion and future works
The ANN has displayed its generalization ability in this particular application. This is evident
through the ability to single out performing
stock
counters and having excess returns in the
Basic
Stock
Selection System overtime. Moreover,
neural network
has also showed its
ability in deriving relationships in a constrained environment in the Moving Window
Stock
Selection System thus making it even more attractive for applications in the field of Finance.
This paper is largely constrained by the availability of data. Therefore, when more data is
available, performance of the
neural networks
can be better assessed in the various kinds of
market conditions, such as bull, bear, high inflation, low inflation or even political conditions
in which all have different impact
on
stocks.
Also, as more powerful
neural
architectures are being
discovered by researchers on a fast
pace, it is good to repeat the experiments using several architectures and compare the results.
The best performance structure may than be employed.
References
Alici
, Y., 1996.
Neural network
in corporate failure prediction: the UK experience. In:
Proceedings of the Third International Conference on
Neural Networks
in the Capital
Markets
World Scientific, Singapore 11
–
13 October,1995, London.
Burgess
, A.N. and Refenes
, A.N., 1996. Modelling non

linear co

integration in international
equity index futures. In:
Proceedings of the Third International Conference on
Neural
Networks
in the Capital
Markets
World Scientific, Singapore London 11
–
13 October 1995.
Chinetti
, D., & Rossignoli, C. (1993). A
neural network
model for
stock
market
prediction. Tech. Rep. Universita Statale di Milano, Department of Computer Science, Milan,
Italy..
Dutta
, S. and Shekar, S., 1993. Bond rating: a non

conservative application of
neural
networks
Chapter 14,reprint. In: Robert R. Trippi
and Efraim Turban Editors, 1993.
Neural
networks
in finance and investing
Pobus Publishing Company, pp. 257
–
273 Proceedings of
the IEEE International Conference on
Neural Networks,
July 1988 (pp. II443
–
II450).
Gencay
, R. and Stengos, T., 1996. The predictability of
stock
returns with local versus
global nonparametric estimators. In:
Proceeding of the Third Co
nference on
Neural
Networks
in the Capital Markets
World Scientific, Singapore London 11
–
13 October 1995.
Odom
, M.D. and Sharda, R., 1993.
Neural network
model for bankruptcy prediction
Chapter 14, reprint. In: Robert R. Trippi and Efraim Turban Editors, 1993.
Neural networks
in finance and investing
Pobus Publishing Company, pp. 178
–
185 Proceedings of the IEEE
International Conference on
Neural Networks
(pp II163
–
II168). San Deigo, CA, EEE.
Raghupathi
, W., Schkade, L.L., & Raju, B.S. (1991). A
neural network
approach to
bankruptcy prediction. Proceedings of the IEEE 24th Annual Hawaii International Conference
on System Sciences..
Robles
, J.V. an
d Naylor, C.D., 1996. Applying
neural networks
in copper trading: a
technical analysis simulation. In:
Proceedings of the Third International Conference on
Neural Networks
in Capital Markets
World Scientific, Singapore London 11
–
13 October
1995.
Rumelhart
, D.E., Hinton, G.E
. and Williams, R.J., 1986. Learning the internal representations
by error propagation. In:
Parallel distributed processing vol 1 and 2
MIT Press,
Massachusetts.
Salchenberger
, L.M
., Cinar, E.M. and Lash, N.A., 1992.
Neural networks:
a new tool for
predicting thrift failures.
Decision Sciences
23
4, pp. 899
–
916.
Full Text
via CrossRef
Westheider
, O., 1994.
Stock
return predictability: a
neural network
approach. In:
Information systems working paper #8
–
93
The John E. Anderson Graduate School of
Management at UCLA.
Corresponding author; email:
itsquah@ntu.edu.sg
Expert Systems with Applications
Volume 17, Issue 4
, November 1999, Pages 295

301
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο