Forecasting stock exchange movements using neural networks ...

sciencediscussionΤεχνίτη Νοημοσύνη και Ρομποτική

20 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

144 εμφανίσεις

Forecasting
stock

exchange movements using
neural networks:

Empirical evidence from
Kuwait

Mohamed M. Mostafa
a
,

a

New York Institute of Technology, Global Program in Bahrain, Manama, Bahrain


Available online 23 February 2010.


Abstract

Financial time series are very complex

and dynamic as they are characterized by extreme
volatility. The major aim of this research is to forecast the Kuwait
stock

exchange (KSE)
closing price movement
s using data for the period 2001

2003. Two
neural network

architectures: multi
-
layer perceptron (MLP)
neural networks

and generalized regression
neur
al networks

are used to predict the KSE closing price movements. The results of t
his
study show that neuro
-
computational models are useful tools in forecasting
st
ock

exchange
movements in emerging markets. These results also indicate that the
quasi
-
Newton training
algorithm produces less forecasting errors compared to other training algorithms. Due to their
robustness and flexibility of modeling algorithms, neuro
-
computational models are expected
to outperform traditional statistical techniques

such as regression and ARIMA in forecasting
stock

exchanges’ price movements.

Keywords:
KSE;
Neural networks
; Forecasting

Article Outline

1.
Introduction and related work

2.
Methodology

2.1.
Data

2.2.
Multi
-
layer perceptron

2.3.
Generalized regression
neural network

3.
Results

3.1.
MLP
-
based forecasting

3.2.

GRNN
-
based forecasting

4.
Implications, limitations and future research

References

1. Introduction and related work

The increasing globalization of the financial markets has heightened the interest in emerging
markets. In the Middle East there are 11 formal
stock

markets monitored by Standard and
Poors Emerging Markets Database (
Smith, 2007
).
Stock

markets are usually classified as
either developed or emerging. Three of the Middle East
stock

markets are categorized as
developed: Kuwait, United Arab Emirates and Qatar. Middle East
stock

markets are,
however, relatively small by world stan
dards as the 11 formal
stock

markets in the region
account for around 0.9% of world
stock

market capitalization (
Smith, 2007
).

Kuwait
stock

exchange
(KSE) was officially established in 1984 after the crash of
Almanakh (over
-
the
-
counter market). In the post liberation period (1992) KSE witnessed a lot
of reforms among which were the government privatization program to sell its holdings of
shares in loca
l shareholding companies, the launching of mutual funds, and the emergence of
institutional investors as the dominant players in the market (
Al
-
Loughani & Chappell, 2000
).

Throughout

its history, KSE has been characterized by an irregularity in trades and price
formation process. However, the Kuwaiti parliament has recently introduced some measures
that permit foreign investors to trade in KSE (
Al
-
Loughani & Chappell, 2000
). This study
aims to forecast the KSE movements using
neural network

(NN) models.

NN models have
been successfully used in prediction or forecasting studies across many
disciplines. One of the first successful applications of MLP is reported by
Lapedes and Farber
(1988)
. Using
two deterministic chaotic time series generated by the logistic map and the
Glass

Mackey equation, they designed an MLP that can accurately mimic and predict such
dynamic non
-
linear systems. Another major application of MLP is in electric load
consumption
(e.g.
[Darbellay and Slama, 2000]

and
[McMenamin and Monforte, 1998]
).
Many other problems have been solve
d by MLP. A short list includes air pollution forecasting
(e.g.
Videnova, Nedialkova, Dimitrova, & Popova, 2006
), maritime traffic forecasting
(
Mostafa, 2004
), airline passenger traffic forecasting (
Nam & Yi, 1997
), railway traffic
forecasting (
Zhuo, Li
-
Min, Yong, & Yan
-
hui, 2007
), commodity prices (
Kohzadi, Boyd,
Kemlanshahi, & Kaastra, 1996
), ozone level (
Ruiz
-
Suarez, Mayora
-
Ibarra, Torres
-
Jimenez,
& Ruiz
-
Suarez, 1995
), student grade point averages (
Gorr, Nagin, & Szczypula, 1994
),
forecasti
ng macroeconomic data (
Aminian, Suarez, Aminian, & Walz, 2006
), financial time
series forecasting (
Yu & Lai
Wang, 2009
), advertising (
Poh, Yao, & Jasic, 1998
), and market
trends (
Aiken & Bsat, 1999
).

Due to its goo
d performance in noisy environments, the generalized regression
neural
network

(GRNN) has been extensively used in various prediction and forecasting tasks in t
he
literature. For example,
Gaetz, Weinberg, Rzempoluck, and Jantzen (1998)

analyzed EEG
activity of the brain using a GRNN.
Chtioui, Panigrahi, and Francl (1999)

used a GRNN for
leaf wetness prediction.
Ibric, Jovanovi, Djuri, Paroj, and Solomun (2002)

used a GRNN in
the design of exte
nded
-
release aspirin tablets.
Cigizoglu (2005)

employed a GRNN to forecast
monthly water flow in Turkey. In this study, GRNN forecasting performance was found to be
superior to the
MLP and other statistical and stochastic methods.
Kim and Lee (2005)

used a
GRNN
-
based genetic algorithm to predict silicon oxynitride etching.
Hanna, Ural, and Saygili
(2007)

developed a GRNN model to predict seismic condition in sites susceptible to
liquefaction.
Shie (2008)

used a hy
brid method integrating a GRNN and a sequential
quadratic programming method to determine an optimal parameter setting of an injection
-
molding process.

There is an extensive literature in financial applications of NNs (e.g.
[Harvey et al., 2000]

and
[Kumar and Bhattacharya, 2006]
). For example,
Cao, Leggio, and Schniederjans (2005)

used
NNs to predict
stock

price movements for firms traded on the Shanghai
stock

exchange.
The authors compared the predictive power using linear models to the predictive power of the
univariate and multivariate NN models. Results showed that NN models outperform the linea
r
models. These results were statistically significant across the sample firms and indicated that
NN models are useful for
stock

price prediction.
Kryzanowski, Galler, and Wright (1993)

used NN models with historic accounting and macroeconomic data to identify
stocks

that
will outperform the market.
McGrath (2002)

used market to book and price earnings ratios in
a NN model to rank
stocks

based on the likelihood estimates.
(Ferson and Harvey, 1993)

and
(Kimoto et al., 1990)

used a series of macroeconomic variables to capture predictable
variation in
stock

price returns.
McNelis (1996)

used the Chilean
stock

market to predict
returns in the Brazilian markets.
Yumlu, Gurgen, and Okay (2005)

used various NN
architectures to model the performance of Istanbul
stock

exchange over the period 1990

2002.
Leigh, Hightower, and Modani (2005)

used NN models and linear regression models to
model the New York
Stock

Exchange Composite Index data for the period from 1981 to
1999. Results were robust and informative as to the role of trading volume in the
stock

market.
Chen, Leung, and Daouk (2003)

predicted the direction of return on market index of
the Taiwan
stock

exchange using a probabilistic NN model. The results were then
compared to the generalized methods of moments (GMM) with Kalman filter. From this
literature survey we find that no prev
ious studies have attempted to predict the movements of
KSE. In this study we aim to fill this research gap through the application of MLP and GRNN
models to forecast the movements of the KSE.

2. Methodology

2.1. Data

This study covers the time period of J
une 17, 2001 through November 30, 2003. Thus, the
data set contained 612 data points in time series. This data set is quite similar in length to data
sets in previous studies of similar nature. We used data for all listed companies traded on the
KSE. The d
ata consists of daily closing prices. The source of the closing price data is the
KSE

(
www.Kuwaitse.com
).
Table 1

shows

descriptive statistics of the open and closing prices of
the data used in this study.

Table 1.

Descriptive statistics of KSE opening and closing prices.

Open


Close


Mean

2550.021242

Mean

2553.124346

Standard Error

36.7838047

Standard Error

36.81831376

Median

2196.25

Median

2194.55

Mode

1746.7

Mode

1764.2

Standard Deviation

909.9810725

Standard Deviation

910.8347795

Sample Variance

828065.5523

Sample Variance

829619.9955

Kurtosis

-
0.63889504

Kurtosis

-
0.64911602

Skewness

0.891027464

Skewness

0.885
77702

Range

2945.2

Range

2942.8

Minimum

1578.7

Minimum

1579

Open


Close


Maximum

4523.9

Maximum

4521.8

Count

612

Count

612




2.2. Multi
-
layer perceptron

MLP consists of sensory units that make up the input layer, one or more hidden layers of

processing units (perceptrons), and one output layer of processing units (perceptrons). The
MLP performs a functional mapping from the input space to the output space. An MLP with a
single hidden layer having
H

hidden units and a single output,
y
, impleme
nts mappings of the
form

(1)



(2)


where
Z
h

is the output of the
h
th hidden unit,
W
h

is the weight between the
h
th hidden and the
output unit, and
W
0

is the output bias. There a
re
N

sensory inputs,
Xj
. The
j
th input is
weighted by an amount
β
j

in the
h
th hidden unit. The output of an MLP is compared to a
target output and an error is calculated. This error is back
-
propagated to the
neural network

and used to adjust the weights. This process aims at minimizing the mean square error
between the
network’s

prediction output and the target output.

MLP was first developed to mimic the functioning of the brain. It consists of interconnected
nodes referred to as processing elements that receive, process, and transmit information. MLP
consists of three

types of layers: the first layer is known as the input layer and corresponds to
the problem input variables with one node for each input variable. The second layer is known
as the hidden layer and is useful in capturing non
-
linear relationships among vari
ables. The
final layer is known as the output layer and corresponds to the classification being predicted
(
Baranoff, Sager, & Shively, 2000
).
Fig. 1

represents the typical structure of MLP.



Full
-
size image

(28K)


Fi
g. 1.

General structure of MLP.



First of all the
network

has
to be
trained

to produce the correct output with minimum error.
To achieve the minimum error the
network

first has to be trained until it produces a
tolerable

e
rror. This is how the training is done. Input is fed to the input nodes, from here the
middle layer nodes take the input value and start to process it. These values are processed
based on the randomly allocated initial weight of the links. The input travel
s from one layer to
another and every layer process the value based on the weights of its links. When the value
finally reaches the output node, the actual output is compared with the expected output. The
difference is calculated and it is propagated backw
ards, this is when the links adjust their
weights. After the error has propagated all the way back to first layer of middle level nodes,
the input is again fed to the input nodes. The cycle repeats and the weights are adjusted over
and over again until the

error is minimized. The key here is the weight of different links. The
weights of the links will decide the output value.

The MLP is the most frequently used
neural network

technique in pattern recognition
(
Bishop, 1999
) and classification problems (
Sharda, 1994
). H
owever, numerous researchers
document the disadvantages of the MLP approach. For example,
Calderon and Cheh (2002)

argue that the standard MLP
network

is subject to problems of local minima.
Swicegood
and Clark (2001)

claim that there is no formal method of deriving
a MLP
network

configuration for a given classification task. Thus, there is no direct method of finding the
ultimate structure for modeling process. Consequentl
y, the refining process can be lengthy,
accomplished by iterative testing of various architectural parameters and keeping only the
most successful structures.
Wang (1995)

argues tha
t standard MLP provides unpredictable
solutions in terms of classifying statistical data.

2.3. Generalized regression
neural network

GRNN was devised by
Specht (1991)
, casting a statistical method of function approximation
into a
neural network

form. The GRNN, like the MLP, is able to approximate any
functional relationship between inputs and outputs (
Wasserman, 1993
). Structurally, the
GRNN resembles the MLP. However, unl
ike the MLP, the GRNN does not require an
estimate of the number of hidden units to be made before training can take place.
Furthermore, the GRNN differs from the classical MLP in that every weight is replaced by a
distribution of weight which minimizes th
e chance of ending up in local minima. Therefore,
no test and verification sets are required, and in principle all available data can be used for the
training of the
network

(
Parojcic, Ibric, Djuric, Jovanovic, & Corrigan, 2005
).

The GRNN is a method of estimating the joint probability density function (pdf) of
x

and
y
,
giving only a train
ing set. The estimated value is the most probable value of
y

and is defined
by

(3)


The d
ensity function
f

(
x
,
y
) can be estimated from the training set using Parzen’s estimator
(
Parzen, 1962
)

(4)


The probability estimate
f

(
x
,
y
) assigns a sample probability of width
σ

for each sample
x
i

and
y
i
, and the probability estimate is

the sum of these sample probabilities (
Specht, 1991
).
Defining the scalar function

(5)


and assessing

the indicated integration yields the following:

(6)


The resulting regression
(6)

is directly applicable to problems involving numerical data.

The first hidden layer in the GRNN contains the radial units. A second hidden layer contains
units that help to estimate the wei
ghted average. This is a specialized procedure. Each output
has a special unit assigned in this layer that forms the weighted sum for the corresponding
output. To get the weighted average from the weighted sum, the weighted sum must be
divided through by t
he sum of the weighting factors. A single special unit in the second layer
calculates the latter value. The output layer then performs the actual divisions (using special
division units). Hence, the second hidden layer always has exactly one more unit than

the
output layer. In regression problems, typically only a single output is estimated, and so the
second hidden layer usually has two units.
Fig. 2

shows the general structure of th
e GRNN.
The GRNN can be modified by assigning radial units that represent clusters rather than each
individual training case: this reduces the size of the
network

and increases execution speed.
Centers can be assigned using any appropriate algorithm (i.e., sub
-
sampling,
K
-
means or
Kohonen).
Fig. 2

shows the General structure of the GRNN.



Full
-
size image

(26K)


Fig. 2.

General structure of the generalized regression
neural network

(GRNN).



3. Results

3.1. MLP
-
based forecasting

There are many software packag
es available for analyzing MLP models. We chose
NeuroIntelligence package (
Alyuda Research Company, 2003
). This software applies
artificial intelligence techniques to automatically f
ind the efficient MLP architecture.
Typically, the application of MLP requires a training data set and a testing data set (
Lek &
Guegan, 1999
). The training data set is used to trai
n the MLP and must have enough examples
of data to be representative for the overall problem. The testing data set should be
independent of the training set and is used to assess the classification/prediction accuracy of
the MLP after training. Following
Lim and Kirikoshi (2005)
, an error back
-
propagation
algorithm with weight updates occurring after each epoch was used for MLP training.

Fig. 3

shows the actual versus the fitted closing price values for the whole series using the
quick propagation training algorithm.
Fig. 4

shows the negat
ive exponential decay in the error
rate. As can be seen from
Fig. 4
, the best
network

was obtained after around 500 epochs
(trials). This figure is quite similar to typical convergence of errors in MLP models (similar
error graphs were obtained using other training algorithms).



Full
-
size image

(104K)


Fig. 3.

Actual versus fit using the quick propagation training algorithm.




Full
-
size image

(47K)


Fig. 4.

Error conversion and best
network

error.



Fig. 5

shows the actual versus the fitted closing price values for the whole series using the
conjugate gradient descent training algorithm.
Fig. 6

shows the actual versus the f
itted closing
price values for the whole series using the quasi
-
Newton training algorithm.



Full
-
size image

(
93K)


Fig. 5.

Actual versus fit using the conjugate gradient descent training algorithm.




Full
-
size image

(94K)


Fig. 6.

Actual versus fit using the quasi
-
Newton training algorithm.



Based on the descriptive statistics of different training algorithms reported in
Table 2

along
with the plots of the training algorithms used, it seems that the quasi
-
Newton training
algorithm produces less forecasting errors compared to the other two methods. This is also
evident from the target vs. output graph and from the error dependence graph using the quasi
-
Newton training algorithm (
Fig. 7
).

Table 2.

MLP training algorithms major statistics.


Target

Output

AE

ARE

Quick propagation
a

Mean

2534.744

2537.564

48.4
50

0.021

SD

908.299

890.698

31.966

0.015

Min

1579.000

1708.253

0.336

0.000

Max

4521.800

4369.447

155.094

0.084


Conjugate gradient descent
b

Mean

2534.744

2535.314

26.598

0.01
0

SD

908.299

902.767

22.605

0.009

Min

1579.000

1653.851

0.334

0.000

Max

4521.800

4433.294

170.075

0.049


Quasi
-
Newton algorithm
c

Mean

2534.744

2535.313

24.956

0.010

SD

908.2
99

903.801

21.542

0.008

Min

1579.000

1646.779

0.303

0.000

Max

4521.800

4442.282

166.271

0.046

a

Correlation = 0998; R
2

= 0.996.

b

Correlation = 0999; R
2

= 0.999.

c

Correlation = 0999; R
2

= 0.999.




Full
-
size image

(96K)


Fig. 7.

Predicted Vs. actual and error dependence graph using the quasi
-
Newton training
algorithm.



3.2. GRNN
-
based forecasting

There are many computer software packages available for building and analyzing NNs.
Because of its extensive capabilities for building
networks

based on a variety of training
and learning methods, NeuralTools Professional package (
Pa
lisade Corporation, 2005
) was
chosen to conduct GRNN analysis in this study. This software automatically scales all input
data. Scaling involves mapping each variable to a range with minimum and maximum values
of 0 and 1. NeuralTools Professional software

uses a non
-
linear scaling function known as the
‘tanh’, which scales inputs to a (−1, 1) range. This function tends to squeeze data together at
the low and high ends of the original data range. It may thus be helpful in reducing the effects
of outliers (
Tam, Tong, Lau, & Chan 2005
).
Table 3

shows the basic properties of the GRNN
model used in this study.
[Fig. 8]

and
[Figure 9]

show the error distribution and the predicted
versus actual results using the GRNN
network.

These figures indicate the robustness of the
GRNN and its normality of error distributions.

Table 3.

GRNN architecture.

Summary



Net Information

Confi
guration

GRNN Numeric Predictor


Training

Number of Cases

490

Number of Trials

61

Reason Stopped

Auto
-
Stopped

% Bad Predictions (30% Tolerance)

0.0000%

Root Mean Square Error

24.51

Mean Absolute Error

17.13

Std. Deviation of Abs. Error

17.53


Te
sting

Number of Cases

122

% Bad Predictions (30% Tolerance)

0.0000%

Root Mean Square Error

32.12

Mean Absolute Error

22.30

Summary



Std. Deviation of Abs. Error

23.12


Data Set

Number of Rows

612





Full
-
size image

(22K)


Fig. 8.

GRNN histogram of residuals (training/testing).




Full
-
size image

(25K)


Figure 9.

GRNN predicted versus actual values (tra
ining/testing).



4. Implications, limitations and future research

Our results confirm the theoretical work by
Hecht
-
Nielson (1989)

who has shown that NNs
can learn input

output relationships to the point of making perfect forecasts with the data on
which the
network

is trained. The good performance of the NN models in predicting KSE
closing price movements can be traced to its inherent non
-
linearity. This makes an NN ideal
for dealing with non
-
linear relations that may exist in the data. Thus,
neuro
-
computational
models are needed to better understand the inner dynamics of
stock

markets. Our results are
also in line with the findings of other research
ers who have investigated the performance of
NN compared to other traditional statistical techniques, such as regression analysis,
discriminant analysis, and logistic regression analysis. For example, in a study of credit
-
scoring models used in commercial
and consumer lending decisions,
Bensic, Sarlija, and
Zekic
-
Susac (2005)

compared the performance of logistic regression,
neural networks

and
decision trees. The PNN model produced the highest hit rate and the lowest type I error.
Similar findings have been reported in a study examining the performance of NN in
predicting bankruptcy (
Anandarajan, Lee, & Anandarajan 2001
) and diagnosis of acute
appendicitis (
Sakai et al., 2007
).

Despite the signif
icant contributions of this study, it suffers from a number of limitations.
First, despite the satisfactory performance of the NN models in this study, future research
might improve the performance of the NN models used in this study by integrating fuzzy
d
iscriminant analysis and genetic algorithms (GA) with NN models.
Mirmirani and Li (2004)

pointed out that traditional algorithms search for optimal weight vectors for a
neural
network

with a given architecture, while GA can yield an efficient exploration of the search
space when the modeler has little
apriori

knowledge of the structure of
problem domains.
Second, future research might use other NN architectures such as self
-
organizing maps
(SOMs) to classify movements in KSE. Due to the unsupervised character of their learning
algorithm and the excellent visualization ability, SOMs have bee
n recently used in myriad
classification tasks. Examples include classifying cognitive performance in schizophrenic
patients and healthy individuals (
Silver & Shmoish, 2008
), mutual

funds classification
(
Moreno, Marco, & Olmeda, 2006
), crude oil classification (
Fonseca, Biscaya, de Sousa
, &
Lobo 2006
), and classifying magnetic resonance brain images (
Chaplot, Patnaik, &
Jagannathan 2006
).

References

Aiken and Bsat, 1999

M. Aiken and M. Bsat, Forecasting market trends with
neur
al
networks,

Information Systems Management

16

(1999), pp. 42

49.

Al
-
Loughani and Chappell, 2000

N. Al
-
Loughani and D. Chappell, Modeling the day
-
of
-
the
-
week effect in the Kuwait
stock

exch
ange: A non
-
linear Garch representation,
Applied
Financial Economics

11

(2000), pp. 353

359.

Alyuda Research Company, 2003

Alyuda Research Company (2003). NeuroIntelligence User
Man
ual (Version 2.1).

Aminian et al., 2006

F. Aminian, E. Suarez, M. Aminian and D. Walz, Forecasting economic
data with
neural networks,

Computational Economics

28

(2006), pp. 71

88.
Full Text

via
CrossRef

|
View Record in Scopus

|
Cited By in Scopus (3)

Anandarajan et al., 2001

M. Anandarajan, P. Lee and A. Anandarajan, Bankruptcy prediction
of financially stressed firms: an examination of the predictive accuracy of artificial
neural
networks,

International Journal of Intelligent Systems in Accounting, Finance and
Management

10

(2001), pp. 69

81.
Full Text

via CrossRef

Baranoff et al., 2000

E. Baranoff, T. Sager and T. Shively, A semi
-
parametric stochastic
spline model as a managerial tool for potential insolvency,
Journal of Risk and Insurance

67

(2000), pp. 369

39
6.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited By in Scopus
(4)

Bensic et al., 2005

M. Bensic, N. Sarl
ija and M. Zekic
-
Susac, Modelling small
-
business credit
scoring by using logistic regression,
neural networks

and decision trees,
Intelligent Systems
in Accounting, Finance and

Management

13

(2005), pp. 133

150.
Full Text

via CrossRef

Bishop, 1999

C. Bishop,
Neural networks

for pattern recognition, Oxford University Press,
New York (1999).

Calderon and Cheh, 2002

T. Calderon and J. Cheh, A roadmap for future
neural networks

research in auditing and risk assessment,
International Journal of Accounting Information
Systems

3

(2002), pp. 203

236.
Article

|
PDF (491 K)

|
View Record in Scopus

|
Cited By
in Scopus (30)

Cao et al., 2005

Q. Cao, K. Leggio and M. Schniederjans, A comparison between Fama and
French’s model and artificial
networks

in predicting the Chinese
stock

market,
Computers
and Operations Research

32

(2005), pp. 2499

2512.
Article

|
PDF (238 K)

|
View Record
in Scopus

|
Cited By in Scopus (19)

Chaplot et al., 2006

S. Chaplot, L.

Patnaik and N. Jagannathan, Classification of magnetic
resonance brain images using wavlets as input to support vector machines and
neural
network,

Biomedical Signal Processin
g and Control

1

(2006), pp. 86

92.
Article

|
PDF
(237 K)

|
View Record in Scopus

|
Cited By in Scopus (15)

Chen et al., 2003

A. Chen, M. Leung and H. Daouk, Application of
neural networks

to an
emerging financial market: forecasting and trading the Taiwan
stock

index,
Computers and
Operations Research

30

(2003), pp. 901

923.
Article

|
PDF (229 K)

|
View Record in
Scopus

|
Cited By in Scopus (54)

Chtioui et al., 1999

Y. Chtioui, S. Panigrahi and L. Francl, A generalized regression
neural
network

and its application for leaf wet
ness prediction to forecast plant disease,
Chemometrics and Intelligent Laboratory Systems

48

(1999), pp. 47

58.
Article

|
PDF
(204 K)

|
View Record in Scopus

|
Cited By in Scopus (26)

Cigizoglu, 2005

H. Cigizoglu, Generalized regression
neural network

in monthly flow
forecasting,
Civil Engineering and Environmental Systems

22

(2005), pp. 71

84.
View
Record in Scopus

|
Cited By in Scopus (25)

Darbellay and Slama, 2000

G. Darbellay and M. Slama, forecasting the short
-
term demand for
electricity: Do
neural networks

stand a better chance,
International Journal of Forecasting

16

(2000), pp. 71

83.
Article

|
PDF
(205 K)

|
View Record in Scopus

|
Cited By in Scopus
(67)

Ferson and Harvey, 1993

W. Ferson and C. Harvey, The risk and predictability of
internatio
nal equity returns,
Review of Financial Studies

6

(1993), pp. 527

566.

Fonseca et al., 2006

A. Fonseca, J. Biscaya, J. de Sousa and A. Lobo, Geographical
classification of crude oi
ls by Kohonen self
-
organizing maps,
Analytica Chimica Acta

556

(2006), pp. 374

382.
Article

|
PDF (339 K)

|
View Record in Scopus

|
Cited By in Scopus
(14)

Gaetz et al., 1998

M. Gaetz, H. Weinberg, E. Rzempoluck and K. Jantzen,
Neural network

classification and correlation analysis of EEG and MEG activity accompanying spontaneous
reversals of the Necker cube,
Cognitive Brain Research

6

(1998), pp. 335

346.
Article

|
PDF (247 K)

|
View Record in Scopus

|
Cited By in Scopus (10)

Gorr et al., 1994

W. Gorr, D. Nagin and J. Szczypula, comparative study of artificial
neural
network

and statistical models for predic
ting student point averages,
International Journal of
Forecasting

10

(1994), pp. 17

34.
Abstract

|
PDF (1636 K)

|
View Record in Scopus

|
Cited By in Scopus (46)

Hanna et al., 2007

A. Hanna, D. Ural and G. Saygili, Evaluation of liquefaction potential of
soil deposits us
ing artificial
neural networks,

Engineering Computations

24

(2007), pp. 5

16.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited By in Scopus (4)

Harvey et al., 2000

C. Harvey, K. Travers and M. Costa, Forecasting emerging market returns
using
neural networks,

Emerging Markets Quarterly

4

(2000), pp. 43

55.

Hecht
-
Nielson, 1989

Hecht
-
Nielson, R. (1989). Theory of the ba
ck
-
propagation
neural
network.

In
International joint conference on
neural networks

(pp. 593

605
). Washington,
DC.

Ibric et al., 2002

S. Ibric, M. Jovanovi, C. Djuri, J. Paroj and L. Solomun, The application of
generalized regression
neural network

in the modeling and optimization of aspirin extended
release tablets with Eudragit(r) RSPO as matrix substance,
Journal of Controlled Release

82

(2002), pp. 213

222.
Article

|
PDF (736 K)


Kim and Lee, 2005

B. Kim and B. Lee, Prediction of

silicon oxynitride plasma etching using
a generalized regression
neural network,

Journal of Applied Physics

98

(2005), pp. 1

6.

Kimoto et al., 1990

Kimoto, T., Asakawa, K., Yoda, M., & Takeoka, M. (1990).
Stock

market prediction system with modular
neural networks.

In
Proceedings of the IEEE
international conference on
neural networks

(pp. 1

16).

Kohzadi et al., 1996

N. Kohzadi, M. Boyd, B. Kemlanshahi and I. Kaastra, A Comparison of
artificial
neural network

and tim
e series models for forecasting commodity prices,
Neurocomputing

10

(1996), pp. 169

181.
Abstract

|
Article

|
PDF (878 K)

|
View Record
in Scopus

|
Cited By in Scopus (41)

Kryzanowski et al., 1993

L. Kryzanowski, M. Galler and D. Wright, Using artificial
neural
networks

to pick
stocks,

Financial Analysts Journal

49

(1993), pp. 21

27.
Full Text

via
CrossRef

KSE, 2008

KSE (
www.Kuwaitse.com
), visited on November 10, 2008.

Ku
mar and Bhattacharya, 2006

K. Kumar and S. Bhattacharya, Artificial
neural network

vs.
linear discriminant analysis in credit ratings forecast,
Review of Accounting and Financ
e

5

(2006), pp. 216

227.
Full Text

via CrossRef

Lapedes and Farber, 1988

A. Lapedes and R. Farber, How
neural

nets work?. In: D.
Anderson, Editor,
Neural

information processing systems
, American Institute of Physics,
New York (1
988), pp. 442

456.

Leigh et al., 2005

W. Leigh, R. Hightower and N. Modani, Forecasting the New York
stock

exchange composite index with past price and interest rate on condition of volume spike,
Expert Systems with Applications

28

(2005), pp. 1

8.
Article

|
PDF (205 K)

|
View Record
in Scopus

|
Cited By in Scopus (8)

Lek and Guegan, 1999

S. Lek and J. Guegan
, Artificial
neural networks

as a tool in
ecological modelling: an introduction,
Ecological Modeling

120

(1999), pp. 65

73.
Article

|
PDF (151 K)

|
View Record in Scopus

|
Cited By in Scopus (194)

Lim and Kirikoshi, 2005

C. Lim and T. Kirikoshi, Predicting the effects of physician
-
directed
promotion on prescription yield and sales uptake using
neural networks,

Journal of
Targeting, Measurement and Analysis for Marketing

13

(2005), pp. 158

167.

McGrath, 2002

C. McGrath, Terminator portfolio,
Kiplinger’s Personal Finance

56

(2002),
p
p. 56

57.

McMenamin and Monforte, 1998

J. McMenamin and F. Monforte, Short term energy
forecasting with
neural networks,

Energy Journal

19

(1998), pp. 43

52.

McNelis, 1996

P. McNelis, A
neural network

analysis of Brazilian
stock

prices: Tequila
effects vs. pisco sour effects,
Journal of Emerging Markets

1

(1996), pp. 29

44.

-
Mirmirani and Li, 2004

S. Mirmirani and H. Li, Gold price,
ne
ural networks

and genetic
algorithm,
Computational Economics

23

(2004), pp. 193

200.
Full Text

via CrossRef

|
View
Record in Scopus

|
Cited By in Scopus (12)

Moreno et al., 20
06

D. Moreno, P. Marco and I. Olmeda, Self
-
organizing maps could improve
the classification of Spanish mutual funds,
European Journal of Operational Research

147

(2006), pp. 1039

1054.
Article

|
PDF (199 K)

|
View Record in
Scopus

|
Cited By in
Scopus (7)

Mos
tafa, 2004

M. Mostafa, Forecasting the Suez Canal traffic: A
neural network

analysis,
Maritime Policy and Management

31

(2004), pp. 139

156.
Full Text

via CrossRef

|
View
Record in Scopus

|
Cited By in Scopus (6)

Nam and Yi, 1997

K. Nam and J. Yi, Predicting airline passeng
er volume,
Journal of
Business Forecasting Methods and Systems

16

(1997), pp. 14

17.

Palisade, 2005

Palisade Corporation (2005).
NeuralTools professional user guide (version
1.0)
.
Ithaca, New York: Palisade Corporation.

Parojcic et al., 2005

J. Parojcic, S. Ibric, Z. Djuric, M. Jovanovic and O. Corrigan, An
investigation into the usefulness of generalized re
gression
neural network

analysis in the
development of level A in vitro

in vivo correlation,
European Journal of Pharmaceutical
Sciences

30

(2005), pp. 264

272.

Parzen, 1962

E. Parzen, On the estimation of a probability density function and mode,
Annals
of Mathematical Statistics

33

(1962), pp. 1065

1076.
MathSciNet

Poh et al., 1998

H. Poh, J. Yao and T. Jasic,
Neural networks

for the analysis and
forecasting of advertising impact,
International Journal of Intelligent Systems in Accounting,
Finance and management

7

(1998), pp. 253

268.
Full Text

via CrossRef

Ruiz
-
Suarez et al., 1995

J. Ruiz
-
Suarez, O. Mayora
-
Ibarra, J. Torres
-
Jimenez and L. Ruiz
-
Suarez, Short
-
term ozone forecasting by artificial
neural network,

Advances in Engineering
Software

23

(1995), pp. 143

149.
Abstract

|
Article

|
PDF (703 K)

|
View Record in
Scopus

|
Cited By in Scopus (36)

Sakai et al., 2007

S. Sakai, K. Ko
bayashi, S. Toyabe, N. Mandai, T. Kanda and K. Akazawa,
Comparison of the levels of accuracy of an artificial
neural network

model and a logistic
regression model for the diagn
osis of acute appendicitis,
Journal of Medical Systems

31

(2007), pp. 357

364.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited By in Scopus
(4)

Sharda, 1994

R. Sharda,
Neural netwo
rks

for the MS/OR analyst: An application
bibliography,
Interfaces

24

(1994), pp. 116

13
0.

Shie, 2008

J. Shie, Optimization of injection
-
molding process for mechanical properties of
polypropylene components via a generalized regression
neural network,

Polymers for
Advanced Technologies

19

(2008), pp. 73

83.
Full Text

via CrossRef

|
View Record in
Scopus

|
Cited By in Scop
us (1)

Silver and Shmoish, 2008

H. Silver and M. Shmoish, Analysis of cognitive performance in
schizophrenia patients and healthy individuals with unsupervised clustering models,
Psychiatry Research

159

(2008), pp. 167

179.
Article

|
PDF (919 K)

|
View Record in
Scopus

|
Cited By in Scopus (6)

Smith, 2007

G. Smith, Random walks in Middle Eastern
stock

mark
ets,
Applied Financial
Economics

17

(2007), pp. 587

596.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited
By in Scopus (0)

Specht, 1991

D. Specht, A general regression
neural networ
k,

IEEE Transactions on
Neural Networks

2

(1991), pp. 568

576.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited By in Scopus (806)

Swicegood and Clark, 2001

P. Swicegood and J. Clark, Off
-
site monitoring systems for
prediction bank underperformance: A comparison of
neural networks,

disc
riminant
analysis, and professional human judgment,
International Journal of Intelligent Systems in
Accounting, Finance and Management

10

(2001), pp. 169

186.
Full Text

via CrossRef

Tam et al., 2005

C. Tam, T. Tong, T. Lau and K. Chan, Selection of vertical framework
system by probabilistic
neural network

models,
C
onstruction Management and Economics

23

(2005), pp. 245

254.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited By in
Scopus (9)

Videnova et al., 2006

I. Videnova, D. Nedialkova, M. Dimitrova and S. Popova,
Neural
networks

for air pollution forecasting,
Applied Artif
icial Intelligence

20

(2006), pp. 493

506.
Full Text

via CrossRef

|
View Record in Scopus

|
Cited By in Scopus (1)

Wang, 1995

S. Wang, The unpredictability of standard back propagation
n
eural networks

in
classification applications,
Management Science

41

(1995), pp. 555

55
9.
Full Text

via
CrossRef

Wasserman, 1993

P. Wasserman, Advanced methods in
neural

computing, Van Nostrand
-
Reinhold, New York (1993).

Yu and Lai Wang, 2009

L. Yu and S.K. Lai Wang, A
neural
-
network
-
based nonlinear
metamodeling approach to financial time series forecasting,
Applied Soft Computing

9

(2009),
pp. 563

574.
Article

|
PDF (739 K)

|
View Record in Scopus

|
Cited By in Scopus (2)

Yumlu et al., 2005

S
. Yumlu, F. Gurgen and N. Okay, A comparison of global, recurrent and
smoothed
-
piecewise
neural

models for Istanbul
stock

exchange (ISE) prediction,
Pattern
Recognition Letters

26

(2005), pp. 2093

2103.
Article

|
PDF (292 K)

|
View Record in
Scopus

|
Cited By in Scopus (4)

Zhuo et al., 2007

W. Zhuo, J. Li
-
Min, Q. Yong and W. Yan
-
hui, Railway passeng
er traffic
volume prediction based on
neural network,

Applied Artificial Intelligence

21

(2007), pp.
1

10.
Full Text

via CrossR
ef


Expert Systems with Applications

Volume 37, Issue 9
, September 2010, Pages 6302
-
6309





Forecasting model of global
stock

index by stochastic time effective
neural network


Zhe Liao
a

and Jun Wang
,
a
,

a
Institute of Financial Mathematics and Financial Engineering, College of Science, Beijing
Jiaotong University, Beijing 100044, PR China


Available online 8 June 2009.


Abstract

In this paper, we investigate the statistical properties of the fluctu
ations of the Chinese
Stock

Index, and we study the statistical properties of HSI, DJI, IXIC and SP500 by
comparison. According to the theory of artificial
neural networks,

a stochastic time
effective function is introduced in the forecasting model of the indices in the present paper,
which gives an improved
neural network



the stochastic time effective
neural network

model. In this model
, a promising data mining technique in machine learning has been
proposed to uncover the predictive relationships of numerous financial and economic
variables. We suppose that the investors decide their investment positions by analyzing the
historical data

on the
stock

market, and the historical data are given weights depending on
their time, in detail, the nearer the time of the historical data is to the present,
the stronger
impact the data have on the predictive model, and we also introduce the Brownian motion in
order to make the model have the effect of random movement while maintaining the original
trend. In the last part of the paper, we test the forecasting
performance of the model by using
different volatility parameters and we show some results of the analysis for the fluctuations of
the global
stock

indices using
the model.

Keywords:
Brownian motion; Stochastic time effective function; Data analysis;
Neural
network
; Returns; Predict

Article Outline

1.
Introduction

2.
Methodology for stochastic time effective function

2.1.
Introduction of stochastic time effective function

2.2.
Procedure followed in the stochastic time effective model

3.
Experiment analysis

3.1.
Selection and preprocessing of data

3.2.
Training st
ochastic time effective
neural network

4.
Conclusions

Acknowledgements

References

1. Introduction

Recently, some progress has been made in the work done on the fluctuations of the Chinese
stock

market, for example, see
(Ji and Wang, 2007)

and
(Li and Wang, 2006)
. In the present
paper, we inves
tigate the statistical properties of the fluctuations of the indices of the Chinese
Stock

Exchange, and study the statistical properties of HSI, DJI, IXIC and S
P500 by
comparison. A predictive model of the
stock

prices is constructed using the theory of
artificial
neural networks

and a stochastic time effective function, further the data from the
Chinese
stock

markets are analyzed
in the model. China has
Stock

A and
Stock

B in the
stock

markets. The indices of
Stock

A and
Stock

B play an important role in the
Chinese
stock

markets, and the database of the indices is fro
m the website
w
ww.sse.com.cn
.

Recently, the properties of the fluctuations of the
stock

markets have been studied in many
research fields, for example, see
(Azoff, 1994)
,
(Nakajima, 2000)
,
(Pino et al., 2008
)
,
(Shtub
and Versano, 1999)

and
(Wang, 2007)
. Artificial
neural networks

are one of the
technologies that have made great progress in the study of the
stock

markets. Us
ually
stock

prices can be seen as a random time sequence with noise, artificial
neural networks,

as large
-
scale parallel processing nonlinear systems that depend on their own intrinsic link
data, provide methods and techniques that can approximate any nonlinear continuous
function, without a priori assumptions about th
e nature of the generating process, see
(Pino et
al., 2008)

and
(Shtub and Versano, 1999)
. They have good
self
-
learning ability, a strong anti
-
jamming capability,and have been widely used in the financial fields such as
stock

prices,
profits, exchange rate and risk
analysis and prediction. Although the historical data have a
great influence on the investors’ positions, the degree of impact of the data depends on the
date at which they occurr (or time), we get a high level effect of the data when they are very
near th
e current state. Furthermore, we also introduce the Brownian motion in the model (see
Wang, 2007
), in order to make the model have the effect of random movement while
maintaining th
e original trend. We test the forecasting performance of the model by using
different volatility parameters, and we show some results of the analysis for the fluctuations
of the global
stock

index using the model. In this work, the forecasting model is developed
to estimate the level of returns on
Stock

A Index of Chin
a. In Section
3
, the results show
that the forecasting model can predict the index behavior better in a short time interval than in
a long time interval, and show the different perfo
rmances of HSI, DJI, IXIC and SP500 by
comparison, see
(Abhyankar et al., 1997)
,
(Austin et al., 1997)

and
(Balvers et al., 1990)
.

In this paper, forecasting based on
neural network

involves two

major steps, data
preprocessing and structure design. In the pretreatment stage, the collected data should be
normalized and properly adjusted, in order to reduce the impact of noise in the
stock

markets. At the design stage, different data training sets, validations and data processing will
cause the different results, see
(Breen et al., 1
990)

and
(Campbell, 1987)
. In
stock

markets,
the environment and behavior of the mark
ets may change greatly, for example see the Chinese
stock

markets in 2007. As a result, the data in the data training set should be time
-
variant,
reflecting the

different behavior patterns of the markets at different times. If all the data are
used to train the
network

equivalently, the
network

system may not be consistent with the
development of the
stock

markets, see
(Chenoweth and Obradovic, 1996)
,
(Chenoweth and
Obradovic, 1996)
,
(Cybenko, 1989)

and
(Demuth and Beale, 1998)
. Especially in the current
Chinese
stock

markets,
stock

market trading rules and management systems are changing
rapidly, for example, the
daily price limit (now 10%), shareholding reformation, the direct
investment of Hong Kong
stock

markets, the reorganization of A share, B share and H
share, and

the establishment of financial derivatives such as futures and options. Therefore,
using the historical data of the past it is difficult to reflect the current Chinese
stock

markets’ development. However, if only the recent data are selected, a lot of useful
information will be lost which the early data hold. In the financial model of the present paper,
a promising data mining technique in machine learning is pr
oposed to uncover the predictive
relationships of numerous financial and economic variables. Considering the above
-
mentioned financial situation, this paper presents an improved
neural network

model, the
stochastic time effective series
neu
ral network

model: each historical datum is given a
weight depending on the time

it occurs in the model, and we also use the probability density
functions to classify the various variables from the training samples, see
(Desai and Bharati,
1998)
,
(Duda et al., 2001)

and
(Elton and Gruber, 1991)
.

2. Methodology for stochastic time effective function

2.1. Introductio
n of stochastic time effective function

In this section, first we describe a three
-
layer BP
neural network

model (see
Azoff, 1994
),
which is shown in
Fig. 1
. The
neural network

model includes three layers: input layer,
hidden layer and output layer. In the model, the proper number of the hidden layer nodes
requires validation techniques to avoid under
-
fitting (too few neurons) and over
-
fitting (too
many neurons). Generally,
too many neurons in the hidden layers, and, hence, too many
connections, produce a
neural network

that memorizes the data and lacks the ability to
generalize.



Full
-
size image

(27K)


Fig. 1.

Three
-
layer
neural network.




Suppose that a three
-
layer
neural network

has neurons, and for any fixed neuron
n
(
n
=1,2,…,
N
)
, the model has the following structure: let
{
x
i
(
n
):
i
=1,2,…,
p
}

denote the set of
input of neurons,
{
y
j
(
n
):
j
=1,2,…,
m
}

denote the set of output of the hidden layer neurons;
V
i

is
the weight that connects the node
i

in the input layer neurons to the node
j

in the hidden layer,
W
j

is the weight that connects the nod
e
j

in the hidden layer neurons to the node
k

in the
output layer; and
{
o
k
(
n
):
k
=1,2,…,
q
}

denote the set of output of neurons. Then the output
value for a unit is given by the follo
wing function


where
are the
neural

thresholds,
is the sigmoid activation function.
Let
T
k
(
n
)

be the actual value of the data sets, then the error of the corresponding neuron
k

to
the output is defined as
ε
k
=
T
k
-
o
k
. In this paper, the error of the output is defined as
,
then the error of the sample
n
(
n
=1,2,…,
N
)

is defined as


where
(
t
)

is the stochastic time effective function. Now we define
(
t
)

as follows:


where
τ
(>0)

is the time strength coefficient,
t
1

is the current time or the time of the newest
data in the data set, and
t
n

is an arbitrary time point in the data set.
μ
(
t
)

is the drift function (or
the trend term),
σ
(
t
)

is the volatility function, and
B
(
t
)

is the standa
rd Brownian motion
(
Wang, 2007
). The stochastic time effective function implies that the recent information has a
stronger effect for the investors than the old information. In deta
il, the nearer the events
happened, the greater the investors and markets affected. And the impact of data follows the
time exponential decay, see
(Nakajima, 2000)

and
(Pino et al., 2008)
. Then the total error of
all the data training sets in the set output layer with the stochastic time effective function is
defined as


2.2. Procedure followed in the stochastic time effective model

Note that the training objective of stochastic time effective
neural network

is to modify the
weights so as to minimize the error between the
network’s

prediction and the actual target.
When all the training data are data (that is
t
1
=
t
n
), the stochastic time effective
neural
network

is the general
neural network

model. In
Fig. 2
, the training algorithms procedures
of stochastic time effective
neural network

are shown, which are as follows:

Step 1: Train

a stochastic time effective
neural network

by choosing five kinds of
stock

prices in the input layer: daily opening price, daily closing price, daily highest price, daily
lowest price and daily trade volume, and one price of the
stock

prices in the output layer:
the closing price of the next trading day. Then set the connective weights, and input the
training data sets.

Step 2: At the beginning of data processing, connective weights
V
i

and
W
j

follow the uniform
distribution on
(
-
1,1)
, and let the
neural

threshold
θ
k
,
θ
j

be 0.

Step 3: Introduce the stochastic time effective function
t

in the error function
e
(
n
,
t
)
. Choose
different volatility parameters. Give the transfer function from the input layer to the hidden
layer and the transfer function from the hidden layer to the output layer.

Step 4: Establish an error acceptable model and set pre
-
s
et minimum error. If the output error
is below pre
-
set minimum error, go to Step 6, otherwise go to Step 5.

Step 5: Modify the connective weights: Calculate backward for the node in the output layer:


Calculate
δ

backward for the node in the hidden layer:


where
o
(
n
)

is the output of the neuron
n
,
T
(
n
)

is the actual value of the neuron
n

in the data
sets,
o
(
n
)[1
-
o
(
n
)]

is the derivative of the sigmoid activation function and
h


is each of the
nodes which connect with the node
h

and in the next hidden
layer after node
h
. Modify the
weights from the layer to the previous layer:


where
η

is the learning step, which usually takes constants between 0 and 1.

Step 6: Output the predictive value.



Full
-
size image

(66K)


Fig. 2.

Procedure followed in the stochastic time effective model.



3. Experiment analysis

3.1. Selection and preprocessing of data

I
n this paper, we select the data of
Stock

A Index (SAI) and
Stock

B Index (SBI) of
China for each trading day in a 18
-
year period, that is from December 19, 1990 to June 7,
2008, which are from the Shanghai and Shenzhen
Stock

Exchange. And we also choose the
data of HSI, DJI, IXIC and SP500 by contrast. First, we study the statistical properties of the
returns of the index using the stochastic time effective
neural network

model, and then
study the relativity between the Chinese
stock

indices and foreign country
stock

indices.

Fig. 3

presents the related coefficients between SAI, SBI, HSI, DJI, IXIC and SP500. In
Fig.
3
, we can see that the related coefficients between SAI and SBI, HSI, DJI, IXIC and SP500
are 0.885, 0.629, 0.463, 0.108, 0.329; and the related coefficients between SBI and SAI, HSI,
DJ
I, IXIC and SP500 are 0.885, 0.817, 0.688, 0.458, 0.623.



Full
-
size image

(60K)


Fig. 3.

Related coefficient
s.



In the model of this paper, we suppose that the
network

in
puts include five kinds of data,
daily opening price, daily closing price, daily highest price, daily lowest price and daily trade
volume, and the
network

outpu
ts include the closing price of the next trading day.

Fig. 4

presents the plot of the time sequence log return of SAI, SBI, IXIC and SP500. We
denote the price sequence of SAI, SBI,
IXIC and SP500 of time
t

by
,
then
R
(
t
)

denotes the logarithm of return rate, respectively, by


In

Fig. 4
, we can see that the prices of the index fluctuate wildly, and this indicates that there
is a big noise in the data that causes difficulty in forecasting. Thus, we should go
through data
preprocessing before forecasting, so the data are normalized as follows:


Similarly to (5), the normalized values of the above
-
mentioned five kinds of data can also be
given.



Full
-
si
ze image

(107K)


Fig. 4.

The plot of log returns of SAI (a), SBI (b), IXIC (c), and SP500 (d).



3.2. Training stochastic time effective
neural network

Data sets are divided into two parts, data training set and data testing set. We collect the data
of SAI in 1990

2006 as the training set and the data of SAI in 2007

2008 as the testing set.
According to the procedures of the three
-
layer
network

introduced in Section
2
, the number
of
neural

nodes in the input layer is 5, the number of
neural

nodes in the hidden layer is
20 and the number of
n
eural

nodes in the output layer is 1, and the threshold of the
maximum training
cycles is 100 and the threshold of the minimum error is 0.0001. We take
the
μ
(
t
)

(the drift parameter) and
σ
(
t
)

(the volatility parameter)
(
μ
(
t
),
σ
(
t
))

to be (1,

0), (1,

1)
and (1,

2). While
(
μ
(
t
),
σ
(
t
))

is (1,

0
), the model has the effect of only time effective function;
while
(
μ
(
t
),
σ
(
t
))

is (1,

1), the model has the effect of both time effective function and normal
randomization; and while
(
μ
(
t
),
σ
(
t
))

is (1,

2), the model has the effect of intensive
ran
domization.

In
Table 1

the parts of training errors of different trading dates for (a) SAI, (b) SBI, (c) HSI,
(d) DJI, (e) IXIC and (f) SP500 are given. It can be clearly seen that d
uring the years 1991
and 1992, the relative error is larger than that in the year 2007, this clearly shows the effect of
time effective function. Furthermore, the gap between the relative error of SAI and SBI is
much more greater than the relative error of

the foreign
stock

markets. So we can conclude
that the value of the historical data in the foreign
stock

markets is greater than that in the
Chinese
stock

markets, this means that the Chinese
stock

markets fluctuate more sharply
than the foreign markets.

Table 1.

Comparison of errors of different dates for SAI, SBI, HSI, DJI, IXIC and SP500 (
(
μ
(
t
),
σ
(
t
))

is
(1,

1)).

Time

Actual

Predictive

Error

(a) SAI

90/12/20

104.39

144.57

−0.384

91/05/10

108.53

149.06

−0.373

06/09/20

1820.8

1830.3

−0.005

07/02/13

2973.4

2974.0

−0.0001


(b) SBI

92/02/24

124.65

128.88

−0.034

93/12/24

93.14

87.64

0.059

05/07/11

59.88

60.8308

−0.02

06/11/15

108.27

106.37

0.017


(c) HSI

01/07/31

12316.7

12123.8

0.016

02/01/29

11014.2

10712.2

0.027

06/10/11

17862.8

17849.5

0.001

07/09/26

26430.2

26230.8

0.008


(d) DJI

91/12/23

3022.6

2948.1

0.025

92/07/07

3295.
2

3340.8

−0.014

06/07/26

11102.5

11125.5

−0.002

07/12/28

13365.9

134129

−0.004

Time

Actual

Predictive

Error


(e) IXIC

91/11/08

548.08

540.3

0.014

92/03/05

621.97

635.6

−0.022

06/03/07

2268.38

2290.0

−0.010

07/05/29

2572.06

2559.6

0.005


(f) SP500

91/01/03

321.91

339.06

−0.0
53

92/02/04

413.85

408.38

0.013

06/10/31

1377.94

1376.05

0.001

07/02/05

1446.99

1448.47

−0.001




Fig. 5

shows the fluctuations of the time sequence of rela
tive errors during the years 1991 to
2007 for the prices of SAI, SBI, HSI, DJI, IXIC and SP500. In these plots, 0 represents the
farthest data to the current date, and the larger
t

(date) represents the date that is nearer to the
current date.
Fig. 5

also clearly indicates that the stochastic time effective
neural network

model can be reali
zed by assigning different weights to the data of different times. Time
sequence of relative errors of (b) SBI and (d) DJI in
Fig. 5

clearly reflects the model of
randomization by th
e effect of the Brownian motion. And we also conclude that SBI is similar
to DJI, and that SAI is similar to HIS. By coincidence, the conclusion is supported by the
relativity analysis shown in
Fig. 3
.



Full
-
size image

(137K)


Fig. 5.

Relative errors of SAI, SBI, HSI, DJI, IXIC,and SP500
((
μ
(
t
),
σ
(
t
)))

is (1,

1).



In order to test the validity of the volatility parameter
σ
(
t
)
, we take
σ
(
t
)

to be 0 or 2. If
σ
(
t
)

is
0, then the model has no effect of randomization but has on
ly the effects of the time effective
function. If
σ
(
t
)

is 2, then the model has intensive effect of the wild fluctuation.

In
Table 2
, the predictive values and relative errors of SAI by the different values
σ
(
t
)

are
given. This shows that the relative error is the smallest when
σ
(
t
)=1

(almost below 1%) and
the relative error is the l
argest when
σ
(
t
)=0

(almost over 10%). So we can conclude that
adding in the Brownian motion in this financial model is propitious for increasing the
precision of prediction. However, the volatility parameter should not be too large, otherwise it
will increase the err
or of the prediction.

Table 2.

Predictive values and errors of time effective
neural network

model of SAI by different
σ
(
t
)
.

Time

Actual

Predictive

Error

(a)

σ
(
t
)=1

08/06/03

3605.859

3627.168

−0.0059

08/06/04

3535.809

3604.064

−0.0193

08/06/05

3516.219

3530.171

−0.0039

08/06/06

3493.189

3507.5
05

−0.0040


(b)

σ
(
t
)=0

08/06/03

3605.859

3655.273

−0.014

08/06/04

3535.809

3689.849

−0.044

08/06/05

3516.219

3559.069

−0.012

08/06/06

3493.189

3502.137

−0.003


(c)

σ
(
t
)=2

08/06/03

3605.859

4060.890

−0.126

08/06/04

3535.809

4034.350

−0.141

08/06/05

3516.219

4001.172

−0.138

08/06/06

3493.189

3968.767

−0.136




In
Table
3
, I stands for the average relative error, II stands for the average relative error of the
first 1000 days in the data sets, and III stands for the latest 100 days in the data sets. In
Table
3
, the time effective function in the model is clearly expressed. Take the relative error in SAI
at
σ
(
t
)=1

for example, the average relative error is 2.6%; the average relative error of the first
1000 days is 6.88% and the average relative error of the latest 100 days is 1.3%. So the latest
data are more valuable than the historical data of the past in
the
stock

market.

Table 3.

Average relative errors for SAI, SBI, DJI, IXIC, HSI and SP500 of different
σ
(
t
)
.


SAI

SBI

DJI

IXIC

HSI

SP500

(a)

σ
(
t
)=1

I

0.026

0.0212

0.0388

0.0152

0.0104

0.0074

II

0.0688

0.0262

0.0697

0.0366

0.0112

0.0077

III

0.013

0.0141

0.0058

0.0078

0.0109

0.007
8


(b)

σ
(
t
)=0

I

0.0766

0.0284

0.0128

0.2143

0.0322

0.0345

II

0.1637

0.0391

0.201

0.7247

0.0379

0.045

III

0.0514

0.0148

0.0103

0.0328

0.0247

0.0366


(c)

σ
(
t
)=2

I

0.2458

0.1041

0.043

0.1443

0.0636

0.0897

II

0.8686

0.1276

0.083

0.3466

0.0875

0.2256

III

0.0353

0.0985

0.034

0.1041

0.0347

0.0265




In
Table 3
, the global
stock

indices errors of the different values
σ
(
t
)

are also given. Take
the relative error in SAI for example, while
σ
(
t
)=1
, the average relative error is 2.6%; while
σ
(
t
)=0
, the average relative error is 7.66%; while
σ
(
t
)=2
, the average relative error is 24.58%.
So this implies that the appropriate volatility parameter is beneficial for the pr
ediction of the
stochastic time effective
neural network,

and that it is widely applicable to the global
stock

markets.

In
Fig. 6
, the comparison between predictive values and the actual values in SAI, SBI, HSI
and IXIC by the stochastic time eff
ective
neural network

model is shown.



Full
-
size image

(71K)


Fig. 6.

Comparison of the predictive values and actual values.



In
Fig. 7
, by using the linear regression method, we compare the predictive values of
stochastic time effective
neural network

mod
el with the actual values in SAI (a), SBI (b),
NSDK (c) and SP500 (d). Through the regression analysis, different linear equations in SAI
(a), SBI (b), NSDK (c) and SP500 (d) are obtained. Take the linear equation in SAI (a) for
example, the linear equatio
n is


And the correlation coefficient
R
=0.9984
. The linear equation for SBI (b) is


And the correl
ation coefficient
R
=0.9978
; the linear equation for NSDK(c) is


And the correlation coefficient
R
=0.9988

and the linear equation for SP500 (d) is


And the correlation coefficient
R
=0.9991
. So we test th
e accuracy of the results of the
forecasting from another angle.



Full
-
size image

(72K)


Fig. 7.

Regression
of the predictive values and actual values.



4. Conclusions

This paper introduces a new stochastic time effective function to model a stochastic time
effectiv
e
neural network

model. The effectiveness of the model has been analyzed by
performing a numerical experiment on the data of SAI, SBI, HSI, DJI, IXIC and SP500,

and
the validity of the volatility parameters of the Brownian motion is tested. Further, the present
paper shows some predictive results on the global
stock

in
dices using the stochastic time
effective
neural network

model.

Acknowledgements

The authors were supported in part by National Natural Science Foundation of Ch
ina Grant
No. 70771006 and by BJTU Foundation No. 2006XM044.

References

Abhyankar et al., 1997

A. Abhyankar, L.S. Copeland and W. Wong, Uncovering nonlinear
structure in real
-
time
stock
-
market indexes: The SP500, the DAX, the Nikkei 225, and the
FTSE
-
100,
Journal of Business Economics and Statistics

15

(1997), pp. 1

14.
Full Text

via
CrossRef

|
View Record in Scopus

|
Cited By in Scopus (58)

Austin et al., 1997

M. Austin, C. Looney and J. Zhuo, Security market timi
ng using
neural
network

models,
Expert Systems

3

(1997), pp. 3

14.
View Record in Scopus

|
Cited By in
Scopus (9)

Azoff, 1994

E.M. Azoff,
Neural network

time series forecasting of financial market, Wiley,
New York (1994).

Balvers et al., 1990

R.J. Balvers, T.F. Cosimano and B. McDonald, Predicting
stock

returns
in an eHcient market,
Journal of F
inance

55

(1990), pp. 1109

1128.
Full Text

via CrossRef

Breen et al., 1990

W. Breen, L.R. Glosten and R. Jagannathan, Predictable variations in

stock

index returns,
Journal of Finance

44

(1990), pp. 1177

1189.

Campbell, 1987

J. Campbell,
Stock

returns and the term structure,
Journal of Financial
Economics

18

(1987), pp. 373

399.
Abstract

|
PDF (1757 K)

|
View Record in Scopus

|
Cited By in Scopus (353)

Chenoweth and Obradovic, 1996

T. Chenoweth and Z. Obradovic, A multi
-
component
nonlinear prediction system for the SP500 index,
Neurocomputing

10

(1996), pp. 275

290.
Abstract

|
Article

|
PDF (1159 K)

|
View Record in Scopus

|
Cited By in Scopus (14)

Chenoweth and Obradovic, 1996

T. Che
noweth and Z. Obradovic, Embedding technical
analysis into
neural network

based trading systems,
Applied Artificial Intelligence

10

(1996), pp. 523

541.
View Record in Scopus

|
Cited By in Scopus (8)

Cybenko, 1989

G. Cybenko, Approximation by superpositions of a sigmoidal function,
Mathematics of Control Signals and Systems

2

(1989), pp. 303

314.
MathSciNet

|
View
Record in Scopus

|
Cited By in Scopus (2248)

Demuth an
d Beale, 1998

H. Demuth and M. Beale,
Neural network

toolbox: For use with
MATLAB (5th ed.), The Math Works, Inc, Natick, MA (1998).

Desai and Bharati, 1998

V.S. Desai and R. Bharati, The efficiency of
neural networks

in
predicting returns on
stock

an bond indices,
Decision Sciences

29

(1998), pp. 405

425.

Duda et al., 2001

R.O. Duda, P.E. Hart and D.G. Stork, Pattern clas
sic cation, Wiley, New
York (2001).

Elton and Gruber, 1991

E.J. Elton and M.J. Gruber, Modern portfolio theory and investment
analysis (4th ed.), Wiley, New York (1991).

Ji and Wang, 2007

M.F. Ji and J. Wang, Data analysis and statistical properties of Shenzhen
and Shanghai land indices,
WSEAS Transactions on Business and Economics

4

(2007), pp.
33

39.

Li and Wang, 2006

Q.D. Li and J. Wang, Statistical properties of waiting times and returns in
Chinese
stock

markets,
WSEAS Transactions on Business and Economics

3

(2006), pp.
758

765.

Nakajima, 2000

Y. Nakajima, Are fluctuations in
stock

price unexpected,
Artificial
Intelligence and Knowledge Based Processing (AI)

100

(2000), pp. 37

42.

Pino et al., 2008

R. Pino, J. Parreno, A. Gom
ez and P. Priore, Forecasting next
-
day price of
electricity in the Spanish energy market using artificial
neural networks,

Engineering
Applications of Artificial Intelligence

2
1

(2008), pp. 53

62.
Article

|
PDF (302 K)

|
View
Record in Scopus

|
Cited By in

Scopus (18)

Shtub and Versano, 1999

A. Shtub and P. Versano, Estimating the cost of steel pipe bending,
a comparison between
neural networks

and regression analysis,
Journal of Production
Economics

62

(1999), pp. 201

207.
Article

|
PDF (106 K)

|
View Record in Scopus

|
Cited
By in Scopus (18)

Wang, 2007

J. Wang, Stochastic process and its application in fin
ance, Tsinghua University
Press and Beijing Jiaotong University Press (2007).


Cor
responding author. Tel./fax: +86 10 51682867.


Expert Systems with Applications

Volume 37, Issue 1
, January 2010, Pages 834
-
841





Improving returns on
stock

investment through
neural network

selection

Tong
-
Seng Quah
,
,
a

and Bobby Srinivasan
b

a

School of Electrical and Electronic Engineering, Nanyang Technological University,
Nanyang Avenue, Singapore 639798, Singapore

b

School of Accountancy and Business, Nanyang Technological University, Nanyang Avenue,
Singapore 639798, Singapore


Available online 1 November 1999.


Abstract

The Artificial
Neural Network

(ANN) is a technique that is heavily researched and used i
n
applications for engineering and scientific fields for various purposes ranging from control
systems to artificial intelligence. Its generalization powers have not only received admiration
from the engineering and scientific fields, but in recent years,
the finance researchers and
practitioners are taking an interest in the application of ANN. Bankruptcy prediction, debt
-
risk assessment and security market applications are the three areas that are heavily
researched in the finance arena. The results, this

far, have been encouraging as ANN displays
better generalization power as compared to conventional statistical tools or benchmark.

With such intensive research and proven ability of the ANN in the area of security market
application and the growing import
ance of the role of equity securities in Singapore, it has
motivated the conceptual development of this project in using the ANN in
stock

selection.
With its prov
en generalization ability, the ANN is able to infer from historical patterns the
characteristics of performing
stocks.

The performance of
stocks

is reflective of their
profitability and the quality of management of the underlying company. Such information is
reflected in financial and technical variables. As such, the ANN
is used as a tool to uncover
the intricate relationships between the performance of
stocks

and the related financial and
technical variables. Historical data such

as financial variables (inputs) and performance of the
stock

(output) are used in this ANN application. Experimental results obtained this far have
been very enc
ouraging.

Author Keywords:
Technical analysis; Fundamental analysis;
Neural netwo
rk
; Economic
factors; Political factors; Firm specific factors

Article Outline

1. Introduction

2. Application of
neural network

in financial and commercial domains

3.
Neural

architecture

3.1. Select the Appropriate Algorithm

3.2. Architecture of ANN

3.3. Selection of the Learning Rule

3.4. Selection of the Appropriate Learning Ra
tes and Momentum

4. Variables selection

5. Experiment

5.1. Research design

5.2. Design 1 (Basic System)

5.3. Design 2 (Moving Window System)

5.4. Results

6. Conclusion and future works

References

1. Introduction

With the growing importance in the role of equities to both the international and local
investors, the selection of attractive
stocks

is of utmost importance to ensure a good return.
Therefore, a reliable tool in the selection process can be of great assistance to these investors.
An effective and efficient tool/system gives the investor the competitive edge over others as

he/she can identify the performing
stocks

with minimum effort.

Trading strategies, rules and concepts based on fundamental and technical analysis, have been
de
vised by both academics and practitioners in assisting the investors in their decision making
process. Innovative investors opt to employ information technology to improve the efficiency
in the process. This is done through transforming trading strategies
into computer known
languages so as to exploit the logical processing power of the computer. This greatly reduces
the time and effort in short
-
listing the list of attractive
stocks.

In this age where information technology is dominant, such computerized rule based expert
systems have great limitations that will affect its effectiveness and efficiency. However, with
the significant advancement in the field of Arti
ficial
Neural Network

(ANN), these
limitations have found a solution. In this research, the generalization ability of the ANN is
being harnessed in creating an
effective and efficient tool for
stock

selection. Results of the
researches in this field have so far been very encouraging.

2. Application of
neural network

in financial and commercial domains

Research developments in Bankruptcy prediction have showed that the ANN performs better
than the conventional statistical meth
ods such as Discriminant Analysis and Logistic
Regression.
Alici (1996)
, using UK data, shows that ANN, with the architecture consisting of
3 multi layer Preceptron (28 financial inp
uts, seven hidden neurons and two output neurons)
has an average of 71.38%(76.07%) for the failed firms (healthy firms) using
neural
network.

On the other hand,

the Discriminant Analysis and Logistic Regression have both
achieved 60.12%(71.43%) and 65.29%(71.07%) for the failed firms (Healthy Firms). The
feasibility of applying ANN into bankruptcy was also studied by
Raghupathi, Schkade and
Raju, 1991

and
Odom
, and many others.

Salchenberger, Cin
ar and Lash (1992)

use the ANN for classifying failure pertaining to the
Savings and Loans (S&Ls) organizations in US. In the Debt Risk Assessment applications,
Dutta and Shekhar (19
93)

use
neural network

to classify bonds.

The generalization ability of the ANN is also extended to commodity trading (copper).
Robles
and Naylor (1996)

are able to show that the ANN outperforms the traditional Weighted
Moving Average rule and a “buy and hold” strategy. In equity,
Gencay and Stengos (1996)

show that the ANN outperforms the
(linear model) in the identical research
methodology
. The ratios of the average MSPE of the testing model and benchmark are 0.961
and 1 for the ANN and the
, respectively.

Burgess and Refenes (1996)

prove that, the ANN has better generalization ability than
Ordinary Least Square in predicting daily return of the
index.
Neural network

is also
applied in predicting the trend of Italian
stock

market (

).
Chinetti and Rossignoli
(1993)

attempted to use technical variables to build an accurate prediction system to forecast
the timing of buying and selling for the
index.
Westheider (1994)

studied the
predictive power of the ANN on
stock

index returns
. He uses economic and fundamental
variables to predict monthly, quarterly and yearly returns.

3.
Neural

architecture

The computer software selected for the tra
ining and testing the
network is Neural

Planner
version 3.71. This software was programmed by Stephen Wolstenholme. It is an ANN
simulator strictly designed for

only one Back Propagation learning algorithm.

There are four major issues in the selection of the appropriate
network:

1. Select the Appropriate Algorithm.

2.
Architecture of the ANN.

3. Selection of the Learning Rule.

4. Selection of the Appropriate Learning Rates and Momentum.

3.1. Select the Appropriate Algorithm

Since the sole purpose of this project is to identify the top performing
stocks

and the
historical data that are used for the training process will have a known outcome (whether it is
considered Top performer or otherwise), algorithms designed for supervise
d learning are
ideal. Among the available algorithms, the Back Propagation algorithm designed by
Rumelhart, Hinton and Williams (1986)

is the most suitable as it is being intensively

tested in
Finance. Moreover, it is recognized as a good algorithm for generalization purposes.

3.2. Architecture of ANN

Architecture, in this context, refers to the entire structural design of the ANN (Input Layer,
Hidden Layer and Output Layer). It invol
ves determining the appropriate number of neurons
required for each layer and also the appropriate number of layers within the Hidden Layer.
The logic of the Back Propagation method is the Hidden Layer. The Hidden Layer can be
considered as the crux of the

Back Propagation method. This is because hidden layer can
extract higher level features and facilitate generalization, if the input vectors have low level
features of a problem domain or if the output/input relationship is complex. The fewer the
hidden un
its, the better is the ANN able to generalize. It is important not to over
-
fit the ANN
with large number of hidden units than required until it can memorize the data. This is
because the nature of the hidden units is like a storage device. It learns noise
present in the
training set, as well as the key structures. No generalization ability can be expected in these.
This is undesirable as it does not have much explanatory power in a different
situation/environment.

3.3. Selection of the Learning Rule

The lea
rning rule is the rule that the
network

will follow in its error reducing process. This
is to facilitate the derivation of the relationships between the input(s
) and output(s). The
generalized delta rule developed by
Rumelhart et al. (1986)

is used in the calculations of
weights. This particular rule is selected because it is heavily used a
nd proven effective in the
Finance researches.

3.4. Selection of the Appropriate Learning Rates and Momentum

The Learning Rates and Momentum are parameters in the learning rule that aid the
convergence of error, so as to arrive at the appropriate weights t
hat are representative of the
existing relationships between the input(s) and the output(s).

As for the appropriate learning rate and momentum to use, the
Software
has a feature that can determine appropriate learning rate and momentum for the
network

to
start training with. This functi
on is known as “Smart Start”. Once this function is activated,
the
network

will be tested using different values of learning rates and momentum to find a
combin
ation that yields the lowest average error after a single learning cycle. These are the
optimum starting values as using these rates improve error converging process, thus require
less processing time.

Another attractive feature is that the software comes
with an “Auto Decay” function that can
be enabled or disabled. This function automatically adjusts the learning rates and momentum
to enable a faster and more accurate convergence. In this function, the software will sample
the average error periodically,
and if it is higher than the previous sample then the learning
rate is reduced by 1%. The momentum is “decayed” using the same method but the sampling
rate is half of that used for the learning rate. If both the learning rate and momentum decay are
enabled

then the momentum will decay slower than the learning rate.

In general cases, where these features are not available, a high learning rate and momentum
(e.g. 0.9 for both the Learning Rates and Momentum) are recommended as the
network

will
converge at a faster rate than when lower figures are used. However, too high a Learning Rate
and Momentum will cause the error to oscillate and thus prevent the converging pr
ocess.
Therefore, the choice of learning rate and momentum are dependent on the structure of the
data and the objective of using the ANN.

4. Variables selection

In general, financial variables chosen are constrained by data availability. They are chosen
fi
rst on the significant influences over
stock

returns based on past literature searches and
practitioners’ opinions and then on the availability of such data. Mo
st of the data used in this
research is provided by Credit Lyonnais Securities (Singapore) Pte Ltd.
Stock

prices are
extracted from a financial database called
.

Broadly, factors that can affect
stock

prices can be classified into three categories:
economic factors, political factors and firm/
stock

specific factors. Economic factors have
significant

influence on the returns of individual
stock

as well as
stock

i
ndex in general as
they possess significant impact on the growth and earnings’ prospects of the underlying
companies thus affecting the valuation and returns. Moreover, economic variables also have
significant influence on the liquidity of the
stock

market. Some of the economic variables
used are: inflation rates, employment figures and producers’ price index.

Many researchers have found that it is difficult to a
ccount for more than one third of the
monthly variations in individual
stock

returns on the basis of systematic economic
influences, and shown that political fa
ctors could help to explain some of the missing
variations. Political stability is vital to the existence of business activities and the main driving
force in building a strong and stable economy. Therefore, it is only natural that political
factors such a
s fiscal policies, budget surplus/deficit etc do have effects on
stock

price
movements.

Firm specific factors affect only individual
stock

return. For example, financial ratios and
some technical information that affects the return structure of specific
stocks,

such as yield
factors, growth factors, momentum factors, risk factors and liquidity factors. As far as
stock

selection is
concerned, firm specific factors constitute to important considerations as it is
these factors that determine whether a firm is a bright start or a dim light in the industry. Such
firm specific factors can be classified into five major categories:

1. Yield

factors: these include “historical P/E ratio” and “Prospective P/E ratio”. The former
is computed by price/earning per share. The latter is derived by price/consensus earnings per
share estimate. Another variable is the “cashflow yield”, which is basicall
y price/operating
cashflow of the latest 12

months.

2. Liquidity factors: the most important variable is the “market capitalization”, which is
determined by “price of share×number of shares outstanding”.

3. Risk factors: the representative variable is the
“earning per share uncertainty”, which is
defined as “percentage deviation about the median EPS estimates”.

4. Growth factors: basically, this means the “return on equity (ROE)”, and is computed by
“net profit after tax before extraordinary items/sharehold
ers equity”.

5. Momentum factors: a proxy is derived by “average of the price appreciation over the
quarter with half of its weights on the last month and remaining weights being distributed
equally in the remaining two months”.

The inputs of the
neural network stock

selection system are the above seven inputs and the
output is the return differences between the
stock

and the market return (excess returns).
This is to enable the
neural network

to establish the relationships b
etween inputs and the
output (excess returns).

The training data set will include all data available until the quarter before the testing quarter.
This is to ensure that the latest changes in the relationship of the inputs and the output are
being captured

in the training process.

5. Experiment

The quarterly data required by the project are generally
stock

prices and financial variables
(inputs to the ANN
stock

selection system) from 1/1/93 to 31/12/96.

Stock

Prices, which ar
e used to calculate
stock

returns, are extracted from the Financial
Database called
. These
stock

returns, adjusted for dividends,
stock

splits
and bonus issues, will be used as output in the ANN training process.

One unique feature of this research is that Prospective P/E ratio, measured as
(Price/Consensus Earnings Per Share Estimate), is being us
ed as a forecasting variable. This
variable has not received much attention in Financial Research. Prospective P/E ratio is used
among practitioners as it can reflect the perceived value of
stock

with respect to EPS
(Earnings per share) expectations. It is used as a value indicator, which has similar
implications as that of the Historical P/E ratio. As such, a low Prospective P/E suggests that
the
stock

is undervalued with respect to its future earnings and vice versa. With its
explanatory power, Prospective P/E ratio qualifies as an input in the
stock

selection system.
Data on Earnings per Share estimates, which is used for the calculation of EPS Uncertainty
and Prospective P/E ratio, is available in
. This is a compilation
on EPS estimates and recommendations are put forward by financial analysts. The coverage
has estimates from countries over the A
sia Pacific Rim as from January 1993.

5.1. Research design

The purpose of this ANN
stock

selection system is to select
stocks

that are top performers
from the market (
Stock

that outperformed the market by 5%) and to avoid se
lecting under
performers (
Stocks

that underperformed the market by 5%). More importantly, the aim is to
beat the market benchmark (Quarterly return on the
index) on a portfolio basis.

This ANN
stock

selection system is a quarterly portfolio re
-
balancing strategy whereby it
will select
stocks

in the beginning of the quarter and performance (the retur
n of the
portfolio) will be assessed at the end of the quarter.

5.2. Design 1 (Basic System)

In this research design, the sample used for training consists of
stocks

that outperformed
and underperformed the market quarterly by 5% from 1/1/93 to 30/6/95.

The inputs of the ANN
stock

selection system are the seven inputs
chosen in
Section 4

and
the output will be the return differences between the
stock

an
d the market return (excess
returns). This is to enable the ANN to establish the relationships between inputs and the
output (excess returns).

The training data set will include
all data available until the quarter before the testing
quarter
. This is to en
sure that the latest changes in the relationship of the inputs and the
output are being captured in the training process.

The generalization ability of the ANN in selecting top performing
stocks

and whether the
system can perform is tested across time. The data used for the selection process are from the
third quarter of 1995(1/7/95

30/9/95), the fourth quarter of 1995 (1/10/95

31/12/95), the first
quarter of 199
6 (1/1/1996

31/3/1996), the second quarter of 1996 (1/4/1996

30/6/1996), the
third quarter of 1996 (1/7/1996

30/9/1996), the fourth quarter of 1996 (1/10/1996

31/12/1996). The limited test duration is constrained by the data availability.

The testing input
s are being injected into the system and the predicted output will be
calculated using the established weights. After which, the top 25
stocks

with the highest
output value will be selected to form a portfolio of
stocks.

These 25
stocks

are the top 25
stocks

recommended for purchase at the beginning of the quarter. Generalization ability of
the ANN will be determined by the perform
ance of the portfolio, measured by excess returns
over the market as well as the % of top performers in the portfolio as compared to the
benchmark portfolio (Testing Portfolio) at the end of the month.

5.3. Design 2 (Moving Window System)

The Basic System
is constrained by meeting the minimum sample size required for training
process. However, this second design is going to forgo the recommended minimum sample
size and introduce a Moving Window concept. This is to analyze the ANN ability to perform
under a
restricted sample size environment.

The inputs and output variables are identical with that of the Basic System, but the training
and testing samples are different. The Moving Window System uses three quarters as training
sample and the subsequent quarter
as the testing sample. The selection criterion is also
identical with that of the Basic System in research Design 1.

5.4. Results

The ANN is made to train with 10

000 and 15

000 cycles. The reason for using these numbers
of cycles for training is because t
he error converging is generally slow after 10

000 thus
suggesting adequate training. Moreover, it does not converge beyond 15

000. This is an
indicator that the
network

is over
-
trained.

The training of four hidden neurons for 10

000 cycles takes approximately 1.5

h, eight hidden
neurons takes about 3

h and the most complex (14 hidden neurons) took about 6

h on a
Pentium 100

MHz PC. As for those architectures tha
t require 15

000 cycles, it usually takes
about 1.5 times the time it takes to train the
network

for 10

000 cycles.

The results of the Basic
Stock

Selection System based on the training and testing schedules
mentioned are presented in two forms: (1) the excess return format and (2) the percentage of
the top performers
in the selected portfolio. These two techniques will be used to assess the
performance and generalization ability of ANN.

Testing results show that the ANN is able to “beat” the market overtime, as shown by positive
compounded excess returns achieved consi
stently throughout all architectures and training
cycles. This implies that the ANN can generalize relationships overtime. Even at individual
quarters’ level, the relationships between the inputs and the output established by the training
process is proven

successful by “beating” the market in 6 out of 8 possible quarters which is a
reasonable 75%. (
Fig. 1
.)



Full
-
size image

(16K)

Fig. 1. Graphical presentation of excess returns for 15

000 cycles.



The Basic
Stock

System has consistently performed better than the testing portfolio
overtime. This is evident by the fact that the selected portfolios have hig
her percentage of top
performing
stocks

(above 5%) than the testing portfolio overtime. This ability has also
enabled the
network

to better the performance of the market (

index) presented
ea
rlier. (
Fig. 2

and
Fig. 3
.)



Full
-
size image

(17K)

Fig. 2. Performance of portfolios with % of
stocks

with actual return

of 5% above market
for 10

000 cycles.




Full
-
size image

(24K)

Fig. 3. Graphical presentation of excess returns of portfolio.



The Moving Window Selection System is design
ed to test the generalization power of the
ANN in an environment with limited data.

The generalization ability of the ANN is again evident in the Moving Window
Stock

Selection System as it outperformed the Testing portfolio in 9 out of 13 testing quarters
(69.23%). This can be seen in the graphical presentation that the line representing the selected
portfolio is above the line representing testing portfolio most

of the time. (
Fig. 4
.) Moreover,
the compounded excess returns and the annualized compounded excess returns are better than
that of the testing portfolio by two times over. The sele
cted portfolios have outperformed the
market 10 out of 13 (76.92%) testing quarters and excess returns (127.48% for the 13 quarters
and 36.5% for the Annualized compounded return), which proved its consistent performance
over the market (

index) overtime. (
Fig. 5
.)



Full
-
size image

(21K)

Fig. 4. % of Top performers in the portfolio
-

moving window
stock

selection system.




Full
-
size image

(28K)

Fig. 5. Actual returns
-

moving window
stock

selection system.



The Selected Portfolios have outperformed the Testing Portfolio in nine (69.23%) and equal
the performance in one occasion. This further proves the generalization ability of the ANN.
Moreove
r, the ability to avoid selecting undesirable
stocks

is also evident by the fact that
the selected portfolios have less of this kind of
stocks

than the testing portfolio in 10 out of
13 occasions (76.92%).

From the experimental results, the portfolio of the Selected portfolios outperformed the
Testing and Market portfo
lios in terms of compounded actual returns overtime. The reason is
because the Selected portfolios outperform the two categories of portfolios in most of the
testing quarters thus achieving better overall position at the end of the testing period.

6. Concl
usion and future works

The ANN has displayed its generalization ability in this particular application. This is evident
through the ability to single out performing
stock

counters and having excess returns in the
Basic
Stock

Selection System overtime. Moreover,
neural network

has also showed its
ability in deriving relationships in a constrained environment in the Moving Window
Stock

Selection System thus making it even more attractive for applications in the field of Finance.

This paper is largely constrained by the availability of data. Therefore, when more data is
available, performance of the
neural networks

can be better assessed in the various kinds of
market conditions, such as bull, bear, high inflation, low inflation or even political conditions
in which all have different impact
on
stocks.

Also, as more powerful
neural

architectures are being

discovered by researchers on a fast
pace, it is good to repeat the experiments using several architectures and compare the results.
The best performance structure may than be employed.

References

Alici
, Y., 1996.
Neural network

in corporate failure prediction: the UK experience. In:
Proceedings of the Third International Conference on
Neural Networks

in the Capital
Markets

World Scientific, Singapore 11

13 October,1995, London.

Burgess
, A.N. and Refenes
, A.N., 1996. Modelling non
-
linear co
-
integration in international
equity index futures. In:
Proceedings of the Third International Conference on
Neural
Networks

in the Capital

Markets

World Scientific, Singapore London 11

13 October 1995.

Chinetti
, D., & Rossignoli, C. (1993). A
neural network

model for
stock

market
prediction. Tech. Rep. Universita Statale di Milano, Department of Computer Science, Milan,
Italy..

Dutta
, S. and Shekar, S., 1993. Bond rating: a non
-
conservative application of
neural
networks

Chapter 14,reprint. In: Robert R. Trippi

and Efraim Turban Editors, 1993.
Neural
networks

in finance and investing

Pobus Publishing Company, pp. 257

273 Proceedings of
the IEEE International Conference on
Neural Networks,

July 1988 (pp. II443


II450).

Gencay
, R. and Stengos, T., 1996. The predictability of
stock

returns with local versus
global nonparametric estimators. In:
Proceeding of the Third Co
nference on
Neural
Networks

in the Capital Markets

World Scientific, Singapore London 11

13 October 1995.

Odom
, M.D. and Sharda, R., 1993.
Neural network

model for bankruptcy prediction
Chapter 14, reprint. In: Robert R. Trippi and Efraim Turban Editors, 1993.
Neural networks

in finance and investing

Pobus Publishing Company, pp. 178

185 Proceedings of the IEEE
International Conference on
Neural Networks

(pp II163

II168). San Deigo, CA, EEE.

Raghupathi
, W., Schkade, L.L., & Raju, B.S. (1991). A
neural network

approach to
bankruptcy prediction. Proceedings of the IEEE 24th Annual Hawaii International Conference
on System Sciences..

Robles
, J.V. an
d Naylor, C.D., 1996. Applying
neural networks

in copper trading: a
technical analysis simulation. In:
Proceedings of the Third International Conference on
Neural Networks

in Capital Markets

World Scientific, Singapore London 11

13 October
1995.

Rumelhart
, D.E., Hinton, G.E
. and Williams, R.J., 1986. Learning the internal representations
by error propagation. In:
Parallel distributed processing vol 1 and 2

MIT Press,
Massachusetts.

Salchenberger
, L.M
., Cinar, E.M. and Lash, N.A., 1992.
Neural networks:

a new tool for
predicting thrift failures.
Decision Sciences

23

4, pp. 899

916.
Full Text

via CrossRef

Westheider
, O., 1994.
Stock

return predictability: a
neural network

approach. In:
Information systems working paper #8

93

The John E. Anderson Graduate School of
Management at UCLA.

Corresponding author; email:
itsquah@ntu.edu.sg


Expert Systems with Applications

Volume 17, Issue 4
, November 1999, Pages 295
-
301