Forecasting the Insolvency of U.S. Banks Using Support Vector ...

grizzlybearcroatianΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

116 εμφανίσεις

1


Forecasting

t
he Insolvency
o
f
U
.
S
. Banks Using
Support Vector
Machines
(
SVM
)

based on Local Learning Feature Selection

Theophilos Papadimitriou
*
, Periklis Gogas
+
,
Vas
ilios Plakandaras
1
,

John C. Mourmouris
a

Department of International Economic Relations
and Development

Democritus University of Thrace.

Abstract

We
propose

a Support Vector Machine (SVM) based structural model
in order to
forecast

the collapse of
banking institutions
in the U
.
S
.

using publicly disclosed

information
from

their financial statements on a
four
-
year
rolling
window. In our
approach
,
the

optimum

input variable set is defined from a large dataset using an
iterative relevance
-
based selection

procedure. We train an SVM model to classify
banks as
solvent and insolv
ent
. The resulting model exhibits
significant abilit
y

in
bank default

forecasting
.


JEL Code: G21

Key words: Bank insolvency, SVM, local learning, feature selection.

1.
Introduction

F
orecast
ing

bank

failures

is
of outmost importance to

bank supervision
agencies and

is
also
a key
issue

in
risk management
,

especially when it comes to asset allocation

and diversification
.
Thus, forecasting

bank insolvency is an active field of research
with intense
interest from market participants and policy makers.

The overall negative
spillovers
through
contagion at

the microeconomic and macroeconomic level
of a
possible
bank failure

may be significant. This renders the early identification of
banking institutions that may be in financial distress an important goal

for the
supervising authorities
.




*

papadimi@ierd.duth.gr

+

pgkogkas@ierd.duth.gr

1

vplakand@ierd.duth.gr

a

jomour@eexi.gr

2


There is a
vast

literature regarding bank failure prediction. Researchers from various
scientific fields applied several
methodologies

to predict bank
collapses
, such as
n
eural
n
etworks (
Sietsma & Dow, 1991; Huang & Wu,

2011
)
,
e
xpert
s
ystems
(Zopounidis et al, 2008;

Lane et al, 1986),
r
ough
s
et
t
heory
(McKee,

2000)
and
various

econometric

techniques

(Martin, 1977
; Pouran,

1991;

Kolari et al,

2002
;

Cole
et al,

1995
).
Recently
, machine learning techniques have been used extensively due to
their ability of
revealing data

patterns
,

while

dealing
efficiently
with highly non
-
linear
systems

(Ming & Lee,

2005;

Kim et al,

2010;

Lin et al, 2009)
.

The scope of this paper is
to

develop

a structural SVM model that can
efficiently

forecast

bank
ing failures

using a
regulatory efficient,
minimum
-
sized variable set
.

2.
Methodology


Dataset


2.1 Support Vector Machine (SVM)

The
Support Vector Machine is a classification and regression
technique proposed by
Vapnik (199
2
). When it comes to classification, the basic idea is to find an optimal
hyper plane

that separates data

points into

two

or more classes
defined by

a
small
subset

of
them
, which are called Support Vectors.
The
s
upport
v
ector
s
et is located in
the dataset through a minimization procedure.
Of course
, not all datasets are linearly
separable

(even when using the error
-
tolerant SVM model)
, so with the use of a
non
-
linear
kernel function
the system can be projected
to
a
higher dimensional space
where linear separation is possible.
(Vapnik

& Cortes, 1995)
.

One of the main advantages of SVM in comparison to other machine learning
methods is that it
can identify global

minima
and avoids selecting local
ones
,
when
reaching
a
n optimal

solution
.
This aspect is crucial
to the generalization ability of the
SVM
results as it produces

accurate and reliable
forecasts
.
The model is built in two
steps:
the training step and the testing step. In the training step, the largest part of the
dataset is used for the estimation of the separating hyperplane (i.e. the
detection

of the
Support Vectors
that define the hyperplane); in the testing step, the generaliz
ation
ability of the model is evaluated by
checking

the model’s performance in the small
subset that was left aside in the first step.



3


2.2 Overfitting

In a machine learning scheme, training results in overfitting when the model produced
is significantly
affected by possible noise
or short run dynamics

in the sample in hand
instead of the true underlying relationship that describes the phenomenon. Usually,
overfitting
gives

a high performance on the training step and
a
significant accuracy
drop
on the testing step.
Overfitting can be avoided using the
k
-
fold cross validation.
The dataset is cut into

k

subsets

and the training
-
testing steps are repeated
k

times. In
each turn a different
subset

is used as the test dataset, while the rest
k
-
1
ones

a
re
grouped together to form the training dataset.
The procedure is called dataset folding
.

The model is evaluated by averaging the performance of the model on every fold.

In
Figure 1 we present a graphical representation of a 3
-
fold cross validation system
.
T
he
model
’s

generalization ability is further tested using an out
-
of
-
sample set

(a subset
that did not participate in the cross
-
validation procedure)
.



2.
3
.
Data
set

The data used in this paper
came
from the
annual
financial statements
of

300 U.S.
banks
spanning
four
subsequent
years f
or each bank.

The

overall

collection
window
spans
the period
from 2003 to 2011
.
For
these banks

we collect
ed

data

from

t
-
1

to

t
-
4
,
in order to forecast their state in year
t
.

The data were gathered from the Federal
Deposit

Insurance Corporation (FDIC)
.

We

select
ed

37
variables for each year,
resulting in a 148 variable dataset

in our cross
-
section
.


Training

Testing

Model
Evaluation

m
odel

Training

Testing

model

Training

Testing

model

Initial
Dataset

Figure
1
: Overview of a 3
-
fold cross validation scheme

4


In order to eliminate redundant or irrelevant to our purpose variables from the dataset,
we performed the variable selection procedure that Sun
et al
. introduced in 2010
b
. The
method calculates a relevance factor for every variable

of the dataset, aiming at
creating an input vector set with just the most relevant variables. Starting, from the
highest relevance factor variable, we define the input variable set, iteratively. In every
step, we add the next higher relevance variable to
the input set and test the forecasting
accuracy of the model. If the model accuracy increases with the current vector set, we
continue the expansion procedure. If the model accuracy drops, then the algorithm
ends and we keep the input set that resulted in
the highest forecasting accuracy.

Th
e

described scheme was performed

for four kernels:
the
linear,
the
radial basis

function

(RBF)
,
the
sigmoid
and

the
polynomial.

The

mathematical
representation
of
each kernel is:

Linear



(





)









(
1
)


RBF



(





)













(
2
)


Polynomial



(





)

(








)


(
3
)


Sigmoid



(





)


(








)

(
4
)


with
factors
d
,
r
,
γ

representing

kernel parameters
.

The ratio of the solvent over insolvent banks in our dataset
was chosen to be

2 to 1.

In order to balance the dataset we attributed a weight to every case: 1 for the solvent
banks and 2 for the insolvent banks. Consequently
, inside the core of the SVM
training, in the minimization procedure, a failure to forecast an insolvent bank, adds
double the amount in the objective function, than a failure to forecast a solvent bank.

3. Empirical Results

After selecting the optimum input variable set iteratively with the aforementioned
variable selection technique described in section 2.3, the model with the best
generalization ability on the test set, consists of 6 input variables. The out
-
of
-
sample
forec
asting ability reaches 92% in the unknown dataset, which is a considerably
accurate prediction performance.
B
est results are

achieved using the RBF kernel and
they are
illustrated in
Table
1.




b

The detailed analysis of the variable selection method is out of the scope of this
correspondence. The interested reader can follow the paper in Sun
et al.

(2010).

5


Table 1: Results of Successional Addition of variables for the

RBF kernel

Iteration

Variable

Out
-
of
-
sample

forecasting
accuracy

1

Equity capital to assets ratio (year 4
)

78
%

2

Yield on earning assets

(year 2)

86
%

3

Core capital ratio (year 4
)

86
%

4

Net interest margin

(year 4)

84
%

5

Returns
on earning assets
(year 1)

80%

6

Goodwill and other intangibles (year 4)

92%


Equity capital to assets measures the

proportion of

assets funded by shareholders or
inversely
the amount of assets funded by debt
; it is thus a measure of leverage.

A
bank with a high equity/asset ratio is exposed to
less
financial risk.
Moreover
,

return

on earning assets

measures the total interest, dividends and fee income as a part of
earning assets and is commonly used by banking regulators for evaluating bank
so
lvency.

Core capital ratio

is the percentage of the bank’s total capital to risky
weighted assets and is also commonly controlled by regulating aut
horities. Overall,
the first thr
ee variables
can be assumed as proxies for measuring bank exposure to
risk un
der the Risk coverage and Containing Leverage of Basel III (Bank of
International Settlements, 2012).

On a similar path, net interest margin measures the difference between interest
earned
(on assets)
and interest
paid (on liabilities) by the bank
. Net int
erest margin is a
straight forward measure of the

efficiency of the core banking function, i.e. borrowing
and lending.

On the contrary,
Goodwill and other intangibles accounts for the
earnings from non
-

asse
t

revenues and is described as a measure of the d
ifference
between accounting re
cords and market
. Indeed
, the addition of this variable
raised

the overall accuracy and therefore leads to the conclusion that
the market
appreciation

about a banking institution does affect its financial future.

After all
,
the

key stone of
every healthy banking institution is the faith of its shareholders about its solvency and
the establishment in public opinion and market participants about its profitability and
6


ability to attribute high dividends. All the above can be con
centrated
to

this certain
variable,

However,

as mentioned before
apart from overall performance
we are

also

interested
in insolvency

forecasting
. Consequently,
it is important for our cause to segregate the
forecasting accuracy of our model on the solvent
and the insolvent
states
.
Thus,

in
Table
2
, we decompose the
forecasting
results, in
the two
cases
under examination.


Table
2
: Comparison of predict
ed

to real incidents


Real

Solvent

Real

Insolvent

Predicted
Solvent

33

4

Predicted
Insolvent

0

1
3



We

remind that the ov
erall forecasting accuracy was 9
2%. In the insolvent bank cases
the out
-
of
-
sample forecasting ac
curacy of our model is 76.4% (13

out of 17); in the
solvent case the forecasting accuracy is
100
% (
33

out of 33).

4
.
Conclusion

F
orecasting
bank
insolvency
is of great interest to both policy makers and market
participants. The aim of this paper was to forecast bank failures by using information
that is publicly available in the financial statements of banking institutions. In doing
so,
within

a machine learning framework
we employed a structural SVM model

technique

and a variable selection procedure
.

The proposed scheme, resulted
in
a
model with just
six

input variables (out of one hundred forty eight
tested
), yielding
an
out
-
of sample overall
forecasting
accuracy
of
9
2% and
an
insolvency forecasting
accuracy
of

76.4
%.


Acknowledgments

This research has been co
-
financed by the European Union (European Social Fund


ESF) and Greek national funds through the Operational Program "Education and
Lifelong Learning" of the National Strategic Reference Framework (NSRF)
-

Research Funding Program:
THALES
. Investing in kno
wledge society through the
European Social Fund.


7


References


Chih
-
Chung Chang and Chih
-
Jen Lin, LIBSVM: a library for support vector
machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1
--
27:27, 2011. Software available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm

Cole

Rebel A., Gunther

Jeffery W.

(1995)

, Separating the likelihood and timing of
bank failure, Journal of Banking & Finance
, vol.
19
, pp.

1073
-
1089.

Cortes

Corinna, Vapnik

Vladimir

(1995)
, Support
-
Vector
Networks, Ma
chine
Leaming, vol 20, pp. 273
-
297
.

Huang Shian
-
Chang, Wu Cheng
-
Feng (2011), Customer credit quality assessments

using data mining methods for banking industries
,
African Journal of Business
Management Vol. 5(
11
), pp.4438
-
4445
.

Kim Hong Sik,
Sohn So Young (2010), Support vector machines for default
prediction of SMEs based on technology credit, European Journal of
Operational Research, vol. 201, pp. 838


846.

Kolari

James, Glennon

Dennis, Shin

Hwan

and

Caputo

Michele

(2002),

Predicting
large US commercial bank failures, Journal of Economics and Business
, vol.
54
,
pp.

361

387.

Lane

William R
.
, Looney

Stephen W. and Wansley

James W.

(1986)
, An application
of Cox proportional hazards Model to bank failure, Journal of Banking an
d
Finance
vol 10, pp.
511
-
531.

Lin Shih
-
Wei, Shiue Yeou
-
Ren, Chen Shih
-
Chi, Cheng Hui
-
Miao (2009), Applying
enhanced data mining approaches in predicting bank performance: A case of
Taiwanese commercial banks, Expert Systems with Applications, vol. 36,
pp.
11543

11551.

Martin

Daniel

(1977)
, Early warning of bank failure, A logit regression approach,
Journal of Banking and Finance
vol.1,pp.

249
-
276.

McKee Thomas E. (2000), Developing a Bankruptcy Prediction Model via Rough

Sets Theory, International Jour
nal of Intelligent Systems in Accounting, Finance
& Management
vol.
. 9,

pp. 159

173.

Min

Jae H., Lee

Young
-
Chan

(2005)
, Bankruptcy prediction using support vector
machine

with optimal choice of kernel function parameters, Expert Systems
with Applications
,

vol.

28
, pp.

603

614.

8


Pouran

Espahbodi

(1991)
, Identification of problem banks binary choice models,
Journal of Banking and Finance
vol.
15
, pp.
53
-
71.

Sietsma Joselyn, Robert J. F. Dow

(1991)

Creating Artificial Neural Networks That

Generalize, Neur
al
networks, Vol. 4, pp. 67
-

79
.

Sun Yijun, Todorovis Sinisa and Goodison Steve (2010), Local Learning Based
Feature Selection

for High Dimensional Data Analysis
,
IEEE

T
ransactions

on

P
attern

Analysis

and

Machine

Intelligence
,
vol
. 32,
no
. 9,

pp.

1610


1626
.

Zhao

Huimin, Sinha

Atish P., Ge

Wei

(2009)
, Effects of feature construction on
classification performance: An empirical study in bank failure prediction,
Expert Systems with Applications
,

vol.
36
, pp.

2633

2644.

Zopounidis C, Dimitras A.
(
1998
)
, Multicrite
ria Decision Aid Methods for the

Prediction of Business Failure.
Springer, 1
st

edition.