A
comparison
of
the
ability
of
artificial
neural
network
and
polynomial
fitting
was
carried
out
in
order
to
model
the
horizontal
deformation
field
.
It
is
performed
by
means
of
the
horizontal
components
of
the
GPS
solutions
in
the
Cascadia
Subduction
Zone
.
One
set
of
the
data
is
used
to
calculate
the
unknown
parameters
of
the
model
and
the
other
is
used
only
for
testing
the
accuracy
of
the
model
.
The
problem
of
overfitting
(i
.
e
.
,
the
substantial
oscillation
of
the
model
between
the
training
points)
can
be
avoided
by
restricting
the
flexibility
of
the
neural
model
.
This
can
be
done
using
an
independent
data
set,
namely
the
validation
data,
which
has
not
been
used
to
determine
the
parameters
of
the
model
.
The
proposed
method
is
the
so

called
“stopped
search
method”,
which
can
be
used
for
obtaining
a
smooth
and
precise
fitting
model
.
However,
when
fitting
high
order
polynomial,
it
is
hard
to
overcome
the
negative
effect
of
the
overfitting
problem
.
The
computations
are
performed
with
Mathematica
software,
and
the
results
are
given
in
a
symbolic
form
which
can
be
used
in
the
analysis
of
crustal
deformation,
e
.
g
.
strain
analysis
.
Crustal velocity field modelling with neural network
and
polynomials
Piroska Zaletnyik
1
, Khos
r
o Moghtased

Azar
2
1
Department of Geodesy and Surveying, Budapest University of Technology and Economics
–
Hungary, piri@agt.bme.hu
2
Institut of Geodesy, University of Stuttgart
–
Germany, m
oghtased@gis.uni

stuttgart.de
The
adaptation
of
neural
networks
to
the
modeling
of
the
deformation
field
offers
geodesists
a
suitable
tool
for
describing
structural
deformation
.
Overfitting
problem
can
occur
in
higher
order
polynomials,
but
neural
network
overcomes
the
problem
thanks
to
stopped
search
method
.
The
greatest
advantage
of
this
method
is
that
the
solution
can
be
given
as
an
analytical
function,
which
could
be
use
to
compute
derivation
of
the
velocity
vectors
for
strain
analysis
.
The
first
author
wish
es
to
thank
to
the
Hungarian
E
ö
tv
ö
s
Fellowship
for
support
ing
her
visit
at
the
Department
of
Geodesy
and
Geoinformatics
of
the
University
of
Stuttgart
(Germany),
where
this
work
has
been
accomplished
.
Abstract
Results
Fig 1.GPS determined horizontal velocity field by Pacific Northwest Geodetic
Array (PANGA), which is plotted relative to North American Plate.
Introduction
Polynomial fitting
When
there
are
more
points
than
the
number
of
parameters,
there
is
a
possibility
for
adjustment
calculation,
i
.
e
.
for
polynomial
fitting
.
In
this
case
62
points
were
used
for
the
adjustment
.
The
calculations
were
carried
out
with
Mathematica
software
.
Figure
4
:
Northing velocity
model by polynomials
Figure
5
:
Northing velocity
model by neural networks
Conclusion
Acknowledgement
Neural network with stopped
search method
Testing set residuals (mm/year)
min
max
mean
Std.
Northing velocities

11.1
36.3
1.4
9.0
Easting velocities

9.3
17.3
1,8
6.1
Teaching set residuals (mm/year)
min
max
mean
Std.
Northing velocities

4.1
3.9
0.0
1.2
Easting velocities

7.9
8.2
0.0
2.8
Table1
:
The statistics of the differences between polynomial model and
real velocities in the 62 teaching points.
Testing set residuals (mm/year)
min
max
mean
Std.
Northing velocities

6.2
8.3

0.5
2.8
Easting velocities

4.6
8.9
0.2
3.8
Teaching set residuals (mm/year)
min
max
mean
Std.
Northing velocities

4.9
5.5
0.0
1.6
Easting velocities

6.8
9.5
0.2
3.1
The
GPS
measurements
to
determine
crustal
strain
rates
were
initiated
in
the
Cascadia
region
(US
Pacific
Northwest
and
south

western
British
Columbia,
Canada)
more
than
a
decade
ago,
with
the
first
campaign
measurements
in
1986
and
the
establishment
of
permanent
stations
in
1991
.
Nowadays,
continuous
GPS
data
from
the
Pacific
Northwest
Geodetic
Array
process
ed
by
the
geodesy
laboratory
serves
as
the
data
analysis
facility
for
the
Pacific
Northwest
Geodetic
Array
(PANGA)
.
This
organization
has
deployed
an
extensive
network
of
continuous
GPS
sites
that
measure
crustal
deformation
along
the
CSZ
.
Fig
.
1
illustrates
the
horizontal
velocity
field
along
the
Cascadia
margin
assuming
the
North
American
plate
to
be
stable
.
Table 2. The statistics of the differences between
3
D polynomial model
and real velocities in the 20 testing points.
The
o
verfitting
problem
means
that
the
error
of
the
teaching
set
is
decreasing
while
the
error
of
the
testing
set
is
growing,
in
other
words
the
network
excessively
fits
the
teaching
points
which
is
illustrated
by
Fig
.
3
.
Overfitting problem
Fig. 2. Overfitting problem
Comparing
the
results
of
Table
1
and
Table
2
we
recognize
a
significant
difference
between
the
deviations
of
the
teaching
and
the
testing
set
.
The
determined
model
by
polynomial,
work
s
well
only
in
the
teaching
points
but
between
them
it
does
not
work
as
well
.
The
testing
set,
which
was
not
used
during
the
determination
of
the
model,
is
also
needed
in
order
to
qualify
the
results
.
As
a
classical
approximation
model,
3
D
polynomial
fitting
technique
is
used
to
build
continuous
velocity
field
as
a
function
of
geodetic
coordinates
.
Displacement
vector
which
can
be
derived
from
GPS
observations
have
east,
north
and
up
components
in
topocentric
coordinates
.
F
or
modeling
the
horizontal
displacement
field
we
use
only
the
north
and
the
east
elements
.
Accuracy
of
modeling
is
determined
by
differences
between
true
values
and
values
estimated
by
3
D
polynomial
fitting
.
When
we
increase
e
the
degree
of
the
polynomial,
accuracy
is
increase
ing
up
to
the
6
th
degree,
but
above
that
started
to
decrease
,
because
of
the
deterioration
of
the
conditions
of
the
equations
(
ill
conditioned
equations)
.
The
6
th
order
polynomial
was
the
best
fitting
model
.
In
this
case
28
points
are
needed,
because
a
two

variable
6
th
order
polynomial
has
28
parameters
.
A
central
issue
in
choosing
the
most
suitable
model
for
a
given
problem
is
selecting
the
right
structural
complexity
.
Clearly,
a
model
that
contains
too
few
parameters
will
not
be
flexible
enough
to
approximate
important
features
in
the
data
.
If
the
model
contains
too
many
parameters,
it
will
approximate
not
only
the
data
but
also
the
noise
in
the
data
.
Overfitting
may
be
avoided
by
restricting
the
flexibility
of
the
neural
model
in
some
way
.
The
Neural
Networks
package
in
Mathematica
offers
a
few
ways
to
handle
the
overfitting
problem
.
All
solutions
rely
on
the
use
of
a
second,
independent
data
set,
the
so

called
validation
data,
which
has
not
been
used
to
train
the
model
.
One
way
to
handle
this
problem
is
the
stopped
search
method
.
Stopped
search
refers
to
obtaining
the
network
’
s
parameters
at
some
intermediate
iteration
during
the
training
process
and
not
at
the
final
iteration
as
it
is
normally
done
.
During
the
training
the
values
of
the
parameters
are
changing
to
reach
the
minimum
of
the
mean
square
error
(MSE)
.
Using
validation
data,
it
is
possible
to
identify
an
intermediate
iteration
where
the
parameter
values
yield
a
minimum
MSE
.
At
the
end
of
the
training
process
the
parameter
values
at
this
minimum
are
the
ones
used
in
the
delivered
network
model
.
In
order
to
avoid
the
overfitting
problem
by
means
of
stopped
search
method,
we
will
need
more
data
.
A
learning
set
and
a
validation
set
.
Hence,
we
have
to
divide
the
used
teaching
set
(
62
points)
into
two
sets,
the
first
will
be
the
learning
set
with
42
points
and
the
remaining
20
points
will
be
the
validation
set
.
In
fig
.
3
we
can
see
the
errors
of
the
learning
and
the
teaching
set
during
the
learning
procedure
of
the
neural
network
model
for
the
northing
velocities
.
The
errors
of
the
learning
set
(continuous
line)
decrease
during
the
whole
procedure,
but
the
errors
of
the
validation
set
(dashed
line)
are
decreasing
only
until
the
262
nd
iteration
step,
from
that
point
are
growing
.
The
maximum
number
of
iteration
was
500
,
but
the
best
parameter
set
is
the
one
calculated
at
the
262
nd
iteration
step
.
In
the
model
in
the
hidden
layer
7
neurons
(nodes)
were
used
.
The
s
election
of
number
of
neurons
is
basically
depends
on
the
number
of
known
points
.
In
fact,
by
having
more
known
data
we
can
increase
the
numbers
of
neurons
.
Let
us
check
the
statistics
of
the
residuals
for
the
whole
teaching
set
(
62
points)
in
Table
3
.
Fig.3. Errors of
learning
(continuous line)
and validation set
(dashed line) during
stopped search
method
Table 3. The statistics of the differences between neural network
model and real velocities in the 62 teaching points
.
Table 4. The statistics of the differences between neural network model
and real velocities in the 20 testing points
.
Let’s
see
the
differences
between
the
neural
network
model
and
the
real
velocities
in
the
20
testing
points
(Table
4
.
)
Using
neural
network
model
with
stopped
search
technique
we
can
obtain
a
smooth
and
good
fitting
model,
while
in
the
case
of
high
order
polynomial
model
there
are
substantial
oscillations
between
the
teaching
points
.
See
fig
.
4
and
5
.
Comments 0
Log in to post a comment