# Identification Via Unsupervised

Τεχνίτη Νοημοσύνη και Ρομποτική

20 Οκτ 2013 (πριν από 4 χρόνια και 6 μήνες)

98 εμφανίσεις

Optimized
Weather Pattern
Identification Via Unsupervised
Neural Network Techniques

Jeffrey Copeland

Work supported by

U.S. Army National Ground Intelligence Center (NGIC)

also DHS, BRGM,
& JPEO
-
CBD

March 9, 2012

Motivation

2

How can we make better sense of very large data sets?

-
order” statistics do not reveal much,

What we are really interested in is the the patterns, their
frequencies of occurrence, and the changes with time.

Classification method should not require input from
subject matter expert

Self
-
Organizing Maps

Each node is aware of what is
happening in its neighborhood
and responds to changes in the
neighborhood

1. Nodes
are initialized and
trained

Randomly selected from training
vectors

Randomly chosen to span range
of training data

Etc.

3

Training Vectors

SOM

Self
-
Organizing Maps

1. Regular
mapping of nodes are
initialized and trained

2. For
each training vector find the
best matching node

Based on distance measure

Euclidian,
etc.

4

Self
-
Organizing Maps

1. Regular
mapping of nodes
are
initialized and trained

2. For
each training vector find
the best matching node

3. Best
matching node and
more like the training vector

W
t+1
= W
t

+
Θ
t
L
t
(V
t
-
W
t
)

Θ

neighborhood function

L learning rate

5

Self
-
Organizing Maps

1. Regular
mapping of nodes
are
initialized and trained

2. For
each training vector find
the best matching node

3. Best
matching node and
more like the training vector

W
t+1
=
Θ
t
L
t
(V
t
-
W
t
)

4. Neighborhood
decreases
with successive iterations

Exact form of the weights is
not critical

examples:

L
t

= L
0
e
-
t/λ

ρ
t

= ρ
0
e
-
t/λ

θ
t

= exp(
-
d
2
/2ρ
t
2
)

6

Self
-
Organizing Maps

1. Regular
mapping of nodes
are
initialized and trained

2. For
each training vector find
the best matching node

3. Best
matching node and
more like the training vector

W
t+1
=
Θ
t
L
t
(V
t
-
W
t
)

4. Neighborhood
decreases
with successive iterations

W
final

~
Σw
i
V
i

Similar to but not the true
average of the members

7

Self
-
Organizing Maps

Neighboring nodes bear
a strong similarity to
each other.

Difficulty in interpreting
the large number of
resulting clusters
.
How
many?

reducing the
importance of
individual clusters

Too
overestimating within
cluster variance and
errors in
selecting a
typical
day

Wind speed

The goal is to develop an objective method to determine the
clusters that the eye can readily
identify

8

Optimize SOM Patterns

Optimize number of
patterns by

Perform hierarchal
clustering for each
permutation of SOM
patterns

Compute Davies
-
Bouldin

metric (mean cluster
scatter / cluster
separation) for each
hierarchal cluster

Optimal number of
clusters defined by
minimum of DB curve

9

Use

of

hierarchal

optimization

stage

more

clearly

defines

relationship

between

climate

patterns

and

allows

for

refinement

of

number

of

cases

based

upon

analyst

Optimize SOM Patterns

10

Looking Forward

Analyzing the climate reanalysis assumes a stationary
climate (but current ≠ historical)

NOAA Climate Forecast System (CFS) provides 4
-
member ensemble forecasts with up to 9
-

Can we use these short
-
term climate forecasts to re
-
estimate the frequency of occurrence of the historic
patterns?

Is there reasonable stability between forecast leads for
use as planning tool?

11

Looking Forward

12

Summary

Ongoing

NGIC: identification of relevant patterns for T&D case
studies

NGIC: climate forecast of frequency of occurrence of
relevant patterns

Future

FAA: prediction of probability of convection on trans
-
oceanic
air traffic routes

Past

DHS: identification
of relevant patterns for
environmental
impact assessment

BRGM:
identification of relevant
precipitation patterns
for
surface
trafficability

JPOE
-
CBD
: identification of relevant patterns for
instrument
siting
case
studies at DPG

13

Questions?

14

How to deal with the volume of data produced by GCAT?

Typically over 20,000 hourly output volumes produced per
run (30 year simulation of 30
-
day period)

Requirement for some form of intelligent data reduction (i.e.
pattern classification not bulk statistics)

Classification method should not require input from subject
matter expert (i.e. unsupervised learning)

-
means, hierarchal) can be
computationally expensive for large N problems

Classify on model quantities that are relevant to the
problem (training vectors)

We
make use of Self
-
Organizing
Maps
(SOMs
)

applied to
transport and dispersion environmental impact studies
using climate reanalysis and forecasts.

15

Self
-
Organizing Maps

Self
-
Organizing Maps

An
artificial neural network
technique used

for pattern
recognition and
classification

The SOM consists
of
components called nodes or
neurons

N
odes are usually arranged in
a
hexagonal or rectangular grid

The
SOM describes
a mapping
from a higher dimensional input
space to a lower dimensional
map
space

SOMs with
a small number of
nodes behave in a way that is
similar to
k
-
means

SOMs
may be considered a
nonlinear generalization of
PCA