Spatial Data Mining

naivenorthΤεχνίτη Νοημοσύνη και Ρομποτική

8 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

67 εμφανίσεις

Spatial Data Mining

hari agung

What is Spatial Data?

Used in/for:




GIS
-

G
eographic
I
nformation
S
ystems




Meteorology




Astronomy




Environmental studies, etc.


The data related to objects
that occupy space


traffic, bird habitats, global
climate, logistics, ...



Object types:


Points, Lines, Polygons,etc.

Why do we need Data Mining?


Large number of records(cases) (10
8
-
10
12

bytes)


One thousand (10
3
) bytes = 1 kilobyte (KB)


One million (10
6
) bytes = 1 megabyte (MB)


One billion (10
9
) bytes = 1 gigabyte (GB)


One trillion (10
12
) bytes = 1 terabyte (TB)


High dimensional data (variables)


10
-
10
4

attributes


Only a small portion, typically 5% to 10%, of
the collected data is ever analyzed


We are drowning in data, but starving for
knowledge!

Spatial Data Mining


Spatial Patterns


Spatial outliers


Location prediction


Associations, co
-
locations


Hotspots, Clustering, trends,



Primary Tasks


Mining Spatial Association Rules


Spatial Classification and Prediction


Spatial Data Clustering Analysis


Spatial Outlier Analysis

Spatial Classification


Use spatial information at different
(coarse/fine) levels (different indexing
trees) for data focusing


Determine relevant spatial or non
-
spatial
features


Perform normal supervised learning
algorithms


e.g., Decision trees,

Spatial Clustering


Use tree structures to index spatial data


DBSCAN: R
-
tree


CLIQUE: Grid or Quad tree


Clustering with spatial constraints
(obstacles


need to adjust notion of
distance
)


Spatial Association Rules


Spatial objects are of major interest, not
transactions


A


B


A, B can be either spatial or non
-
spatial (3
combinations)


What is the fourth combination?


Association rules can be found w.r.t. the 3
types


Spatial Data Mining Results


Understanding spatial data, discovering
relationships between spatial and nonspatial data,
construction of spatial knowledge bases, etc.


In various forms


The description of the general weather patterns in a set
of geographic regions is a
spatial characteristic rule
.


The comparison of two weather patterns in two
geographic regions is a
spatial discriminant rule
.


A rule like

most cities in Canada are close to the
Canada
-
US border


is a
spatial association rule


near(x,coast) ^ southeast(x, USA) ) hurricane(x), (70%)


Others: spatial clusters,


Basic Concepts (1)


Spatial data mining follows along the same
functions in data mining, with the end objective
to find patterns in geography, meteorology, etc.


The main difference (Spatial autocorrelation)


the neighbors of a spatial object may have an
influence on it and therefore have to be considered as
well


Spatial attributes


Topological


adjacency or inclusion information


Geometric


position (longitude/latitude), area, perimeter, boundary
polygon

Basic Concepts (2)


Spatial neighborhood


Topological relation



intersect

,

overlap

,

disjoint

,



distance relation



close_to

,

far_away

,



direction/orientation relation



left_of

,

west_of

,



Global model might be
inconsistent with regional
models

Global Model

Local Model

Applications


NASA Earth Observing System (EOS):
Earth science data


National Inst. of Justice: crime mapping


Census Bureau, Dept. of Commerce:
census data


Dept. of Transportation (DOT): traffic data


National Inst. of Health(NIH): cancer
clusters


Example:

What Kind of Houses Are Highly Valued?

Associative Classification


Data


ERA
-
15 using a T106L31 model (from 1978 to 1994) with 1.125


resolution


Terabytes


Comprises data from approx. 20 variables (such as
temperature,humidity, pressure, etc.) at 30 pressure levels of a
360x360 nodes grid

6

SOM Application for DataMining

Downscaling Weather Forecasts

Adaptive

Competitive

Learning

Sub
-
grid details scape from numerical models


Dept. of Applied
Mathematics

Universidad de
Cantabria

Santander, Spain

And now
discussion