I Data Mining and Spatial Data Mining

naivenorthAI and Robotics

Nov 8, 2013 (3 years and 8 months ago)

51 views

Data Mining

and


Spatial Data Mining

1

2

Important Notes

Please read all the information on the following
website for this subject


Go to
http://elearning.lsgi.org/

and click on
“Geospatial Data Mining and Knowledge
Discovery”


Your username:

LSGI Username


Your password:

LSGI Password

References


Longley, P., Brooks, S. M., and McDonnell, R.
(1998):
Geocomputation: A Primer, John Wiley.


Atkinson, P. M. and Martin, D. (2000):
GIS and
Geocomputation, Taylor & Francis.


Openshaw, S. and Abrahart, R. J. (2000):
Geocomputation, Taylor & Francis.


Roddick, J. F. and Hornsbym K. (2001):
Temporal,
Spatial, and Spatio
-
Temporal Data Mining, Springer.


Miller, H. J. and Han, J. (2001):
Geographic Data
Mining & Knowledge Discovery, Taylor & Francis.

3

Data Mining


The exploration and analysis , mostly by
automatic means, of large quantities of data in
order to discover meaningful patterns and
rules.



Consequently, data mining emphasizes
computational approaches so they are useful
for analyzing data.

4

Data Mining as Understood by Different
Practioners


To statisticians, and other quantitative researchers it
is selectively scrutinizing data that will support a
hypothesis


To a lawyer, it means find something for a law suit
(actionable information)


AI researcher seeks features and attributes


Business people see records and field for actionable
market information


For us, discovering relationships between spatial and
aspatial data, construction of spatial knowledge
bases…


5

Data Mining Techniques

1.
Market Basket Analysis

2.
Memory
-
based Reasoning

3.
Cluster Detection

4.
Link Analysis

5.
Decision Trees and Rule Induction

6.
Artificial Neural Networks

7.
Genetic Algorithms

8.
On
-
line Analytic Processing (OLAP)

6

1.
Market Basket Analysis


A form of clustering used for finding groups
of items that tend to occur together in
transaction (or market basket)

7

2. Memory
-
based Reasoning


It uses known instances as a model to make
predictions about unknown instances


Uses concepts such as nearest neighbors


Uses SQL statements on a relational data
base.

8

3. Cluster Detection


Uses similarity concept


Undirected data mining


Uses statistical methods and neural networks


9

4. Link Analysis


Uses graph theory


Develops models based on patterns in the
relationships


Used frequently in telecommunication.

10

5. Decision Trees and Rule Induction


It is a directed data mining method


Uses for classification


The records are divided into disjoint subsets
each of which is described by a simple rule on
one or more fields.

11

6. Artificial Neural Networks


They are simple models of neural
interconnections in brains adapted for use in
computers.


They learn from training sets


Wide application area


But difficult to deduce rules, explain results.

12

7. Genetic Algorithms


Application of genetics and natural selection
to a search used for finding an optimal set of
parameters to describe a predictive function.


Uses selection, crossover and mutation
algorithms to evolve successive generation of
solutions.

13

. 8 On
-
line Analytic Processing (OLAP)


A way of presenting relational data to users
for understanding data and its patterns


Uses Multidimensional Data Bases (MDB)

14

Data Warehousing

Collection and organization of information in a
consistent and useful way followed by its analysis,
understanding and conversion into actionable
information through data mining.

15

Data Mining Tasks


Classification


Estimation


Prediction


Affinity grouping


Clustering


Description.


16

Approaches to Various Data Mining
Tasks


Top
-
Down approach:
Hypothesis testing
based on preconceived ideas, notions,
hunches, etc.. concerning relationships in the
data


Bottom
-
up
: Knowledge discovery from the
data.


Directed approach:
explain, categorize some
particular data field


Undirected approach
find patterns and similarities
among group of records without use of particular
field or collection of predefined classes.

17

Spatial Data Mining


SDM is a set of methods and tools for
exploratory spatial analysis


It is an extension of
ordinary data mining
which is often used in marketing science (as
discussed earlier), management science, etc.


SDM treats a huge amount of information


SDM emphasizes computational approach

18

Spatial Patterns


Spatial outliers


Location prediction


Associations, co
-
locations


Hotspots, Clustering, trends, …

19

Primary Tasks



Mining Spatial Association Rules


Spatial Classification and Prediction


Spatial Data Clustering Analysis


Spatial Outlier Analysis


Exploratory Data Analysis (Statistical), EDA


Exploratory
Spatial

Data Analysis, ESDA


20

Spatial Data Mining Results. Examples


The description of the general weather patterns in a
set of geographic regions is a spatial characteristic
rule.


The comparison of two weather patterns in two
geographic regions is a spatial discriminant rule.


A rule like “most cities in Canada are close to the
Canada
-
US border” is a spatial association rule


near(x,coast) ^ southeast(x, USA) )


hurricane(x), (70%)


Others: spatial clusters
,…

21

Spatial Neighborhood



Topological relation


“intersect”, “overlap”, “disjoint”, …



Distance relation


“close_to”, “far_away”,…



Direction/orientation relation


“left_of”, “west_of”,…

22