Using formal ontology for integrated spatial data mining

dealerdeputyAI and Robotics

Nov 25, 2013 (3 years and 11 months ago)

82 views

Using formal ontology for
integrated spatial data mining

Julie Sungsoon Hwang

Department of Geography

State University of New York at Buffalo

ICCSA04

Perugia, Italy

May 14, 2004

Research purposes


Enlighten the role of formal ontology in
KDD



Propose the conceptual framework for
ontology
-
based spatial data mining



Case study: ontology
-
based spatial
clustering algorithms

Problems in focus (cont.)


No single algorithm is best suited to all
research purposes and application
domains.



The same algorithm can yield results
inconsistent with fact without considering
domain knowledge



The same data may have to be analyzed in
different ways depending on users’ goal

Problems in focus


Developing new
algorithms

Algorithm

D

Algorithm

C

Algorithm

A

Algorithm

B

Algorithm

D’

Domain

Task


Re
-
using existing
algorithms

Suited to domain and task

How can algorithms be customized to varying domain and task?

Relation between data mining and
ontology construction

Knowledge

Ontology

Ontology Construction

(Knowledge acquisition)

Level of abstraction

Data

Information

Data Mining

(Knowledge discovery)

Knowledge

Role of formal ontology in KDD


Provide the context in which the knowledge
extracted from data is interpreted and evaluated


Guide algorithms such that they can be suitable
for domain
-
specific and task
-
oriented concepts

KDD Process Diagram

Using ontology for spatial data mining


Ontology formalizes how the knowledge is conceptualized,
thereby making implicit meaning explicit


Data mining extracts a high
-
level knowledge from a low
-
level
data, thereby enhancing the level of understanding

Domain

Model

Task

Model

Ontology

Spatial Data Mining

Low
-
level data

High
-
level knowledge

Domain
-
specific

spatial data mining



Let’s compare two different domains: traffic accident
versus retailers

Domain of traffic
accident

Domain of retailers

Is
-
a

Spatial
constraints

Event

Physical object

In

road network

Outside of

road network

Spatial data mining algorithms should take into account
different conceptualization (domain
-
specific properties)

Task
-
oriented

spatial data mining


Let’s compare two different tasks: detecting hotspots of
traffic accident versus partitioning market areas based on
the location of retail

Detect hotspots of
traffic accident

Partition market
areas to a retailer

# of
clusters
k

Level of
details

Spatial data mining algorithms should take into account
different tasks and users’ need

Depend on
spatial distributn.

Given (resource
constraint)

Varies with scale
(depends on area
of users’ interest)

Doesn’t vary with
scale

Ontology as an active component
of information system

e.g. medicine

e.g. diagnosing

e.g. space, time,
matter, object, event

Application Ontology

Task Ontology

Domain Ontology

Top
-
level Ontology

dependence

subject

From Guarino, 1998

Conceptual framework for ontology
-
based spatial data mining (OBSDM)

Component of OBSDM


OBSDM:: Input:: Metadata


Tag structure of XML can be utilized to inform
domain ontology of the semantics of data


Component of OBSDM

OBSDM:: OBSDMM:: Domain Ont.


Terms within the “theme” tag in the metadata
are used as a token to locate the appropriate
domain ontology



Domain ontology specifies the definition, class,
and properties


Class example: Accident is a Subclass
-
Of Temporal
-
Thing


Properties example: Road has a Geographic
-
Region
as a Value
-
Type



Properties of class inherit from top
-
level
ontology

Domain ontology := Traffic accident


Theory TRAFFIC
-
ACCIDENT
-
DOMAIN


As a spatial thing,


Point(x)


On(x, y)


Roadway(y)


Line(y)


In(y, z)


Geographic
-
Region(z)


As a temporal thing,


Point(x)


At(x, y)


Time(y)


Event(x) <=> Occurrence(x)


Notification(x)


Response(x)


Arrival(x)


Before(Occurrence(x), Notification(x))


As an intangible thing,


Accident (x)


RelatedTo(x, y)


Vehicle(y)



Component of OBSDM

OBSDM:: Input:: User Interface


Users can specify a goal, level of detail, and
geographic area of interest through UI


Component of OBSDM

OBSDM:: OBSDMM:: Task Ont.


The inputs specified by users in the user
interface are translated into task ontology



Task ontology explicitly specify goal,
methods, requirements, and constraint

Task ontology := Spatial clustering


Theory SPATIAL
-
CLUSTERING
-
TASK


Documentation:


This theory defines a task ontology for the spatial clustering task. The spatial
clustering task, which is a class of clustering task, is a problem of grouping
similar spatial objects into classes.



Super classes: Clustering


Subclasses:


Sub goal:



“Find hot spots”



“Group similar patterns”


“Partition into
k
-
clusters”



Requirement:



Assignment
-
Object


Source:
Spatial Objects



Target:
Clusters



Geographic
-
Scale



Detail
-
Level


Constraint:


Spatial Objects



Operational Constraints



Component of OBSDM

OBSDM:: OBSDMM:: Alg. Builder

OBSDM:: Output:: GVis tool


Algorithm builder puts together requirements for
building the best algorithm suited to domain of
data and users’ input (task).



Data content is filtered through domain ontology,
and the users’ requirement is filtered through
task ontology.



The geographic visualization tool displays
results (pattern discovered)

Case study:

ontology
-
based spatial clustering of traffic accidents

OBS
C

Input: 353 features in Erie

Setting

Metadata


Theme := Traffic Accident

User interface


Goal := “identify hot spots”


LevelOfDetail := State


PlaceName := New York

Method


Algorithm := SMTIN


Constraint := Named
-
Roadway

Output: 18 clusters in Erie County

Case study:

Effect of scale (Task ontology)


OBSC clusters reflect spatial distribution specific to
the scale of users’ interest


Control Algorithm

OBSC Algorithm

TASK


LevelOfDetail :=
Null


PlaceName :=
Null

DOMAIN


Constraint := Roadway

TASK


LevelOfDetail :=
County


PlaceName :=
New York

DOMAIN


Constraint := Roadway

Specifying area of
interest doesn’t
mask details

Case study:

Effect of constraint (Domain ontology)


OBSC clusters identify the physical barrier due to
concept implicit in domain

Control Algorithm

OBSC Algorithm

TASK


LevelOfDetail := State


PlaceName := New York

DOMAIN


Constraint :=
Null

TASK


LevelOfDetail := State


PlaceName := New York

DOMAIN


Constraint :=
Roadway

Separated
by body of
water

Case study:

Benefit of using ontology in spatial clustering


Incorporating ontology in spatial clustering
algorithms enhances the quality of spatial
clustering results



Task ontology makes clusters
usable


Responsive to users’ view



Domain ontology makes clusters
natural


Dictated by concept implicit in domain

Conclusion (cont.)


Presents how ontology are incorporated in
spatial data mining algorithms



Semantic linkage between ontologies and
algorithms through parameterization



Scale as a task
-
oriented property


Constraint as a domain
-
specific property


Conclusion


Ontology is examined as a means to customize
algorithms to varying domain and task



Ontology enables algorithms to reflect concepts implicit
in domain, and adapt to users’ view


Ontology provides the semantically plausible way to re
-
use existing algorithms



Ontology provides the systematic way of
organizing various factors that dictate
mechanisms underlying data mining process