Clustering Trajectories of Moving Objects in an Uncertain World

savagelizardΤεχνίτη Νοημοσύνη και Ρομποτική

25 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

98 εμφανίσεις

Clustering Trajectories of Moving
Objects in an Uncertain World

1
Dept. of Informatics,

Univ. of Piraeus, Greece

2
Tech. Educational
Institute of Crete,
Greece


Nikos Pelekis
1
, Ioannis Kopanakis
2
, Evangelos E. Kotsifakos
1
,

Elias Frentzos
1
, Yannis Theodoridis
1

IEEE International Conference on Data Mining (ICDM 2009), Miami, FL, USA, 6
-
9 December,

2009

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

2

Outline


Related work


Motivation


Our contribution


From Trajectories to Intuitionistic Fuzzy Sets


A similarity metric for Uncertain Trajectories (Un
-
Tra)


Cen
-
Tra: The Centroid Trajectory of a bunch of trajectories


TR
-
I
-
FCM: A novel clustering algorithm for Un
-
Tra


Experimental study


Conclusions & future work

Related Work

on Mobility Data Mining

Trajectory clustering

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

4

Trajectory Clustering


Questions:


Which distance between trajectories?


Which kind of clustering?


What is a cluster ‘mean’ or ‘centroid’?


A representative trajectory?

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

5


Average Euclidean distance




“Synchronized” behaviour distance


Similar objects = almost always in the same place at the same time


Computed on the whole trajectory


Computational aspects:


Cost = O( |

1
| + |

2
| ) (|

| = number of points in

)


It is a metric => efficient indexing methods allowed, e.g. [Frentzos et al. 2007]


Timeseries
-
based approaches: LCSS, DTW, ERP, EDR


Trajectory
-
oriented approach:


(time
-
relaxed) route similarity vs. (time
-
aware) trajectory similarity and variations (speed
-
pattern based similarity; directional similarity; …)
[Pelekis et al. 2007]



Which distance?

distance between
moving objects

1=慮a=

2=a琠瑩浥t
t

|

|

))

(

),

(

(

|

)

,

(

2

1

2

1

T

dt

t

t

d

D

T

T



=









Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

6

K
-
means

T
-
OPTICS
[Nanni &
Pedreschi,

2006]

HAC
-
average

Which kind of clustering?

Reachability plot

(= objects reordering for
distance distribution)



threshold

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

7


[Lee et al. 2007]


Discovers similar portions of

trajectories (sub
-
trajectories)



Two phases:
partitioning

and
grouping


TRACLUS:
A Partition
-
and
-
Group Framework

What about usage of
Mobility Patterns?

Visual analytics for mobility data

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

9

Visual analytics for mobility data

[Andrienko et al. 2007]


What is an appropriate way to visualize groups of trajectories?

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

10

Summarizing a bunch of trajectories

1) Trajectories


sequences of
“moves” between “places”

2) For each pair of “places”, compute
the number of “moves”

3) Represent “moves” by arrows (with
proportional widths)

Major flow

Minor variations

Many
small
moves

A word on uncertainty

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

12

Handling Uncertainty


Handling uncertainty is a
relatively new topic!


A lot of research effort has
been assigned


Developing models for
representing uncertainty in
trajectories. T
he most popular
one
[Trajcevski et al. 2004]
:


a trajectory of an object is
modeled as a 3D cylindrical
volume around the tracked
trajectory (polyline)


Various degrees of uncertainty

Coming back to our approach

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

14


Challenge 1: Introduce trajectory fuzziness in spatial
clustering
techniques


The application of spatial clustering
algorithms
(
k
-
means, BIRCH, DBSCAN,
STING
, …
)

to
Trajectory Databases (
TD
)

is
not straightforward


Fuzzy clustering algorithms (Fuzzy C
-
Means and its variants) quantify the degree
of membership of each data vector to a cluster


The inherent uncertainty in TD should taken into account.


Challenge 2: study the nature of the centroid / mean / representative
trajectory in a cluster of trajectories.


Is it a ‘trajectory’ itself?

Motivation

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

15


I
-
Un
-
Tra
: An intuitionistic fuzzy vector representation of trajectories


enables clustering of trajectories by existing (fuzzy or not) clustering
algorithms


D
UnTra
: A distance metric of uncertain trajectories


Cen
-
Tra
: The centroid of a bunch of trajectories


using density and local similarity properties


TR
-
I
-
FCM
: A novel modification of FCM algorithm for clustering
complex trajectory datasets


exploiting on D
UnTra

and Cen
-
Tra.

Our contribution

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

16

From Fuzzy sets to Intuitionistic fuzzy sets


Definition 1

(Zadeh, 1965).
Let a set
E

be fixed. A fuzzy set on
E

is
an object of the form






Definition 2
(Atanassov, 1986; Atanassov, 1994). An intuitionistic
fuzzy set (IFS)
A

is an object of the form




,( )
A
A x x x E

= 
:[0,1]
A
E


where



,( ),( )
A A
A x x x x E
 
= 
:[0,1]
A
E


:[0,1]


A
E
and

where

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

17

Hesitancy


For every element







The hesitancy of the element
x

to the set
A
is

E
x

0 ( ) 1
A
x

 
0 ( ) 1

 
A
x
0 ( ) ( ) 1
A A
x x
 
  
( ) 1 ( ) ( )
A A A
x x x
  
=  
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

18

Vector representation of trajectories


Assume a
regular grid
G
(
m



n
)
consisting of cells
c
k,l

, a trajectory



and a target dimension
p

<<
n
i
,



The “
approximate trajectory





consists of

p
regions (i.e. sets of cells)
crossed by

T
i

during period

p
j





The “
Uncertain Trajectory
” is the

ε
-
buffer of

i i i
i i,0 i,0 i,0 i,n i,n i,n
T = <(x, y, t ), ..., (x, y, t )>
i i,1 i,p
T = <r, ..., r >


1
,
ls j
ls j
j
p p
p
 

 
=
 
i i,1 i,p
UnTra(T) = <ur, ..., ur >
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

19

Intuitionistic Uncertain

Trajectories


membership

= inside cell with 100% probability (i.e. thick portions)


non
-
membership

= outside cell with 100% probability (i.e. dotted portions)


hesitancy

= ignorance

whether inside or outside the cell

(i.e. solid thin portions)



A cell
c
k.l

c
k.l

ε



Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

20

Intuitionistic Uncertain

Trajectories

i i,1 A i,1 A i,1 i,p A i,p A i,p
I-UnTra(T) = <(ur,(ur ),(ur )) ..., (ur,(u
r ),(ur ))>
   


,,
( )
A i j i j i
ur r UnTra T

=




,
,
( )
i i j
A i j
i
UnTra T ur
ur
UnTra T


=


( )
j j
j
i i
A i
i
ur r
ur
UnTra T


=
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

21

Proposed similarity metric (1/2)


The distance between two
I
-
UnTra

A

and
B

is:



where









and



(,) (,) (,) 2
UnTra
total UnTra IFS
IFS
D A B A B D A B D A B
=  = 
































,1,1
,1
,1
,min
,,
,,
,
UnTra i j
UnTra i j i j
ext
UnTra i j i
ext
UnTra i j j
ext
D UnTra T UnTra T
D Rst UnTra T Rst UnTra T ur ur
D Rst UnTra T UnTra T ur gap
D UnTra T Rst UnTra T ur gap
=
 
 
 
 
 
 
 
 
 
 
























2
1
1
2
2
x i x j
x i j
i j
ext
y i y j
y i j
ext mbr ur ext mbr ur
ext mbr ur ur
ur ur
ext mbr ur ext mbr ur
ext mbr ur ur
 

 

 
 
 
 = 
 

 
 
 
 




x i
ext mbr ur




y i
ext mbr ur




y i j
ext mbr ur ur





x i j
ext mbr ur ur

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

22

Proposed similarity metric (2/2)


Assuming
two intuitionistic fuzzy sets on it,
A

=

(
M
A
,

Γ
A
,
Π
A
)

and
B

=

(
M
B
,

Γ
B
,
Π
Β
), with the same cardinality
n
, the similarity measure
Z

between
A

and
B

is given by the following equation:




where
z
(
A’
,
B
’) for fuzzy sets
A'

and
B'

(e.g. for
M
A
,
M
B
) is defined as:







and similarly for
Γ
A
,
Γ
B

and
Π
A
,
Π
B
.











1
3
,,,,
A B A B A B
Z A B z M M z z
=      














''
1
''
1
min,
, ''
','
max,
1,
''
n
A i B i
i
n
A i B i
i
x x
A B
z A B
x x
A B
 
 
=
=


  

=


 =




Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

23

An example


A
={
x
, 0.4, 0.2},
B
={
x
, 0.5, 0.3},
C
={
x
, 0.5, 0.2}







C

is more similar to
A
than B


,,( )
A B C IFSs E

0.4 0.2 0.2
0.5 0.3 0.4
(,) 0.65
3
Z A B
 
= =
0.4 0.2 0.3
0.5 0.2 0.4
(,) 0.85
3
Z A C
 
= =
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

24

The
Cen
troid
Tra
jectory


The idea (similarity
-
density
-
based approach):


adopt some local similarity function to identify common sub
-
trajectories
(concurrent existence in space
-
time),


follow a
region growing

approach
according to density



Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

25





























































































































































































































































































































































































































































































































































T1

T2

T3

Algorithm CenTra: An example

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

26

T1

T2

T3

The Cen
troid

Tra
jectory

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

27


The FCM objective function:



Given that to be minimized requires:




and


Fuzzy C
-
Means algorithm







=
=
=
c
i
N
k
ik
m
ik
m
d
u
V
U
J
1
1
2
,























=


=
=




=






,

,

,
1



,
0
,

,


1
1
2
1
2
1
k
I
i
k
ik
k
k
c
j
m
jk
m
ik
ik
N
k
i
c
i
I
I
i
u
I
i
I
d
d
u
k







.


1
1
1


=
=


=

N
k
m
ik
k
N
k
m
ik
i
c
i
u
x
u
v





=
=


=
N
k
ik
c
i
ik
N
u
u
1
1
0

,
1

1.

Determine c (1 < c < N), and initialize
V
(0), j=1,


2.

Calculate the membership matrix
U
(j),


3.

Update the centroids’ matrix
V
(j),


4.

If |
U
(j+1)
-
U
(j)|>
ε

then j=j+1 and go to Step 2.


Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

28

Ignore update centroid step

and instead use CenTra


The FCM objective function:



Given that to be minimized requires :




and


CenTR
-
I
-
FCM

algorithm









=
=


=
N
k
ik
c
i
ik
N
u
u
1
1
0

,
1

1.
V
(0) =
c

random I
-
UnTra; j=1;


2. repeat


3. Calculate membership matrix
U
(j)


4. Update the centroids’ matrix
V
(j) using CenTra;


5. Compute membership and non
-
membership degrees of
V
(
j
)


6. Until ||
Uj
+1
-
Uj
||
F

ε
;
j
=
j
+1;






1 1
,
c N
UnTra
m
CenTR I FCM
m ik k i
IFS
i k
J U V u x v
 
= =
= 





1
1
1
1
1
1
, ,

0,
, ,
1,
k
c
UnTra
UnTra
m
m
k i k j k
IFS
IFS
j
ik
i c
k
i k N
k
ik k
i I
x v x v I
u
i I
I
u i I


=
 
 



  = 


 =



 
 

= 











.


1
1
1


=
=


=

N
k
m
ik
k
N
k
m
ik
i
c
i
u
x
u
v
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

29

Experiments (1/2)


Dataset:
’Athens trucks’ MOD (www.rtreeportal.org)


50 trucks, 1100 trajectories, 112.300 position records

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

30

Experiments (2/2)


Use CommonGIS [Andrienko et al., 2007] to identify real clusters

“Round trips”
clusters

“Linear”
clusters

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

31

Results

(
Clustering accuracy scaling cell size, ε
)

0.80%
1.00%
1.33%
2.00%
4.00%
6.67%
0
1
2
3
0
10
20
30
40
50
60
70
80
90
100
Success
Cell Size
ε
Fix density
threshold to
δ
=2% of the
total number
of trajectories

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

32

Results

(
Clustering accuracy scaling density threshold, δ
)

0.80%
1.00%
1.33%
2.00%
4.00%
6.67%
0.02
0.04
0.06
0
10
20
30
40
50
60
70
80
90
100
Success
Cell Size
δ
Fix uncertainty
to
ε
=
1

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

33

Results

(
scaling the number of clusters
)

0
10
20
30
40
50
60
70
80
90
100
2
3
4
# Clusters
Success
CenTR-I-FCM
TR-FCM
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

34

Results
(
scaling the dataset cardinality
)

0
2
4
6
8
10
12
14
16
18
20
0
200
400
600
800
1000
1200
# Trajectories
Execution time (sec)
2 clusters
3 clusters
4 clusters
5 clusters
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

35

Results
(Quality of CenTra)

Representative Trajectories
vs.

Centroid Trajectories

cell size
=1.3%,
ε
=0
,

δ=
0.09

cell size
=1.3%,
ε
=0
,

δ=
0.09,
cell size
=2.8%,
ε
=0
,

δ=
0.02

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

36

Conclusions


We proposed a three
-
step approach for clustering trajectories of
moving objects, motivated by the observation that clustering and
representation issues in TD are inherently subject to uncertainty.


1
st

step: an intuitionistic fuzzy vector representation of trajectories plus a
distance metric consisting of


a metric for sequences of regions and


a metric for intuitionistic fuzzy sets


2
nd

step: Algorithm CenTra, a novel technique for discovering the centroid
of a bundle of trajectories


3
rd

step: Algorithm CenTR
-
I
-
FCM, for clustering trajectories under
uncertainty

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

37

Future Work


Devise a

clever sampling technique for multi
-
dimensional
data so as to diminish the effect of initialization in
the

algorithm
;


Exploit the metric properties of the proposed distance
function by using an distance
-
based
index

structure

(
for
efficiency purposes
)
;


Perform
extensive experimental evaluation using large
trajectory datasets

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

38

Acknowledgements


Research partially supported by the FP7 ICT/FET Project
MODAP (Mobility, Data Mining, and Privacy) funded by the
European Union.
URL:
www.modap.org


a continuation of
the

FP6
-
14915 IST/FET Project GeoPKDD
(Geographic Privacy
-
aware Knowledge Discovery and
Delivery) funded by the European Union.
URL:
www.geopkdd.eu


Some slides are from:


Fosca Giannotti, Dino Pedreschi, and Yannis Theodoridis, “Geographic
Privacy
-
aware Knowledge Discovery and Delivery”, EDBT Tutorial,
2009.

Back up slides

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

40

Examples of mobility patterns exploitation


Trajectory Density
-
based queries


Find hot
-
spots (popular places) [Giannotti et al. 2007]


Find T
-
Patterns [Giannotti et al. 2007]


Find hot motion paths [Sacharidis et al. 2008]


Find typical trajectories [Lee et al. 2007]


Identify flocks &

leaders [Benkert et al. 2008]

δ
t

ε

X

Y

T

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

41

Which kind of clustering?


General requirements:


Non
-
spherical clusters should be allowed


E.g.:

A traffic jam along a road =
“snake
-
shaped” cluster





Tolerance to noise


Low computational cost


Applicability to complex, possibly non
-
vectorial data

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

42

Temporal focusing


Different time intervals can show different behaviours


E.g.: objects that are close to each other within a time interval can be
much distant in other periods of time


The time interval becomes a parameter


E.g.: rush hours vs. low traffic times


Already supported by the distance measure


Just compute D(

1
,

2
) |
T

on a time interval T’


T


Problem: significant T’ are not always known
a priori


An automated mechanism is needed to find them

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

43


The representative trajectory of the cluster:


Compute the average direction vector and rotate the axes temporarily .


Sort the starting and ending points by the coordinate of the rotated axis.


While scanning the starting and ending points in the sorted order, count the
number of line segments and compute the average coordinate of those line
segments.

TRACLUS


representative trajectory

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

44

44

Trajectory Uncertainty vs. Anonymization


Never Walk Alone [Bonchi et al.
2008]


Trade uncertainty for anonymity: trajectories that are close up the
uncertainty threshold are indistinguishable


Combine k
-
anonymity and perturbation


Two steps:


Cluster trajectories into

groups of k similar ones

(removing outliers)


Perturb trajectories in a

cluster so that each one

is close to each other

up to the uncertainty

threshold

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"

45

Qualitative evaluation of Z

No

Measure

Counter
-
intuitive
cases

Measure
Value
s

Proposed
measure value

I.

S
C

,
S
DC

{(,0,0,1)},
{(,0.5,0.5,0)}
A x
B x
=
=

S
C
(
A
,
B
)=
S
DC
(
A
,
B
)=1

Z
=0

II.

S
H
,
S
HB
,
p
e
S

{(,0.3,0.3,0.4)},
{(,0.4,0.4,0.2)},
{(,0.3,0.4,0.3)},
{(,0.4,0.3,0.3)}
A x
B x
C x
D x
=
=
=
=

S
H

(
A
,
B
)=
S
HB
(A,B)=
p
e
S
(
A
,
B
)
=0.9

S
H

(
C
,
D
)=
S
HB
(
C
,
D
)=
p
e
S
(
C
,
D
)=0.9

Z
(
A
,
B
)=
0.66

Z
(
C
,
D
)=0.
83

III.

S
H
,
S
HB
,
p
e
S

{(,1,0,0)},
{(,0,0,1)},
{(,0.5,0.5,0)}
A x
B x
C x
=
=
=

S
H

(
A
,
B
)=
S
HB
(
A
,
B
)=
p
e
S
(
A
,
B
)=0.5

S
H

(
B
,
C
)=
S
HB
(
B
,
C
)=
p
e
S
(
B
,
C
)=0.5

Z
(
A
,
B
)=

Z
(
B
,
C
)=0


IV.

S
L

and
p
S
S

{(,0.4,0.2,0.4)},
{(,0.5,0.3,0.2)},
{(,0.5,0.2,0.3)}
A x
B x
C x
=
=
=

S
L
(
A
,
B
)=
p
S
S
(
A
,
B
)=0.95

S
L
(
A
,
C
)=
p
S
S
(
C
,
D
)=0.95

Z
(
A
,
B
)=0.
65

Z

(
A
,
C
)=0.
85

V.

1 2 3
,,
HY HY HY
S S S

{(,1,0,0)},
{(,0,0,1)}
A x
B x
=
=

1 2 3
(,) (,) (,) 0
HY HY HY
S A B S A B S A B
= = =

Z
(
A
,
B
)=0

VI.

1 2 3
,,
HY HY HY
S S S

{(,0.3,0.3,0.4)},
{(,0.4,0.4,0.2)},
{(,0.3,0.4,0.3)},
{(,0.4,0.3,0.3)}
A x
B x
C x
D x
=
=
=
=

1 1
(,) (,) 0.9
HY HY
S A B S C D
= =
2 2
(,) (,) 0.85
HY HY
S A B S C D
= =
3 3
(,) (,) 0.82
HY HY
S A B S C D
= =

Z
(
A
,
B
)=
0.66

Z
(
C
,
D
)=0.
85