Clustering Trajectories of Moving
Objects in an Uncertain World
1
Dept. of Informatics,
Univ. of Piraeus, Greece
2
Tech. Educational
Institute of Crete,
Greece
Nikos Pelekis
1
, Ioannis Kopanakis
2
, Evangelos E. Kotsifakos
1
,
Elias Frentzos
1
, Yannis Theodoridis
1
IEEE International Conference on Data Mining (ICDM 2009), Miami, FL, USA, 6

9 December,
2009
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
2
Outline
Related work
Motivation
Our contribution
From Trajectories to Intuitionistic Fuzzy Sets
A similarity metric for Uncertain Trajectories (Un

Tra)
Cen

Tra: The Centroid Trajectory of a bunch of trajectories
TR

I

FCM: A novel clustering algorithm for Un

Tra
Experimental study
Conclusions & future work
Related Work
on Mobility Data Mining
Trajectory clustering
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
4
Trajectory Clustering
Questions:
Which distance between trajectories?
Which kind of clustering?
What is a cluster ‘mean’ or ‘centroid’?
A representative trajectory?
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
5
Average Euclidean distance
“Synchronized” behaviour distance
Similar objects = almost always in the same place at the same time
Computed on the whole trajectory
Computational aspects:
Cost = O( 
1
 + 
2
 ) (
 = number of points in
)
It is a metric => efficient indexing methods allowed, e.g. [Frentzos et al. 2007]
Timeseries

based approaches: LCSS, DTW, ERP, EDR
Trajectory

oriented approach:
(time

relaxed) route similarity vs. (time

aware) trajectory similarity and variations (speed

pattern based similarity; directional similarity; …)
[Pelekis et al. 2007]
Which distance?
distance between
moving objects
1=慮a=
2=a琠瑩浥t
t


))
(
),
(
(

)
,
(
2
1
2
1
T
dt
t
t
d
D
T
T
=
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
6
K

means
T

OPTICS
[Nanni &
Pedreschi,
2006]
HAC

average
Which kind of clustering?
Reachability plot
(= objects reordering for
distance distribution)
threshold
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
7
[Lee et al. 2007]
Discovers similar portions of
trajectories (sub

trajectories)
Two phases:
partitioning
and
grouping
TRACLUS:
A Partition

and

Group Framework
What about usage of
Mobility Patterns?
Visual analytics for mobility data
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
9
Visual analytics for mobility data
[Andrienko et al. 2007]
What is an appropriate way to visualize groups of trajectories?
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
10
Summarizing a bunch of trajectories
1) Trajectories
sequences of
“moves” between “places”
2) For each pair of “places”, compute
the number of “moves”
3) Represent “moves” by arrows (with
proportional widths)
Major flow
Minor variations
Many
small
moves
A word on uncertainty
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
12
Handling Uncertainty
Handling uncertainty is a
relatively new topic!
A lot of research effort has
been assigned
Developing models for
representing uncertainty in
trajectories. T
he most popular
one
[Trajcevski et al. 2004]
:
a trajectory of an object is
modeled as a 3D cylindrical
volume around the tracked
trajectory (polyline)
Various degrees of uncertainty
Coming back to our approach
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
14
Challenge 1: Introduce trajectory fuzziness in spatial
clustering
techniques
The application of spatial clustering
algorithms
(
k

means, BIRCH, DBSCAN,
STING
, …
)
to
Trajectory Databases (
TD
)
is
not straightforward
Fuzzy clustering algorithms (Fuzzy C

Means and its variants) quantify the degree
of membership of each data vector to a cluster
The inherent uncertainty in TD should taken into account.
Challenge 2: study the nature of the centroid / mean / representative
trajectory in a cluster of trajectories.
Is it a ‘trajectory’ itself?
Motivation
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
15
I

Un

Tra
: An intuitionistic fuzzy vector representation of trajectories
enables clustering of trajectories by existing (fuzzy or not) clustering
algorithms
D
UnTra
: A distance metric of uncertain trajectories
Cen

Tra
: The centroid of a bunch of trajectories
using density and local similarity properties
TR

I

FCM
: A novel modification of FCM algorithm for clustering
complex trajectory datasets
exploiting on D
UnTra
and Cen

Tra.
Our contribution
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
16
From Fuzzy sets to Intuitionistic fuzzy sets
Definition 1
(Zadeh, 1965).
Let a set
E
be fixed. A fuzzy set on
E
is
an object of the form
Definition 2
(Atanassov, 1986; Atanassov, 1994). An intuitionistic
fuzzy set (IFS)
A
is an object of the form
,( )
A
A x x x E
=
:[0,1]
A
E
where
,( ),( )
A A
A x x x x E
=
:[0,1]
A
E
:[0,1]
A
E
and
where
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
17
Hesitancy
For every element
The hesitancy of the element
x
to the set
A
is
E
x
0 ( ) 1
A
x
0 ( ) 1
A
x
0 ( ) ( ) 1
A A
x x
( ) 1 ( ) ( )
A A A
x x x
=
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
18
Vector representation of trajectories
Assume a
regular grid
G
(
m
n
)
consisting of cells
c
k,l
, a trajectory
and a target dimension
p
<<
n
i
,
The “
approximate trajectory
”
consists of
p
regions (i.e. sets of cells)
crossed by
T
i
during period
p
j
The “
Uncertain Trajectory
” is the
ε

buffer of
i i i
i i,0 i,0 i,0 i,n i,n i,n
T = <(x, y, t ), ..., (x, y, t )>
i i,1 i,p
T = <r, ..., r >
1
,
ls j
ls j
j
p p
p
=
i i,1 i,p
UnTra(T) = <ur, ..., ur >
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
19
Intuitionistic Uncertain
Trajectories
membership
= inside cell with 100% probability (i.e. thick portions)
non

membership
= outside cell with 100% probability (i.e. dotted portions)
hesitancy
= ignorance
whether inside or outside the cell
(i.e. solid thin portions)
A cell
c
k.l
c
k.l
ε
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
20
Intuitionistic Uncertain
Trajectories
i i,1 A i,1 A i,1 i,p A i,p A i,p
IUnTra(T) = <(ur,(ur ),(ur )) ..., (ur,(u
r ),(ur ))>
,,
( )
A i j i j i
ur r UnTra T
=
,
,
( )
i i j
A i j
i
UnTra T ur
ur
UnTra T
=
( )
j j
j
i i
A i
i
ur r
ur
UnTra T
=
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
21
Proposed similarity metric (1/2)
The distance between two
I

UnTra
A
and
B
is:
where
and
(,) (,) (,) 2
UnTra
total UnTra IFS
IFS
D A B A B D A B D A B
= =
,1,1
,1
,1
,min
,,
,,
,
UnTra i j
UnTra i j i j
ext
UnTra i j i
ext
UnTra i j j
ext
D UnTra T UnTra T
D Rst UnTra T Rst UnTra T ur ur
D Rst UnTra T UnTra T ur gap
D UnTra T Rst UnTra T ur gap
=
2
1
1
2
2
x i x j
x i j
i j
ext
y i y j
y i j
ext mbr ur ext mbr ur
ext mbr ur ur
ur ur
ext mbr ur ext mbr ur
ext mbr ur ur
=
x i
ext mbr ur
y i
ext mbr ur
y i j
ext mbr ur ur
x i j
ext mbr ur ur
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
22
Proposed similarity metric (2/2)
Assuming
two intuitionistic fuzzy sets on it,
A
=
(
M
A
,
Γ
A
,
Π
A
)
and
B
=
(
M
B
,
Γ
B
,
Π
Β
), with the same cardinality
n
, the similarity measure
Z
between
A
and
B
is given by the following equation:
where
z
(
A’
,
B
’) for fuzzy sets
A'
and
B'
(e.g. for
M
A
,
M
B
) is defined as:
and similarly for
Γ
A
,
Γ
B
and
Π
A
,
Π
B
.
1
3
,,,,
A B A B A B
Z A B z M M z z
=
''
1
''
1
min,
, ''
','
max,
1,
''
n
A i B i
i
n
A i B i
i
x x
A B
z A B
x x
A B
=
=
=
=
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
23
An example
A
={
x
, 0.4, 0.2},
B
={
x
, 0.5, 0.3},
C
={
x
, 0.5, 0.2}
C
is more similar to
A
than B
,,( )
A B C IFSs E
0.4 0.2 0.2
0.5 0.3 0.4
(,) 0.65
3
Z A B
= =
0.4 0.2 0.3
0.5 0.2 0.4
(,) 0.85
3
Z A C
= =
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
24
The
Cen
troid
Tra
jectory
The idea (similarity

density

based approach):
adopt some local similarity function to identify common sub

trajectories
(concurrent existence in space

time),
follow a
region growing
approach
according to density
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
25
T1
T2
T3
Algorithm CenTra: An example
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
26
T1
T2
T3
The Cen
troid
Tra
jectory
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
27
The FCM objective function:
Given that to be minimized requires:
and
Fuzzy C

Means algorithm
=
=
=
c
i
N
k
ik
m
ik
m
d
u
V
U
J
1
1
2
,
=
=
=
=
,
,
,
1
,
0
,
,
1
1
2
1
2
1
k
I
i
k
ik
k
k
c
j
m
jk
m
ik
ik
N
k
i
c
i
I
I
i
u
I
i
I
d
d
u
k
.
1
1
1
=
=
=
N
k
m
ik
k
N
k
m
ik
i
c
i
u
x
u
v
=
=
=
N
k
ik
c
i
ik
N
u
u
1
1
0
,
1
1.
Determine c (1 < c < N), and initialize
V
(0), j=1,
2.
Calculate the membership matrix
U
(j),
3.
Update the centroids’ matrix
V
(j),
4.
If 
U
(j+1)

U
(j)>
ε
then j=j+1 and go to Step 2.
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
28
Ignore update centroid step
and instead use CenTra
The FCM objective function:
Given that to be minimized requires :
and
CenTR

I

FCM
algorithm
=
=
=
N
k
ik
c
i
ik
N
u
u
1
1
0
,
1
1.
V
(0) =
c
random I

UnTra; j=1;
2. repeat
3. Calculate membership matrix
U
(j)
4. Update the centroids’ matrix
V
(j) using CenTra;
5. Compute membership and non

membership degrees of
V
(
j
)
6. Until 
Uj
+1

Uj

F
≤
ε
;
j
=
j
+1;
1 1
,
c N
UnTra
m
CenTR I FCM
m ik k i
IFS
i k
J U V u x v
= =
=
1
1
1
1
1
1
, ,
0,
, ,
1,
k
c
UnTra
UnTra
m
m
k i k j k
IFS
IFS
j
ik
i c
k
i k N
k
ik k
i I
x v x v I
u
i I
I
u i I
=
=
=
=
.
1
1
1
=
=
=
N
k
m
ik
k
N
k
m
ik
i
c
i
u
x
u
v
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
29
Experiments (1/2)
Dataset:
’Athens trucks’ MOD (www.rtreeportal.org)
50 trucks, 1100 trajectories, 112.300 position records
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
30
Experiments (2/2)
Use CommonGIS [Andrienko et al., 2007] to identify real clusters
“Round trips”
clusters
“Linear”
clusters
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
31
Results
(
Clustering accuracy scaling cell size, ε
)
0.80%
1.00%
1.33%
2.00%
4.00%
6.67%
0
1
2
3
0
10
20
30
40
50
60
70
80
90
100
Success
Cell Size
ε
Fix density
threshold to
δ
=2% of the
total number
of trajectories
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
32
Results
(
Clustering accuracy scaling density threshold, δ
)
0.80%
1.00%
1.33%
2.00%
4.00%
6.67%
0.02
0.04
0.06
0
10
20
30
40
50
60
70
80
90
100
Success
Cell Size
δ
Fix uncertainty
to
ε
=
1
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
33
Results
(
scaling the number of clusters
)
0
10
20
30
40
50
60
70
80
90
100
2
3
4
# Clusters
Success
CenTRIFCM
TRFCM
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
34
Results
(
scaling the dataset cardinality
)
0
2
4
6
8
10
12
14
16
18
20
0
200
400
600
800
1000
1200
# Trajectories
Execution time (sec)
2 clusters
3 clusters
4 clusters
5 clusters
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
35
Results
(Quality of CenTra)
Representative Trajectories
vs.
Centroid Trajectories
cell size
=1.3%,
ε
=0
,
δ=
0.09
cell size
=1.3%,
ε
=0
,
δ=
0.09,
cell size
=2.8%,
ε
=0
,
δ=
0.02
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
36
Conclusions
We proposed a three

step approach for clustering trajectories of
moving objects, motivated by the observation that clustering and
representation issues in TD are inherently subject to uncertainty.
1
st
step: an intuitionistic fuzzy vector representation of trajectories plus a
distance metric consisting of
a metric for sequences of regions and
a metric for intuitionistic fuzzy sets
2
nd
step: Algorithm CenTra, a novel technique for discovering the centroid
of a bundle of trajectories
3
rd
step: Algorithm CenTR

I

FCM, for clustering trajectories under
uncertainty
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
37
Future Work
Devise a
clever sampling technique for multi

dimensional
data so as to diminish the effect of initialization in
the
algorithm
;
Exploit the metric properties of the proposed distance
function by using an distance

based
index
structure
(
for
efficiency purposes
)
;
Perform
extensive experimental evaluation using large
trajectory datasets
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
38
Acknowledgements
Research partially supported by the FP7 ICT/FET Project
MODAP (Mobility, Data Mining, and Privacy) funded by the
European Union.
URL:
www.modap.org
a continuation of
the
FP6

14915 IST/FET Project GeoPKDD
(Geographic Privacy

aware Knowledge Discovery and
Delivery) funded by the European Union.
URL:
www.geopkdd.eu
Some slides are from:
Fosca Giannotti, Dino Pedreschi, and Yannis Theodoridis, “Geographic
Privacy

aware Knowledge Discovery and Delivery”, EDBT Tutorial,
2009.
Back up slides
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
40
Examples of mobility patterns exploitation
Trajectory Density

based queries
Find hot

spots (popular places) [Giannotti et al. 2007]
Find T

Patterns [Giannotti et al. 2007]
Find hot motion paths [Sacharidis et al. 2008]
Find typical trajectories [Lee et al. 2007]
Identify flocks &
leaders [Benkert et al. 2008]
δ
t
ε
X
Y
T
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
41
Which kind of clustering?
General requirements:
Non

spherical clusters should be allowed
E.g.:
A traffic jam along a road =
“snake

shaped” cluster
Tolerance to noise
Low computational cost
Applicability to complex, possibly non

vectorial data
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
42
Temporal focusing
Different time intervals can show different behaviours
E.g.: objects that are close to each other within a time interval can be
much distant in other periods of time
The time interval becomes a parameter
E.g.: rush hours vs. low traffic times
Already supported by the distance measure
Just compute D(
1
,
2
) 
T
on a time interval T’
T
Problem: significant T’ are not always known
a priori
An automated mechanism is needed to find them
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
43
The representative trajectory of the cluster:
Compute the average direction vector and rotate the axes temporarily .
Sort the starting and ending points by the coordinate of the rotated axis.
While scanning the starting and ending points in the sorted order, count the
number of line segments and compute the average coordinate of those line
segments.
TRACLUS
–
representative trajectory
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
44
44
Trajectory Uncertainty vs. Anonymization
Never Walk Alone [Bonchi et al.
2008]
Trade uncertainty for anonymity: trajectories that are close up the
uncertainty threshold are indistinguishable
Combine k

anonymity and perturbation
Two steps:
Cluster trajectories into
groups of k similar ones
(removing outliers)
Perturb trajectories in a
cluster so that each one
is close to each other
up to the uncertainty
threshold
Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World"
45
Qualitative evaluation of Z
No
Measure
Counter

intuitive
cases
Measure
Value
s
Proposed
measure value
I.
S
C
,
S
DC
{(,0,0,1)},
{(,0.5,0.5,0)}
A x
B x
=
=
S
C
(
A
,
B
)=
S
DC
(
A
,
B
)=1
Z
=0
II.
S
H
,
S
HB
,
p
e
S
{(,0.3,0.3,0.4)},
{(,0.4,0.4,0.2)},
{(,0.3,0.4,0.3)},
{(,0.4,0.3,0.3)}
A x
B x
C x
D x
=
=
=
=
S
H
(
A
,
B
)=
S
HB
(A,B)=
p
e
S
(
A
,
B
)
=0.9
S
H
(
C
,
D
)=
S
HB
(
C
,
D
)=
p
e
S
(
C
,
D
)=0.9
Z
(
A
,
B
)=
0.66
Z
(
C
,
D
)=0.
83
III.
S
H
,
S
HB
,
p
e
S
{(,1,0,0)},
{(,0,0,1)},
{(,0.5,0.5,0)}
A x
B x
C x
=
=
=
S
H
(
A
,
B
)=
S
HB
(
A
,
B
)=
p
e
S
(
A
,
B
)=0.5
S
H
(
B
,
C
)=
S
HB
(
B
,
C
)=
p
e
S
(
B
,
C
)=0.5
Z
(
A
,
B
)=
Z
(
B
,
C
)=0
IV.
S
L
and
p
S
S
{(,0.4,0.2,0.4)},
{(,0.5,0.3,0.2)},
{(,0.5,0.2,0.3)}
A x
B x
C x
=
=
=
S
L
(
A
,
B
)=
p
S
S
(
A
,
B
)=0.95
S
L
(
A
,
C
)=
p
S
S
(
C
,
D
)=0.95
Z
(
A
,
B
)=0.
65
Z
(
A
,
C
)=0.
85
V.
1 2 3
,,
HY HY HY
S S S
{(,1,0,0)},
{(,0,0,1)}
A x
B x
=
=
1 2 3
(,) (,) (,) 0
HY HY HY
S A B S A B S A B
= = =
Z
(
A
,
B
)=0
VI.
1 2 3
,,
HY HY HY
S S S
{(,0.3,0.3,0.4)},
{(,0.4,0.4,0.2)},
{(,0.3,0.4,0.3)},
{(,0.4,0.3,0.3)}
A x
B x
C x
D x
=
=
=
=
1 1
(,) (,) 0.9
HY HY
S A B S C D
= =
2 2
(,) (,) 0.85
HY HY
S A B S C D
= =
3 3
(,) (,) 0.82
HY HY
S A B S C D
= =
Z
(
A
,
B
)=
0.66
Z
(
C
,
D
)=0.
85
Comments 0
Log in to post a comment