Prediction of Atomic Web Services Reliability Based on K-means Clustering

abnormalobeisanceSecurity

Nov 3, 2013 (3 years and 9 months ago)

82 views

Marin
Silic
,
Goran

Delac

and
Sinisa

Srbljic

Prediction of Atomic Web Services
Reliability Based on K
-
means Clustering

Consumer Computing Laboratory


Faculty of Electrical

Engineering and Computing


University of Zagreb, Croatia

http://ccl.fer.hr/


http://www.fer.hr/



http://www.unizg.hr/

ESEC/FSE, Saint Petersburg, Russia, 2013.


Outline


Motivation


Reliability in SOA


State
-
of
-
the
-
art


CLUS Approach


Evaluation


Conclusion

ESEC/FSE, Saint Petersburg, Russia, 2013.

Motivation


Contemporary web
applications
-

SOA

ESEC/FSE, Saint Petersburg, Russia, 2013.

A 2

A 1

A 3

A 4

A 5

Web Application

A 2

A 1

A 3

A 4

Composite service

Process of candidates selection

ESEC/FSE, Saint Petersburg, Russia, 2013.

A 2

A 1

A 3

A 4

A 5

A 6

Functional properties

Nonfunctional properties

Ensure the desired functionality

Reliability

Availability



Impact
Qos

&
QoE

Repository

A 2

A 1

A 3

A 4

A 5

A 6

Service Oriented System

“Reliability on demand” definition

ESEC/FSE, Saint Petersburg, Russia, 2013.

REQ

RES


𝐸


𝐸


𝐸




𝐸
<



𝑝
𝑟
=
𝑁

𝑁


The ratio of
successful against
total number of
invocations

Application

Past

Invocation

Sample

Drawbacks/Obstacles


Client’s vs. provider’s perspective


Service invocation context


Depends on the quality of the
sample


Acquiring a sample proves to be a difficult task


ESEC/FSE, Saint Petersburg, Russia, 2013.

A 2

A 1

A 3

A 4

A 5

A 6

QoS

QoS

QoS

QoS

QoS

QoS

A 1

QoS
1

QoS
2

QoS

QoS
1

QoS
2





Service Provider

Client

Client

Insight to the Solution


To
overcome the
drawbacks and obstacles




Collect
partial
, but
relevant

past invocation
sample




Utilize
prediction methods

to estimate the
reliability for the missing records

ESEC/FSE, Saint Petersburg, Russia, 2013.

State
-
of
-
the
-
art


Collaborative filtering



ESEC/FSE, Saint Petersburg, Russia, 2013.

p
1n

?



p
11

?



p
1i

?

?



?

p
22



?















p
u
n

?



p
u
1

?



p
u
i















p
m
n

?



p
m1

?



p
m
i

m

users

n

services

?

?



?



?

ui

matrix

m,n

>>

matrix is
extremely
sparse

number of
values to
predict

Collaborative filtering


Computes the similarity using PCC


Matrix can be employed in two different ways

ESEC/FSE, Saint Petersburg, Russia, 2013.

p
1n

?



p
11

?



p
1i

?

?



?

p
22



?















p
u
n

?



p
u
1

?



p
u
i















p
m
n

?



p
m1

?



p
m
i

UPCC

approach

IPCC

approach

Hybrid
approach

Disadvantages of
Collaborative Filtering


Scalability



Having millions of users and services


these
approaches do not scale



Accuracy in dynamic environments



Internet is a highly dynamic system


Do not consider environment conditions

ESEC/FSE, Saint Petersburg, Russia, 2013.

CLUS
tering


To
address scalability


Applies the principle of
aggregation


Reduces the
redundant data

by clustering users
and services using K
-
means



To improve the accuracy


Introduces
environment
-
specific

parameters


Disperses

the collected data across the
additional dimension



ESEC/FSE, Saint Petersburg, Russia, 2013.

CLUS Overview

ESEC/FSE, Saint Petersburg, Russia, 2013.

(1c)

(2c)

(5c
)

Data Clustering Phase

r(u, s, t)

p(r)

Raw

Data

Clustered

Data

Environment

Clustering

Users

Clustering

Services

Clustering

Creation

of
D

(3c
)

(4c
)

Prediction

Prediction
Phase

Environment
-
specific
Clustering


Set of environment conditions

ESEC/FSE, Saint Petersburg, Russia, 2013.


=
{
𝑒
1

,
𝑒
2
,

,
𝑒

,

,
𝑒

}

t
0

t
c

t
1

t
i
-
1

t
i

t
c
-
1

w
1

w
2

w
i

w
c

e
1

e
2

e
i





e
n





𝑝
𝑤
𝑖
=
1
|
𝑊

|

𝑝
𝑟
𝑟

𝑊
𝑖

𝑝
𝑤
1

𝑝
𝑤
2

𝑝
𝑤
𝑖

𝑝
𝑤
𝑐

K
-
means

clustering

A day

User
-
specific
Clustering


Set of users clusters

ESEC/FSE, Saint Petersburg, Russia, 2013.

𝑈
=
{

1

,

2
,

,


,

,


}

u
1

u
2

u
i





u
m





𝑝
𝑟
=
{
𝑝
𝑒
1

,
𝑝
𝑒
2
,

,
𝑝
𝑒

,

,
𝑝
𝑒

}

e
1

e
2

e
i





e
n

𝑝
𝑒
1

𝑝
𝑒
2

𝑝
𝑒
𝑖

𝑝
𝑒
𝑛

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

K
-
means

clustering

Service
-
specific
Clustering


Set of services clusters

ESEC/FSE, Saint Petersburg, Russia, 2013.


=
{

1

,

2
,

,


,

,


}

s
1

s
2

s
i





s
l





e
1

e
2

e
i





e
n

𝑝
𝑒
1

𝑝
𝑒
2

𝑝
𝑒
𝑖

𝑝
𝑒
𝑛

K
-
means

clustering

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟

𝑝
𝑟
=
{
𝑝
𝑒
1

,
𝑝
𝑒
2
,

,
𝑝
𝑒

,

,
𝑝
𝑒

}

Creation of
Space
D



Each record,
r(u, s, t)
, is associated to the belonging
clusters
u
k

,
s
j

,
e
i



Each entry in
D

is computed as follows:





R

contains all the records that belong to clusters
u
k

,
s
j

,
e
i




ESEC/FSE, Saint Petersburg, Russia, 2013.




,


,
𝑒

=

1
|

|

𝑝
𝑟
𝑟



Prediction



Assuming an ongoing
r
c
=(
u
c
,
s
c
,
t
c
)


First, it checks the collected sample:




If
H

is not empty




Otherwise,

ESEC/FSE, Saint Petersburg, Russia, 2013.

𝐻
=
{


|

𝑐
=




𝑐
=




𝑐
,



𝑤

}

𝑝
𝑐
=
1
|
𝐻
|

𝑝
𝑟
𝑟

𝐻

𝑝
𝑐
=



,


,
𝑒

,

𝑐





𝑐





𝑐

𝑒



Evaluation


Comparison with the state
-
of
-
the
-
art


UPCC


IPCC


Hybrid


Evaluation measures


Prediction accuracy

o
MAE, RMSE


Prediction performance

o
Aggregated prediction time

ESEC/FSE, Saint Petersburg, Russia, 2013.

Evaluation


Experiment setup


Amazon EC2 Cloud

ESEC/FSE, Saint Petersburg, Russia, 2013.

Data

Evaluation


Results


Impact of data density


Prediction accuracy


with load intensity

ESEC/FSE, Saint Petersburg, Russia, 2013.

Evaluation


Results


Impact of data density


Prediction
performance



with load
intensity


ESEC/FSE, Saint Petersburg, Russia, 2013.

Evaluation


Results


Impact of
number of clusters


Prediction accuracy, Data density = 20%

ESEC/FSE, Saint Petersburg, Russia, 2013.

Evaluation


Results


Impact of
number of clusters


Prediction performance, Data density = 20%

ESEC/FSE, Saint Petersburg, Russia, 2013.

Conclusion


Proposed a CLUS approach


Improved the prediction accuracy


By introducing
environment
-
specific

parameters


At least
56% lower RMSE
value than the state
-
of
-
the
-
art


Improved the prediction performance


By applying principle of
aggregation


Execution time reduced for
two orders of magnitude

when
compared to the state
-
of
-
the
-
art


Flexibility of approach


Trade
-
off

between
accuracy

and
scalability


Can be applied in different environments


ESEC/FSE, Saint Petersburg, Russia, 2013.

Q&A


Thanks
the audience

for listening
.

ESEC/FSE, Saint Petersburg, Russia, 2013.