Resource Allocation Algorithms for Event-Based Enterprise Systems

dealerdeputyAI and Robotics

Nov 25, 2013 (3 years and 6 months ago)

97 views

PhD Candidate: Alex K. Y. Cheung

Supervisor: Hans
-
Arno Jacobsen


PhD Thesis Presentation

University of Toronto

March 28, 2011

MIDDLEWARE SYSTEMS

RESEARCH GROUP

Resource Allocation Algorithms for
Event
-
Based Enterprise Systems

PhD Thesis Presentation, Alex Cheung © 2011

Introduction to Distributed Content
-
based Publish/Subscribe

2

subscriber

brand = ‘Honda’
cashback > $2000

subscriber

brand= ‘Honda’
cashback > $4000

publisher

brand = ‘Honda’
cashback = $6000

broker

multicast

Advertisement path

Subscription path

Publication path

brand = ‘Honda’
cashback >= $0

PhD Thesis Presentation, Alex Cheung © 2011

Desirable Properties of Distributed
Content
-
based Publish/Subscribe



Decoupling of data sources and sinks


Ease of component addition and removal



Flexible routing based on message content


Efficient use of network resources



Distributed broker overlay network


Scalable


Fault tolerant




3

PhD Thesis Presentation, Alex Cheung © 2011

Applications of Publish/Subscribe


Network and systems monitoring [
Mukherjee

1994]


Business activity monitoring [Fawcett et al. 1999]


Business process execution [Schuler et al. 2001]


Workflow management [
Cugola

et al. 2001]


Multiplayer online games [
Bharambe

et al. 2002]


RSS filtering [
Petrovic

et al. 2005; Rose et al. 2007]


Automated service composition [
Hu

et al. 2008]


Resource discovery [Yan et al. 2009]

4

PhD Thesis Presentation, Alex Cheung © 2011

Real Deployments of Distributed
Publish/Subscribe


GooPS


Google’s pub/sub messaging middleware to integrate web
applications (such as Gmail, Google Docs, Google Calendar) on a
world
-
wide scale supporting millions of users


Hundreds of brokers with tens of thousands of pub/sub clients


Yahoo Message Broker


Yahoo’s pub/sub middleware to integrate applications with their
database system, PNUTS


SuperMontage


Tibco’s

pub/sub distribution network for
Nasdaq’s

quote and
order
-
processing system


GDSN (Global Data Synchronization Network)


A global pub/sub network that allows retailers and suppliers
(i.e.,
Walmart
, Target, Metro, etc.) to exchange timely and
accurate supply chain data

5

PhD Thesis Presentation, Alex Cheung © 2011

Contributions


Load Balancing in Content
-
based
Publish/Subscribe Systems (
ACM TOCS’10
)



Publisher Placement Algorithms in Content
-
based Publish/Subscribe (
IEEE ICDCS’10
)



Green Resource Allocation Algorithms in
Content
-
based Publish/Subscribe
(
IEEE ICDCS’11
)

6

PhD Thesis Presentation, Alex Cheung © 2011

Problem


Brokers located at different geographical areas
may suffer from uneven load distribution due to


Heterogeneous servers


Network congestion


Different densities and interests of end
-
users


Consequences


Overloaded brokers introduce high delivery delays
that may ultimately crash from running out of
memory


System that does not scale with the added resources

7

PhD Thesis Presentation, Alex Cheung © 2011

S

S

S

S

S

P

Visualizing the Problem

8

PhD Thesis Presentation, Alex Cheung © 2011

P

S

S

S

S

S

Overview of Load Balancing Approach

9

Local Load Balancing

Global Load Balancing

offloading
broker

load
-
accepting
broker

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation


Implemented on a real
open source pub/sub
system called PADRES


PlanetLab

and a
cluster
testbed


Local and global load
balancing


Homogeneous and
heterogeneous servers


Compared against a
naive approach

10

B20

B21

B22

B30

B31

B32

B40

B41

B42

S

S

S

B10

B11

B12

P

P

P

Global LB Setup

B50

B51

B52

B60

B61

B62

PhD Thesis Presentation, Alex Cheung © 2011

Summary


Load balancing enables the pub/sub system to
scale with the number of resources


Load balancing solutions that are unaware of
subscription load and relationships are
ineffective


Long response time


Unstable system


11

PhD Thesis Presentation, Alex Cheung © 2011

Contributions


Load Balancing in Content
-
based
Publish/Subscribe Systems (
ACM TOCS’10
)



Publisher Placement Algorithms in Content
-
based Publish/Subscribe (
IEEE ICDCS’10
)



Green Resource Allocation Algorithms in
Content
-
based Publish/Subscribe
(
IEEE ICDCS’11
)

12

PhD Thesis Presentation, Alex Cheung © 2011

Problem


Publishers can join anywhere
or to the closest broker in
the overlay


Consequences


High delivery delay


Sluggish system


High resource usage in terms of
matching, network bandwidth,
and subscription storage


High IT costs


13

P

S

S

PhD Thesis Presentation, Alex Cheung © 2011

Approach


Adaptively move publisher to area of
matching subscribers


Two unique solutions


POP (Publisher Optimistic Placement)


Decision is based on the average
number of downstream publication
deliveries


GRAPE (Greedy Relocation Algorithm
for Publishers of Events)


Decision is based on the end
-
to
-
end
delivery delay, total broker message
rate, and user specified inputs
including the minimization metric
(load/delivery delay) and weight




14

S

S

P

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation


Implemented on the open source
pub/sub system called PADRES


PlanetLab

and a cluster testbed


Enterprise and random workloads


15

Reduced delivery
delay by up to 68%

Reduced
message rate
by up to 85%

PhD Thesis Presentation, Alex Cheung © 2011

Summary


POP is suitable for pub/sub systems that strive
for simplicity, such as GooPS


GRAPE is suitable for systems that strive to
minimize in the extremes, such as system load
in sensor networks or delivery delay in
SuperMontage


16

PhD Thesis Presentation, Alex Cheung © 2011

Contributions


Load Balancing in Content
-
based
Publish/Subscribe Systems (
ACM TOCS’10
)



Publisher Placement Algorithms in Content
-
based Publish/Subscribe (
IEEE ICDCS’10
)



Green Resource Allocation Algorithms in
Content
-
based Publish/Subscribe
(
IEEE ICDCS’11
)

17

PhD Thesis Presentation, Alex Cheung © 2011

Problem


What is the deployment strategy for the broker
overlay, publisher assignment, and subscriber
assignment to minimize the broker message rate
and number of allocated brokers?


Proven to be an NP
-
complete problem


Benefits


Increase capacity of the system


More efficient energy usage of the allocated servers


Fewer servers mean lower investment and
maintenance costs


Inline with Green IT, which is also what enterprises
such as Google and Yahoo are currently engaged in


18

PhD Thesis Presentation, Alex Cheung © 2011

Approach


3 phase design






.




Most compelling properties


Language independent


Content
-
based (XPath, regex, ranged, SQL, composite
subscriptions, etc.) and topic
-
based, such as GooPS


Works effectively under any workload (defined or undefined)





19

Phase 1

Record the publications delivered to each subscription
into bit vectors

Phase 2

Use information from the bit vectors to allocate
subscriptions to brokers using one of 10 algorithms

Phase 3

Construct the broker overlay with 3

optimization
techniques
and deploy the new

configuration

PhD Thesis Presentation, Alex Cheung © 2011

Phase 1: Subscription Profiling


20



0

0

0

0

0

0

0

0

0

Message ID of first index

Start of bit vector

1

Publications
delivered to
subscription

B34
-
M213

B34
-
M215

B34
-
M216

B34
-
M217

B34
-
M220

B34
-
M222

B34
-
M225

B34
-
M226

B34
-
M213

0

1

0

1

0

1

0

1

0

1

0

1

0

1

Profile of each subscriber per advertisement
maintained at the subscriber’s first broker

Message ID

Cardinality of bit vector corresponds to
bandwidth requirement of the subscription

Used to compute “closeness” of between
any two subscriptions in the clustering
algorithm. closeness = |s
i



s
j
|


Fixed size so shift left if next publication is
out of bit vector range

PhD Thesis Presentation, Alex Cheung © 2011

Phase 2: Subscription Allocation
Algorithms


MANUAL/(AUTOMATIC)


Tree with fanout of 2, manual (random) placement of clients


Fastest Broker First (FBF)


Assign subscriptions randomly to the next most powerful broker


Bin Packing


Like FBF, but assigns the next highest traffic subscription


PAIRWISE
-
N, PAIRWISE
-
K (related approaches in ICDCS’02)


Subscription clustering where the number of clusters is given


CRAM (Clustering with Resource Awareness and Minimization)


Dynamically determines the number of clusters


Utilizes a new clustering algorithm that is more effective


Evaluated with 4 different subscription closeness metrics, with
one derived from Banavar
et al.
in ICDCS '99


21

PhD Thesis Presentation, Alex Cheung © 2011

Bin Packing

22

S

S

S

S

S

S

PhD Thesis Presentation, Alex Cheung © 2011

Bin Packing’s Allocation Result

23

S

S

S

S

S

S

PhD Thesis Presentation, Alex Cheung © 2011

S

S

Phase 3: Broker Overlay Construction

24

S

S

S

S

S

S

S

PhD Thesis Presentation, Alex Cheung © 2011

Bin Packing’s Final Overlay

25

S

S

S

S

S

S

S

S

S

P

P

(
( GRAPE )
)

(
( GRAPE )
)

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation


Implemented on the PADRES open source
content
-
based pub/sub project


Evaluated on a cluster testbed with 80 brokers


Evaluated on SciNet, an HPC with 1000 brokers


Comparison against two related works (Riabov
et al.
ICDCS’02, Banavar
et al.
ICDCS’99)


Homogeneous and heterogeneous scenarios


Workload saturates the initial deployment
(MANUAL)


26

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation Results on SciNet

27

Reduced
message rate
by up to 92%

Reduced number of
allocated brokers by
up to 91%

PhD Thesis Presentation, Alex Cheung © 2011

Summary


CRAM combines the benefits of


Subscription clustering


Resource awareness from Bin Packing


by simultaneously reducing both


Broker message rates


Number of allocated brokers


Bit vectors are powerful


Language independent (
XPath
,
regex
, topics)


Effective with any workload distribution

28

PhD Thesis Presentation, Alex Cheung © 2011

Conclusions


Load balancing increases


Availability by circumventing overloads


Scalability of the system


Publisher placement algorithms reduce


Broker input load by up to 68%


Broker message rate by up to 85%


Delivery delay by up to 68%


Resource allocation algorithms reduce


Average broker message rate by up to 92%


Number of allocated brokers by up to 91%


29

PhD Thesis Presentation, Alex Cheung © 2011

Future Work


Self
-
tuning of load balancing parameters


React dynamically by growing and shrinking the
network in incremental steps


Improve runtime of the CRAM algorithm by
parallelization or reducing its computational
complexity


Model workload with more sophisticated
methods, such as stochastic processes, to
improve accuracy of load estimation


Address fault resiliency in each approach

30

PhD Thesis Presentation, Alex Cheung © 2011

Q
&

A

31