Network Aware Resource

possehastyMechanics

Nov 5, 2013 (3 years and 7 months ago)

83 views

Network Aware Resource
Allocation in Distributed Clouds

Contribution


Develops efficient resource allocation algorithms


The developed 2
-
approximation algorithm for
optimum Data Center(DC) selection is found to be
quite efficient


Develops a heuristic for partitioning the requested
resources among the chosen DCs and racks


Minimizes distance (latency) between the selected
DCs


Simulations show that this approach yields
significant gains

Introduction


Resource allocation


a key function of cloud
management and automation


Resource allocation algorithms have high impact on
performance of applications


Also affects the efficiency of DCs in accommodating
requests


User requests require allocation of Virtual
Machines(VMs)


To satisfy these requests, resource allocator
maintains updated list of resources available at DCs,
current allocations and future requirements.

Introduction


User requests include number of VMs and the
communication links required between the VMs


Automation software’s objective is to choose the DC
and rack such that overall resource usage is
minimized and optimal performance is achieved


These two goals are complimentary


Usually involve attempts at allocating all requested
resources onto a single rack



not always possible


Thus, for best results, resource allocation algorithms
that are capable of handling many scenarios are
required

Introduction


Fragmentation of user requests reduces
performance


Difficult to solve fragmentation


This paper focuses on resource allocation problem in
distributed cloud systems spread out geographically
over WAN

Target : latency




System Architecture

Distributed Cloud


Requests should be handled by DCs close to them


helps improve performance


Racks consist of blade servers, each containing
many cores


Communication between multiple blade servers
within the same rack happen via TOR switch


Two different racks communicate using aggregator
switch


DC networks designed with assumption of locality of
communication

System Architecture

Distributed Cloud


As distance between machines increases, the
bandwidth decreases


Bandwidth depends on physical machines that the
Virtual Machines(VM) are assigned to


Overall efficiency of a DC also depends on this


Number of requests serviceable by the DC also


depends on this

System Architecture

Distributed Cloud

System Architecture

Cloud Management and Automation S/W


Prior knowledge about communication links may not
be available


Automation S/W have to assign resources based on
worst case conditions and then re
-
optimize


There are also other conditions that need to be
satisfied


Number of VMs / DC (for fault tolerance)


Automation S/W computes mapping of user requests
to physical machines

System Architecture

Cloud Management and Automation S/W


The output of the cloud automation
s
oftware is a
mapping of VMs to physical resources


The software interacts with Network Management
System (NMS) and the local Cloud Management
System (CMS)


T
he
c
loud optimization software has two
functionalities


Track resource usage


Optimize assignment of user requests


Assignment of user requests consists of identifying
DCs and machines


Goal:
T
o reduce inter
-
DC, intra
-
DC traffic

System Architecture

Cloud Management and Automation S/W

System Architecture

Cloud Management and Automation S/W


Assignment of DCs is done in 4 steps

I.
DC Selection


Identify DCs based on user constraints and availability


Identify subset of DCs that minimize latency

II.
Partitioning Across DCs


Minimize inter
-
DC traffic


Adhere to given constraints and partition VMs accordingly

III.
Rack, Blade, Processor selection


Identify physical computational resources in the DCs


Goal : Identify machines with low inter
-
DC
traffic

IV.
VM Placement


Assign individual VMs to physical resources


Minimize inter
-
rack traffic





System Architecture

Data Center Selection


Select DCs that meet


All specifications and constraints


Optimize network resources


Maximize application performance


Use an algorithm that selects a subset of DCs with
least hops


Handle other constraints such as maximum or
minimum VMs / DC

System Architecture

Data Center Selection


DC selection problem


sub
-
graph selection problem


Given G = (
V,E,w,l
)


V


Data Centers


E


Path between DCs


w


number of available VMs at DC


l


distance of these paths


Note :


If there are constraints on maximum number of VMs / DC, w
takes this value instead


If there is a constraint of the minimum number of VMs / DC,
DCs with fewer VMs are omitted

System Architecture

Data Center Selection


Let
‘s’
be number of VMs
requested


Problem : Find sub
-
graph of G whose sum is at least
‘s’ with minimum diameter


Goal : Find sub
-
graph with minimum length of
longest edge


NP
-
hard problem


System Architecture

Data Center Selection

System Architecture

Data Center Selection


This algorithm finds a star topology
centered at v


Diameter of output sub
-
graph is at most
2x diameter of optimal sub
-
graph

System Architecture

Data Center Selection

System Architecture

Data Center Selection

Running Time


FindMinStar

has to be sorted


O(
nlogn
)


N


number of DCs


Computing diameter


O(n
2
)


O(
FindMinGraph
) = n * O(
FindMinStar
) = O (n
3
)


System Architecture

Machine Selection within DC


Goal : Find machines that reduce inter
-
rack traffic


DC topology is a tree topology


Root



core switch


Children



top
-
level switches


Leaf



racks


Given the tree representation of the DC (T) and total
number of VMs (s) to be placed


Find sub
-
tree with minimum height that has weight
at least equal to ‘s’

System Architecture

Machine Selection within DC

System Architecture

Machine Selection within DC

System Architecture

Virtual Machine Placement


Heuristic algorithms required for assigning individual
VMs to DCs and CPUs within DCs


Problem is a variant of graph partitioning and k
-
cut
problem


User request represented as graph G = (V,E)


Nodes represent VMs to be placed


Edges represent connections between them


Goal : Partition G into disjoint sets c
1
,
c
2
…c
m

such
that communication along vertices is minimized


If traffic is asymmetric, take the average

System Architecture

Virtual Machine Placement

System Architecture

Virtual Machine Placement

System Architecture

Virtual Machine Placement


Algorithms 4,5 give heuristic solution to partition
problem


Optimized using
Keringhan

Lin heuristics


Runtime :


O(n
2
logn)

Simulation Results


Results compared to random approach and greedy
algorithm


Random approach selects random DC and places as
many VMs as possible in the DC


Greedy selects DC with maximum VMs


To measure performances


Random topology created


Random user requests generated


Maximum distance between any two VMs measured


Simulation Results


Location of DCs randomly selected within a
1000x1000 grid


Distance between DCs is the Euclidean distance
between
points


Five different
d
istributed
c
loud scenarios


100 DCs


75 DCs


50 DCs


25 DCs


10 DCs


However, average machines on each cloud is the
same

Simulation Results

I Experiment


Measuring diameter of placement for a single
request of 1000 VMs


Approximation algorithm performs 79% better


Note : Diameter decreases as number of DCs
decreases


Simulation Results

II Experiment


Study cloud systems with series of user requests


Two experiments

I.
100 requests for 50


100 VMs


Requests are uniformly distributed


Large requests

II.
500 requests for 10


20 VMs


Small requests


Note : In both experiments, average VMs requested
is the same



Simulation Results

II Experiment

Simulation Results

II Experiment


Greedy performs better than random by 32.6% and
66.5%


Approximation algorithm performs better than
greedy by 83.4% and 86.4%


Why do larger requests require higher diameter?


Simulation Results

III Experiment


Studies performance of cloud system when
additional constraints are given


Same requests as previous experiment


Resilience
is defined as ratio of total VMs to
maximum VMs at any DC


Requests need to be placed in at least
resilience

number of DCs



Simulation Results

III Experiment


Larger requests have longer diameter


As resilience increases, diameter increases


What is different about these results?


Simulation Results

III Experiment


Performance of heuristic algorithm


Given communication requirements and available
capacity of DCs, algorithm computes optimal
placement of VMs that minimizes inter
-
DC traffic


Comparison of heuristic algorithm with greedy and
random algorithms


Random assigns random DC to each VM


Greedy selects DCs in decreasing order of
availability


While selecting VMs, it chooses VMs with maximum
total traffic first


Simulation Results

III Experiment


Experiment assigns a request of 100 VMs to DCs


Bandwidth fixed randomly between 0 and 1 Mbps


Inter
-
DC traffic for assignment of these VMs to
k
DCs (
k

= 2,…,8) was studied


Available resources at each DC were between 100/
k

and 200/
k


Hence 100 VMs were being assigned to DCs
consisting of 100


200 VMs



Simulation Results

III Experiment


For all algorithms inter
-
DC traffic increases as
number of DCs increase…Why?


Greedy algorithm performs better than random by
10.2%


Heuristic algorithm performs better than greedy by
4.6%

Simulation Results

III Experiment


When the DCs did not have excess capacity, inter
-
DC traffic was higher for heuristic algorithm by
28.2%


Heuristic algorithm performed better than the other
two algorithms by 4.8%


Greedy and Random had similar performances


Simulation Results

IV Experiment


In this experiment, effect of VM traffic on inter
-
DC
traffic is studied


The percentage of links with traffic is varied between
20% and 100% and inter
-
DC traffic is measured


The DCs have no excess capacity in these
experiments


Result:
inter
-
DC traffic grows linearly with
percentage of links with traffic for all algorithms

Conclusions


Main contribution is development of algorithms for
network
-
aware resource allocation of VMs in
distributed cloud systems


Need for these efficient algorithms :

Inter
-
DC traffic may be very expensive



2
-
approximation algorithm provided for selection of
DCs


This algorithm can also be used for rack selection
within DC but using prior knowledge about network
topology within DC gives better results


Heuristic algorithm for mapping VMs to resources
within DC

Related Work


Graph partitioning problems


K
-
cut problem


Maximum sub
-
graph problem


Assigning VMs inside
DCs studied in

Improving the scalability of data center networks
with traffic
-
aware virtual machine placement