B-Analysis of Resource Allocation Techniques for Virtualized Data Centers

toadspottedincurableInternet και Εφαρμογές Web

4 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

97 εμφανίσεις

Analysis of Resource Allocation Techniques for
Virtualized Data Centers

Anusha Damodaran

Computer Science Department

San Jose State University

San Jose, CA 95192




Parallel data processing in
recent years has emerged as one of
the main concepts which has variety of applications in big data
analytics, data mining etc.
The emergence of cloud has lead to
running parallel data processing frameworks in IaaS clouds.
The main concern here is resource
allocation for efficient
parallel data processing.
Since, the data processing
frameworks were designed for homogenous environments,
optimization needs to
done both in Infrastructure level and
application level to increase the performance for data intens
asks. The scalability of clouds and

the dynamic
allocation in cloud are the most

important factors on which the
performance of the parallel data processing frameworks lies
In this paper, we discuss the various resource allocation
that have been proposed for efficient parallel data
processing in cloud.
We explore each of these techniques and
explain the importance of the technique to achieve high
performance parallel data processing.



Today, as t
he amount of data explod
mining and processing of
data becomes inherent. Several companies need to do large
amount of data processing in a cost efficient way. With social
networking at the peak, the amount of data that needs to be
processed grows exponentially. The significanc
e of parallel data
processing becomes more prominent. Companies started to
develop new frameworks that support parallel data processing
such as Google’s Map Reduce, Yahoo’s Map
Merge etc.
These frameworks lets the programmers write code sequentially

and the framework distributes the program as subtasks with
subsets of data among various nodes that are available to execute
instances of code on appropriate fragments of data. The
programming models of these frameworks are designed such that
the programm
ers are saved from the difficulty of parallel
programming methods and execution optimizations. The
framework takes care of parallelizing the job submitted by the

developer. The companies need

to rely on their own physical
infrastructure and their tradition
al databases to do
data intensive
tasks which have beco
me more expensive. Small businesses found
it difficult to work with data intensive applications that are cost

With the evolution of cloud computing, the parallel data
processing has taken a

new route. These companies instead of
relying on their own physical infrastructure can rely on pay
use cloud services which provide Infrastructure as a service (IaaS)
type of services which allocate a set of virtual machines (VM) to
the customer and l
et them control the virtual machines and let the
customer pay according to the amount of time the machines were
allocated to them. These virtual machines reside in the virtualized
data centers owned by the cloud services. A typical virtual
machine varies i
n its characteristics based on the user’s need of
the type of operating system, the amount of CPU
cores and
memory capacity
. This type utility model of cloud computing
easily satisfies the companies which need supercomputing power
to run a data intensive j
ob in parallel.

The usage of virtual machines in IaaS clouds fits the architecture
of the parallel data processing framework
s such as Hadoop and
MapReduce [

These frameworks were initially built for static
clustered environments. When these frameworks a
re run in
clouds, the resource allocation and virtual machine placements
become very important in deciding the performance of a data
intensive application.

Ultimately, when parallel data processing is done in an IaaS
cloud, initially the da
ta processing
rameworks assume

that they
are working on static, clustered homog
enous nodes. This largely

the heterogeneous flexibility offered by IaaS clouds. To
talk more about flexibility of the IaaS clouds, we need to study in
detail how the resource allocatio
n strategies for virtualized data
centers work. Each of these resource allocation strategies are a
type of dynamic resource allocation
. Resource allocation of the
s becomes more important for the performance of data
processing progra
mming models such as
MapReduce [

In this paper we talk about the importance of resource allocation
and several dynamic resource allocation methods, which focus on
demand policies, response
and locality [4
] and

and heterogeneity
We study the architectu
res of the above
models and

discuss how these dynamic resource allocation
strategies help in improving performance of parallel data
processing frameworks and programming models. We
discuss the first data processing framework,
Nephele [14
] which
ted the dynamic resource allocation strategy to do efficient
data processing in cloud in parallel. We conclude by saying how
important it is to come up with new scheduling algorithms, new
framework to overcome the current performance degradation of
l data processing due to heterogeneous nodes.

is lot

of research going on, we need a lot more models to for
optimization of MapReduce like frameworks and also at the
infrastructure level.




Virtualized Data Center

gh performan
ce virtualization in an IaaS

cloud is enabled by
virtualized data centers. A virtualized data center consists of
virtualized versions of servers, routers, switches and links. It
incorporates the concept of virtualizing the device itself. A
ical hardware is virtualized using hypervisor technology to
multiple independent logical instances called a virtual machine. A
virtual machine thus created using hypervisor has different
capacities i.e. different memory capacities, different CPU speed
also run different operating systems and applications. A
virtual machine is hence a software implementation of a
computing environment where an OS and programs can run [
]. A
virtual data center contains virtual machines, virtual routers;
virtual switches
which are connected by virtual links. A
virtualized data center is the physical data center which deploys
resource virtualization techniques whereas a virtual data center is
the logical instance of a virtualized data center. The virtualization
layer can be

used to create multiple, isolated and independent VM
environments. A datacenter is responsible for managing the data
center activities such as creating a VM, destroying and routing of
requests from users to the VM’s.

Figure 1: Virtualization of Data Ce
nter. Image from [


Resource Allocation

Resource allocation is defined as the process of allocating
resources that are
available to the needed users [11
]. Resource
allocation happens at two steps. 1. The initial planning step where
VM’s are grouped into c
lusters and deployed onto physical hosts.
2. Dynamic resource allocation where depending on
the incoming
, the VM’s are created and reallocated or mig
rated to
balance the workload [11
] thus optimizing the allocation. A
resource allocation strategy
(RAS) would focus on how efficiently
resource allocation is done to utilize the resources completely. An
optimal resource allocation strategy would help in proper
utilization of resources and would in turn help process data faster.
The response time and th
e amount of time taken to execute a job
become faster when an optimal resource allocation strategy is
From [11
], we know that a typical RAS would provide
solution to the following problems.


Resource scarcity


Resource contention


Over/Under provisionin
g of resources.

Scarcity of resources arises where there are more number of users
requesting resources than the number of available resources.
Resource contention arises when two applications try to access the
same resource at the same time. Over provision
ing of resources
arise when the application is allocated extra resources than
required. Under provisioning occurs when the allocated resources
are not enough for a particular application.

For efficient parallel data processing, speed with respect to
tion time and response time is very important. Also the jobs
should complete in less time in a cost efficient manner with
accurate results. For data intensive tasks to be done in IaaS
clouds, we need an optimized resource allocation strategy which
would he
lp provide faster execution of jobs.



A resource allocation strategy should face four main challenges.

1. Resource modeling

2. Resource offering

3. Resource discovery/Reso
urce monitoring

4. Resource selection and optimization.

The first challenge deals with what resource model that the cloud
provider chooses depending on the services he provides. The
service provider describes each resource with a certain description
it be a virtual storage device or a network device. There are
existing frameworks like Resource Description Framework and
Network description framework for describing the resources. Also
several modeling languages are used for resource modeling. The
ce offering takes input the application requirements in the
higher level and handles the resources in the lower level.
Depending on the application requirements the resources should
be handled and managed by the resource management and control
solutions. T
handling of resources is

done by a cloud controller
or a cloud driver which controls the virtual devices, the networks
in a data center. The signaling between virtual devices, the signals
sent from one virtual machine to another virtual machine are all
decided by a set of signaling protocols. A dynamic resource
allocation method should face the third challenge which is
resource discovery and monitoring. Monitoring of resources
should be constantly done to select an ideal resource. Monitoring
of the utili
zation level of the physical machines, the congestion in
a network helps in discovery of an ideal physical machine to place
the VM. Thus monitoring resources helps in resource selection
which is the fourth challenge. The resource selection depends on
the r
esource monitoring and resource discover. The resource
selection then leads to resource optimization by optimizing the
resource allocation strategy.

We talked about two steps of resource allocation in the previous
section. There are several resource alloca
tion strategies on both
levels. Although there are various allocation strategies like
Priority based resource allocation, adaptive resource allocation,
location aware resource allocation, topology aware resource
allocation, energy efficient resource alloca
tion, storage
resource allocation etc we focus on few techniques which are used
for data intensive applications and thus focus on techniques which
are more useful for efficient parallel processing. The second level
is the dynamic resource allocation
that is done through Live VM
migration and several other techniques. We then, later study about
a data processing framework which uses dynamic allocation to
perform efficient data processing.

Figure 2:
Resource allocat
ion system inputs. Image from [13

In the above diagram, resource allocation is decided based upon
three factors. The cloud resources is the information of available
resources, the information on the virtual machines , the type of
virtual machines etc., resource modeling is the information
on the
description of each virtual device, the developer requirements
would the user’s requirements as agreed upon by SLA.


and Application Aware
Resource A

], the authors talk about topology aware resou
rce allocation
strategy. In Ia

clouds, service provider is not aware of the
application that is hosted. The service provider might not know
that the application is data intensive and hence does no

take into
account the application’s needs while allocating VM’s. In one
technique, the

users are asked about the resource requirements.
But the users sometimes wouldn’t be aware of the in
underlying the IaaS

Cloud which might lead the user to make a
wrong choice. If the user requirements are not perfect, then the
resource alloca
tion done also does not fit the actual requirement
of the data intensive job. This leads to performance degradation in
both levels. The user might try to enhance the performance by
making optimizations in the application level. But that would not
utilize t
he resources efficiently, and might even cause for more
performance degradation. When we deal with data intensive
workloads for applications of data mining or pattern mining, and
when using map
reduce like programming models, there might be
a lot of commun
ication overhead. If the placement of the VM’s is
done without considering the network topology then this would
cause lot of net
work congestion and the inter

traffic should
travel along bottlenecked network paths. An optimum resource
allocation strateg
y is a solution for the above problems which
should consider the placement of the VM’s depending on the
workload’s resource usage characteristics, the topology and also
the utilization of the IaaS clouds.

.1 Architecture

The architecture proposed by t
he authors follow a methodology
where information is gathered without user input and predicts the
performance of a particular resource allocation. The prototype
consists of two major components.

1. Prediction engine that estimates the performance of a res

2. A genetic algorithm based search technique

All combinations of possible subsets of the available resources are
first identified. The prediction engine then iterates through all the
combinations to find a resource allocation that optimi
zes the
estimated completion time of a job. The genetic algorithm helps
the prediction engine to search through all combinations of the
candidates. The prediction engine takes input from three

1. The objective function

2. Application descriptio

3. Available resources

The objective function here is the MapReduce’s job completion
time. The application description has three parts.

a. The framework type

b. Workload specific parameters

c. Resource specific parameters.

The framework type here i
s the hadoop based MapReduce
framework. The workload specific parameters consist of
framework specific parameters that define the configuration of the
application level environment on top of which the job executes.
The information is usually available in t
he config files. The
information is the number of map and reduce slots that are
configured in the MapReduce framework. The job specific
resource parameters include number of CPU cycles required for
an input record for both the map and reduce task and also
CPU overhead for every task. These can be identified by doing a
test run on a small subset of data. The last component is the
information on available resources. This is obtained from the IaaS
monitoring service and also the virtualization layer. The
nformation obtained includes the number of available resources,
the current load, the available capacity on each server and also the
measurement of available bandwidth in each network link. A
MapReduce simulator helps in finding the metric that is the
ut of the objective function i.e. the estimated job completion

In a very large IaaS system, if the user needs to be allocated r
VM’s then the number of combinations for possible resource
allocation is large. If there n servers to host one VM then th
ere are
a total of nCr combinations. To iterate through such a large
number of combinations, the authors use a genetic algorithm to
generate possible candidates so that the prediction engine can

The architecture is then evaluated for scalabilit
y and performance.
The above architecture is compared with four other application
independent resource allocation strategies and the overall gain in
performance is from 8% to 41%. This is an un
optimized version
of the architecture but still accounts for p
erformance gain. We
learn that application aware resource allocation thus increases
performance gains.

The performance gains obtained by using the above resource
allocation strategy will play an important role in parallel data
processing. The optimization

in the infrastructure level will give a
boost in the performance by the actual application run on the
infrastructure. For all data intensive tasks, knowing about the
application helps the resource allocation to come up with the best
resource allocation. T
he topology consideration also makes the
resource allocation increase performance of a data intensive job.

Figure 3: Architecture of topology
based re
source allocation.
Image from [8


Locality Based Resource A

Allocating a VM to a proper physic
al machine is very important
for the overall performance of the cloud computing environment.
The placement of the virtual machines not only increases
performance but also is cost efficient. The main challenge for the
Iaas systems is to provide nominal resp
onse time. This is achieved
by placing the virtual machine properly. Hence location of the
virtual machines becomes an important factor in increasing the
performance. The service provider usually has data centers across
the world to provide worldwide servi
ces to the customer. A user
requesting a VM in one location should be allocated near his
location to get the fastest response time. If the location of the
physical machine is far from the user’s virtual machine
geographically then there is a significant de
crease in performance
with respect to response time. If the data center controller
allocates VM based on the utilization level of the physical
machine, even if the VM is allocated in a low utilized physical
machine there will be a decrease in performance i
f the location is
not considered.
The authors in [4
] come up with a dynamic
resource allocation strategy which considers two important



aware placement of the virtual machines


Dynamic management of resource utilization

3.2.1 Utility F

To find the suitability of a physical machine to allocate a VM or
for VM migration three factors needs to be taken into account to
compute the utility function.

1. Utilization level

2. Location

3. Expected response time.

The utility function then

decides how and where the VM should
be placed. For the utilization level, the utility function considers
the current utilization level, the expected utilization level and the
predefined threshold of the utilization level. We assume that the
service provid
er can calculate how much the utilization level has
increased after allocating VM. If the expected utilization time is
greater than the threshold then the utility function returns
minimum util
ity score for that particular physical machine (PM)

For the u
tility score of response time, the utility function
considers the response time the user has agreed upon in SLA, the
expected response time. Response time in turn depends on the
utilization level of the PM and the location. If the expected
response time is

greater than the response time agreed upon by the
user in SLA, then minimum utility score is returned for response
time utility.

To evaluate the location, the geographical distance of the PM and
the user is determined. After the utility function is eval
uated the
provider should decide where to place the VM, or where to
migrate the VM. The model as proposed is shown in figure as

3.2.2 Decision M

When a new VM placement request comes in, the decision is
made by first selecting the closest dat
a center. And then the
utilization level of the physical machine is evaluated. The
suitability of the physical machine is then evaluated using the
utility function. The utility function gives a utility score

for each
. Depending on the scores, the VM pla
cement is done. After
the VM placement is done, the PM monitoring continues and
reports the utilization level to the provider.

ure 4: Architecture of locality
based re
source allocation.
Image from [4

When a PM exceeds the utilization threshold leve
l, the migration
overhead is evaluated. The migration is evaluated if it is feasible
to do a migration. If the utility function is greater than the
migration overhead, then the migration is done from one PM to
another. After the migration is done, PM monit
oring continues to
check if the utilization level is exceeded and reports to the cloud

The location aware resource allocation’s performance was
evaluated and found out that, the response time decreases if the
PM is near to the user. For efficien
t parallel data processing, it is
imminent that the response time and the inter
VM response time
is as less as possible. The topology aware and application aware
resource allocation also aims at less completion time for a job.
The location aware resource a
llocation even if it sounds less
significant does provide greater performance for data intensive


Heterogeneity Based Resource A

For data analytic workloads, some tasks might be CPU intensive
and some tasks might be I/O intensive. The cloud

driver should
know which VM to allocate to which type of task. Given the
demands of the data analytic tasks, the resource allocation strategy
should also scale the cluster according to demands. The authors
propose a model for data analytic workloads as fo

The nodes are th
e Virtual machines in this case

The two steps
proposed are 1. Divide the nodes into two pools; the core and
accelerator nodes. 2. The size of the pool is then dynamically
adjusted to reduce cost and to improve utilization.

gure 5
: Architecture of
Data Analytic cloud. Image from [9

The first step is to divide the machines into two pools. The core
nodes and the accelerator nodes. The core nodes are used for
storage and computation. The accelerator nodes are used for more
putational power. The cloud driver in the above diagram
decides when to add or remove the nodes and what type of nodes
to add to which pool and also manages the nodes. The cloud
driver basically predicts the requirements of the job and allocates
accordingly. The cloud driver initiall
y allocates a set of
core nodes,

here there are important production queries or when
there is a large job submitted,
and then

the cloud driver would add
some accelerator nodes to the pool to handle the job instead of
allocating more core nodes and underutilize some of the core

Also there would be a lot of instance types provided by the cloud
service provider. A job might run faster in one instance than the
other. An instance here is the virtual machine. Job aff
inity rate for
a p
articular job is computed by

comparing the computing rate of a
job in every instance. Say for example job j runs faster in instance
‘I’ than in instance ‘i1’, then the computing rate of j is faster in ‘I’
than in ‘i1’. The job affinity ra
te is higher for job j and instance
‘I’. The cloud driver thus decides which instance to allocate for
which job depending on the job affinity rate. On top of the
resource allocation strategy mentioned above, the authors also
propose a job scheduling strate
gy to further increase performance.
But we restrain from discussing the job scheduling strategy since
we are discussing only resource allocation strategies in this paper.
The resource allocation discussed above is to make sure that a
data intensive job lik
e a MapReduce job considers the
heterogeneity of the nodes in the Cloud. Even though there are a
lot of optimization
s proposed for the MapR
educe to increase
heterogeneity [
], a lot of research still needs to be done to
further achieve the best results of

a MapReduce job to exploit all
the features of a cloud.


On Nephele

A Parallel Data Processing

A data processing framework called Nephele has achieved taking
advantage of the dynamic resource allocation of the clouds.
Nephele is the first paral
lel data processing framework which
successfully exploits the dynamic resource allocation model in
The architecture of Nephele is as follows.

Figure 6: Nephele in an IaaS cloud. Image from [14

When the user first sends a job, a VM is start
ed in the cloud
which would serve as the job manager. The job manager will
receive the user’s job and is responsible for scheduling them and
coordinates the execution of the jobs. The job manager
communicates with the cloud controller to allocate or deallo
the instances depending on the job’s execution. The Nephele job
is partitioned into tasks and each task runs on a VM. Each VM
runs a task manager which runs the task. The task manager
receives a set of tasks from the job manager. The task manager

the tasks and reports to the job manager. After the job
manager receives the job from the user, the job manager decides
how many virtual machines are needed and what type of VM’s are
needed to run the job and at which phase of the job execution the
VM’s s
hould be allocated and deallocated such that utilization of
PM’s increases and also since the deallocation of VM’s takes
place as when the job is done with the VM, it is cost efficient. The
persistent storage is where the input data and the output data are

stored. The job manager and the task managers can access the
persistent storage.

The above framework wh
en compared with Hadoop and
educe, the
re is

a significant increase in performance. The
increase i
s mainly due to the deallocation of V
after the task running on the VM is done. The MapReduce does
not know whether to deallocate a VM during a reduce task
because, the MapReduce does not know if the VM has any
intermediate results. But Nephele makes sure that the task
managers report to J
ob manager as soon as the task is done and
executed and the VM is deallocated. The output from each task
manager would be stored in persistent storage. Thus, the
framework takes into account the dynamic resource allocation
capability of the clouds and henc
e increases performance. There
are several optimizations for MapReduce being done to exploit
the flexibility and the scalability offered by clouds to run efficient
MapReduce jobs as well as to increase the utilizations levels of
the PM in a data center.



There are various resource allocation strategies that have been
proposed like Priority based resource allocation

, Energy
efficient resource allocation, On
demand resource allocation,
Aware resource allocation etc. Of all the resource all
strategies, the above mentioned resource allocation strategies that
are focused in the paper are very related to parallel data
processing. It focuses on data intensive tasks run in IaaS clouds
and focuses on how to utilize the flexibility of the cl

Currently there is
lot of research going on
how to improve
MapReduce in heterogeneous environments. If a solution is found
for an optimized resource allocation and job scheduling then data
processing in cloud would be efficient and would attain

performance gains.




Amazon Elastic Compute Cloud (Amazon EC2)


M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz,
A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica,
M. Zaharia, Above the Clouds
: A Berkeley View of Cloud


Md. F. Bari, R. Boutaba, R. Esteves, L. Zambenedetti
Granville, M. Podlesny, Md G. Rabbani, Q. Zhang, and Md
F. Zhani, Data Center Network Virtualizatio
n: A Survey,
IEEE Communications Surveys and Tutorials, Vol. 15, N
2, Second Quarter



G. Jung and K. M. Sim, Location
Aware Dynamic Resource
Allocation Model for Cloud Computing Environment,
International Conference on Information and Computer
Applications (ICICA 2012)


V. Josyula, M. Orr, G. Page
Cloud Computing:

the Virtualized Data Center


M. Kesavan, A. Gavrilovska, K. Schwan, Elastic Resource
Allocation in Datacenters: Gremlins in the Management


D. Logothetis, K. Yocum, Ad
Hoc Data Processing in the


G. Lee, N. Tolia, P. Ranganathan, R. H
. Katz, Topology
Aware Resource Allocation for Data
Intensive Workloads



Leey, B. Chunz, R. H. Katzy, Heterogeneity
Resource Allocation and Scheduling in the Cloud


NIST Definition of Cloud



R. Patel, S. Patel, Survey on Resource Allocation Strategies
in Cloud Computing, International Journal of Engineering
Research & Technology (IJERT), Vol. 2 Issue 2, February



C. S. Pawar, R. B. Wagh, Priority Based Dynamic resource
llocation in Cloud Computing


Textbook of Short Courses, Resource Allocation in Clouds:
Concepts, Tools and Research Challenges


D. Warneke and O. Kao, Exploiting Dynamic Resource
Allocation for Efficient Parallel Data Processing in the
IEEE Transact
ions On Parallel And Distributed
Systems, January



M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, Ion Stoica,
Improving MapReduce Performance in Heterogeneous
Environments, 8th USENIX Symposium on Operating
Systems Design and Implementation