A Hierarchical Framework for Cross-Domain MapReduce Execution

bootlessbwakInternet και Εφαρμογές Web

12 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

161 εμφανίσεις

A Hierarchical Framework for Cross-Domain
MapReduce Execution
Yuan Luo
1
, Zhenhua Guo
1
, Yiming Sun
1
, Beth Plale
1
, Judy Qiu
1
, Wilfred W. Li
2

1
School of Informatics and Computing, Indiana University, Bloomington, IN, 47405

2
San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, 92093

{yuanluo, zhguo, yimsun, plale, xqiu}@indiana.edu, wilfred@sdsc.edu

ABSTRACT
The MapReduce programming model provides an easy way to
execute pleasantly parallel applications. Many data-intensive life
science applications fit this programming model and benefit from
the scalability that can be delivered using this model. One such
application is AutoDock, which consists of a suite of automated
tools for predicting the bound conformations of flexible ligands to
macromolecular targets. However, researchers also need
sufficient computation and storage resources to fully enjoy the
benefit of MapReduce. For example, a typical AutoDock based
virtual screening experiment usually consists of a very large
number of docking processes from multiple ligands and is often
time consuming to run on a single MapReduce cluster. Although
commercial clouds can provide virtually unlimited computation
and storage resources on-demand, due to financial, security and
possibly other concerns, many researchers still run experiments on
a number of small clusters with limited number of nodes that
cannot unleash the full power of MapReduce. In this paper, we
present a hierarchical MapReduce framework that gathers
computation resources from different clusters and run MapReduce
jobs across them. The global controller in our framework splits
the data set and dispatches them to multiple “local” MapReduce
clusters, and balances the workload by assigning tasks in
accordance to the capabilities of each cluster and of each node.
The local results are then returned back to the global controller for
global reduction. Our experimental evaluation using AutoDock
over MapReduce shows that our load-balancing algorithm makes
promising workload distribution across multiple clusters, and thus
minimizes overall execution time span of the entire MapReduce
execution.
Categories and Subject Descriptors
C.2.4 [Computer-Communication Networks]: Distributed
Systems – Distributed applications.
General Terms:
Design, Experimentation, Performance

Keywords:
AutoDock, Cloud, FutureGrid, Hierarchical
MapReduce, Multi-Cluster

1. INTRODUCTION
Life science applications are often both compute intensive and
data intensive. They consume large amount of CPU cycles while
processing massive data sets that are either in large group of small
files or naturally splittable. These kinds of applications ideally fit
in the MapReduce [2] programming model. MapReduce differs
from the traditional HPC model in that it does not distinguish
computation nodes and storage nodes so each node is responsible
for both computation and storage. Obvious advantages include
better fault tolerance, scalability and data locality scheduling. The
MapReduce model has been applied to life science applications by
many researchers. Qiu et al. [15] describe their work to implement
various clustering algorithm using MapReduce.
AutoDock [13] is a suite of automated docking tools for
predicting the bound conformations of flexible ligands to
macromolecular targets. It is designed to predict how small
molecules of substrates or drug candidates bind to a receptor of
known 3D structure. Running AutoDock requires several pre-
docking steps, e.g., ligand and receptor preparation, and grid map
calculations, before the actual docking process can take place.
There are desktop GUI tools for processing the individual
AutoDock steps, such as AutoDockTools (ADT) [13] and BDT
[19], but they do not have the capability to efficiently process
thousands to millions of docking processes. Ultimately, the goal
of a docking experiment is to illustrate the docked result in the
context of macromolecule, explaining the docking in terms of the
overall energy landscape. Each AutoDock calculation results in a
docking log file containing information about the best docked
ligand conformation found from each of the docking runs
specified in the docking parameter file (dpf). The results can then
be summarized interactively using the desktop tools such as
AutoDockTools or with a python script. A typical AutoDock
based virtual screening consists of a large number of docking
processes from multiple targeted ligands and would take a large
amount of time to finish. However, the docking processes are
data independent, so if several CPU cores are available, these
processes can be carried out in parallel to shorten the overall
makespan of multiple AutoDock runs.
Workflow based approaches can also be used to run multiple
AutoDock instances; however, MapReduce runtime can automate
data partitioning for parallel execution. Therefore our paper
focuses on extending the MapReduce model for parallel execution
of applications across multiple clusters.
Cloud computing can provide scalable computational and storage
resources as needed. With the correct application model and
implementation, clouds enable applications to scale out with
relative ease. Because of the “pleasantly parallel” nature of the
MapReduce programming model, it has become a popular model
for deploying and executing applications in a cloud, and running
multiple AutoDock jobs certainly fits well for MapReduce.
However, many researchers still shun away from clouds for
different reasons. For example, some researchers may not feel
comfortable letting their data sit in shared storage space with
users worldwide, while others may have large amounts of data
and computation that would be financially too expensive to move









into the cloud. It is more typical for a researcher to have access to
several research clusters hosted at his/her lab or institute. These
clusters usually consist of only a few nodes, and the nodes in one
cluster may be very different from those in another cluster in
terms of various specifications including CPU frequency, number
of cores, cache size, memory size, and storage capacity.
Commonly a MapReduce framework is deployed in a single
cluster to run jobs, but any such individual cluster does not
provide enough resources to deliver significant performance gain.
For example, at Indiana University we have access to IU Quarry,
FutureGrid [5], and Teragrid [17] clusters but each cluster
imposes limit on the maximum number of nodes a user can uses at
any time. If these isolated clusters can work together, they
collectively become more powerful.
Unfortunately, users cannot directly deploy a MapReduce
framework such as Hadoop on top of these clusters to form a
single larger MapReduce cluster. Typically the internal nodes of a
cluster are not directly reachable from outside. However,
MapReduce requires the master node to directly communicate
with any slave node, which is also one of the reasons why
MapReduce frameworks are usually deployed within a single
cluster. Therefore, one challenge is to make multiple clusters act
collaboratively as one so it can more efficiently run MapReduce.
There are two possible approaches to address this challenge. One
is to unify the underlying physical clusters as a single virtual
cluster by adding a special infrastructure layer, and run
MapReduce on top of this virtual cluster. The other is to make the
MapReduce framework directly working with multiple clusters
without needing additional special infrastructure layers.
We propose a hierarchical MapReduce framework which takes
the second approach to gather isolated cluster resources into a
more capable one for running MapReduce jobs. Kavulya et al.
characterize MapReduce jobs into four categories based on their
execution patterns: map-only, map-mostly, shuffle-mostly, and
reduce-mostly, and also find that 91% of the MapReduce jobs
they have surveyed fall into the map-only and map-mostly
categories [10]. Our framework partitions and distributes
MapReduce jobs from these two categories (map-only and map-
mostly) into multiple clusters to perform map-intensive
computation, and collects and combines the outputs in the global
node. Our framework also achieves load-balancing by assigning
different task loads to different clusters based on the cluster size,
current load, and specifications of the nodes. We have
implemented the prototype framework using Apache Hadoop.
The rest of the paper is organized as follows. Section 2 presents
some related works. Section 3 gives an overview of our
hierarchical MapReduce framework. Section 4 presents more
details on the multiple AutoDock runs using MapReduce. Section
5 gives experiment setup and result analysis. The conclusion and
future work are given in Section 6.
2. RELATED WORKS
Researchers have put significant efforts to the easy submission
and optimal scheduling of massive parallel jobs in clusters, grids,
and clouds. Conventional job schedulers, such as Condor [12],
SGE [6], PBS [8], LSF [23], etc., aim to provide highly optimized
resource allocation, job scheduling, and load balancing, within a
single cluster environment. On the other hand, grid brokers and
metaschedulers, e.g., Condor-G [4], CSF [3], Nimrod/G[1],
GridWay [9], provide an entry point to multi-cluster grid
environments. They enable transparent job submission to various
distributed resource management systems, without worrying about
the locality of execution and available resources there. With
respect to the AutoDock based virtual screening, our earlier
efforts presented at National Biomedical Computation Resource
(NBCR) [14] Summer Institute 2009, addressed the performance
issue of massive docking processes by distributing the jobs to the
grid environment. We used the CSF4 [3] meta-scheduler to split
docking jobs to heterogeneous clusters where these jobs were
handled by local job schedulers including LSF, SGE and PBS.
Clouds give users a notion of virtually unlimited, on-demand
resources for computation and storage. Attributed to its ease of
executing pleasantly parallel applications, MapReduce has
become a dominant programming model for running applications
in a cloud. Researchers are discovering new ways to make
MapReduce easier to deploy and manage, more efficient and
scalable, and also more able to accomplish complex data
processing tasks. Hadoop On Demand (HOD) [7] uses the
TORQUE resource manager [16] to provision and manage
independent MapReduce and HDFS instances on shared physical
nodes. The authors of [21] have identified some fundamental
performance limitation issues in Hadoop and in the MapReduce
model in general which make job response time unacceptably
long when multiple jobs are submitted; by substituting their own
scheduler implementation, they are able to overcome these
limitations and improve the job throughput. CloudBATCH [22] is
a prototype job queuing mechanism for managing and dispatching
MapReduce jobs and commandline serial jobs in a uniform way.
Traditionally a cluster must separate MapReduce-enabled nodes
because they are dedicated to MapReduce jobs and cannot run
serial jobs. But CloudBATCH uses HBase to keep various
metadata on each job and also uses Hadoop to wrap commandline
serial jobs as MapReduce jobs, so that both types of jobs can be
executed using the same set of cluster nodes. The Map-Reduce-
Merge is extended from the conventional MapReduce model to
accomplish common relational algebra operations over distributed
heterogeneous data sets [20]. In this extension, the Merge phase
is a new concept that is more complex than the regular Map and
Reduce phases, and requires the learning and understanding of
several new components, including partition selector, processors,
merger, and configurable iterators. This extension also modifies
the standard MapReduce phase to expose data sources to support
some relational algebra operations in the Merge phase.
Sky Computing [11] provides end user a virtual cluster
interconnected with ViNe [18] across different domains. It aims to
bring convenience by hiding the underlying details of the physical
clusters. However, this transparency may cause unbalanced
workload if a job is dispatched over heterogeneous compute nodes
among different physical domains.
Our hierarchical MapReduce framework, aims to enable map-only
and map-most jobs to be run across a number of isolated clusters
(even virtual clusters), so these isolated resources can collectively
provide a more powerful resource for the computation. It can
easily achieve load-balance because the different clusters are
visible to the scheduler in our framework.
3. HIERARCHICAL MAPREDUCE
The hierarchical MapReduce framework we present in this paper
consists of two layers. The top layer has a global controller that
accepts user submitted MapReduce jobs and distributes them
across different local cluster domains. Upon receiving a user job,
the global controller divides the job into sub-jobs according to the
capability of each local cluster. If the input data has not been
deployed onto the cluster already, the global controller also
p
t
h
c
f
t
h
e
c
r
A
s
fr
w
M
c
p
l
e
b
s
c
s

r
f
o
m
a
c
3
F
M
g
t
r
r
t
h
M
m
a
W
t
h
a
i
n
fr
n
b
j
o
p
artitions input
h
em to these
c
c
lusters, the glo
f
inal reduction
u
h
e user. The b
o
e
ach receives s
u
c
ontroller, perf
o
esults back to t
h
A
lthough on th
e
s
imilar to the
M
fr
amework is ve
r
w
ork section, t
h
M
erge model i
s
c
omplex than
p
rogrammers i
m
e
arn this new c
o
b
ut also need to
m
s
ource. Our f
r
c
onventional M
a
s
upply two Red
u

instead of j
u
equirement is
t
f
ormats of the l
o
o
f the global Re
d
m
ap-only, the p
r
a
nd the global c
o
c
lusters and plac
3
.1 Archit
e
F
igure 1 is a hi
g
M
apReduce fra
m
g
lobal controlle
r
r
ansferer, a
w
educer. The bot
t
h
e distributed l
o
M
apReduce ma
s
m
anage
r
. The c
o
a
ccessible from
t
Figure 1
.
W
hen a user su
b
h
e job schedul
e
a
ssigns them t
o
n
cluding the cu
r
fr
om each local
n
odes ma
k
ing
u
b
alance by ensu
r
o
b in approxi
m
data proportio
n
c
lusters. After
b
al controller
c
u
sing the global
r
o
ttom layer con
s
u
b-jobs and inp
u
o
rms local Ma
p
h
e global contro
l
e
surface our fr
a
M
ap-Reduce-Me
r
r
y different in n
a
h
e Merge phase
s
a new conce
p
the conventi
o
m
plementing job
s
o
ncept along wi
m
odify the Ma
p
r
amework, on t
h
a
p and Reduce,
u
cers – one local
u
st one for the
t
hat the progra
m
o
cal Reducer ou
t
d
ucer input key/
v
r
ogrammer does
o
ntroller simpl
y
e
s them under a
e
cture
g
h-level archite
c
m
ework. The t
o
r
, which consi
w
orkload collec
t
t
om layer consi
s
o
cal MapReduc
s
ter node with
o
mpute nodes i
n
t
he outside.
.
Hierarchical
M
b
mits a MapRe
d
e
r splits the jo
b
o
each local c
l
r
rent workload
r
cluster, as we
l
u
p each cluster
.
r
ing that all clus
t
m
ately the same
n
ally to the su
b
the jobs are a
l
c
ollects the out
p
r
educer which i
s
s
ists of multiple
u
t data partition
p
Reduce comp
u
ler.
a
mework may
a
r
ge model pres
e
a
ture. As discu
s
introduced in
p
t which is di
f
o
nal Map an
s
under this mo
d
t
h the compone
n
p
pers and Reduc
h
e other hand,
and a program
m
Reducer, and o
n
regular Map
R
m
mer must m
a
t
put keys/value
v
alue pairs. Ho
w
not need to su
p
y
collects the m
a
common direct
o
c
ture diagram
o
o
p layer in our
sts of a job
s
t
or, and a us
e
s
ts of multiple c
l
e jobs, where
e
a workload r
e
n
side each of t
h
M
apReduce Ar
c
d
uce job to the
b
into a numbe
r
l
uster based o
n
r
eported by the
w
l
l as the capab
i
.
This is done
t
ers will finish t
h
time. The glo
b
b
-jobs, and sen
d
l
l finished on
a
p
uts to perform
s
also supplied
b
local clusters t
h
s from the glo
b
u
tation and sen
d
a
ppear structura
l
e
nted in [20], o
s
sed in the relat
e
the Map-Redu
c
f
ferent and m
o
d Reduce, a
n
d
el must not o
n
n
ts required by
ers to expose d
a
strictly uses t
h
m
er just needs
n
e global Redu
c
R
educe. The o
n
a
ke sure that t
h
pairs match tho
w
ever, if the job
p
ply any reduce
r
a
p results from
a
o
ry.
o
f our hierarchi
c
framework is t
h
s
cheduler, a d
a
e
-supplied glo
b
l
usters for runni
n
e
ach cluster has
e
porter and a j
o
h
e cluster are
n
c
hitecture
global controll
e
r
of sub-jobs a
n
n
several facto
r
w
orkload repor
t
i
lity of individ
u
to achieve lo
a
h
eir portion of t
h
b
al controller al
d
s
a
ll
a
b
y
h
at
b
al
d
s
l
ly
ur
e
d
c
e-
o
re
n
d
n
ly
it,
a
ta
h
e
to
c
e
r

n
ly
h
e
se
is
r
s,
a
ll
c
al
h
e
a
ta
b
al
n
g
a
o
b
n
ot

e
r,
n
d
r
s,
t
er
u
al
ad
-
h
e
so
p
artiti
o
input
d
transfe
r
config
u
As so
o
job sc
h
that cl
u
is ver
y
contro
l
the ti
m
to the
efficie
n
get the
get do
finishe
d
will tr
a
receivi
n
reduce
r
the ori
g
3.2
P
The
p
frame
w
compu
t
Global
from
syntac
t
Reduc
e
an in
p
likewi
s
an int
e
p
roduc
p
airs.
cluster
s
using
t
functi
o
the lo
c
the Gl
o
Tabl
e
Fun
c
R
Glo
b
Figure
among
diagra
m
Global
cluster
s
numbe
r
occur,
a
(key/v
a
1, and
t
(globa
l
also M
consu
m
interm
e
p
airs a
r
the lo
c
key wi
t
o
ns the input d
a
d
ata have not
r
er would trans
f
u
ration files wi
t
o
n as the data t
r
h
eduler at the g
l
u
ster to start th
e
y
expensive, we
l
ler to transfer d
m
e spent for tra
n
computation ti
m
n
t and effective
t
full benefit of
p
o
minated by d
a
d on a local clu
ansfer the out
p
n
g all the outp
u
r
will be invoke
d
g
inal job is map
-
P
ro
g
rammi
n
p
rogramming
m
w
ork is the “
M
t
ations are expr
e
Reduce. We us
e
the “local”
R
t
ically, a Glob
a
e
r. The Mapp
e
p
ut pair and
p
s
e, the Reducer,
e
rmediate inpu
t
ed by the Map
t
Both the Map
p
s
. The Global
R
t
he output fro
m
o
ns and also the
i
c
al Reduce
r
s ou
o
bal Reducer in
p
e
1. Input and o
c
tion Name
Map
R
educe
b
al Reduce
2 uses a tree-li
k
the Map, Red
u
m
, the root no
d
Reduce takes
s
that perform t
h
r
s shown in Fi
g
and the arrows
i
a
lue pairs) flow.
t
hen the input
k
e
l
controller) to t
h
M
ap tasks are lau
m
es an input
e
diate key/valu
e
a
re passed to th
e
c
al clusters. Ea
c
th a set of corr
e
a
ta in proportio
n
been deploye
d
f
er the user sup
p
t
h the input da
t
r
ansfer finishes
l
obal controller
e
local MapRed
u
recommend th
a
d
ata when the si
z
n
sferring the da
t
m
e. For large
d
t
o deploy them
b
p
arallelization a
n
a
ta transfer. A
f
ster, if the appl
i
p
ut back to th
e
ut data from a
l
d
to perform th
e
-
only.
n
g
Model
m
odel of our
M
ap-Reduce-Glo
b
e
ssed as three f
u
e
the term “Glo
b
R
educer, but
al Reducer is
e
r, just as a conv
p
roduces an in
t
just as a conve
t
key and a s
e
t
ask, and output
s
p
er and the Re
d
R
educer is exec
u
m
the local clu
s
i
nput and outpu
t
u
tput keys/value
p
ut key/value pa
o
utput types of
M
Reduce funct
i
Input
ሺ݇




ሺ݇

,ሾݒ


,…,
ݒ
ሺ݇

,ሾݒ


,…,ݒ

k
e structure to s
h
u
ce, and Globa
l
d
e is the glob
a
place, and the
t
he Map and R
e
g
ure 2 indicate
t
i
ndicate the dire
c
A job is subm
i
e
y/value pairs a
r
h
e child nodes (
l
u
nched at the lo
c
key/value pair
e

p
airs. In Ste
p
e
Reduce tasks,
c
h Reduce task
e
sponding value
n
to the sub-job
d
before-hand.
p
lied MapReduc
e
t
a partitions to
t
for a particular
notifies the job
u
ce job. Since
d
a
t users only us
z
e of input data
t
a is insignifica
n
d
ata sets, it wo
u
b
efore-hand, so
t
n
d the overall ti
m
f
ter the local
s
i
cation requires,
e
global contro
l
l
l local clusters
e
final reduction
hierarchical
b
al Reduce”
m
u
nctions: Map,
R
b
al Reduce” to
d
conceptually
a
just another
c
entional Mappe
r
t
ermediate key
/
ntional Reduce
r
e
t of correspon
d
s
a different set
o
d
ucer are execu
t
u
ted on the glob
s
ters. Table 1 l
t
data types. T
h
pairs must ma
t
irs.
M
ap, Reduce,
a
i
ons
O
ሺ݇

ݒ


ሿሻ

݇
ݒ


ሿሻ

݇
h
ow the data fl
o
l
Reduce funct
i
a
l controller o
n
leaf nodes re
p
e
duce functions.
t
he order in whi
c
tions in which
t
i
tted into the sy
s
r
e passed from t
h
l
ocal clusters) i
n
c
al clusters whe
r
r
and produce
s
p
3, the set of
i
which are also
consumes an
i
s, and produces
sizes if the
The data
e
ja
r
and job
t
he clusters.
cluster, the
manager of
d
ata transfer
e the global
is small and
n
t compared
u
ld be more
t
hat the jobs
m
e does not
s
ub-jobs are
the clusters
l
ler. Upon
, the global
task, unless
MapReduce
m
odel where
R
educe, and
d
istinguish it
a
s well as
c
onventional
r
does, takes
/
value pair;
r
does, takes
ding values
o
f key/value
t
ed on local
al controller
l
ists these 3
h
e formats of
t
ch those of
a
nd Global
O
utput




݇




݇




o
w sequence
i
on. In this
n
which the
p
resent local
The circled
ch the steps
t
he data sets
s
tem in Step
h
e root node
n
Step 2, and
r
e each Map
s
a set of
i
ntermediate
launched at
i
ntermediate
yet another
s
a
R
c
R
S
T
j
u
h
c
d
c
m
i
n
e
c
c
3
T
a
h
T
s
p
c
o
w
I
n
s
r
u
c
a
d
a
t
o
W
a
s
w
r
u
b
a
M
b
ܯ
a
o

a
s
整映步e/癡汵攠
p
a
牥⁳敮r⁢慣欠
t
R
敤畣攠eask⸠周
e
c
潲牥獰潮摩湧⁶
a
R
敤畣敲e,⁰
r

r
S
瑥瀠㔮t
T
桥潲整楣慬汹Ⱐ瑨
e
u
獴⁴睯⁨楥牡牣
h
h
慶攠浯牥⁤数a
h
c
潮瑲潬汥牳⁳業楬
a
d
楶楤攠楴猠慳獩杮
c
汵獴lr献†䉵琠景
r
m
潲攠瑨慮⁴睯
a
n
捲case⁴桥⁣
m
e
慣栠慤摩瑩潮慬al
c
汵lt敲e⁡癡 污扬
e
c
牥慴攠愠扲潡摥r
b
3
.3 Job Sc
h
T
he main challe
n
a
mong each loc
h
ow the datasets
T
he input datas
e
s
ubmitted by th
e
p
re-deployed on
c
atalog to the u
s
o
n the global co
w
hen partitionin
g
n
this paper,
w
s
ubmitted by th
e
u
n separate su
b
c
onsuming and
a
utomatically c
o
d
ataset using us
a
nd divides the
d
o
each cluste
r
.
W
e make the
a
a
pplication are
c
s
ame amount of
w
e will see in
u
nning multiple
b
ehavior. The s
c
a
s follows. Le
t
M
appe
r
s that ca
n
b
e the number
ܯ
ܽ݌݌݁ݎܣݒ݈ܽ݅


b
a
dded for execu
t
o
f CPU Cores o

ሼ1,...,nሽ. W
e
a
ssigns to each c
p
airs as output.
t
o the global
c
e
Global Reduc
a
lues that were
o
r
ms the comput
a
e
model we pre
s
h
ical layers, i.e.
h

b
y turning t
h
a
r to the global
ed jobs and ru
n
r
all practical p
u
a
yers for the fo
r
m
plexity as well
ayer. If a resea
r
e
, it is most lik
e
b
ottom layer th
a
Figure 2. Prog
r
h
edulin
g
an
d
n
ge of our wor
k
al MapReduce
are partitioned.
e
t for a particul
a
e
user to the glo
b
the local clust
e
s
er who runs th
e
ntroller takes i
n
g
the datasets a
n
w
e focus on the
e
user. If the u
s
b
-jobs on diff
e
erro
r
-prone.
O
o
unt the total
n
er
-implemented
d
ataset and assi
g
a
ssumption that
c
omputation int
e
time to run – t
h
the next secti
o
AutoDock inst
a
c
heduling algori
t
t
ܯܽݔܯܽ݌݌݁ݎ

n
be run concur
r
of Mappers
b
e the number
o
t
ion on ܥ݈ݑݏݐ݁ݎ

n ܥ݈ݑݏݐ݁ݎ

, wh
e
e
also use ߩ

to
d
ore, that is,
In Step 4, the l
o
c
ontroller to pe
r
e
task takes in
a
o
riginally produ
c
a
tion, and prod
u
s
ent can be exte
n
the tree structu
r
h
e leaf clusters
controller and e
n
them on its o
w
u
rposes, we do
n
r
eseeable future
as the overhea
d
r
cher has a larg
e
e
ly more effici
e
a
n to increase th
e
r
amming Mod
e
d
Data Par
t
k
is how to bala
n
cluster, which

a
r MapReduce
j
b
al controller b
e
e
rs and is expos
e
e
MapReduce j
o
n
to consideratio
n
n
d scheduling th
e
situation wher
e
s
er manually sp
l
e
rent clusters, i
O
ur global con
t
n
umber of rec
o
InputFormat a
n
g
ns the correct
n
all map tasks
e
nsive and take
a
h
is is a reasona
b
o
n that applyi
n
a
nces displays e
x
t
hm we use for
be the maxi
m
r
ently on ܥ݈ݑݏݐ
݁
c
urrently runni
o
f available M
ap

㬠ܰݑ݉ܥ݋ݎ݁


be
e
re i is the clus
d
efine how man
y
o
cal reduce out
p
r
form the Glo
b
a
key and a set
c
ed from the lo
c
u
ces the output
n
ded to more th
a
r
e in Figure 2 c
a
into intermedi
a
ach would furt
h
w
n set of childr
e
n
ot see a need
f
, because it co
u
d
introduced w
i
e
number of sm
a
e
nt to use them
e
depth.
e
l
t
itionin
g

n
ce the workloa
is closely tied
j
ob may be eit
h
e
fore execution,
e
d via a metad
a
o
b. The schedu
l
n
the data local
i
e
job.
e
input dataset
l
it the dataset a
n
t would be ti
m
t
roller is able
o
rds in the in
p
n
d RecordRead
e
n
umber of recor
d
of a MapRedu
a
pproximately t
h
b
le assum
p
tion
n
g MapReduce
x
actly this kind
our framework
m
um number
݁
ݎ

; ܯܽ݌݌݁ݎܴݑ
݊
ng on ܥ݈ݑݏݐ݁
ݎ
p
pers that can
b
e
the total num
b
ter number, an
d
y
map tasks a u
s
p
ut
b
al
of
c
al
in
a
n
a
n
a
te
h
er
e
n
f
or
u
ld
i
th
a
ll
to

d
s
to
h
e
r

or
a
ta
l
er
i
ty
is
n
d
m
e
to
p
ut
e
r,
d
s
c
e
h
e
as
to
of
is
of
݊


ݎ

;
b
e
b
er
d
i
s
er
ܯ
ܽ
Norma
compu
t
ܯ
ܽ
For si
m
ߛ


The w
e
factor
ߠ
sp敥搬
摥灥d
d
捯cpu
t
†††ܹ
݁
Let
ܬ
݋
ܾ
x, whi
c
the M
a
schedu

ܬ
݋
ܾ
After
p
using
e
move
t
local c
l
4. A
We a
p
AutoD
o
frame
w
output
s
the Au
t
are lig
a
simple
input
corres
p
Tabl
e
a
u
su
m
For o
u
Reduc
e
1)
M
a
p
execut
a
summ
a
consta
n
2)
R
ed
u
the co
n
ܽ
ݔܯܽ݌݌݁ݎ

ൌ ߩ

a
lly we set ߩ


t
ation intensive
j
ܽ
݌݌݁ݎܣݒ݈ܽ݅


ܯ
m
plicity, let

ܯܽ݌݌݁ݎܣݒܽ
݅
e
ight of each s
u
ߠ

is the comp
u
memory size,
s
d
ing on the char
a
t
ation intensive
݁
݄݅݃ݐ




ൈఏ





ಿ
೔సభ
ܾ
ܯܽ݌

be the t
o
c
h can be calcul
a
a
p tasks, and
ܬ
݋
ܾ
u
led to ܥ݈ݑݏݐ݁ݎ


f
ܾ
ܯܽ݌
௫,௜
ൌ ܹ݁
݅
p
artitioning the
e
quation (5), w
e
t
he data items a
c
l
uste
r
s, or from
l
UTODOC
K
p
ply the Ma
p
ock instances
w
ork to prove t
h
s
of AutoGrid (
o
u
toDock. The ke
a
nd names and
input file for
m
recor
d
, which
p
onds to a map t
a
e
2. AutoDock
M
Field
ligand_name
autodock_exe
input_files
output_dir
u
todock_param
e
summarize_ex
e
m
marize_param
u
r AutoDock
M
e
functions are i
m
p
: The Map tas
k
a
ble against a s
h
a
rize_result4.py
n
t intermediate
k
u
ce: The Reduc
n
stant intermedi

ൈܰݑ݉ܥ݋ݎ݁



1 in the loc
a
j
obs, so we get
ܯ
ܽݔܯܽ݌݌݁ݎ


݅
݈


u
b-job can be ca
u
ting power of
s
torage capacit
y
a
cte
r
istics of th
e
or I/O intensive




o
tal number of
M
a
te
d
from the n
u
ܾ
ܯܽ݌
௫,௜

b
e the
for job x, so tha
t
݅
݄݃ݐ


ܬ
݋ܾܯܽ
MapReduce j
o
e
number the da
ccordingly, eith
l
ocal cluster to l
K
MAPRE
D
p
Reduce
p
arad
i
using the
h
e feasibility of
o
ne tool in the
A
e
y/value pairs o
f
the location of
m
at for AutoD
o
contains 7 f
i
ask.
M
apReduce in
p
e
Pa
t
Ou
e
ters
e
P
m
eters S
u
M
apReduce, the
m
plemented as
f
k
takes a ligand
t
h
ared receptor, a
n
to output the l
o
k
ey.
e task takes all
i
ate key an
d
so
r

a
l MapReduce

ܯܽ݌݌݁ݎܴݑ݊



lculated from (
4
each cluster, e.
g
y
, etc. The act
u
e
jobs, i.e., whe
t

M
ap tasks for a p
u
mber of keys i
n
number of Ma
p
t

݌


o
b to Sub-Map
R
ta items of the
d
er from global
c
l
ocal cluster.
D
UCE
i
gm to runni
n
hierarchical
our approach.
W
A
utoDock suite
)
f
the input of th
e
ligand files. W
e
o
ck MapReduce
i
elds shown i
n
p
ut fields and d
e
Descripti
o
Name of the
l
t
h to AutoDock
Input files of A
u
tput directory o
f
AutoDock par
a
P
ath to summar
i
u
mmarize script
p
Map, Reduce,
f
ollows:
t
o run the Auto
D
nd then runs a
P
o
west energy r
e
the values corr
e
r
ts the values b
y
(1)
clusters for
(2)
(3)
4
) where the
g
., the CPU
u
al ߠ

varies
t
her they are
(4)
articular job
n
the input to
p
tasks to be
(5)
R
educe jobs
d
atasets and
c
ontroller to
n
g multiple
MapReduce
W
e take the
)
as input to
e
Map tasks
e
designed a
jobs. Each
n
Table 2,
e
scriptions
o
n
l
igan
d

executable
u
toDoc
k

f
AutoDoc
k

a
meters
i
ze script
p
arameters
and Global
D
ock binary
P
ython script
e
sult using a
e
sponding to
y
the energy
fr
l
o
3
o
i
n
5
W
h
a
s
r
i
n
t
o
c
i
n
m
l
o
I
n
c
c
fr
r
c
fr
m
b
p
E
T
b
m
i
n
H
H
t
i
W
H
r
u
d
j
o
E
s
C
ߩ
o
v
v
e
c
I
n
t
h

潭潷⁴漠桩
h
o
捡氠牥摵捥c⁩ t
e
3
⤠Global Redu
c
o
f the local red
u
n
to a single file
5
. EVALU
A
W
e evaluate
o
h
ierarchical Ma
p
a
nd Shell script
s
s
tage-in and sta
g
eporter is a
c
n
formation acc
e
o
make it a se
p
c
ode. Unfortu
n
n
formation we
m
odify Hadoop
o
ad data by usi
n
n
our evaluatio
n
c
luster and two
c
c
luste
r
which h
a
fr
om outside.
A
elated tasks, in
c
c
ancellation. Th
e
fr
om outside. S
e
m
ounted to each
b
y the jobs. Fut
u
p
arts, and each
E
ucalyptus, Ni
m
Ta
b
Cluster
Hotel In
t
Alamo In
t
Quarry In
t
T
o deploy Had
o
b
uilt-in job sc
h
m
aintainability
a
n
shared direct
o
H
adoop progra
m
H
adoop daemo
n
i
mes.
W
e use three c
l
H
otel and Futur
e
u
n Linux 2.6.
1
d
edicated mast
e
o
btracker) and
E
ach node in
s
pecifications of
C
onsidering Aut
ߩ

ൌ 1
p
er secti
o
o
n each node is
v
ersion of Aut
o
v
ersion. The gl
e
xecution detai
l
c
omplexity.
n
our experime
n
h
e most impor
t
h
, and outputs t
h
e
rmediate key.
c
e: The Global
R
u
cer intermedia
t
by the energy f
r
A
TIONS
o
ur model by
p
Reduce syste
m
s
. We use ssh
a
g
e-out. On the
l
c
omponent tha
t
e
ssed by global
s
p
arate
p
rogram
n
ately, Hadoop
need to exter
n
code to add a
n
n
g Hadoop Java
A
n
, we use severa
l
c
lusters in Futur
e
s several login
n
A
fter a user lo
g
c
luding job sub
m
e
computation n
o
e
veral distribute
d
computation n
o
u
reGrid partitio
n
of which pro
v
m
bus, and HPC.
b
le 3. Cluster
N
CPU
t
el Xeon 2.93G
H
t
el Xeon 2.67G
H
t
el Xeon 2.33G
H
o
op to tradition
a
h
eduler (PBS)
a
nd performanc
e
ry while store
d
m
(Java jar fil
e
n
s whereas the
l
usters for eval
u
e
Grid Alamo. E
a
1
8 SMP. Wit
h
e
r node (HD
F
other nodes a
r
these cluster
s
these cluster no
oDock being a
C
o
n 3.3 so that th
equal to the n
u
o
Dock we use
i
obal controller
l
s because ou
r
n
ts, we use 6,0
0
ant configurati
o
h
e sorted result
s
R
educe finally t
a
t
e key, sorts a
n
r
om low to high.
prototyping
a
m
. The system
i
a
nd scp scripts
l
ocal clusters’ s
exposes Had
o
s
cheduler. Our o
r
without touchi
n
does not
e
n
al applications
,
n
additional da
e
A
PIs.
l
clusters includ
e
Grid. IU Quarr
y
n
odes that are
p
g
ins, he/she ca
n
m
ission, job sta
t
o
des however, c
d
file systems (
L
o
de for storing i
n
n
s the
p
hysical
c
v
ides a differen
t
N
ode Specificat
i
Cache
s
H
z 8192
K
H
z 8192
K
H
z 6144
K
a
l HPC clusters
to allocate n
o
e
, we install the
d
ata in local dire
c
e
s, etc.) is loa
d
HDFS data is
a
u
ations – IU Q
u
a
ch cluster has 2
h
in each cluste
r
F
S namenode
r
e data nodes
a
s
has an 8-c
o
des are listed in
C
PU-intensive
a
e
maximum nu
m
u
mber of cores
i
s 4.2 which i
s
does not car
e
r
local job m
a
0
0 ligands and
1
o
n parameters i
s
s
to a file usin
g
a
kes all the val
u
n
d combines th
e
a
Hadoop bas
e
i
s written in Ja
v
to finish the d
a
i
de, the worklo
a
o
op cluster lo
a
r
iginal design
w
n
g Hadoop sour
xpose the lo
a
,
and we had
e
mon that colle
c
ing the IU Qua
r
y
is a classic H
P
p
ublicly accessi
b
n
do various jo
t
us query and j
o
annot be access
e
L
ustre, GPFS)
a
n
put data access
e
c
luster into seve
r
t
testbed such
i
ons.
ize Memor
y
K
B 24GB
K
B 12GB
K
B 16GB
, we first use t
h
o
des. To balan
Hadoop progr
a
c
tory, because t
h
d
ed only once
b
a
ccessed multi
p
u
arry, FutureG
r
1 nodes. They
a
r
, one node is
and MapRedu
a
nd task tracke
r
o
re CPU. T
h
Table 3.
a
pplication, we
s
m
ber of map tas
k
on the node. T
h
s
the latest sta
b
e
about low-le
v
a
nagers hide t
h
1
receptor. One
s

g
a_num_eval
s
g
a
u
es
e
m
e
d
v
a
a
ta
a
d
a
d
w
as
c
e
a
d
to
c
ts
r
ry
P
C
b
le
b-
o
b
e
d
a
re
e
d
r
al
as
y

h
e
c
e
a
m
h
e
b
y
p
le
r
id
a
ll
a
c
e
r
s.
h
e
s
et
k
s
h
e
b
le
v
el
h
e
of
s
-
numbe
r
p
robab
experi
e
5,000,
0
Fig
u
Figure
during
tracker
mome
n
numbe
r
b
eginn
i
Towar
d
quickl
y
indicat
i
MapR
e
b
y tho
s
Ta
b
Nu
m
Ma
p
Per
C
1
5
1
1
2
Test C
a
Our fi
r
Contro
p
erfor
m
AutoD
o
2000 l
i
4 for r
e
As is
numbe
r
linear,
The to
t
r
of evaluation
ility that better
e
nces, the
g
a_n
u
0
00. We config
u
u
re 3: Number
M
3 plots the nu
m
the job executi
o
r
s, so the maxi
m
n
t is 20 * 8 =
r
of running
m
ing and stays
d
s the end of
j
y
(roughly 0 -
ing that node
u
e
duce tasks com
e
s
e new tasks.
b
le 4. MapRed
u
under di
f
m
ber of
p
Tasks
C
luster
H
o
(sec
o
1
00 1
0
5
00 1
7
000 2
9
500 4
3
2
000 5
9
a
se 1:
r
st test case is a
ller to find ou
t
m
s under diff
e
ock in the Ha
d
i
gand/receptor
p
e
sults.
reflected in Fi
g
r
of map tasks
regardless of t
h
t
al execution ti
m
s. The larger i
t
r
results may b
e
u
m_evals is typ
i
u
re it to 2,500,0
0
of running ma
p
M
apReduce in
s
m
ber of running
m
o
n. The cluster
m
um number o
f
160. From the
m
ap tasks qui
c
approximately
j
ob execution,
5). Notice th
e
u
sage ratio is lo
w
e in, the availab
u
ce execution ti
m
f
ferent numbe
r
Execution Ti
m
o
tel
onds)
A
l
(se
c
0
04
8
7
63 1
9
86
2
3
04
4
9
42 5
base test case
w
t
how each of
e
rent numbers
d
oop to process
p
airs in each of
t
g
ure 4, the t
o
in test case 1
h
e startup overh
e
m
e of the jobs r
u
t
s value is, th
e
e
obtaine
d
. Ba
s
i
cally set from
2
0
0 in our experi
m
p
tasks for an
A
s
tance
m
ap tasks withi
n
has 20 data no
d
f
running map
t
plot, we can
s
c
kly grows to
constant for a
it drops to a
e
re is a tail n
e
w
. At this mo
m
le mappers will
m
e on differen
t
r
of map tasks.
m
e on Three Cl
u
l
amo
c
onds)
Q
(
s
8
21
771
2
962
4
251
849
w
ithout involvin
g
our local Had
o
of map task
100, 500, 100
0
t
he three cluster
s
o
tal execution t
i
on each cluste
r
e
ad of the Map
R
u
nning on the Q
u
e
higher the
s
ed on prior
2
,500,000 to
m
ents.

A
utodock
n
one cluster
d
es and task
t
asks at any
s
ee that the
160 in the
long time.
small value
e
ar the en
d
,
m
ent, if new
be occupied
t
clusters
u
sters
Q
uarry
s
econds)
1179
2529
4370
6344
8778
g
the Global
o
op clusters
s. We ran
0
, 1500 and
s
. See Table
ime vs. the
r
is close to
R
educe jobs.
u
arry cluster
i
s
T
C
T
O
M
c
e
c
s
r
u
d
p
s
c
t
h
b
s
c
T
l
i
g
s
a
j
o
s
approximatel
y
T
he main reaso
n
C
PUs compared
Figure 4. Loc
a
T
est Case 2:
O
ur second te
s
M
apReduce job
s
c
lusters, which
i
e
quation (4) fr
o
c
onstant, and i

s
桯睳h ߛ

ൌ ߛ

u
nning before
h
d
istribution on
e
p
artition the da
t
s
tage the data
c
onfiguration fil
e
h
e local MapR
e
b
ack to the glob
a
s
hows the data
c
ontexts.
Figure 5. T
w
partitioned d
a
T
he input datas
i
gands. The re
c
g
ridmap files t
o
s
tored in 600
a
pproximately 5
-
o
b configuratio
n
y
50% slower t
h
n
is that nodes
o
with that of Al
a
a
l cluster Map
R
different num
b
s
t case shows
s
with ߛ-weight
e
i
s
b
ased on the
o
m section 3.3
,

ሼ1,2,3ሽ for
o
ൌ ߛ

ൌ 160,
g
h
and. Therefo
r
e
ach cluste
r
is
ܹ
t
慳整
慰慲琠晲a
m
瑯来瑨敲⁷楴
h
e
⁴漠汯捡氠捬畳l
e
e
摵捥⁥硥c畴楯
n
a
氠捯湴牯汬敲⁦l
r
浯癥浥湴⁣m
s
w
o-way data
m
a
tasets: local
M
et of AutoDoc
k
c
eptor is descri
b
o
taling 35MB i
n
0 separate d
i
-
6 KB large. I
n
n
file together
h
h
an running on
A
o
f the Quarry cl
a
mo and Hotel.
R
educe executio
n
b
er of map task
s
the performa
n
e
d
p
artitioned d
a
following para
m
,
we set ߠ


ܥ
o
ur three cluste
r
g
iven no Ma
pR
r
e, the weigh
t
ܹ
݄݁݅݃ݐ

ൌ 1/3.
m
shared datas
e
h
the jar exe
rs for executio
n
n
, the output fil
r
the final globa
l
s
t in the stage
-
m
ovement cost o
M
apReduce inp
u
k
contains 1 r
e
b
ed as a set of
n
size, an
d
the
i
rectories, eac
h
n
addition, the
e
h
as a total of 30
A
lamo and Hot
e
uster have slo
w
n
time based o
n
s
.
n
ce of executi
n
a
tasets on differ
e
m
eters setup. F
ܥ
Ⱐ睨敲攠䌠楳
r
s.⁏畲 捡c捵ca瑩
o
R
敤e捥潢c
a
t
映浡瀠瑡t
k
We⁴桥渠敱 a
l
e
琩⁩湴漠㌠灩散
e
捵瑡扬攠慮搠j
o
n
⁩渠灡牡汬敬⸠䅦
t
es⁷楬氠扥⁳ 慧
e
l
⁲敤畣攮 䙩杵牥
-
in⁡湤⁳t慧攭
o
f ࢽ-weighted
u
ts and outputs
e
ceptor and 60
0
approximately
2
6000 ligands
a
h
of which
e
xecutable jar a
n
0KB in size. F
e
l.
w
er

n

n
g
e
nt
o
r

a
o
n
a
re
k
s
l
ly
e
s,
o
b
t
er
e
d
5
o
ut

0
0
2
0
a
re
is
n
d
or
each
c
contai
n
execut
a
transfe
r
decom
p
in.” Si
m
files t
o
compr
e
contro
l
As we
13.88
t
takes
2
little l
o
compa
r
execut
i
The ti
m
MapR
e
cluster
s
data m
in Fig
u
time
t
approx
Hotel
a
all the
only 1
6
on Qu
a
Figu
r
Test C
a
In our
MapR
e
differe
n
test ca
s
assign
e
the sa
m
of tim
e
slower
Quarr
y
Intel(
R
2.93G
H
that of
differe
n
freque
n
numbe
r
capabi
l
schedu
Here
w
ߠ

ൌ 2
c
luster, the gl
o
n
ing 1 recepto
r
a
ble jar, and j
o
r
s it to the
d
p
ressed. We cal
l
m
ilarly, when t
h
o
gether with co
n
e
ssed into a t
a
l
ler. We call thi
s
can see from
F
t
o 17.3 seconds
t
2
.28 to 2.52 se
c
o
nger to transfe
r
r
e to the relat
i
i
ons.
m
e it takes to
r
e
duce clusters v
a
s
. The local
M
m
ovement costs
(
u
re 6. The Hotel
t
o finish thei
r
x
imately 3,000
m
a
nd Alamo. Th
e
local results ar
e
6
seconds to fi
n
a
rry becomes th
e
r
e 6. Local Ma
p
datasets,
i
a
se 3:
third test case,
e
duce jobs wi
t
n
t clusters, whi
s
es 1 and 2, we
e
d the same nu
m
m
e amount of d
a
e
to finish.
A
than Alamo an
y
, Alamo and
H
R
) Xeon(R) X5
5
H
z, respectivel
y
processing tim
e
n
ce in processi
n
n
cies, therefore
,
r
of cores fo
r
l
ities of each
u
ling policy to
a
w
e set ߠ

ൌ 2.
9
2
for Quarry. As
o
bal controller
r
file set, 20
0
o
b configuratio
n
d
estination clu
s
l this global-to-
l
h
e local MapRe
n
trol files (typic
a
a
rball and tran
s
s
local-to-global
F
igure 5, the d
a
t
o finish, while
t
c
onds to finish.
r
the data but t
h
i
vely long dur
a
r
un 2000 map
ar
ies due to the
d
M
apReduce exe
c
(
both data stage
and Alamo clu
s
r
jobs, but
t
m
ore seconds to
f
e
Global Reduc
e
e
ready in the g
l
n
ish. Thus, the
r
e
bottleneck on
t
p
Reduce turna
r
i
ncluding data
m
we evaluate th
e
th ߛߠ-weighte
d
i
ch is
b
ased on
have observed
t
m
ber of comput
e
a
ta, they take si
g
A
mong the thre
e
d Hotel. The s
p
H
otel are Intel(
R
5
50 2.67GHz, a
n
y
. The inverse r
a
e
match roughly
.
n
g time is main
,
it is not eno
u
r
load balanc
i
core are also
a
dd CPU frequ
e
9
3 for Hotel,
ߠ
楳⁦潲⁴敳琠捡獥c
捲cates⁡ ㄴ
0
〠汩条湤0 摩d
e
n
⁦楬敳,⁡ 氠捯c
p
s
瑥爬⁷桥牥⁴h
e
l
潣慬⁰牯捥摵牥
摵捥潢s 晩湩f
h
a
汬l″〰ⴵ か0
s
晥牲敤⁢慣欠瑯e
灲潣敤畲攠鍤慴
a
a
瑡⁳瑡来-楮⁰牯
c
t
桥⁤慴愠獴慧攭h
u
周攠䅬慭漠捬
u
h
e⁤楦 敲敮ee⁩
i
a
瑩潮映汯捡氠
瑡獫猠潮⁥sch
o
d
楦晥ie湴⁳灥捩ni
c
c
畴楯渠浡步獰u
n
⵩渠慮搠獴慧攭-
u
s
瑥牳⁴a步⁳kmil
a
t
桥⁑畡牲h⁣
u
f
楮楳i,⁡扯u琠㔰
%
e
⁴ 獫⁩s湬 ⁩
n
l
潢慬⁣潮瑲a汬敲,
r
敬e瑩t敬礠yo潲o
p
t
桥⁣畲h敮琠橯戠
d
r
ound time of

m
潶om敮琠捯tt
e
performance
o
d

p
artitioned
d
the following
s
t
hat although all
e
no
d
es and cor
e
g
nificantly diffe
e
clusters, Qua
r
p
ecifications of
t
R
) Xeon(R) E
5
n
d Intel(R) Xeo
n
a
tio of CPU fr
e
.
So we hypoth
e
ly due to the d
i
u
gh to merely
fa
i
ng, and the
c
important. We
e
ncy as a facto
ߠ

ൌ 2.67 for
A
2, we again ha
v
MB tarball
e
ctories, the
p
ressed, and
e
tarball is
“data stage-
h
, the output
in size) are
the global
a
stage-out.”
c
edure takes
u
t procedure
u
ster takes a
i
nsignificant
MapReduce
o
f the local
c
ation of the
n
, including
u
t) is shown
a
r amount of
uster takes
%
more than
n
voked after
and it takes
p
erformance
d
istribution.


⵷敩杨瑥-
o
f executing
d
atasets on
s
etup. From
clusters are
e
s to process
rent amount
r
ry is much
t
he cores on
5
410 2GHz,
n
(R) X5570
e
quency and
e
size that the
i
fferent core
fa
ctor in the
c
omputation
refine our
r
to set ߠ

.
A
lamo, and
v
e calculated
ߛ
b
ܹ
a
t
o
F
s
l
i
c
1
2
t
r
r
p
F
W
m
s
a
t
h
ߛ

ൌ ߛ

ൌ ߛ


b
eforehan
d
. T
h
ܹ
݄݁݅݃ݐ

ൌ 0.3
5
a
nd Quarry resp
e
o
the new weig
h
Table 5. Num
b
Cluster
Hotel
Alamo
Quarry
F
igure 7 shows
t
s
cenario. The v
a
i
gands sets are
c
an see from the
1
7.64 seconds t
o
2
.2 to 2.6 seco
n
r
ansfer the data
elatively long
d
p
revious test cas
e
Figure 7. T
w
partitioned d
a
F
igure 8. Loca
l
dat
a
W
ith weighted
m
akespan, inclu
d
s
tage-out) are s
h
a
mount of time
h
at our refined
s
160, given n
o
h
us, the we
i
5
05, and ܹ݁݅݃
݄
e
ctively. The da
h
t. Table 5 show
b
er of Map Ta
s
Time on E
Number of Ma
p
2316
2103
1581
t
he data move
m
a
riations in the
s
q
uite small, wh
i
graph, the data
o
finish, while t
h
n
ds to finish.
A
but the differe
n
d
uration of local
e
.
w
o-way data m
o
a
tasets: local
M
l
MapReduce t
u
a
sets, including
partition, th
e
d
ing data move
m
h
own in Figure
to finish the lo
c
s
cheduler confi
g
o
MapReduce
j
i
ghts are ܹ
݁
݄
ݐ

ൌ 0.2635
f
t
aset is also
p
ar
t
s how the datas
e
s
ks and MapRe
d
ach Cluster
p
Tasks
Exe
c
(
S
m
ent cost in the
w
s
ize of tarball di
i
ch is smaller t
h
stage-in proce
d
h
e data stage-o
u
A
lamo takes a l
i
n
ce is also insi
g
MapReduce e
x
o
vement cost o
f

M
apReduce inp
u
u
rnaround tim
e
data movemen
e
local Map
R
m
ent costs (bot
h
8. All three cl
u
c
al MapReduce
g
uration improv
e
j
obs are runni
n
݁
݄݅݃ݐ

ൌ 0.386
f
or Hotel, Ala
m
t
itione
d
accordi
n
e
t is partitioned.
d
uce Executio
n
c
ution Time
S
econds)
5915
5888
6395
w
eighted partiti
o
fferent number
h
an 2MB. As
w
u
re takes 12.34
u
t procedure ta
k
i
ttle bit longer
g
nificant given t
h
x
ecutions as in t
h
f
ࢽࣂ-weighted
u
ts and outputs
e
of ࢽࣂ-weight
e
t cost
R
educe executi
o
h
data stage-in a
n
u
sters take simi
l
jobs. We can s
e
s performance
b
n
g
0,
m
o,
n
g

n

o
n
of
w
e
to
k
es
to
h
e
h
e


e
d
o
n
n
d
l
ar
e
e
b
y
b
alanc
i
reducti
sorts t
h
p
roces
s
second
6. C
In thi
s
frame
w
cluster
s
imple
m
Reduc
e
functi
o
contro
l
multip
l
functi
o
contro
l
capaci
t
We us
e
p
erfor
m
workl
o
minim
u
There
a
future
applic
a
the CP
U
has la
r
Other
s
to be c
in our
c
scp, w
h
Howe
v
solutio
n
with a
can als
well i
n
alterna
t
also e
x
data se
t
7. A
This
w
and
M
p
rovid
i
feedba
c
to Cha
t
8. R
E
[1] B
u
ar
s
y
H
P
[2] D
e
d
a
(J
a
ht
[3] D
i
W
L
i
V
o
1
0
ht
i
ng workload a
m
i
on combines p
a
h
e results. Th
e
s
ing 6000 ma
p
d
s.
ONCLUSI
O
s
paper, we h
a
w
ork that can g
a
s
and run Map
R
m
ented in this
f
e
” model whe
r
o
ns: Map, Re
d
l
ler in our frame
l
e “local” Map
R
o
ns, and the lo
c
l
ler to run the
G
t
y-aware algorit
h
e multiple Aut
o
m
ance of our
o
ads are well b
a
u
m.
a
re several
p
ot
e
work. Based
a
tion, our sched
u
U
specification
s
r
ger data sets
t
s
cheduling met
r
c
onsidered. The
c
urrent prototy
p
h
ich may not
w
v
er, they can be
n
for remote jo
b
meta-scheduler
,
s
o be switche
d
t
o
n
heterogeneou
s
t
ive to transferr
i
x
plore the feasi
b
ts among globa
l
CKNOWL
E
w
ork funded in
p
M
icrosoft. Our
i
ng us early ac
c
c
k on our work
.
t
hura Herath for
R
EFERENC
u
yya, R., Abra
m
r
chitecture for a
r
y
stem in a globa
l
P
C ASIA'2000,
ean, J. and Ghe
m
a
ta processing o
n
anuary 2008), 1
0
t
tp://doi.acm.or
g
ing, Z., Wei, X.
,
W
. Customized P
l
i
fe Sciences Ap
p
olume 25, Num
b
0
.1007/s00354-
0
t
tp://dx.doi.org/
1
m
ong clusters. I
n
a
rtial results fr
o
e
average glob
a
p
tasks (ligan
d
O
N AND F
U
a
ve presented
a
ther computati
o
R
educe jobs ac
r
f
ramework ado
p
r
e computatio
n
d
uce, and Gl
o
work splits the
d
R
educe cluster
s
c
al results are
r
G
lobal Reduce
h
m to balance t
h
o
Dock runs as
framework. T
h
a
lanced and th
e
e
ntial improvem
on the compu
t
u
ling algorithm
s
. It will not be t
h
t
hat data mov
e
r
ics such as dis
k
remote job sub
m
p
e are built upo
n
w
ork well in a
h
e
replaced by ot
b
submission is
,
e.g., CSF and
N
o
solutions that
s
environments
i
ng data explici
t
b
ility of using a
l
controller and
l
E
DGMEN
T
p
art by the Per
v
special thanks
c
ess to Future
G
.
We also woul
d
r
discussions.
ES
m
son, D., Giddy,
resource manag
e
l
computational
China, IEEE C
S
m
awat, S. 2008
.
n
large clusters.
0
7-113. DOI=1
0
g
/10.1145/1327
4
,
Luo, Y., Ma,
D
l
ug-in Modules
p
lications, New
b
er 4, 373-394,
2
0
07-0024-6
1
0.1007/s00354
-
n
the final stag
e
o
m lowe
r
-level
c
a
l reduce time
d
/receptor doc
k
U
TURE W
O
a hierarchical
o
n resources fr
o
r
oss them. The
p
t the “Map-Re
d
n
s are express
e
o
bal Reduce.
d
ata set and ma
p
s
to run Map
a
r
eturned
b
ack t
o
function. We
u
h
e workload am
o
a test case to
e
h
e result sho
w
e
total makespa
n
ents we will a
d
te-intensive na
t
only takes con
s
h
e case when a
n
e
ment becomes
k
I/O and netw
o
m
ission and dat
a
n
the combinatio
n
h
eterogeneous e
t
her solutions.
O
to integrate ou
r
N
imrod/G. Dat
a
are more scala
b
, such as grid
f
t
ly from site to
s
shared file sys
t
l
ocal Hadoop cl
u
T
S
v
asive Technol
o
to Dr. Geoffr
e
G
rid resources
a
d
like to expres
s
J. Nimrod/G: a
n
e
ment and sche
d
grid, in: Procee
d
S
Press, USA, 2
0
.
MapReduce: s
i
Commun. AC
M
0
.1145/1327452
4
52.1327492
D
., Arzberger, P.
in Metaschedul
e
Generation Co
m
2
007, DOI:
-
007-0024-6
e
, the global
clusters and
taken after
k
ing) is 16
O
RK
MapReduce
o
m different
applications
d
uce-Global
e
d as three
The global
p
s them onto
a
nd Reduce
o
the global
u
se resource
o
ng clusters.
e
valuate the
w
s that the
n
is kept in
d
dress in our
t
ure of the
s
ideration of
n
application
significant.
o
rk I/O need
a
movement
n
of ssh and
nvironment.
O
ne possible
r
framework
a
movement
b
le and work
f
tp. As an
s
ite, we will
t
em to share
u
sters.
o
gy Institute
e
y Fox for
a
nd valuable
s
ou
r
thanks
n

d
uling
d
ings of the
0
00.
i
mplified
M
51, 1
.1327492
W., Li, W.
e
r CSF4 for
m
puting
[4] Frey, J., Tannenbaum, T., Livny, M., Foster, I., Tuecke, S.
2002. Condor-G: A Computation Management Agent for
Multi-Institutional Grids. Cluster Computing 5, 3 (July
2002), 237-246. DOI=10.1023/A:1015617019423
http://dx.doi.org/10.1023/A:1015617019423
[5] FutureGrid, http://www.futuregrid.org
[6] Gentzsch, W. (Sun Microsystems). 2001. Sun Grid Engine:
Towards Creating a Compute Power Grid. In Proceedings of
the 1st International Symposium on Cluster Computing and
the Grid (CCGRID '01). IEEE Computer Society,
Washington, DC, USA, 35-39
[7] Hadoop On Demand,
http://hadoop.apache.org/common/docs/r0.17.2/hod.html
[8] Henderson, R. L.. 1995. Job Scheduling Under the Portable
Batch System. In Proceedings of the Workshop on Job
Scheduling Strategies for Parallel Processing (IPPS '95),
Dror G. Feitelson and Larry Rudolph (Eds.). Springer-
Verlag, London, UK, 279-294.
[9] Huedo, E., Montero, R. S., and Llorente, I. M. 2004. A
framework for adaptive execution in grids. Softw. Pract.
Exper. 34, 7 (June 2004), 631-651. DOI=10.1002/spe.584
http://dx.doi.org/10.1002/spe.584
[10] Kavulya, S., Tan, J., Gandhi, R., and Narasimhan, P. 2010.
An Analysis of Traces from a Production MapReduce
Cluster. In Proceedings of the 2010 10th IEEE/ACM
International Conference on Cluster, Cloud and Grid
Computing (CCGRID '10). IEEE Computer Society,
Washington, DC, USA, 94-103.
DOI=10.1109/CCGRID.2010.112
http://dx.doi.org/10.1109/CCGRID.2010.112
[11] Keahey, K., Tsugawa, M., Matsunaga, A., and Fortes, J.
2009. Sky Computing. IEEE Internet Computing 13, 5
(September 2009), 43-51. DOI=10.1109/MIC.2009.94
http://dx.doi.org/10.1109/MIC.2009.94
[12] Litzkow, M. J., Livny, M., Mutka, M. W. Condor - A Hunter
of Idle Workstations. ICDCS 1988:104-111
[13] Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F.,
Belew, R. K., Goodsell, D. S. and Olson, A. J. (2009),
AutoDock4 and AutoDockTools4: Automated docking with
selective receptor flexibility. Journal of Computational
Chemistry, 30: 2785–2791. doi: 10.1002/jcc.21256
[14] National Biomedical Computation Resource, http://nbcr.net
[15] Qiu, J., Ekanayake, J., Gunarathne, T., Choi, J. Y., Bae, S.
Ruan, Y., Ekanayake, S., Wu, S., Beason, S., Fox, G., Rho,
M., Tang, H., “Data Intensive Computing for
Bioinformatics”, In Data Intensive Distributed Computing,
IGI Publishers, 2010
[16] Staples, G. 2006. TORQUE resource manager. In
Proceedings of the 2006 ACM/IEEE Conference on
Supercomputing (SC '06). ACM, New York, NY, USA,
Article 8. DOI=10.1145/1188455.1188464
http://doi.acm.org/10.1145/1188455.1188464
[17] Teragrid, http://www.teragrid.org
[18] Tsugawa, M., and Fortes, J. A. B. 2006. A virtual network
(ViNe) architecture for grid computing. In Proceedings of the
20th International Conference on Parallel and Distributed
Processing (IPDPS'06). IEEE Computer Society,
Washington, DC, USA, 148-148.
[19] Vaqué, M., Arola, A., Aliagas, C., and Pujadas, G. 2006.
BDT: an easy-to-use front-end application for automation of
massive docking tasks and complex docking strategies with
AutoDock. Bioinformatics 22, 14 (July 2006), 1803-1804.
DOI=10.1093/bioinformatics/btl197
http://dx.doi.org/10.1093/bioinformatics/btl197
[20] Yang, H., Dasdan, A., Hsiao, R., and Parker, D. S. 2007.
Map-Reduce-Merge: Simplified Relational Data Processing
on Large Clusters. In Proceedings of the 2007 ACM
SIGMOD International Conference on Management of Data
(SIGMOD '07). ACM, New York, NY, USA, 1029-1040.
DOI=10.1145/1247480.1247602
http://doi.acm.org/10.1145/1247480.1247602
[21] Zaharia, M., Borthakur, D, Sarma, J. S., Elmeleegy, K.,
Shenker, S., and Stoica, I. Job Scheduling for Multi-User
MapReduce Clusters, Technical Report UCB/EECS-2009-
55, University of California at Berkeley, April 2009.
[22] Zhang, C., De Sterck, H., "CloudBATCH: A Batch Job
Queuing System on Clouds with Hadoop and HBase," Cloud
Computing Technology and Science, IEEE International
Conference on, pp. 368-375, 2010 IEEE Second
International Conference on Cloud Computing Technology
and Science, 2010
[23] Zhou, S, Zheng, X., Wang, J., and Delisle, P. 1993. Utopia: a
load sharing facility for large, heterogeneous distributed
computer systems. Softw. Pract. Exper. 23, 12 (December
1993), 1305-1336. DOI=10.1002/spe.4380231203
http://dx.doi.org/10.1002/spe.4380231203