Comparison of Cloud Providers

prettybadelyngeΛογισμικό & κατασκευή λογ/κού

18 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

93 εμφανίσεις

Comparison of Cloud Providers


Presented by
Mi

Wang

Motivation


Internet
-
based cloud computing has
gained tremendous
momentum.
A
growing
number of companies
provide
public
cloud computing
services.


How to compare their
profermance
?


Help customer
choose a provider that best
fits its performance and cost needs
.


Help
cloud
provider know
the right
direction
for
improvements

A Benchmark for Clouds


Introduction


The
goal of benchmarking a software system is to
evaluate its average performance under a
particular
workload


TPC benchmarks are widely used today in
evaluating the performance of computer systems.


Transaction
Processing Performance Council
(TPC) is a non
-
profit organization to define
transaction processing and database benchmarks .


However, they are
not sufficient for analyzing
novel
cloud services

Requirements
of a Cloud Benchmark


Features and
Metrics


The
main advantages of
cloud computing
are
scalability, pay
-
per
-
use and
fault
-
tolerance


A
benchmark for
the cloud
should
test
the
these
features
and provide appropriate
metrics for
them.

Requirements
of a Cloud Benchmark


Architectures


Clouds may have different
service
architectures



A cloud benchmark
should be general enough
to cover the different
architectural variants

Problems of TPC
-
W


The TPC
-
W benchmark specifies an
online bookstore that
consists of
14 web
interactions allowing to browse, search,
display
, update
and order the products of
the store
.


The main measured parameter
is WIPS
,
the number of web interactions per
second(WIPS)
that the system
can handle.


TPC
-
W measure the cost by the ratio of
total cost to maximum WIPS

Problems of TPC
-
W


TPC
-
W
is
designed for
transactional
database systems. Cloud
systems
may not
offer
strong consistency
constraints it
requires.


WIPS
is
not for adaptable and scalable
systems.
Ideal clouds would compensate
increasing load

by
adding new processing
units. It
is not possible to report the
maximum
WIPS.

Problems of TPC
-
W


Metric for cost is
not
applicable for
clouds. Different price
-
plans
and the
lot
-
size
prevent to calculate a
single $/
WIPS
number.


TPC
-
W
does not
reflect
technical
evolution of web
applications


TPC
-
W
lacks
adequate metrics
for
measuring
the features of cloud systems
like scalability,
pay
-
per use and
fault
-
tolerance


Ideas for a new benchmark


Features


Should
analyze the ability of
a dynamic
system
to adapt to a changing load
in terms
of
scalability and
costs


Should run
in different
locations


Should
comprise web interactions
that
resemble
the access patterns of Web 2.0 like
applications.
M
ultimedia content should also
be included.

Ideas for a new benchmark


Configurations: A
new benchmark can
choose
between three
different levels of
consistency


Low: All web interactions use only the
BASE
(
Basically Available, Soft
-
State, Eventually
Consistent)

guarantees


Medium: The web interactions use a mix of
consistency guarantees
ranging from BASE to
ACID
.


High: All web interactions use only the ACID
guarantees.

Ideas for a new benchmark


Metrics: Scalability


Ideally,
clouds
should scale linearly and
infinitely with
a constant cost per
WI



Increasing the issued WIPS over time and
continuously counting the
WI
in a given
response
time. Measure the deviation
between
issued
WIPS and answered WI

Ideas for a new benchmark


Metrics: Scalability


Correlation
C
oefficient R
2

𝑅
2
=
1


𝑦
𝑖

𝑓
𝑖
2

𝑦
𝑖

𝑦

2


R
2


(0, 1),
R
2
= 1
indicates perfect linear

scaling

f
i

y
i

f
i+1

f
i+2

f
i+3

f
i+4

y
i+1

y
i+2

y
i+3

y
i+4

f(x) =
x
b

OR

Non
-
linear regression function f(x) =
x
b

b

(0, 1
), b =
1 indicates perfect linear

scaling

Ideas for a new benchmark


Metrics: Cost


Measure the
cost in dollars per
WIPS


Price
plans
might cause
variations of the $/WIPS


Measure average and
standard
deviation of cost
per
WIPS


Plans are fully utilized

Ideas for a new benchmark


Metrics:
Fault
tolerance


Failure is defined as a certain percentage of the
resources
used for
the application is shut
down


Clouds would be able to replace these resources
automatically


Measure the
ratio between WIPS
in RT
and Issued
WIPS


Failure

Goals of
CloudCmp


P
rovide
performance and
cost
information
about
various cloud providers


Help
a
provider identify
its under
-
performing services compared to its
competitors
.


Provide
a fair
comparison


C
haracterizing
all providers using the
same
set
of workloads and
metrics


Skip
specialized
services that
only a few
providers offer

Goals of
CloudCmp


Reduce measurement
overhead
and
monetary costs


periodically
measure
each provider at
different times of day
across all
its
locations


Comply with cloud providers’ use
policies


Cover
a representative set of cloud
providers

Method: Select Provider


Amazon AWS



Microsoft Azure



Google
AppEngine



Rackspace
CloudServers

Method: Identify core functionality


Elastic
compute
cluster


The
cluster includes a
variable number
of virtual instances
that run application code.


Persistent storage


The
storage service keeps the state
and data
of an
application and can be accessed by
application instances
through API calls.


Intra
-
cloud network


The
intra
-
cloud network connects
application instances
with each other and with shared services.


Wide
-
area network.


The
content of an application is
delivered to
end users
through the wide
-
area network from
multiple data
centers
(DCs) at different geographical locations.

Method: Identify core functionality


Services
offered by the providers

Provider

Elastic Cluster

Storage

Wide
-
area
Network

Amazon

Xen

VM

SimpleDB

(table),
S3 (blob), SQS
(queue)

3 Data Centers (2
in US, 1 in EU)

Microsoft

Azure VM

XStore

(table,
blob, queue)

6 Data Centers (2
each in US, EU,
and Asia)

Google
AppEngine

Proprietary
sandbox

DataStore

(table)

Unpublished
number of
Google Data
Centers

Rackspace
CloudServers

Xen

VM

CloudFiles

(blob)

2 Data Centers
(all in US)

Method: Choose
Performance
Metrics


Elastic
compute
cluster


Provides
virtual instances that host and run
a
customer’s
application code
.


Is charged
per
usage:


IaaS
: Time of
an instance remains allocated



PaaS
:
CPU cycles
consumes


Elastic: can
dynamically scale up and down
the number of
instances

Method: Choose Performance Metrics


Metrics to compare Elastic
compute
cluster


Benchmark finishing
time


how long the instance takes to complete the
benchmark tasks


Scaling latency


time taken by a provider to allocate a new instance
after a customer requests
it


Cost
per
benchmark


cost to complete each benchmark task


Method: Choose Performance Metrics


Persistent
Storage


Three
common
types:
table
,
blob
, and
queue


Table

storage is designed to store
structural data
like conventional database
s


Blob

storage
is designed to store unstructured
blobs such
as binary
objects


Queue

storage implements a global message queue
to
pass messages
between different
instances


T
wo
pricing
models:


Based on CPU
cycles consumed
by
an
operation


Fixed cost per operation

Method: Choose Performance Metrics


Metrics
to compare
Persistent Storage


Operation response
time


Time for
a storage operation to finish


Time
to
consistency


time between
when data
is written to the storage
service and when all reads for
the data
return
consistent and valid results


Cost
per
operation


Method: Choose Performance Metrics


Intra
-
cloud
Network


Connects
a customer’s instances
among
themselves
and with the shared services
offered by a
cloud


None
of the providers charge for traffic
within their data
centers.
Inter
-
datacenter
traffic is charged based on the amount of data


Metrics
to
compare
Intra
-
cloud
Network


Path capacity:
TCP throughput


Latency

Method: Choose Performance Metrics


Wide
-
area
Network


The
collection of
network paths
between a
cloud’s data centers and external hosts on the
Internet


Metrics to
compare
Wide
-
area Network


optimal wide
-
area network
latency


The
minimum latency between
testers’ nodes and
any data center owned by a provider

Implementation:
Computation Metrics


Benchmark
tasks


A modified set
of Java
-
based benchmark tasks
from SPECjvm2008 that satisfies constraints
of all the providers


Metrics: Benchmark
finishing
time


Run
the benchmark tasks on
each of
the
virtual instance types provided by the clouds,
and
measure their
finishing
time


R
un
instances
of the
same benchmark task in
multiple
threads to test
multi
-
threading
performance

Implementation:
Computation Metrics


Metrics
: Cost per
benchmark


Multiply the
published per hour
price by
finishing
time for those who charge by time


Using
billing
API for those who charge by CPU
cycle.


Metrics:
Scaling latency


Repeatedly request new
instances and record
the
request

time
and available time


Divide the latency
into two
segments to locate
the performance bottleneck


Provisioning latency: Request time to powered
-
on time


Booting latency: Powered
-
on
time
to available time

Implementation:
Storage Metrics


Benchmark
tasks


Use Java
-
based client to test API to
get, put or
query data from the
service


N
on
-
Java
-
based clients are also tested


Mimic streaming workload
to
avoid the
potential impact of memory or disk
bottlenecks at
the client’s
side


Implementation:
Storage Metrics


Metrics:
Response
time


The time from
when the client instance begins
the operation to when the
last byte
reaches
the
client


Metrics: Throughput


The maximum rate
that a client instance
obtains from the storage
service


Implementation:
Storage Metrics


Metrics:
Time to
Consistency


Write
an object to a storage
service, then
repeatedly read the object and measure how
long it
takes before correct result is returned


Metrics:
Cost per
operation


Via billing API

Implementation:
Network Metrics


Metrics: Intra
-
cloud Network Throughput and
Latency


Allocate
a pair of instances in the same or different
data centers, run standard tools such as
iperf

and
ping

between the
two
instances


Metrics: Optimal Wide
-
area Network Latency


Run an
instance in each data center owned by the
provider and
ping these
instances from over 200
nodes
on
PlanetLab
(
a group of computers available as a
testbed

for computer networking and distributed systems research
).
Record
the
smallest
RTT


For
AppEngine
: collect the IP addresses of
the
instance
from each of the
PlanetLab

nodes.
Ping all of
these IP addresses from each of the
PlanetLab

nodes

Results


Anonymize

the identities of the providers
in
our
results, and refer to them as C1 to
C4
(But it is easy to see that: C1


AWS, C2


Rackspace, C3


AppEngine
, C4


Microsoft
Azure)


Test
all
instance types
offered by C2 and C4,
and the general
-
purpose
instances from C1


Refers to instance types as
provider.i
,
i

denotes the tier of
service


Compare instances from
both
Linux and
Windows for
experiments
depend
on the
type of
OS. Test
Linux
instances for others

Results


Cloud instances tested

Results:
Elastic Compute Cluster


Metrics: Benchmark finishing time

Results:
Elastic Compute Cluster


Price
-
comparable
instances
offered by
different
providers have widely different CPU and
memory
performance.


The
instance types appear to be
constructed in
different ways
.


For C1, the high
-
end instances
may have
faster
CPUs


F
or C4, all instances
might share the same type of
physical
CPU


C2 may be lightly loaded during the test


The
disk I/O
intensive task
exhibits high variation
on some C1 and C4 instances,
probably due
to
interference from other
colocated

instances

Results:
Elastic Compute Cluster


Metrics: Performance at Cost

Results:
Elastic Compute Cluster


For
single
-
threaded tests, the smallest
instances of most
providers are
the most
cost
-
effective


For
multi
-
threaded
tests,
the high
-
end
instances
are
not more
cost
-
effective than
the low
-
end ones
.


The prices
of high
-
end instances are proportional
to the number of
CPU cores


Bounded by memory bus and
I/O bandwidth


For parallel
applications it might be more cost
-
effective to use
more low
-
end
instances

Results:
Elastic Compute Cluster


Metrics: Scaling
Latency

Results:
Elastic Compute Cluster


Metrics: Scaling
Latency


All cloud providers can allocate new
instances
quickly
with the average scaling latency below 10
minutes


Windows instances appear to take longer time to
create than
Linux ones


For C1, Windows ones have larger
booting latency
,
possibly due to slower
CPUs


For C2, provisioning latency of the Windows instances is
much
larger. It is
likely that C2 may have different
infrastructures
to provision
Linux and Windows
instances.

Results:
Persistent Storage


Table
Storage


Test the
performance of three operations: get,
put, and
query


Each operation runs against two pre
-
defined
data tables
: a
small one with 1K entries, and a
large one with 100K
entries


Repeat
each operation
several hundred times


C2
is not tested because it does
not provide a
table
service

Results:
Persistent Storage


Table
Storage: Response Time


The three
services perform similarly for both get
and put
operations


For the query operation, C1
appears to
have
a
better
indexing
strategy


None of the services show
noticeable
performance degradation in multiple
concurrent
operations

Results:
Persistent Storage


Table
Storage:
Time to
Consistency


40% of the get operations in C1 see
inconsistency when triggered
right after a put,
Other
providers exhibit no such
inconsistency


C1 does provide an API option to
request strong
consistency but disables it by default
.

Results:
Persistent Storage


Table
Storage: Cost per operation


Both C1 and C3 charge lower cost for get/put
than
query


C4 charges
the same across
operations and
can
improve its
charging model
by accounting for the
complexity of the operation.

Results:
Persistent Storage


Blob Storage:
R
esponse time

Results:
Persistent Storage


Blob Storage:
R
esponse time


Blobs
of different sizes may stress different
bottlenecks. The
latency for small blobs can be
dominated by one
-
off
costs whereas
that for
large blobs can be determined by service
throughput
, network
bandwidth, or client
-
side
contention
.


C2’s
store may
be tuned for read heavy
workload.

Results:
Persistent Storage


Blob Storage:
R
esponse time of multiple
concurrent operations


C1
and C4’s blob service throughput is well
-
tuned for
multiple concurrent
operations


Results:
Persistent Storage


Blob Storage: Maximum Throughput






C1 and C2’s blob
service throughput
is close to
their intra
-
datacenter network
bandwidth


C4’s blob
service throughput
of a large
instance

also corresponds to TCP throughput inside its
datacenter, and may not be constrained by
instance itself.

Results:
Persistent Storage


Blob Storage: Cost per operation


The
charging models
are similar for all three
providers and are based on
the number
of
operations and the size of the blob
.


No differences

Results:
Persistent Storage


Queue Storage: R
esponse time


Message size is 50 Byte

Results:
Persistent Storage


Queue
Storage


N
o significant performance degradation is
found in sending up to 32 concurrent
messages


R
esponse
time of the queue
service is
on the
same order of magnitude as that of the table
and
blob services.


B
oth
services charge similarly

1
cent per
10K
operations

Results:
Intra
-
cloud Network


Data Centers


C3 is not considered because
it does
not allow
direct communication between
instances


Results:
Intra
-
cloud Network


Intra
-
datacenter
Network


C1 and C4
provide very
high TCP
throughput


C2 has much lower throughput

Results:
Intra
-
cloud Network


Inter
-
datacenter
Network


Only the
results for data centers within the
US is shown


T
he
throughput across datacenters is much smaller than
that
within the datacenter

Results: Wide
-
area
Network


O
ptimal
wide
-
area
latency

Results: Wide
-
area
Network


O
ptimal
wide
-
area
latency


Latency of C3 are lower than
other providers,
may due to its
widely dispersed
presence


C1has a larger fraction of nodes that have an
optimal latency higher than 100ms,

which are
mostly in Asia and
South America
, where C1
does not have a
presence


C2 has the worst latency distribution because
it has the
smallest number
of data centers.

Using
CloudCmp
: Case Studies


D
eploy
three simple applications on the
cloud to
check whether the benchmark
results from
CloudCmp

are
consistent
with
the performance experienced by real
applications
.


a storage intensive e
-
commerce
website


a computation
intensive
application for DNA
alignment


a latency sensitive
website that serves static
objects

Using
CloudCmp
: Case Studies


E
-
commerce
Website


a Java implementation of
TPC
-
W,
database
operations are redirected to each cloud’s
table storage
APIs


M
ajor
performance
goal is to minimize
the
page generation
time.
The performance
bottleneck
lies in accessing
table
storage.


C1 offers the lowest table service
response
time
among all
providers, so it should have
best performance for TPC
-
W

Using
CloudCmp
: Case Studies


E
-
commerce
Website


C1
has
the lowest page generation time


C4 has lower generation time than C3 for most pages
except for
pages 9 and
10, which
contain many query
operations
, since C4 performs better than C3 in query
operations.

Using
CloudCmp
: Case Studies


Parallel Scientific
Computation


Blast, a parallel computation application for
DNA alignment


Blast instances communicate with each other
through the
queue storage
service, and also
use
blob storage service to
store results


Major performance goal
is to
reduce job
execution time given a budget on number of
instances


At
a similar price point, C4.1 performs
better
than C1.1

Using
CloudCmp
: Case Studies


Parallel Scientific
Computation

Using
CloudCmp
: Case Studies


Latency
Sensitive
Website


Set up a
simple web server to serve
static
pages. Download
the pages from
PlanetLab

nodes around the
world


The performance
goal is to minimize the page
downloading time
from many nodes


Major
bottleneck is
the wide
area network
latency


C3 having the lowest wide
-
area network
latency
distribution


Using
CloudCmp
: Case Studies


Latency
Sensitive
Website


C3 has
the smallest
page downloading time

Limitations
and
Future Works


Limitations


In
several occasions
,
CouldCmp

sacrifices
depth for
breadth


The results
is only
a snapshot comparison
among cloud
providers


Future works


Use
CloudCmp’s

measurement results to
make application
-
specific
performance
prediction


It
can be promising to develop a meta
-
cloud
that
combines the
diverse strengths of various
providers