Cloud Computing - Dashboard - University of Illinois - Engineering ...

burnwholeInternet και Εφαρμογές Web

5 Φεβ 2013 (πριν από 4 χρόνια και 4 μήνες)

166 εμφανίσεις

Cloud Computing

Reza Farivar

farivar2@illinois.edu


Slides adapted from cloud computing course CS 598

By Prof. Roy Campbell, Reza Farivar

Objectives and Syllabus

1


Introduction to
some of the
major developments
in
cloud
computing


Teach
how
to re
-
think batch processing computational
problems to fit the MapReduce programming paradigm,
and other streaming computations in terms of the Strom
framework


Through hands
-
on experience in labs, reinforce and
deepen
your knowledge
of Hadoop MapReduce and Storm


Overview


What is meant by


Cloud Computing


Utility Computing


{Infrastructure, Platform, Software} as a Service


Why do corporations need to pay attention


General
principles


Research


2

Tremendous Buzz

“No less influential than e
-
business”

(Gartner, 2008)

“Cloud computing achieves a
quicker return on
investment


(Lindsay Armstrong of
salesforce.com, Dec 2008)

“ Economic downturn, the
appeal of that cost advantage
will be greatly magnified"


(IDC, 2008)

“Revolution, the biggest upheaval
since the invention of
the PC

in the 1970s […] IT departments will have little left to
do once the bulk of business computing shifts […] into the
cloud”

(Nicholas Carr, 2008)


Not only is it
faster and more
flexible
, it is
cheaper
. […] the
emergence of cloud models
radically
alters the cost benefit decision


(FT Mar 6, 2009)

The economics are
compelling
, with business
applications made three to five times cheaper and
consumer applications five to 10 times
cheaper

(Merrill Lynch, May, 2008)

Domestic cloud
computing estimated to
grow at 53%
(moneycontrol.com,
June, 2011)

3

Gartner’s 2011 Hype Cycle

4

Cloud Computing

A Computing paradigm where the
boundaries of computing will be
determined by economic rationale rather
than technical limits

Professor
Ramnath

Chellappa

Emory University


It is not just Grid, Utility, or Autonomic
computing
.

5

NIST Definition

July 5, 2011:


The NIST Definition of Cloud Computing identified
cloud computing as:


a model for enabling ubiquitous, convenient, on
-
demand network access to a shared pool of
configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can
be rapidly provisioned and released with minimal
management effort or service provider interaction.


6

Cloud
Characteristics


On
-
demand self
-
service


Ubiquitous network access


Location independent resource
pooling


Rapid elasticity


Pay per use



7

Delivery
Models


Software
as a Service (
SaaS
)


Use provider’s applications over a network


SalesForce.com


Platform
as a Service (
PaaS
)


Deploy
customer
-
created applications to a cloud


AppEng


Infrastructure
as a Service (
IaaS
)


Rent processing, storage, network capacity, and other
fundamental computing
resources


EC2, S3

8

Software Stack

Mobile (Android), Thin client (
Zonbu
)
Thick client (Google Chrome)

Identity, Integration Payments, Mapping,
Search, Video Games, Chat

Peer
-
to
-
peer (
Bittorrent
), Web app
(twitter),
SaaS

(Google Apps, SAP)

Java Google Web Toolkit,
Django
, Ruby on
Rails, .NET

S3,
Nirvanix
,
Rackspace

Cloud Files,
Savvis
,

Full virtualization (
GoGrid
), Management
(
RightScale
), Compute (EC2), Platform
(Force.com)


Services

Application

Platform

Storage

Infrastructure

9

NIST: Interactions between Actors in
Cloud Computing

10

Cloud Consumer

Cloud Provider

Cloud Broker

Cloud Auditor



The communication path between a cloud provider & a cloud consumer

The communication paths for a cloud auditor to collect auditing information

The communication paths for a cloud broker to provide service to a cloud
consumer





Predictions


By 2015, those companies who have adopted
Big Data and extreme information
management (their term for this area) will
begin to outperform their unprepared
competitors by 20% in every available
financial metric
.


Gartner

11

Forbes Predictions 2011


Cloud Adopters Embrace Cloud For Both
Innovation and Legacy
Optimization


Replace most new procurement with cloud
strategies.



Start with private clouds as a stepping stone to
public clouds
.


Get real about security
.

Move to private clouds as
a back up to public clouds.



The Bottom Line: Cloud Adoption Provides
a
Path
to
t
he
Next Generation Enterprise

12

Google Trends:
-

Cloud computing,
virtualization





cloud
computing


virtualization

13

Utility Computing

“Computing may someday be organized as a public
utility, just as the telephone system is organized as a
public utility”

John McCarthy, 1961

14

Perils of Corporate Computing


Own information systems



However


Capital investment



Heavy fixed costs



Redundant expenditures



High energy cost, low CPU utilization



Dealing with unreliable hardware



High
-
levels of overcapacity (Technology and Labor)



NOT SUSTAINABLE

15

Google: CPU Utilization


Activity profile of a sample of 5,000 Google Servers over a period of 6 months

16

Google: Energy Overhead


17

Google: Service Disruptions


18

Utility Computing


Let economy of scale prevail


Outsource all the trouble to someone else


The utility provider will share the overhead
costs among many customers, amortizing the
costs


You only pay for:



the amortized overhead



Your real CPU / Storage / Bandwidth usage

19

Why Utility Computing Now


Large data stores


Fiber networks


Commodity computing


Multicore

machines

+


Huge data sets


Utilization/Energy


Shared people

Utility Computin
g

20

Data Intensive Computing


Data collection too large to transmit economically over
Internet
---

Petabyte

data collections


Computation produces small data output containing a
high density of information


Implemented in Clouds


Easy to write programs, fast turn around.


MapReduce
.


Map(k1, v1)
-
> list (k2, v2)


Reduce(k2,list(v2))
-
> list(v3)


Hadoop
, PIG, HDFS,
Hbase



Sawzall
, Google File System,
BigTable


21


Open Cloud Computing Interface


Infrastructure


EC2 API


Simple Storage Service (S3) API


Windows Azure Storage Service REST APIs


Windows Azure Service Management REST APIs


Deltacloud

API


Rackspace Cloud Servers API


Rackspace Cloud Files API


Cloud Data Management Interface



vCloud

API


GlobusOnline

REST API

22

Cloud Interoperability Standards

Cloudonomics

CLOUD


晲潭o慮 散e湯浩m 癩v睰潩o琺

1.
C
ommon Infrastructure


pooled standardized resources, statistical multiplexing

2.
L
ocation
-
independence


ubiquitous availability meeting performance requirements


latency reduction and user experience enhancement

3.
O
nline connectivity


an enabler of other attributes ensuring service access


(not discussed here)

4.
U
tility Pricing


usage
-
sensitive or pay
-
per
-
use pricing



benefits environments with variable demand levels

5.

on
-
D
emand Resources


scalable, elastic resources provisioned and de
-
provisioned without delay or costs
associated with change


Sometimes in contrast with each other (as we will see)


Cloudonomics
: A Rigorous Approach to Cloud Benefit Quantification, Joe
Weinman
,
http://
journal.thedacs.com/issue/62/199


23

1. The Value of Common Infrastructure


Economies of scale


Reduced overhead costs


Buyer power through volume purchasing


Statistics of Scale


For
infrastructure built to peak requirements: Multiplexing
demand


higher utilization


Lower cost per delivered resource than unconsolidated workloads


For infrastructure built to less than peak: Multiplexing
demand


reduce the
unserved

demand


Lower loss of revenue or a Service
-
Level agreement violation
payout.



24

A
useful Measure of “Smoothness



The
coefficient of variation

C
V






the
variance

σ
2

nor the correlation
coefficient




Ratio
of the standard
deviation

σ

to
the absolute value of
the mean

|
μ
|


“smoother”
curves:


large mean
for a given standard
deviation


or smaller standard
deviation for a given
mean


Importance of
smoothness
:



a
facility with fixed assets servicing highly variable demand will
achieve lower utilization than a similar one servicing relatively
smooth demand
.




Multiplexing
demand from multiple sources may reduce
the coefficient of
variation
C
V

25

Coefficient
of variation

C
V


X
1
,

X
n

,…,
X
n

independent

random
variables

for
demand


Identical

standard
variation

σ

and mean

μ


Aggregated

demand
:


Mean



sum of
means
:
n.

μ


Variance


sum of variances:
n.
σ
2


Coefficient of variance


𝑛
.
𝜎
𝑛
.
𝜇
=
𝜎
𝑛
.
𝜇
=
1
𝑛

𝑣


Adding
n

independent demands reduces the
C
v

by
1/√n


Penalty of insufficient/excess resources grows smaller


Aggregating 100 workloads bring the penalty to 10%

26

But what about dependent workloads?


Negative correlation demands


X
and

1
-
X


Sum is random variable 1


Appropriate selection of customer segments


Perfectly correlated demands


Aggregated demand:
n.X
, variance of sum: n
2
𝜎
2
(X)


Mean:
n.
μ
,
standard deviation:
n.
𝜎
(X
)


Coefficient of Variance remains constant


Simultaneous peaks


27

Common Infrastructure in Real World


Correlated demands:



private, mid
-
size and large
-
size providers can experience similar
statistics of scale


Independent demands:


Midsize providers can achieve similar statistical economies to an
infinitely large provider


Available data on economy of scale for large providers is
mixed


use the same COTS computers and components


Locating near cheap power supplies


everyone can do that


Early entrant automation tools


3
rd

parties take care of it


Take away lesson
: you don’t need to be as large as
Amazon.com to compete!



At least according to “Value of Common Infrastructure”




28

2. Value of Location Independence


We used to go to the computers , but applications, services
and contents now come to us!


Through networks: Wired, wireless, satellite, etc.


But what about latency?


Human response latency: 10s to 100s
miliseconds


Latency is correlated with:


Distance (Strongly)


Routing algorithms of routers and switches (second order effects)


Speed of light in fiber: only 124 miles per
milisecond


If the
G
oogle word suggestion took 2 seconds



VOIP with latency of 200ms or more



Supporting a global user base requires a dispersed service
architecture


Coordination, consistency, availability, partition
-
tolerance


Investment implications


29

Covering a large area with centers


Simple model: planar (a state or small country)






Realistic model: spherical (continents)



r
= radius, related to latency/distance


n
= Number of service nodes


Exact for planar, almost exact for spherical with large
n


(Details of this expression is in the paper)


Diminishing returns make private investment difficult


But clouds can help! What if you had only 5 users in a remote
area? Use another cloud providers’ resources






30

4. Value of Utility Pricing


As mentioned before, economy of scale might not
be very effective


But cloud services don’t need to be cheaper to be
economical!


Consider a car


Buy or lease for $10 per day


Rent a car for $45 a day


If you need a car for 2 days in a trip, buying would be
much more costly than renting


It depends on the demand


31

Utility Pricing in Detail


D(t)
: demand for resources,
0<t<T


P = max( D(t) )
: Peak Demand


A =
Avg
( D(t) )
: Average Demand


B = Baseline (owned) unit cost ;
B
T

= Total Baseline Cost


C = Cloud unit cost; C
T
= Total
Cloud Cost


U = C / B
: Utility Premium


For the rental car example, U=4.5

----------------------------------------------------------------


C
T

=


×

×

𝑡
𝑑𝑡
=

×

×

×

𝑇
0


B
T

=
𝑃
×

×



Because the Baseline should handle peak demand


When is cloud cheaper than owning?



𝑇
<


𝑇


×

×

×

<
𝑃
×

×



<

𝑃
𝐴


When utility premium is less than ratio of Peak demand to Average
demand

32

Utility Pricing in Real World

1.
In practice demands are often highly spiky


News stories, marketing promotions, product launches,
Internet flash floods (
slashdot

effect), tax season,
Christmas shopping, processing a drone footage for a 1
week border skirmish, etc.

2.
Often a hybrid model is the best


You own a car for daily commute, and rent a car when
traveling or when you need a van to move


Key factor is again the ratio of peak to average demand


But we should also consider other costs


Network cost (both fixed costs and usage costs)


Interoperability overhead


Consider Reliability,
accessability


33

5. Value of on
-
Demand Services


Simple problem: when owning your resources,
you will pay a penalty whenever your resources
do not match the instantaneous demand


Either pay for unused resources, or suffer the penalty
of missing service delivery


D(t)
: Instantaneous Demand at time
t


R(t) :
Resources at time
t



Penalty cost


If demand is flat, penalty = 0


If demand is linear, periodic provisioning is
acceptable

34

E.g. Penalty Costs for exponential
demand


Penalty cost


If
demand is
exponential
(
D(t)=e
t
)
,
any fixed
provisioning
interval (
t
p
)
according to the current
demands will fall
exponentially
behind


R(t) =


D(t)


R(t) =


Penalty
cost

c.k
1
e
t


35

Behavioral
Cloudonomics


humans
do not always make purely rational and
quantitative
decisions


Cons:



loss aversion:” people generally get less satisfaction from
gaining a dollar than they feel pain from losing
one



customers
may recognize the financial advantage of pay
-
per
-
use, but avoid it due to
a
“flat
-
rate”
bias


E.g
. fear of an unexpected large monthly cell phone
bill


a
flat
-
rate plan
> measured
service


Pros:


special attraction of “free”



The lack of upfront investment in using the cloud
becomes
extremely attractive


36

Computational Complexity


Satisfying demands with constraints (e.g. distance) is
computationally intractable


CLOUD
COMPUTING DEMAND SATISFIABLITY is
NP
-
complete


based
on a transformation of BOOLEAN
3
-
SATISFIABILITY


E
ven
if there is exactly the right aggregate capacity in a
distributed cloud system,
it may be impossible to find the
right assignment of capacity to
demand


E.g. map scheduling in
Hadoop

vs

File chunk locations in HDFS


Common Infrastructure and Location Independence (latency
optimization) are usually a tradeoff


we can choose to optimize the statistics of scale by building
fewer consolidated facilities, and we can choose to optimize user
experience by building more, dispersed
facilities


determining an optimal tradeoff is
intractable

37

Questions


Weinman

mentions a “classification” of clouds
based on economic models


Compute Clouds


Hotel Clouds


Rental Car Clouds


Restaurant Clouds


Etc.


What do you think about each category? Can
you come up with others?

38