X-Tracing Hadoop

triangledriprockInternet και Εφαρμογές Web

7 Αυγ 2012 (πριν από 5 χρόνια και 5 μήνες)

352 εμφανίσεις

UC Berkeley

1

Beyond the Hype:

A Berkeley View of Cloud Computing


Anthony D. Joseph*,
UC Berkeley

Reliable Adaptive Distributed Systems
Lab




TNC 2010

31 May 2010


Image: John Curley http://www.flickr.com/photos/jay_que/1834540/

*Director, Intel Labs Berkeley

http://abovetheclouds.cs.berkeley.edu/

Outline


What is it?


Why now?


Cloud economics and opportunities


Challenges

2

3

Datacenter is new “server”



“Program”

== Web search, email, map/GIS, …


“Computer”

== 1000’s computers, storage, network


Warehouse
-
sized facilities and workloads


New datacenter ideas (2007
-
2008): truck container (Sun),
floating (Google), datacenter
-
in
-
a
-
tent (Microsoft)


How to enable innovation in new services without first
building & capitalizing a large company?

photos: Sun Microsystems & datacenterknowledge.com

RAD Lab 5
-
year Mission

Enable
1 person

to develop, deploy, operate

next
-
generation Internet application


Key enabling technology: Statistical machine learning


debugging, power management, performance prediction, ...


Highly interdisciplinary faculty & students


PI’s:
Patterson/Fox/Katz
(systems/networks), Jordan (machine
learning
), Joseph (systems/networks/security), Stoica
(networks/P2P), Franklin
(databases)


2
postdocs
, ~30 PhD students, ~5 undergrads

4

Examples


Predict performance of complex software system
when demand is scaled up


Automatically add/drop servers to fit demand,
without violating
SLO


Distill millions of lines of log messages into an
operator
-
friendly “decision tree” that pinpoints
“unusual” incidents/conditions


Recurring themes
:


Cutting
-
edge SML methods work where simpler methods
have failed


Demonstrate applicability on
at least
1000’s of machines!

5

Utility Computing Arrives


Amazon Elastic Compute Cloud (EC2)


“Compute unit”
rental: $
0.10
-
0.80

0.085
-
0.68/hour


1 CU ≈ 1.0
-
1.2 GHz 2007 AMD
Opteron
/Intel Xeon core











No up
-
front cost, no contract, no minimum


Billing rounded to nearest
hour (also regional, spot pricing)


New paradigm(!)
for deploying services
?, HPC?

Platform

Units

Memory

Disk

Small
-

$0.10
$.085/hour

32
-
bit

1

1.7GB

160GB

Large
-

$0.40
$0.35/hour

64
-
bit

4

7.5GB

850GB


2 spindles

X Large
-

$0.80
$0.68/hour

64
-
bit

8

15GB

1690GB


4 spindles

High CPU Med
-

$0.20
$0.17

64
-
bit

5

1.7GB

350GB

High CPU Large
-

$0.80
$0.68

64
-
bit

20

7GB

1690GB

High
Mem

X Large
-

$0.50

64
-
bit

6.5

17.1GB

1690GB

High
Mem

XXL
-

$1.20

64
-
bit

13

34.2GB

1690GB

High
Mem

XXXL
-

$2.40

64
-
bit

26

68.4GB

1690GB

Northern VA cluster


Major enabler

for SW as a Service (
SaaS
) startups


Animoto

traffic doubled every 12 hours for 3 days when
released as
Facebook

plug
-
in in April 2008











Peak EC2 instances:


Mon 50, Tues 400, Wed 900, Friday 3400

Cloud as Major Enabler

But...


What
is
cloud computing,
exactly?

8

“It’s nothing new”

“...we’ve redefined Cloud Computing to
include everything that we already do... I
don’t understand what we would do
differently ... other than change the
wording of some of our ads.”


Larry Ellison, CEO, Oracle (Wall Street
Journal, Sept. 26, 2008)

9

“It’s a trap”

“It’s worse than stupidity: it’s marketing
hype. Somebody is saying this is
inevitable

and whenever you hear that,
it’s very likely to be a set of businesses
campaigning to
make
it true.”


Richard Stallman, Founder, Free Software
Foundation (The Guardian, Sept. 29,
2008)

10

Above the Clouds:

A Berkeley View of Cloud Computing

abovetheclouds.cs.berkeley.edu


White paper by RAD Lab PI’s and students


Clarify terminology around Cloud Computing


Quantify comparison with conventional computing


Identify Cloud Computing challenges & opportunities


Why can we offer new perspective?


Strong engagement with industry


Users of cloud computing in our own research and
teaching in last 3 years


Goal: stimulate discussion on
what’s really new


without resorting to weather analogies
ad nauseam

11

Cloud Computing:

True Utility

Cloud Computing
: App and Infrastructure over Internet

Software as a Service
: Applications over the Internet

Utility Computing:



“Pay
-
as
-
You
-
Go”
Datacenter Hardware and Software

Three New Aspects to Cloud Computing

The Illusion of Infinite Computing Resources Available on Demand

The Elimination of an Upfront Commitment by Cloud Users

The Ability to Pay for Use of Computing Resources

on a Short
-
Term Basis as Needed

Classifying Clouds

App Model for Utility Computing

Something

New

???

???

???

Amazon EC2

Close to Physical
Hardware

User Controls
Most of Stack

Hard to Auto
Scale and Failover

Windows Azure

.NET and CLR…
ASP.NET Support

More Constraints
on User Stack

Auto Provisioning
of Stateless App

Google
AppEngine

App Specific Traditional
Web App Model

Constrained
Stateless/
Stateful

Tiers

Auto Scaling and

Auto High
-
Availability

Constraints on App Model Offer Tradeoffs… Lots of Ongoing Innovation…

Lower
-
level,

Less
managed

“flexibility/portability”

Higher
-
level,

More
managed

“more built
-
in functionality”


Instruction Set VM (Amazon EC2, 3Tera)


Managed runtime VM (Microsoft Azure)


Framework VM (Google
AppEngine
, Force.com)

Outline


What is it?


Why now?


Cloud economics and opportunities


Challenges

14

Why Now (not then)?

15

Economies of Scale for Humongous Datacenters

(1,000’s to 10,000’s of
commodity

computers)

Electricity

Put Datacenters
at Cheap Power

Network

Put Datacenters
on Main Trunks

Operations

Standardize and
Automate Ops

Hardware

Containerized
Low
-
Cost Servers

Technology

Cost in Medium
-
Sized

DC

Cost in Very Large

DC

Ratio

Network

$95 per
Mbit
/sec/month

$13 per
Mbit
/sec/Month

7.1

Storage

$2.20 per
GByte
/month

$0.40 per
Gbyte
/month

5.7

Administration

≈ 140 Servers


䅤Ai湩s瑲慴潲

㸠㄰〰⁓敲癥牳 ⼠
䅤Ai湩s瑲慴潲

㜮7

James Hamilton,
Internet Scale Service
Efficiency
, Large
-
Scale Distributed Systems
and Middleware (LADIS) Workshop Sept‘08

Huge DCs 5
-
7X as Cost Effective
as Medium
-
Scale DCs

Why Now (not then)?


Common HW & SW platform


x86 as universal ISA, plus fast virtualization


Standard software stack, largely open source (LAMP)


Bet: Can statistically multiplex multiple instances onto
a single box
without interference between instances


Novel economic model: fine grain billing


Earlier examples: Sun, Intel Computing Services

longer commitment, more $$$/hour


Infrastructure software:
eg

Google
FileSystem


Operational expertise: failover,
DDoS
, firewalls...


More pervasive broadband Internet


16

Unused resources

Cloud Economics 101


Static provisioning for peak: wasteful, but
necessary for
SLO

“Statically provisioned”


data center

“Virtual” data center

in the cloud

Demand

Capacity

Time

Resources

Demand

Capacity

Time

Resources

17

Risk of underutilization if
peak predictions are too
optimistic


Wasted
CapEx

Risks of underprovisioning

Lost revenue

Lost users

Resources

Demand

Capacity

Time (days)

1

2

3

Resources

Demand

Capacity

Time (days)

1

2

3

Resources

Demand

Capacity

Time (days)

1

2

3

18

New Scenarios Enabled by
“Risk Transfer”


Not (just)
CapEx

vs.
OpEx
!




Cost associativity”: 1,000 computers for 1 hour
same price as 1 computer for 1,000 hours


Washington Post converted Hillary Clinton’s travel
documents to post on WWW
<1 day

after released


RAD Lab graduate students demonstrate improved
Hadoop (batch job) scheduler

on 1,000
servers

19

Outline


What is it?


Why now?


Cloud economics and opportunities


Challenges

20

Obstacles and Opportunities

Obstacle

Opportunity

1

Availability of Service

Use Multiple Cloud Providers;


Use Elasticity to Prevent DDOS

2

Data Lock
-
In

Standardized APIs;

Compatible Software to
Enable Surge Computing

3

Data Confidentiality and
Auditability

Deploy Encryption, VLANs, Firewalls;

Geographical Data Storage

4

Data Transfer Bottlenecks

FedExing

Disks; Data Backup/Archival;
Higher

Bandwidth Switches

5

Performance

Unpredictability

Improved

VM Support; Flash Memory;
Gang Scheduling VMs

6

Scalable Storage

Invent Scalable Store

7

Bugs in Large Distributed

Systems

Invent Debugger

that Relies on Dist VMs

8

Scaling quickly

Auto
-
Scaler
;
Snaphots

for Conservation

9

Reputation Fate Sharing

Reputation Guarding Services

10

Software Licensing

Pay
-
for
-
Use Licenses;

Bulk Use Sales

Open source
reimplementations

of Google
AppEngine

(
AppScale
),
EC2 API (Eucalyptus),
BigTable

(
HyperTable
)

Freedom OSS partnership with
Amazon to allow FedEx
-
ing

disks into their datacenters,
Amazon hosting free public
datasets to “attract” cycles

2/11/09:
IBM
WebSphere
™ and
other service
-
delivery software
available on AWS with
pay
-
as
-
you
-
go
pricing

Cloud or Earthbound: “Should I
Move to the Cloud?”


Some app types made more compelling


Surge computing: overflow into the cloud


Extend desktop apps into cloud:
Matlab
,
Mathematica; soon productivity apps?


Batch
processing to exploit cost associativity, e.g. for
business analytics


Other apps more challenged


Bulk data movement expensive, slow


Jitter
-
sensitive apps (long
-
haul latency & transient
performance distortion due to virtualization
)


What about HPC/Grid apps?

22

2008 Observed

EC2 Topology

Caveat: only shows routers, not switches

EC2 Large Instance

Network Performance


Measurement:


Amount of data
transferred
between 2
instances in 10
seconds


Network quirks


96% RTT < 1ms


Occasional route
changes lead to
weird paths and
increased latency



Effects on MPI Performance


Many research opportunities to address issues


Collocated rack
-
level allocation schemes


Gang scheduling for clouds/virtual clouds


New network topologies/implementations

25

[Walker, :login 2008
-
10]

Summary


Many Cloud Computing Benefits:


Shift
CapEx

to
OpEx

, Scale
OpEx

to demand


Startups and prototyping, One
-
off tasks (Wash. Post)


Cost associativity


Research at scale


Many Cloud Computing Challenges:


Availability


Data in cloud may be a “gravity well” ($$$ to move)


Not ready for HPC applications
yet


Opportunities to address performance issues


More:
http://abovetheclouds.cs.berkeley.edu/

26

UC Berkeley

Thank you!

adj@eecs.berkeley.edu

http://abovetheclouds.cs.berkeley.edu/


27

BACKUP SLIDES


28

Performance for

Money Spent on EC2


LINPACK cost effectiveness on EC2

29

[
Napper

and
Bientinesi
]

Heterogeneity in Virtualized
Environments


VM technology isolates CPU and memory, but disk
and network are shared


Full bandwidth when no contention


Equal shares when there is contention


2.5x

performance difference

0
10
20
30
40
50
60
70
1
2
3
4
5
6
7
IO
Performance per VM (MB/
s
)

VMs

on Physical
Host

EC2
small
instances

Power and Cooling Is Expensive!

Belady, C., “In the Data Center, Power
and Cooling Costs More than IT
Equipment it Supports”, Electronics
Cooling Magazine (Feb 2007)

The Infrastructure for Power
and Cooling Costs a
LOT

Infrastructure PLUS Energy

> Server Cost Since 2001

Infrastructure
Alone


> Server Cost Since 2004

Energy
Alone


> Server Cost Since 2008

Cost Effective to Discard Inefficient Servers

Power Savings


Infrastructure Savings!

Like Airlines Retiring Fuel
-
Guzzling Airplanes

Willing to pay more $/server for

more power efficient servers