Presentation - UTPA Faculty Web

rodscarletΛογισμικό & κατασκευή λογ/κού

14 Δεκ 2013 (πριν από 3 χρόνια και 6 μήνες)

63 εμφανίσεις

Web Search for a Planet:

The Google Cluster

Architecture

Eugenio De Hoyos


6175 Computer Science Seminar

October 4, 2011

introduction

2

introduction

… a single query on Google reads

h
undreds of megabytes of data and

c
onsumes tens of billions of CPU cycles…





500 MB @ 20 MB/s → 25 sec

10x10
9

cycles @ 2 GHz → 5 sec

IO

CPU

3

introduction

… a single query on Google reads

h
undreds of megabytes of data and

c
onsumes tens of billions of CPU cycles…





500 MB @ 20 MB/s → 25 sec

10x10
9

cycles @ 2 GHz → 5 sec

IO

CPU

4

outline

5

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

http://www.googlefalle.com

a

single query

6

a

single query

7

Google Web Server

Google Web Server

Google Web Server

Google Web Server

Google Web Server

Hardware

Load Balancer

Google Web Server

Google Web Server

8

Google Web Server

PC

PC

PC

PC

Shard

PC

PC

PC

PC

Shard

PC

PC

PC

PC

Shard

PC

PC

PC

PC

Shard

Index

Servers

PC

PC

PC

PC

Shard

PC

PC

PC

PC

Shard

PC

PC

PC

PC

Shard

PC

PC

PC

PC

Shard

Document

Servers

1

2

3

4

Google Web Server

Google Web Server

9

outline

10

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

philosophy

Service A

Service C

Service B

11

philosophy

176

CPU’s

176

GB RAM

7

TB ROM

278,000

Dollars

8

CPU’s

64

GB RAM

8

TB ROM

758,000

Dollars

12

13

14

outline

15

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

t
he power problem

CPU

POWER

RAM/BOARD

HD

16

A Google data center, circa 2000. Note the fan on
the floor to cool servers.

(Credit: Stephen Shankland
-
CNET News.com/Jeff Dean
-
Google)





17

t
heir observation

18

Cost

Scale

Equipment

Power &

C
ooling

a
re their numbers right?



𝐼 𝑐𝑖
$7
,
700
+
$1500

$7
,
700
+
$1300
+
$200

𝑃  𝑎𝑐
𝐶


𝐼 𝑐𝑖
𝐴 𝑖𝑧𝑎𝑖
+
(
𝑃𝑤
+
𝐶 𝑖
)

Cost of inefficiency

Min. Cost

Requires

$ 20,000

Amortization

Min. Amortization

Requires

$ 1,500

Operating Costs

19

outline

20

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

hardware

21

index server


RAM

CPU


Hard Drive

hardware

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

6

7

8

6

7

1

2

3

4

5

1

2

3

4

5

1

2

3

4

1

2

3

1

2

6

5

9

0

8

9

0

7

8

9

0

6

7

8

9

0

6

7

8

9

0

4

5

3

4

5

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

1 Clock Cycle

Short

P
ipeline

Long

P
ipeline

Pentium IV

Pentium
III

22

hardware

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

6

7

8

6

7

1

2

3

4

5

1

2

3

4

5

1

2

3

4

1

2

3

1

2

6

5

9

0

8

9

0

7

8

9

0

6

7

8

9

0

6

7

8

9

0

4

5

3

4

5

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

6

7

8

9

0

1 Clock Cycle

Short

P
ipeline

Long

P
ipeline

Pentium IV

Pentium
III

23

hardware

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

thread

level parallelism

i
nstruction
level
parallelism

24

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

hardware

simultaneous multithreading (SMT)

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

L1

L2

CPU

25

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

hardware

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

chip multiprocessor (CMP)

L1

L2

CPU

CPU

L1

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

26

outline

27

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

m
emory & scalability

Unpredictable memory access

Large cache lines prefetch helps

28

Memory bandwith

OK

CPU

RAM

Cache

l
ine length

c
ache length

outline

29

A Single Query

Philosophy

Power

Index Hardware

Index Memory

Conclusion

conclusion

Cluster architecture is ideal and least expensive


Maximize throughput


Software Reliability

30

c
onclusion

Service A

Service C

Service B

31

a

discussion question…

HDMI Monitor

USB Keyboard

700 MHz ARM 11

128 MB RAM

Open GL ES 2.0 1080p

--

David Braben, UK game developer

32

q
uestions?

33