Thermal Aware Resource

spongereasonInternet and Web Development

Nov 12, 2013 (3 years and 10 months ago)

74 views

Thermal Aware Resource
Management Framework

Xi
He, Gregor von Laszewski,
Lizhe

Wang

Golisano

College of Computing and Information
Sciences

Rochester Institute of Technology

Rochester, NY 14623

xi.he@mail.rit.edu

1

Outline

2




Introduction


Motivation


Thermal
-
aware Resource Management
Framework


Motivational Examples


System Model and Problem Definition


Thermal
-
aware Task Scheduling Algorithm


Conclusion

Introduction

3

Distributed Collaborative Experiment

Introduction

4


61 billion kilowatt
-
hours of power in 2006, 1.5
percent of all US electricity use costing around
$4.5 billion.


Energy usage doubled between 2000 and 2006.


Energy usage will double again by 2011[1].

61
billion kilowatt
-
hours of power in 2006, 1.5 percent of
all US electricity use costing around $4.5 billion.



[1]
http://www.energystar.gov/ia/partners/prod_develop
ment/downloads/EPA_Datacenter_Report_Congress_Fi
nal1.pdf

Dynamic Voltage Scaling
Hardware Level

Dynamic Frequency Scaling

Virtualization
Software Level

Job Scheduling
Middleware Level

Virtual Machine Scheduling

Introduction

5

Cooling System
Data Center Level

Motivation


Why thermal
-
aware resource management
framework?



To allow end users easily collaborate with each
other and get access to remote resources.


To implement Green Computing.


To monitor temperature situation in Data Center.




6

Architecture Overview

7

8

Different types of task
-
temperature profiles



Motivational

Examples

9

Task
-
temperature profile (Buffalo Data Center)




Motivational Examples

10

job
1
=(0,2,20,f(job
1
))

job
2
=(0,1,40,f(job
2
))

node
1
=40C

node
2
=32C

node
3
=34C

node
4
=32C

node
1
=40C

node
2
=40C

node
3
=40C

node
4
=40C

job
1

node
4

job
1

node
2

job
2

node
3

job
1

node
1

job
1

node
2

job
2

node
3

max=40C

σ=0

node
1
=48C

node
2
=40C

node
3
=40C

Node
4
=32C

Max=48C

Σ=5.6

Motivational Examples

System Model

11


Where,
node
i

indicates
i
th

node in
the data
center;

Each node has a
temperature
-
time profile that
indicates the node’s temperature
value over time.

System Model

12


Where,
t
start

indicates the starting
time of job; The job needs
node
num

processors and lasts
t
exe
;
f
temp
(t
)
is a
function caused by the execution of
the job based on the execution time
of the job.

Problem Definition

13


Given a set of jobs. Find an
optimal schedule to assign each job
to the nodes to minimize computing
nodes’ temperature deviation.


Where,
ΔTemp

is the temperature
increase that
job
k

causes.

Problem Definition

14


We use standard deviation as
the metric for measuring the
temperature distribution.

Algorithm

15

Algorithm

16

1.
Select the node which has the lowest “current” temperature.

2.
Sort jobs in descending order of the temperature rise they


caused.

3.
For each job

4.

Assign the job to the selected node.

5.

Update the node’s temperature
-
time profile.

6.

Select the node which has the lowest “current” temperature.

7.
End For

8.
If a node’s temperature exceed the threshold, don’t choose it
in the next round and let it cool down.



Experiment

17

y =
-
0.0005x
2

+ 0.17x
-

0.543

-2
0
2
4
6
8
10
12
14
16
0
50
100
150
200
Series1
Log. (Series1)
Poly. (Series1)
Task temperature profile

Execution Time(s)

Temperature

Experiment

18

iCore7 cooling profile

66
68
70
72
74
76
78
80
0
50
100
150
Series1
Poly. (Series1)
Time(s)

Temperature

Result

19

σ

( Thermal aware
task scheduling )

σ

( Random task
scheduling )

N=10

M=30

6.2

13.4

N=20

M=30

5.3

11.1

N=20

M=40

7.3

16.5

N indicates the number of job groups

M indicated the number of jobs in each group

Related Work


In [1], [2], power reduction is achieved by the
power
-

aware task scheduling on DVS
-
enabled
commodity systems which can adjust the supply
voltage and support multiple operating points.





[1] K. H. Kim, R.
Buyya
, and J. Kim, “Power aware scheduling of bag
-
of
-

tasks
applications with deadline constraints on
dvs
-
enabled clusters,” in CCGRID, 2007,
pp. 541

548.


[2] R.
Ge
, X.
Feng
, and K. W. Cameron, “Performance
-
constrained distributed
dvs

scheduling for
scientific

applications on power
-
aware clusters,” in SC, 2005,
p
.
34.

20

Related Work


In [3], [4] thermodynamic formulation of steady
state hot spots and cold spots in data centers is
examined and based on the formulation several
task scheduling algorithms are presented to reduce
the cooling energy consumption.




[3] Q. Tang, S. K. S. Gupta, and G.
Varsamopoulos
, “Thermal
-
aware task
scheduling for data centers through minimizing heat recirculation,” in CLUSTER,
2007, pp. 129

138.


[4] J. D. Moore, J. S. Chase, P.
Ranganathan
, and R. K. Sharma, “Making
scheduling ”cool”: Temperature
-
aware workload placement in data centers,” in
USENIX Annual Technical Conference, General Track, 2005, pp. 61

75.

21

CONCLUSION

My accomplishment in the research:



Grid computing and Cloud computing
literature review



Make an analyzing study on Buffalo data
center operation.



Scheduling algorithms literature review



22


23

Conclusion


A novel framework to solve resource
management problem.


A thermal
-
aware task scheduling for data
center, which will save a lot of cooling energy
cost.


Future work


Investigate other thermal characteristic of data
centers.


Continue the development of thermal
-
aware
resource management framework.


24

PUBLICATION

G. von Laszewski, F. Wang, A. Younge, X. He, Z. Guo, and M.
Pierce, “Cyberaide javascript: A javascript commodity grid
kit,” in GCE08 at SC’08. Austin, TX: IEEE, Nov. 16 2008.
[Online]. Available:
http://cyberaide.googlecode.com/svn/trunk/papers/


08
-

javascript/vonLaszewski
-

08
-

javascript.pdf



G. von Laszewski, A. Younge, X. He, K. Mahinthakumar, and L.
Wang, “Experiment and workflow management using
cyberaide shell,” in 4th International Workshop on Workflow
Systems in e
-
Science (WSES 09) in conjunction with 9th IEEE
International Symposium on Cluster Computing and the Grid.
IEEE, 2009.



25

26

Appendix

Appendix

27

Appendix

28