The Parallel Computation of a 3D_Steady Conduction

compliantprotectiveSoftware and s/w Development

Dec 1, 2013 (3 years and 4 months ago)

86 views



北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


172



The Parallel Computation of a
3D_Steady
Conduction
Problem with Gauss
-
Seidel Method


三维稳态传热问题的并行计算


Cheng MuLin

Mechanical and Engineering Science Department, PeiK
ing Universi
ty



Abstract




In this paper, I use MPICH to implement the parallel
computation of a 3D
-
Steady conduction problem. Running cases
with different mesh and processor number closely tests the parallel
performance of this program.


摘要

本文中采用
MPICH
实现了一个
3D
-
Stead
y
的传热问题的
平行计算求解
.
通过运行具有不同的网格数目
,
进程数目的程式
,
对该程式的平行效率进行了测试
,
发现具有线性加速比
.
这表明
本程式具有较高的平行效率
.









北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


173

Introduction

History



Single
-
Processor supercomputers achieved unheard of speeds beyond 100
million instructions per second, and pushed hardware technology to the p
hysica
l
limits of chip building. And so it will come to the end, because there are physical and
architectural bounds that limit the computational power that can be achieved with a
single
-
processor system. But the computing tasks from the scientific field, such a
s
CFD (Computational Fluid Dynamic), nuclear physics and so on, are more and more
complex which demand huge memory and high computing speed. Thus the parallel
computer system is designed to match this need. Because the whole task is split to
some small pie
ces or steps and each processor has one or more pieces or steps running
on itself, different pieces or steps are done at the same time and the whole task can be
finished more quickly than on a single
-
processor computer. But different processors in
a parall
el computation are not independent with each other in most cases, so data and
message exchanging are unavoidable which are very slow comparing to the CPU
speed. These data and message passing is the most important factor that limits the
speed of parallel c
omputers speed.

During recent years, different paradigms of parallelism are developed suitable
for different application field. Following table (tab.1) shows a classification system,
which is not a complete one, but includes the major approaches taken by s
cientists,
engineers, and researchers in a variety of fields, who apply parallel computing.
Vector/Array is taken as the parallelism paradigm at the beginning period of parallel
computation research. Now, MIMD (Mutiple
-
Instructions
-
Mutiple
-
Data) is the mos
t
general form and SIMD (Single
-
Instructions
-
Multiple
-
Data) and SPMD
(Single
-
Program
-
Multiple
-
Data) forms of parallelism appear to be appropriate for
scientific problems whose data are regular

and whose calculations are uniform and
repetitious.



北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


174


Table.1



During this summer holiday, I study MPI and MPICH, and then develop a
parallel program with MPICH for a 3D
-
Steady Conduction problem with the guidance
of Pro. Lin. This paper includes the most part of my work.


Basic Idea of Parallel Com
putation



MPI and MPICH


Message Passing is a Paradigm used widely on certain classes of parallel
machines, especially those with distributed memory. To reduce the repetitious work of
vendors who apply parallel computing, MPI(Message Passing Interface) is

defined
which try to define both the syntax and semantics of a core of library routines that will
be useful to a wide users and efficiently implementable on a wide range of computers.
MPI describes all MPI function in the language
-
independent notation and

the ANSI C
version of the functions is provided, the
FORTRAN

77 version of the same functions
is also provided. MPICH is a portable implementation of the full MPI specification
for a wide variety of parallel and distributed computing
environments
.



Measu
re of Performance


For a single
-
processor computer,
MIPS (
Million Instructions Per Second) and


北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


175

MFLOPS (
Million Floating Point Operation Per Second) are traditional measures for
the performance. For a parallel computer system, Speedup is
an

often
-
quoted

mea
sure
for parallel performance, although it is also a controversial one. Speedup is defined as
following:


0
( )
T
Speedup
T N


(1)


where T
0

is the time to compute a certain problem using a serial program on one
processor. And T(N) is the tim
e to compute a certain problem using a parallel
program

on N processors. That is to say Speedup is computed by dividing the time to compute
a solution using one processor by the solution time using N processors. But in practice,
T(1) is used for T
0

instead

for simplicity. Thus speedup can be computed as following:


(1)
( )
T
Speedup
T N


(2)

However, we should remember the slight difference between T(1) and
T
0
, which

comes from using different programs in which one is a serial one and the other is

a
parallel one.


Parallel Computation


Problem
Description



A 3D
-
Steady Conduction Problem is considered in this paper. The Problem is
shown as figure 1.

The Length (L) of the bar is 0.4m, the width(D) of the bar is 0.1m
and the height(H) of the bar is
0.1m.too. Aluminum is selected as the material of the
bar and the material is homogeneous through the whole bar. Parameters used about
Aluminum is
shown as following:




There is a temperature difference at two ends of the bar, the left end is heated to 10
0K
and the right end is kept at 0K, so heat will move from the left to the right and
temperature will reach a steady distribution through the whole bar. For other four
K
kg
W
k
ductivity
thermalcon
K
kg
J
C
pressure
constat
at
heat
Specific
m
kg
Density
p





/
237
/
903
/
2702
3



北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


176

faces of the bar, the adiabatic boundary condition is set, that is to say, no heat escap
es
from these four faces of the bar. A heat source S is under consideration, and S is the
function of temperature T. S(T) can be used to represent many cases in which the bar
gets or losses heat through no
-
mechanical process, such as radiation, chemical
re
action and so on.



fig.1 Problem description


Equations



Because this is a conduction problem without fluid motion,
governing

equation
is a Poisson Equation, as following.


2 2 2
2 2 2
( ) 0
T T T
S T
x y z
  
   
  

(3)


At two ends, the
boundary condition is:


0
100 0
T T K x m
  

(4)

1
0
T T K x Lm
  

(5)

For four faces of the bar, the adiabatic condition can be expressed as following:



北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


177


0,0
T
y m or y Dm
y

  


(6)

0,0
T
z m or z Hm
z

  


(7)



Equation6~10 decide the distribution of temperature T through the bar. Because
my focus is the parallel computation performance, the boundary condition is designed
carefully so that the problem can be solved anal
ytically when S(T) is set to ZERO.
Obviously
, a linear solution can be given:


1 0
0
(,,)
T T
T x y z T x
L

  

(8)



This equation will be used to compare with the numerical result from parallel
computation.


Discretization and Solution Method



A constructral mesh is used as is shown in figure.1. The finite
-
difference method
is used. First, S(T) is linearized to


( )
C P
S T S S T
  

(9)



where Sc, Sp are not constants and vary with Temperature T. Second, equ.6 is
integra
ted on the control volume around the gird point. At last, the temperature on the
gird points is substituted into the equations and the finite
-
difference equation can be
expressed in this form:




北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


178

z
y
x
S
a
a
a
a
a
a
a
z
y
x
S
b
x
y
x
k
a
x
y
x
k
a
x
x
z
k
a
x
x
z
k
a
x
z
y
k
a
x
z
y
k
a
b
T
a
T
a
T
a
T
a
T
a
T
a
T
a
P
B
T
S
N
W
E
P
c
b
B
t
T
s
S
n
N
w
W
e
E
B
B
T
T
S
S
N
N
W
W
E
E
P
P













































)
(
)
(
)
(
)
(
)
(
)
(









where T
P,

T
E

T
W,

T
N,

T
S,

T
T,

T
B
are
the value on the
center

point, east one, west
one, north one, south one, top one, bottom one respectively.
z
y
x



,
,

is the
dimension of the
control

volume. Additionally, the boundary condition need some
carefully consideration without basic
difference

to above.


Although the Gauss
-
Seidel line
-
by
-
line method will make the iteration of the
solution converge more quickly than the Gauss
-
Seidel point
-
by
-
point method, we still
use the point
-
by
-
point method for the reason of parallel
programming
.


F
or parallel computing, the mesh are split by several faces perpendicular to the x
direction to some
approximate

equivalent

blocks
. Each processor will burden
computing on one block, and the value of grid points on
splitting

faces should be
passing between
processors. The computing process is split and the data resource is
not split. That is to say, at the beginning of parallel computing, all the processors
finish initialization at the same time and then compute its own block, at last each
computing node sen
ds the result to node 0.Node 0 collects the result and outputs it to
file.


Result



Three
computers, which has two CPUs,

are used to construct a parallel computer.


北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


179

Using a mesh with 20*5*5 grid points, the iteration converges to a numerical solution
with
1.0E
-
6 residual, after 3142 times iterations.Figure.2 shows the distribution of
temperature through the bar.



fig.2 Temperature Distribution


The figure shows that the temperature is constant when x is constant and
distributes linearly

along the x direction. This numerical result is coherent with the
analytical result(equ.8
), which

shows the correction of the parallel program.


To test the parallel performance of my program, more cases with different mesh and
processors number have bee
n tested on the same parallel computer. The iteration
times and the solution time consuming on each processor for every case are recorded.
We find that iteration times very slight increase when processors number increases
from 1 to 5(figure.3).




北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


180

Relative Iteration times Increasement--Processors Number
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
0
1
2
3
4
5
6
Processors Number
Relative Iteration times Increasement
20*5*5
60*15*15
120*30*30
fig.3 Relative Iteration Times Increasement


If comparing the parallel program with the serial one, the reason for the iteration
times increasing can be found easily.


The solution time consuming on the node 0 is slightly larger than that

on other
nodes, which

is caused by the last step Reduction Operation in parallel computing. So
the solution time consuming on the node 0 is used as the whole solution time. The
solution time increases with the grid points number increasing when the residu
al is
fixed which is shown in
figure
.4

.



北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


181

Ti me-Processors Number
0.00E+00
5.00E+03
1.00E+04
1.50E+04
2.00E+04
Processors Number
time(s)
20*5*5
60*15*15
90*25*25
120*30*30
20*5*5
2.0268E+00
4.3419E+00
5.0346E+00
6.6446E+00
8.1674E+00
60*15*15
5.7377E+02
3.3798E+02
2.5750E+02
2.2130E+02
2.0940E+02
90*25*25
3.3529E+03
2.0993E+03
1.5585E+03
1.2754E+03
1.0859E+03
120*30*30
1.6379E+04
9.0460E+03
6.5874E+03
5.3099E+03
4.4930E+03
1
2
3
4
5

Fig.4 Computation Time



Speedup-Processors Number
0.000E+00
5.000E-01
1.000E+00
1.500E+00
2.000E+00
2.500E+00
3.000E+00
3.500E+00
4.000E+00
1
2
3
4
5
Processors Number
Speedup
20*5*5
60*15*15
90*25*25
120*30*30
fig.5 Speedup

Figure.5 shows the speedup curves. Each curve represents a kind of
mesh, which

has different grid points number. When t
he grid points number is small, such as
20*5*5 in figure.5, the speedup will be less than 1
and

decrease with processors
number increasing. Because there are relative massive data passed between processors
comparing to the grid points number, the parallel
computing speed is greatly cut down
that it is more slowly than the single
-
processor computing. When the grid points
number is large enough, such as 60*15*15, 90*25*25 or 120*30*30, the speedup will


北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


182

be more than 1 and increases when more processors are add
ed to parallel computation.
What is more, the speedup curve
approaches

to a linear line from a curve when the
grid
point’s

number is large enough, such as 120*30*30. A linear speedup curve
whose slope is approximate 0.66 shows the program has good parallel

computation
performance.

Because

the speedup is the function of mesh and processors number, another
speedup curves figure is given as
following (
figure.6), which shows the relation
between grid points number and speedup.

Speedup--Grid Points
0.0000
0.5000
1.0000
1.5000
2.0000
2.5000
3.0000
3.5000
4.0000
0.00
1.00
2.00
3.00
4.00
5.00
6.00
Grid Points
Speedup
1
2
3
4
5

fig.
6 Speedup



Discussion




Another kind of mesh partition is also
used
, but less speedup is got because
more data needs to be passed between processors. All the result shows that the time
consuming on
communication

between processors greatly limit the paral
lel
computation speed. There are three traditional methods to conquer this defect. One is
improving the hardware of parallel computer, but this always leads to the expensive
price. The second one is to change the interconnection network(IN) topology of
par
allel computer. The last one is to develop new
algorithms, which are differ
ent from
present ones for serial programs and suitable for the parallel computation.



Acknowledge



During this summer holiday, I come to Taiwan for research and
communication
.


北京大学
政学者论文集(
2001
年)


三维稳态传热问题的并行计算


183

My
teacher Prof. Lin have not only given me much useful
guidance
, but also help me
overcome some difficulties on living. My
lab mates
, such as LoWei, Weng PeiShen,
Lin ZhengWei, Li NongMing and other students, also give me lots of help and I
cannot

finish thi
s paper without their help. At last, I should give my most earnest
acknowledge to Prof. Shen JunShan and Prof. Li ZhengDao for that they give me this
chance.





Reference


[1]

Numerical Heat Transfer and Fluid Flow


Suhas V.Patankar Hemisphere
Publishi
ng Corporation Washington New York London(1979).

[2]

Numerical Methods


J.Douglas Faires and Richard L.Burden PWS
-
KENT
Publishing Company Boston

[3]

Introduction to Parallel Computing Ted G.Lewis and Hesham El
-
Rewini with
In
-
Kyu Kim P
rentice Hall, En
glewood, New Jersey 07632

[4]

Users

s Guide for MPICH, a Portable Implementation of MPI William Gropp
and Ewing Lusk Mathematics and Computer Science Division




导教师

林昭




台湾
新竹清华大

动力机


教授

主要






值模
拟研



涡模

(LES)


行计