(Issues) for Aerospace (CFD)

usefultenchΜηχανική

22 Φεβ 2014 (πριν από 3 χρόνια και 1 μήνα)

66 εμφανίσεις

Algorithm and Scaling
(Issues) for Aerospace (CFD)
Codes

Sukumar

Chakravarthy

src@metacomptech.com

www.metacomptech.com

1

Scope of Presentation


Range of aerospace CFD and related
applications


Hierarchy of simulation approaches


Hierarchy of algorithmic approaches


Algorithm and scalability issues and
considerations


2

Presentation Approach & Goals


A picture is worth a thousand words


We will use ten thousand words and 1
picture


== eleven thousand word
-
equivalents



Catalog, serve as collective conscience


Discuss relationship between application
needs, algorithms, modeling approaches
and HPC issues and possibilities

3

CFD++ Aerospace Applications


External aerodynamics


Propulsion integration


Component integration


Systems


Cabin airflow


FADEC


Icing


Fuel tank purge


Thrust reverser


Propulsion


Nozzle design


Jet noise

CFD++ Aerospace Applications


Plumes


Trajectory


Aerodynamic coefficients


Drag polar


Dynamic derivatives


Store separation


Canopy separation


Sabot separation


Stage separation


Pilot seat ejection


Projectiles


Spinning projectiles

CFD++ Aerospace Applications


Synthetic jets


Turbomachinery


Blade design


Blade cooling


Pulsed detonation


Flapping wings


Flexible wings


Entomopters


Helicopters


Propellers, rotors


Parachutes


Parachutists, sky
-
diving

CFD++ Aerospace Applications


Spacecraft launch


Reentry vehicles


Rocket assisted landings (Earth, Mars, Venus)


X
-
Prize vehicles


Land speed record vehicles


Bullets, artillery rounds


Liquid fuel breakup


Liquid fuel sloshing, feed


Acceleration, deceleration effects


Aeroacoustics


Flow Structure Interaction (FSI)

What’s special about Aerospace CFD?


Extremes of scales, operating conditions,
physics and chemistry, speeds, application
-
specific needs (extraction of useful
information)


Nonlinearity is most often inherent


It is not just the simulation itself that
counts


If there is no information output required,
no need to do the simulation


Hierarchy of problem classes


Steady state/unsteady problems


Small, medium and large scale problems


Entire configurations as well as analysis of
components


Engineering analysis, scientific analysis,
trouble shooting


All speeds, atmospheric conditions, diverse
fluids and their properties


10

Physics (nature)

Math Model of Physics

Numerical Model of Math Model

Computational Model

Human(s) in the loop

Simulation Results

Common Elements of Simulations

Common Underlying Physical Processes

ij
P
11



Convection:


Production:


Dissipation:


Redistribution:


Diffusion:


Evolution:



t
u
u
j
i




k
k
j
i
x
u
u
u




.
ij

*
ij

ij
d
Summary of some HPC issues


Loading the problem, saving final results


Checkpointing


Computational vs. communications
performance (scalability)


Data extraction issues


Robustness (10000
-
way parallel should be as
robust as serial algorithm)


Data
-
center issues (throughput, storage)


Visualization, interaction with running case

12

Modeling Hierarchy


Potential flow assumption


Small
-
disturbance approaches


Inviscid
-
flows taken separately, and
hybridized with boundary layer theory


Reynolds/Favre
-
averaged N
-
S equations
with phenomenological turbulence models


LES and hybrid RANS
-
LES approaches


Special equations and models

13

Mesh possibilities


Surface mesh only (panel methods)


Cartesian mesh, almost Cartesian mesh


Structured mesh


hex (3D) & quad (2D)


Unstructured


all cell types


Hybrid structured and unstructured meshes,
hex
-
core meshes


Patched and overset meshes


Moving (dynamic) meshes


Flexible boundaries and meshes


14

“Extreme Grids”


Aspect ratios of 10000 to 1 or more
(boundary layer resolution with Y+ < 1)


Mesh sizes of hundreds of million and more


Extreme grid
spacings

present in mesh

15

Numerical approaches


Explicit and implicit


Fractional steps and factored schemes


Finite volume, finite difference schemes


Finite element schemes


Spectral and spectral element schemes


“Local” schemes and “global” schemes

16

Some HPC algorithmic challenges


Challenges of making implicit schemes be
really implicit on multi
-
CPU computations


Ensure insensitivity of results to variations in
number of parallel processes used


How to make the 10000
-
way parallel
computation as robust as the serial algorithm


How to make the 10000
-
way parallel
computation converge as well but in much
less time

17

Adaptive meshes


Adaptive elements (cells)


Adaptive grids


H
-
adaptation, P
-
adaptation, H
-
P
-
adaptation



18

Classification of Algorithms


Low information density schemes


expand
stencil to improve accuracy


High information density schemes


expand
information content per cell (e.g. use values
and derivatives, or values at multiple
collocation points)


Homogeneity (or lack of) of
discretization

and
solution methodology


Homogeneity (or lack of) underlying physics
models

19

The usual scalability considerations


Computation and communication


Computation versus communication


Overlap of computation and
communication


Bulk of communication for local schemes
can follow pattern of one to a few
connectivity


Global operations


global reductions often
determine scalability

20

21

Recent Scalability Improvements


CFD++ now scales well to very large number of cores


The scalability improvements are universal


they apply to all modern HPC
platforms from all vendors


Tests have shown effective performance all the way up to 4096 cores


Even relatively small grids (e.g. 16 million cells) scale well to 2048 or even
4096 cores, depending on computer and type of case run


Goal


to demonstrate similar performance on 10000 to 40000 cores


0
1
2
3
4
5
6
0
200
400
600
800
1000
1200
Scaling
Performance


# of CPU cores

0
10
20
30
40
0
1000
2000
3000
4000
5000
Scaling
Performanc
e


# of CPU cores

Ex 1: 33M cells, Computer 1, Case 1

Ex 2: 16M cells, Computer 2, Case 2

Some Influences on Scalability


Effect of physics


increased sophistication
means more computation, often more
scalability


Effect of
numerics



increased accuracy
means more computation, and more
communication, often more scalability


Effect of grid


more grid means more
computation and less communication for
“local” algorithms

22

Additional thoughts on Parallel Processing


Two ways of using multiple compute
engines


Parallel computations


Pipelined computations


Pipelined algorithms have not been
exploited too much at the HPC level


Process level and thread level parallelism
beginning to be combined (e.g. to exploit
GPGPUs)

23

Load balancing issues


Structured vs. unstructured grids (usually
solved by weighted domain decomposition)


Adaptive algorithms and adaptive meshes


Different physics in different regions


Moving meshes and overset meshes

24

Optimization considerations


Parallel algorithms for optimization


How to use large numbers of processors


E.g. Do many cases in parallel


Pre
-
compute cases matrix, sensitivity, etc.
and then train neural networks or tabulate
sensitivity before applying optimization
procedure

25

Multi
-
physics considerations


Communications between non
-
homogeneous simulation tools


Communications between diverse hardware
platforms


Tight coupling vs. loose coupling
considerations

26

Need for Parallel I/O and File systems


Very large scale problems


Very large number of processors


Initial load and final save + intermediate
data output


Asymmetric data extraction needs


27

Typical “post
-
processing” needs


Global information (forces and moments,
lift, drag, torque)


Semi
-
global information (forces and
moments along wing span, along fuselage)


Reduced subsets


iso
-
surfaces, surface
data, cut
-
planes


Time
-
averages versus instantaneous values


In
-
situ “post”
-
processing can be very useful

28

Single and Distributed File Parallel I/O


Parallel I/O (PIO) can be accomplished in two
ways


In Single
-
File mode, PIO reads and writes
from the current full
-
mesh/full
-
solution files.


In Distributed
-
File mode, PIO reads and writes
from a set of files (e.g. placed in
subdirectories) associated with each parallel
process

29

Interactive massively parallel computing


Steady state versus Transient (unsteady)
computations


Links with front
-
end and graphical
processing


Even post processing of large scale
problems may require substantial parallel
computing resources


One should not just focus on the “batch”
computing model

30

Some elements of the balancing act


Computation


Communication


Memory requirements


I/O requirements


Accuracy requirements


Robustness requirements


In
-
situ solution processing requirements

31

Bandwidths to consider


Number of cores vs. number of I/O
channels


Memory bandwidth from core to memory


Memory
access conflicts

32

Some old ideas revisited


Paying more attention to connectivity
architecture


Minimization of hops


Domain decomposition that minimizes
traffic between switches


How many switches or hops (groups of
nodes), how many nodes, how many
processors in a node, how many cores per
processor

33

Final thoughts


The challenge of producing codes that work in
the user’s hands and computing facilities


Ease of use


Scalability and effectiveness vs. just scalability


Resource maximization versus minimization


What can be done with less


What can be done with more


What more can be
done with less

Thank you

34