Leveraging Parallel Computing to Advance Computational Solid Mechanics

shapecartΛογισμικό & κατασκευή λογ/κού

1 Δεκ 2013 (πριν από 3 χρόνια και 10 μήνες)

93 εμφανίσεις

US Army Corps of Engineers
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Leveraging Parallel Computing to
Advance Computational Solid
Mechanics


Kent. T. Danielson, Ph.D., Director
Shock and Vibration Information Analysis Center
Research Civil Engineer
Engineer Research and Development Center
Vicksburg, MS USA
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
OUTLINE

§

Explicit Dynamic Approaches


High-Rate Short Duration Events
§

Current Software


Legacy Elements/Material Models


Coarse-Grain Parallel w/ MPI
§

Current R&D
à

Nano
& Small-Scale Emphasis
§

In-Between Advances are Feasible via HPC
§

Next Generation of Parallel Paradigms?
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
CSM/CSD

§

Long Term R&D
à
Nano
& Multiscale Methods
§

Helped by HPC, Infeasible for Practical Problems

Macroscale
Molecular
Nanoscale

Microscale
Mesoscale
Atomistic
Method 1:
Ladder
Approach
Method 2:
Direct
Approach
Multiscale
Model Levels
Baryons &
Beyond
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
CSM/CSD

§

In-Between Advances are More Reasonable


Mesocale
w/ Bridging
à
Not Inherently Parallel


Inelastic/Damage Models Beyond Plasticity
§

Predominately 1
st
Order Elements


1970s, 1980s “Speed Over Accuracy”


New Elements & Meshfree by Leveraging HPC
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Parallel Computational Testbed

Developed over last 15+ yrs at ERDC
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
6
ParaAble Framework
§

ParaMount

à
Installation



IBM, SGI, Compaq, Cray Platforms, Linux Clusters


Parallel or Serial MPI Library, Unix (e.g., g95), Windows
§

PrePara

à
Serial METIS/RCB Based Partitioning
§

ParaAble

à
Parallel Explicit Solver
§

ParaGraph

à
Serial Gather/Manipulation for Visualization


Separate/Concurrent


Becoming Less Necessary/Obsolescent?
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
7
ParaAble Characteristics
§

FORTRAN 95 (except for METIS)


Strict ANSI, MPI-1 Standards (Except Options)
§

Coarse Grain MPI or SHMEM Parallelism
§

Scalable I/O Via Individual Processor Files
§

Weighted Partitioning (New METIS Features)


Element/Material Costs
§

Overlapping Communications/Computations
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
8
Overlapping Communications/Computations


Edge Elements


Overlap(Synchronize)


Nonblocked Send/Recv


Interior Elements


Critical Time Step from
Entire Mesh
Duplicate Nodes
on Boundaries
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
§

Pros


Effective in
Flexure


No Hourglass
Control


Curved Shapes


Contact


Fracture
Mechanics


Less Diffusion
§

Cons


More Complex


Smoother


Expensive


Historically
Inaccurate in
Explicit Methods
COMPARISON of 1
st
to 2
nd
ORDER HEXs

BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
§

1
st
Order Choices


Full Integration: Shear Locks


Reduced Integration: Hourglasses


Hourglass Control Can Be Troublesome


Incompatible Modes: Incompatibility


Slow Convergence in Flexure
CONS OF 1
st
ORDER HEXs

BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
2
nd
ORDER EXPLICIT HISTORY

§

DYNA3D (Late 1970s)


20 Node Serendipity


2x2x2 Gauss Quadrature


Ad Hoc
Lumping


Inaccurate and Impractical


NEGATIVE CORNER LOADS
§

Cook, Malkus, Plesha, Fried
(1980’s)


9 Node Quad (2-D)


3x3 Gauss Quadrature


“Optimal” Lumping
(Simpson’s Rule)


Simple Idealized Problems
§

ABAQUS (21
st
Century)


10 Node Tetrahedron


Proprietary Formulation


Hourglass Control
§

Ortiz, … Tet ???
8 Node Face
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Parallel Computational Testbed

§

Reliable Formulations


27 Node Hexahedrons


3x3x3 Gauss Quadrature
or

3x3x3, 2x2x2 Selective Reduced
Integration


Row-Summation Lumping


Type I (Classical)


Type II (New)
J
I
M
dV
h
dV
h
h
M
IJ
V
o
I
o
Nodes
J
V
o
J
I
o
II
o
o








,
0
#
1
ρ
ρ
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Parallel Computational Testbed

§

Evaluated as Unreliable


20 Node Hexahedrons


2x2x2, Directional, and Several
14 pt Reduced Gauss
Quadrature Rules—Sometimes
Fine, but Usually Unstable;
Lack of Stiffness Matrix
(Lumping) Permits Propagation
of Instabilities


Nodal Integration or
“Optimal” (Simpson’s Rule)
Lumping is Inaccurate w/
Poorly Shaped Elements
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Nodal Integration “Optimal”
Lumping

J
I
h
h
I
J
I
M
dV
h
h
M
J
I
IJ
V
o
J
I
o
IJ
o







,
0

and

1

Node
at

since
,
0
ρ


Simpson’s Rule (Newton-Cotes n=2)
à
Quadrature Points are Nodes


For perfect Cube, Yields Identical
Mass Lumping as Row-Summation


Accuracy of 2x2x2 Gauss
Quadrature (“Full” 3x3x3 Used in
Row-Summation)


Accuracy is Reduced w/ Element
Distortion


Does not Provide the Reliability of
Row-Summation
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
APPLICATIONS


No Theoretical Basis for Lumping
à
Numerical Testing

BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
PATCH TEST/1-D WAVE

3 Identical Distributions
“Optimal” Lumping Fails
Patch Test
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
ELASTIC/INELASTIC BENDING
Belytschko-
Bindeman Example
Compared to Puso
Element in DYNA3D
Elastic
Elastic-Plastic
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
STRESS EXTRAPOLATION
Multiple Points within Element Facilitates Extrapolation
(Gradient can be Computed without a Patch of Elements)
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
TWISTED BEAM 3-D BENDING

BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
IMPACT APPLICATIONS

ELASTIC-PLASTIC
HYPERELASTIC
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
REBAR MODELING
Elastica Problem
Single Element
Through Thickness
Rebar Cross-Section
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Elastica Solution for #4 Rebar
Single Element
Through Thickness
Loaded Slowly
w/o Mass
Damping or
Scaling
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Material Model Developments



Classical Plasticity Models Frequently Assume Idealized
Failure Surfaces, Uncoupled Deviatoric-Volumetric
Responses, ad hoc damage models, etc.


Again, Capabilities in Current Software Predominately
Developed in “Speed Over Accuracy” Era of the 1970s and
1980s


Advanced Methods that don’t got to Extreme Scales Offer
Advancements for Practical Applications that were not
Feasible in the Past, Damage Models, Microplane, etc.
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Basic Multi-Scale Material
Model
More Confidence in
Arbitrary 3-D Strain Paths
Microplane Inelasticity/
Damage Brought to Macro-
Scale
Homogenization Via Virtual
Work & Free Energy
Equivalency
Bridging

Ω
Ω
Ψ

Ψ
d
micro
macro
π
4
3
Density
Energy
Free
Helmholz

Ψ
Microplane - Macro/
Meso
Scale
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Inherent Anisotropic Damage
Eckhart &
Ramm



Many State variables (100+)


~Order of Magnitude More Computations than Plasticity
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Penetrator Simulations
Microplane Model for
Concrete
à
Erosion

BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Penetrator Simulation Performance
0
20000
40000
60000
80000
100000
120000
140000
75
150
300
600
900
1200
CPU Time, sec.
Number of Cores
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Explosive Detonation in Wall
First Order


995,192 HEX8FB Elements,
1,030,890 Nodes


1.0 CPU hrs on 64 cores
Second Order


110,896 HEX27 Elements
920,157 Nodes


1.0 CPU hrs on 128 cores


30 CPU min on 256 cores


No Hourglass Control
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Truck Bomb Retrofit Application
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Truck Bomb Retrofit Application
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Internal Detonation in a Bridge Pier Application
~15 Million
Hexahedral
Elements
~12 min. 8192
PE’s Cray XT4
Microplane Model for Concrete
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Bridge Pier Internal Detonation Performance
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
256
512
1024
2048
4096
8192
CPU Time, sec.
Number of Cores
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
Summary: HPC, MPI and Beyond
§

MPI


Scales Well for 1000s of PEs in Many Design Applications


Scales Well for 100,000s of PEs in Some R&D Applications


Some Areas Still are Troublesome


Not inherently parallel or tedious programming
à
CAF, UPC



Designed in 1990s
à
10s to 100s of PEs


Syntax may be functional, but implementation may not be useful


ALL_TO_ALL for 100,000 PEs is much different that for 100
w

May need nesting, but performance and message counts can
still be issues


No inherent way to recover from PE failure


I/O is difficult for large problems
§

Multi-threaded Operating Systems (
Parallex
, etc.)


OS to Control All Processes for Redistribute, …
§

Leverage HPC to Modernize CSM/CSD Software
BUILDING STRONG
®
Unclassified//Unlimited Distribution
Unclassified//Unlimited Distribution
3
4
Thank You!