Scalable Scientific Applications

photofitterInternet and Web Development

Dec 4, 2013 (3 years and 10 months ago)

110 views

Managed by UT
-
Battelle

for the Department of Energy

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Scalable Scientific Applications

C
haracteristics & Future Directions



Douglas B. Kothe


And Richard Barrett, Ricky Kendall, Bronson Messer, Trey White

Leadership Computing Facility

National Center for Computational Sciences

Oak Ridge National Laboratory

Managed by UT
-
Battelle for the

Department of Energy

2

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Science Teams Have Specific PF Objectives

Application
Area

Science Driver

Science Objective

Impact

Combustion

(S3D)

Predictive engineering engine
design simulation tool for new
engine design

Understanding flame stabilization in lifted
autoigniting diesel fuel jets relevant to low
-
temperature combustion for engine design at
realistic operating conditions

Potential for 50% increase in efficiency and 20%
savings in petroleum consumption with lower
emission, leaner burning engines

Fusion

(GTC)

Understand and quantify
physics and properties of
ITER scaling and H
-
mode
confinement

Strongly coupled and consistent wall
-
to
-
edge
-
to
-
core modeling of ITER plasmas; attain a realistic
assessment of ignition margins

ITER design and operation

Chemistry

(MADNESS)

Computational catalysis

Describe large systems accurately with modern
hybrid and meta density functional theory
functionals

Generate quantitative catalytic reaction rates and
guide small system calibration

Nanoscale
Science

(DCA++)

Material
-
specific
understanding of
high
-
temperature
superconductivity theory

Understand the quantitative differences in the
transition temperatures of high
-
temperature
superconductors

Macroscopic quantum effect at elevated
temperatures (>150K); new materials for power
transmission and oxide electronics

Climate

(POP)

Accurate representation of
ocean circulation

Fully coupled eddy
-
resolving ocean and sea ice
model to reduce the coupled model biases where
ice and deep water parameters are governed by the
accurate representation of current systems

Reduce current uncertainties in coupled ocean
-
sea ice system model

Geoscience

(PFLOTRAN)

Perform multiscale,
multiphase, multi
-
component
modeling of a 3
-
D field CO
2

injection scenario

Include oil phase and four
-
phase liquid
-
gas
-
aqueous
-
oil system to describe dissipation of the
supercritical CO
2

phase and escape of CO
2

to the
surface

Demonstrate viability of and potential for
sequestration of anthropogenic CO2 in deep
geologic formations

Astrophysics

(CHIMERA)

Understand the core
-
collapse
supernova mechanism for a
range of progenitor star
masses

Perform core
-
collapse simulations with
sophisticated spectral neutrino transport, detailed
nuclear burning, and general relativistic gravity

Understand the origin of many elements in the
Periodic Table and the creation of neutron stars
and black holes

Managed by UT
-
Battelle for the

Department of Energy

3

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Application Requirements at the PF





Application categories analyzed


Science motivation and impact


Science quality and productivity


Application models, algorithms, software


Application footprint on platform


Data management and analysis


Early access science
-
at
-
scale scenarios






Results


100+ page Application Requirements Document published in Jul 07


New methods for categorizing platforms and application attributes devised and utilized in
analysis: guiding tactical infrastructure purchase and deployment


But still too qualitative! More work to do….

Managed by UT
-
Battelle for the

Department of Energy

4

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Application Codes in 2008

An Incomplete List


Astrophysics


CHIMERA, GenASiS, 3DHFEOS,
Hahndol, SNe, MPA
-
FT, SEDONA,
MAESTRO, AstroGK


Biology


NAMD, LAMMPS


Chemistry


CPMD, CP2K, MADNESS, NWChem,
Parsec, Quantum Expresso, RMG,
GAMESS


Nuclear Physics


ANGFMC, MFDn, NUCCOR, HFODD


Engineering


Fasel, S3D, Raptor, MFIX, Truchas,
BCFD, CFL3D, OVERFLOW, MDOPT


High Energy Physics


CPS, Chroma, MILC


Fusion


AORSA, GYRO, GTC, XGC


Materials Science


VASP, LS3DF, DCA++, QMCPACK,
RMG, WL
-
LSMS, WL
-
AMBER, QMC


Accelerator Physics


Omega3P, T3P


Atomic Physics


TDCC, RMPS, TDL


Space Physics


Pogorelov


Climate & Geosciences


MITgcm, PFLOTRAN, POP,

CCSM (CAM, CICE, CLM, POP)


Computer Science (Tools)


Active Harmony, IPM, KOJAK, mpiP,
PAPI, PMaC, Sca/LAPACK, SvPablo,
TAU

Managed by UT
-
Battelle for the

Department of Energy

5

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Apps Teams Are Reasonably Adept at Using
our Current Systems*

*Is the “field of dreams” approach inadequate (too little too late)?

What is “effective utilization”? Scaling? Percent of Peak (Jacoby vs MG)?

Current SC apps range from 2
-
70% of peak: what’s the goal?

Remember, we improve what we measure so let’s have the right metrics and measures

My $0.02: science and engineering achievements on these systems is the legacy

Managed by UT
-
Battelle for the

Department of Energy

6

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Science Workload

Job Sizes and Resource Usage of Key Applications

Code

2007 Resource
Utilization

(M core
-
hours)

Projected 2008
Resource Utilization

(M core
-
hours)

Typical Job Size in
2006
-
2007

(K cores)

Anticipated Job
Size in 2008

(K cores)

CHIMERA

2

(under development)

16

0.25

(under development)

>10

GTC

8

7

8

12

S3D

6.5

18

8
-
12

>15

POP

4.8

4.7

4

8

MADNESS

1

(under development)

4

0.25

(under development)

>8

DCA++

N/A

(under development)

3
-
8

N/A

(under development)

4
-
16 (w/o disorder)

>40 (with disorder)

PFLOTRAN

0.37

(under development)

>2

1
-
2

(under development)

>10

AORSA

0.61

1

15
-
20

>20

Managed by UT
-
Battelle for the

Department of Energy

7

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Preparing for the Exascale

Long
-
Term Science Drivers and Requirements


We have recently surveyed,
analyzed, and documented the
science drivers and application
requirements envisioned for
exascale leadership systems in the
2020 timeframe


These studies help to


Provide a roadmap for the ORNL
Leadership Computing Facility


Uncover application needs and
requirements


Focus our efforts on those
disruptive technologies and
research areas in need of our and
the HPC community’s attention

Managed by UT
-
Battelle for the

Department of Energy

8

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008


All projections are daunting


Based on projections of existing technology both

with and without “disruptive technologies”


Assumed to arrive in 2016
-
2020 timeframe


Example 1


115K nodes @ 10 TF per node, 50
-
100 PB, optical
interconnect, 150
-
200 GB/s injection B/W per node, 50 MW


Examples 2
-
4 (DOE “Townhall” report*)

What Will an EF System Look Like?

*www.er.doe.gov/ASCR/ProgramDocuments/TownHall.pdf

Managed by UT
-
Battelle for the

Department of Energy

9

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Science Prospects and Benefits with High End
Computing (EF?) in the Next Decade

Opportunity

Key application areas

Goal and benefit

Materials science

Nanoscale science, manufacturing,
and material lifecycles, response and
failure

Design, characterize, and manufacture materials, down to the nanoscale,
tailored and optimized for specific applications

Earth science

Weather, carbon management,
climate change mitigation and
adaptation, environment

Understand the complex biogeochemical cycles that underpin global
ecosystems and control the sustainability of life on Earth

Energy assurance

Fossil, fusion, combustion, nuclear
fuel cycle, chemical catalysis,
renewables (wind, solar, hydro),
bioenergy, energy efficiency, power
grid, transportation, buildings

Attain, without costly disruption, the energy required by the United
States in guaranteed and economically viable ways to satisfy residential,
commercial, and transportation requirements

Fundamental
science

High energy physics, nuclear physics,
astrophysics, accelerator physics

Decipher and comprehend the core laws governing the Universe and
unravel its origins

Biology and
medicine

Proteomics, drug design, systems
biology

Understand connections from individual proteins through whole cells
into ecosystems and environments

National security

Disaster management, homeland
security, defense systems, public
policy

Analyze, design, stress
-
test, and optimize critical systems such as
communications, homeland security, and defense systems; understand
and uncover human behavioral systems underlying asymmetric
operation environments

Engineering design

Industrial and manufacturing
processes

Design, deploy, and operate safe and economical structures, machines,
processes, and systems with reduced concept
-
to
-
deployment time

Managed by UT
-
Battelle for the

Department of Energy

10

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Science Case: Climate


250 TF


Mitigation: Initial simulations
with dynamic carbon cycle and
limited chemistry


Adaptation: Decadal
simulations with high
-
resolution ocean (1/10
°
)


1 PF


Mitigation: Full chemistry,
carbon/nitrogen/sulfur cycles,
ice
-
sheet model, multiple
ensembles


Adaptation: High
-
resolution
atmosphere (1/4
°
), land, and
sea ice, as well as ocean


Sustained PF


Mitigation: Increased
resolution, longer simulations,
more ensembles for reliable
projections; coupling with
socio
-
economic and
biodiversity models


Adaptation: Limited cloud
-
resolving simulations, large
-
scale data assimilation


1 EF


Mitigation: Multi
-
century
ensemble projections for
detailed comparisons of
mitigation strategies


Adaptation: Full cloud
-
resolving simulations, decadal
forecasts of regional impacts
and extreme
-
event statistics

Resolve clouds, forecast weather & extreme
events, provide quantitative mitigation strategies

Mitigation: Evaluate strategies and inform policy decisions for climate stabilization; 100
-
1000 year simulations

Adaptation: Decadal forecasts & region impacts; prepare for committed climate change; 10
-
100 year simulations

Managed by UT
-
Battelle for the

Department of Energy

11

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Barriers in Ultrascale Climate Simulation

Attacking the Fourth Dimension: Parallel in Time


Problem


Climate models use explicit
time stepping


Time step must go down as
resolution goes up


Time stepping is serial


Single
-
process performance is
stagnating


More parallel processes do not
help!


Possible
complementary

solutions


Implicit time stepping


High
-
order in time


“Fast” bases: curvelets and
multi
-
wavelets


“Parareal” parallel in time


Progress


Implicit version of HOMME for
global shallow
-
water equations:
10x speedup for steady
-
state
test case


High
-
order single
-
step time
integration


Single
-
cycle multi
-
grid linear
solver for 1D


Pure advection with curvelets
and multi
-
wavelets


Near
-
term plans


Scale, tune, and precondition
implicit HOMME


Single
-
cycle multi
-
grid linear
solver for 2D


“Parareal” for Burgers’ (1D
nonlinear)

Ref: Trey White (ORNL)

Managed by UT
-
Battelle for the

Department of Energy

12

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Science Case: Astrophysics


250 TF


The interplay of several
important phenomena:
hydrodynamic instabilities,
role of nuclear burning,
neutrino transport


1 PF


Determine the nature of the
core
-
collapse supernova
explosion mechanism


Fully integrated, 3D neutrino
radiation hydrodynamics
simulations with nuclear
burning


Sustained PF


Detailed nucleosynthesis
(element production) from core
-
collapse SNe


Large nuclear network capable
of isotopic prediction (along
with energy production)


1 EF


Precision prediction of
complete observable set from
core
-
collapse SNe:
nucleosynthesis, gravitational
waves, neutrino signatures,
light output


Tests general relativity and
information about the dense
matter equation state, along
with detailed knowledge of
stellar evolution


Full 3D Boltzmann neutrino
tranpsort, 3D MHD/RHD,
nuclear burning

CHIMERA

Explanation and prediction of core
-
collapse SNe; put general
relativity, dense EOS, stellar evolution theories to the test

Managed by UT
-
Battelle for the

Department of Energy

13

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Requirements Gathering


Consult literature and existing documentation


Construct a survey eliciting speculative
requirements for scientific application on HPC
platforms in 2010

2020


Pass the survey to leading computational
scientists in a broad range of scientific
domains


Analyze and validate the survey results (hard)


Make informed decisions and take action

Managed by UT
-
Battelle for the

Department of Energy

14

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Survey Questions


What are some possible science
drivers and urgent problems that
would require Leadership
Computing in 2010

2020?


What are some looming
computational challenges that will
need resolution in 2010

2020?


What are some science objectives
and outcomes that Leadership
Computing could enable in 2010

2020?


What are some improvement goals
for science
-
simulation fidelity that
Leadership Computing could enable
in 2010

2020?


What are some possible changes in
physical model attributes for
Leadership
-
Computing applications
in 2010

2020?


What major software
-
development
projects could occur in your
application area in 2010

2020?


What major algorithm changes
could occur for your applications in
2010

2020?


What libraries and development
tools may need to be developed or
significantly improved for
Leadership Computing in 2010

2020?


How might system
-
attribute
priorities change for Leadership
Computing for your application?


In what ways might or should your
workflow in 2010

2020 be different
from today?


Are there any disruptive
technologies that might affect your
applications?

Managed by UT
-
Battelle for the

Department of Energy

15

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

16

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

17

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

18

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

19

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

20

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Findings in Models and Algorithms


The seven algorithm types are scattered broadly among science
domains, with no one particular algorithm being ubiquitous and
no one algorithm going unused.


Structured grids and dense linear algebra continue to dominate, but other
algorithm categories will become more common.


Compared to the Seven Dwarfs for current applications, we
project a significant increase in Monte Carlo and increases in
unstructured grids, sparse linear algebra, and particle methods,
as well as a relative decrease in FFTs


These projections reflect the expectation of much
-
greater parallelism in
architectures and the resulting need for very high scalability


Load balancing, scalable sparse solver, and random number generator
algorithms will be more important.


Some important algorithms are not captured in the Seven Dwarfs


Categories expected by application scientists to be of growing importance in
2010

2020 include adaptive mesh refinement, implicit nonlinear systems,
data assimilation, agent
-
based methods, parameter continuation, and
optimization

Managed by UT
-
Battelle for the

Department of Energy

21

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Findings in Software


“Hero developer” mode is fatalistic


Does not scale and no single person can adequately understand breadth
and depth of issues


Only accomplished by computer scientists, algorithm developers,
application developers, and end
-
user scientists working together in a
tightly integrated manner


Must develop a means of interface between the heterogeneous computer,
the developer, and the end user scientist


Must raise the level of abstraction


Current approach based on low
-
level constructs places constraints on
performance: over
-
constrain compiler and runtime system


Raising abstraction level allows for increased algorithm experimentation,
incorporation of intent in data structures, flexible memory organization,
inclusion of fault tolerance constructs


Enables exploration of power
-
aware algorithms


Freedom from heroic software efforts having to be the norm

Managed by UT
-
Battelle for the

Department of Energy

22

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Findings in Software


Application development and maintenance tools and practices
need to fundamentally change


Productivity improvement is an important metric and guide for
tool and software choices


Fault tolerance and V&V software components must be used to
improve reliability and robustness of application software


Knowledge discovery techniques and tools should be explored
to help with bug detection, simulation steering, and data
feature extraction and correlation


A holistic view of application data (from input to archival) is
needed to most effectively deliver tools for the end
-
to
-
end
workflow performed by scientists

Managed by UT
-
Battelle for the

Department of Energy

23

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

24

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

25

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

26

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

27

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Managed by UT
-
Battelle for the

Department of Energy

28

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Applications Analyzed


CHIMERA


Astrophysics:

core
-
collapse supernova explosion mechanism


S3D


Turbulent combustion:

lifted flame stabilization in diesel & gas turbine
engines


GTC


Fusion:

Analyze and validate CTEM and ETG core turbulence


POP


Global ocean circulation:

Eddy
-
resolved flow with biogeochemistry


DCA++


High
-
temperature superconductivity:

Effect of charge & spin
inhomogeneities in the Hubbard model superconducting state


MADNESS


Chemistry:

neutron & x
-
ray spectra of cuprates; dynamics of few
-
electron
systems; metal oxides surfaces in catalytic processes


PFLOTRAN


Reactive flows in porous media:

Uranium migration and CO
2

sequestration
in subsurface geologic formations

Managed by UT
-
Battelle for the

Department of Energy

29

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Application Requirements and Workload
Reinforces a Balanced System Assertion


Applications analyzed represent almost
one half of our 2008 allocation


A broad range of compute/communicate
workloads must be supported


Depends upon science, application within
that science, and problem tackled by
application


Application requirements call for breadth in
models, algorithms, software, and scaling
type


Physical models


coupled continuum conservation laws,
radiation transport, many
-
body Schrodinger,
plasma physics, Maxwell’s equations,
turbulence


Numerical algorithms


Each of the “7 dwarfs” is required


Software implementation


All popular languages are required


Science drivers


Strong scaling (time to solution)


Weak scaling (bigger problem)


Application readiness actions plans are in
place and being followed

Solar Physics
3.3%
Accelerator Physics
3.1%
Astrophysics
14.1%
Biology
4.8%
Chemistry
7.4%
Climate
13.6%
Computer Science
2.8%
Engineering
0.56%
Combustion
14.4%
Nuclear Physics
5.2%
Atomic Physics
1.4%
QCD
4.9%
Geosciences
1.2%
Fusion
7.2%
Materials Science
16.0%
Solar Physics
3.3%
Accelerator Physics
3.1%
Astrophysics
14.1%
Biology
4.8%
Chemistry
7.4%
Climate
13.6%
Computer Science
2.8%
Engineering
0.56%
Combustion
14.4%
Nuclear Physics
5.2%
Atomic Physics
1.4%
QCD
4.9%
Geosciences
1.2%
Fusion
7.2%
Materials Science
16.0%
GTC

MADNESS

DCA++

S3D

CHIMERA

POP

PFLOTRAN

Distribution in this space depends upon
the applications and the problem being
simulated for a given application

Computation

Communication

0%

100%

0%

100%

GTC

S3D

POP

CHIMERA

DCA++

MADNESS

Computation

Communication

0%

100%

0%

100%

GTC

S3D

POP

CHIMERA

DCA++

MADNESS

PFLOTRAN

Managed by UT
-
Battelle for the

Department of Energy

30

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Resource Utilization by Science Applications

Science Dictates the Requirements

Managed by UT
-
Battelle for the

Department of Energy

31

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Example: PF Performance Observations and
Readiness Plan for Some of our Key Apps

Code

Science
Scaling
Needs

Performance Observations and

Readiness Plan

S3D

larger
problem



Compute
-
bound with minimal communication overhead



Reduce memory contention with hybrid parallelism



Increase cache reuse

GTC

larger
problem



Compute
-
bound with minimal communication overhead



Use radial domain decomposition to eliminate cross
-
core collective calls



Reduce the size of the problem per core and get better cache
-
reuse



Increase SSE factor

DCA++

solution time



Heavily compute bound, benefitting from L3 BLAS routines (DGEMM,ZGEMM)



Very good use of SSE (50%) with no changes



Include disorder model for additional level of parallelism: 10x need for more processors



Multithreaded linear algebra will allow additional parallelism at a lower level

MADNESS

solution time



Full asynchronous algorithm with communications hidden by model



Nicely positioned to exploit Gemini



Good SSE factor but still room for improvement

POP

solution time



Sizeable communication component



Reduce memory contention time and increase SSE factor



Minimize synchronous behavior; better cache blocking



New physics (biogeochemistry) increases compute fraction

CHIMERA

larger
problem



Communication dominated by collectives



Production level physics increases compute fraction



Reasonable SSE factor but room for improvement



20% raw speedup from Gemini w/o enhancements

PFLOTRAN

solution time



Communication dominated by collectives



Poor SSE factor


some room for improvement



Additional phases and chemical species will reduce memory contention (natural block structure of
Jacobian enables more efficient use of memory hierarchy)

Managed by UT
-
Battelle for the

Department of Energy

32

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Accelerating Development & Readiness


Automated diagnostics


Drivers: performance analysis, application verification, S/W debugging, H/W
-
fault
detection and correction, failure prediction and avoidance, system tuning, and
requirements analysis


Hardware latency


Won’t see improvement nearly as much as flop rate, parallelism, B/W in coming years


Can S/W strategies mitigate high H/W latencies?


Hierarchical algorithms


Applications will require algorithms aware of the system hierarchy (compute/memory)


In addtion to hybrid data parallelism, and file
-
based checkpointing, algorithms may
need to include dynamic decisions between recomputing and storing, fine
-
scale task
-
data hybrid parallelism, and in
-
memory checkpointing


Parallel programming models


Improved programming models needed to allow developer to identify an arbitrary
number of levels of parallelism and map them onto hardware hierarchies at runtime


Models continue to be coupled into larger models, driving the need for arbitrary
hierarchies of task and data parallelism

Managed by UT
-
Battelle for the

Department of Energy

33

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Accelerating Development & Readiness


Solver technology and innovative solution techniques


Global communication operations across 10
6
-
8

processors will be prohibitively
expensive, solvers will have to eliminate global communication where feasible and
mitigate its effects where it cannot be avoided. Research on more effective local
preconditioners will become a very high priority


If increases in memory B/W continue to lag the number of cores added to each socket,
further research needed into ways to effectively trade flops for memory loads/stores


Accelerated time integration


Are we ignoring the time dimension along which to exploit parallelism? (Ex: climate)


Model coupling


Coupled models require effective methods to implement, verify, and validate the
couplings, which can occur across wide spatial and temporal scales. The coupling
requirements drive the need for robust methods for downscaling, upscaling, and
coupled nonlinear solving


Evaluation of the accuracy and importance of couplings drives the need for methods
for validation, uncertainty analysis, and sensitivity analysis of these complex models


Maintaining current libraries


Reliance of current HPC applications on libraries will grow


Libraries must perform as HPC systems grow in parallelism and complexity

Managed by UT
-
Battelle for the

Department of Energy

34

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

PF Survey Findings (with some opinion)


A rigorous & evolving apps reqms
process pays dividends


Needs to be quantitative: apps cannot
“lie” with performance analysis


Algorithm development is
evolutionary


Can we break this mold?


Ex: Explore new parallel dimensions
(time, energy)


Hybrid/multi
-
level programming
models virtually nonexistent


No algorithm “sweet spots” (one
size fits all)


But algorithm footprints share
characteristics


V&V and SQA not in good standing


Ramifications on compute systems as well
as apps results generated



No one is really clamoring for new
languages


MPI until the water gets too hot
(frog analogy)


Apps lifetimes are >3
-
5x machine
lifetimes


Refactoring a way of life


Fault tolerance via defensive
checkpointing de facto standard


Won’t this eventually bite us?
Artificially drives I/O demands


Weak or strong scale or both (no
winner)


Data analytics paradigm must
change


The middleware layer is
surprisingly stable and agnostic
across apps (and should expand!)



Managed by UT
-
Battelle for the

Department of Energy

35

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Summary & Recommendations: EF Survey


We are in danger of failing because of a software crisis unless concerted
investments are undertaken to close the H/W
-
S/W gap


H/W has gotten way ahead of the S/W (same ole


same ole?)


Structured grids and dense linear algebra continue to dominate, but …


Increase projected for Monte Carlo algorithms, unstructured grids, sparse
linear algebra, and particle methods (relative decrease in FFTs)


Increasing importance for AMR, implicit nonlinear systems, data assimilation,
agent
-
based method, parameter continuation, optimization


Priority of computing system attributes


Increase: interconnect bandwidth, memory bandwidth, mean time to
interrupt, memory latency, and interconnect latency


Reflect desire to increase computational efficiency to use peak flops


Decrease: disk latency, archival storage capacity, disk bandwidth,
wide area network bandwidth, and local storage capacity


Reflect expectation that computational efficiency will not increase


Per
-
core requirements relatively static, while aggregate requirements
will grow with the system

Managed by UT
-
Battelle for the

Department of Energy

36

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Summary & Recommendations: EF Survey


System software must possess more stability, reliability, and fault
tolerance during application execution


New fault tolerance paradigms must be developed and integrated into
applications


Job management and efficient scheduling of those resources will be a major
obstacle faced by computing centers


Systems must be much better “science producers”


Strong software engineering practices must be applied to systems to
ensure good end
-
to
-
end productivity


Data analytics must empower scientists to ask “what
-
if” questions,
providing S/W & H/W infrastructure capable of answering these questions
in a timely fashion (Google desktop)


Strong data management will become an absolute at the exascale


Just like H/W requires disruptive technologies for acceleration
of natural evolutionary paths, so too will algorithm, software,
and physical model development efforts need disruptive
technologies (invest now!)

Managed by UT
-
Battelle for the

Department of Energy

37

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Hardware: 3

From David Keyes, FSP Review ASCAC
30 April 2008


1.5 orders: increased processor speed and efficiency


1.5 orders: increased concurrency


1 order: higher
-
order discretizations


Same accuracy can be achieved with many fewer elements


1 order: flux
-
surface following gridding


Less resolution required along than across field lines


4 orders: adaptive gridding


Zones requiring refinement are <1% of ITER volume and
resolution requirements away from them are ~10
2

less severe


3 orders: implicit solvers


Mode growth time 9 orders longer than Alfven
-
limited CFL

Fusion Simulation Project: Where to find
12 orders in 10 years?

Software: 9

Managed by UT
-
Battelle for the

Department of Energy

38

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

A View from Berkeley (John Shalf)*


Need better benchmarks and
better performance models


For reliable extrapolated
code requirements


Power is driving daunting
concurrency


Scalable programming
models


Need to exploit hierarchical
machine architecture


Hybrid processors


More
concurrency; need a
more generalized approach


Apps must deal with
platform reliability


Don’t forget autotuning


Shows value of good compilers
and associated R&D


Fast, robust I/O is hard


Scaling and concurrency is
outsripping our ability to do
rigorous V&V


Application code complexity
has outgrown available tools


Frameworks and community
codes can work but with certain
“rules of engagement”


*ASCAC Fusion Simulation Project Review panel presentation (4/30/08)

Managed by UT
-
Battelle for the

Department of Energy

39

CSRI Scalable Apps Workshop: SF, NM, June 3
-
5, 2008

Questions?

Doug Kothe (kothe@ornl.gov)