michael
aivazis
(
aivazis@caltech.edu
)
A survival toolkit
for the technology outback
Michael
Aivazis
California Institute of Technology
NOBUGS
(ok, maybe a few bugs)
Sydney
3
-
5 November 2008
michael
aivazis
(
aivazis@caltech.edu
)
Lay of the land
NOBUGS, Sydney, 3
-
5 November 2008
2
michael
aivazis
(
aivazis@caltech.edu
)
Evolutionary pressures
•
Alternatives: evolve, uplift or perish…
NOBUGS, Sydney, 3
-
5 November 2008
3
michael
aivazis
(
aivazis@caltech.edu
)
Debunking stereotypes
NOBUGS, Sydney, 3
-
5 November 2008
4
Dymaxion
map (B. Fuller)
michael
aivazis
(
aivazis@caltech.edu
)
NOBUGS, Sydney, 3
-
5 November 2008
User stereotypes
•
End
-
user
–
occasional user of prepackaged and specialized analysis tools
•
Application author
–
author of prepackaged specialized tools
•
Expert user
–
investigator with a specific scientific goal
•
Domain expert
–
author of analysis, modeling or simulation software
•
Software integrator
–
responsible for extending software with new technology
•
Framework maintainer
–
responsible for maintaining and extending the infrastructure
5
michael
aivazis
(
aivazis@caltech.edu
)
Technical challenges
6
NOBUGS, Sydney, 3
-
5 November 2008
michael
aivazis
(
aivazis@caltech.edu
)
Sources of complexity
•
Project size:
–
asset complexity: number of lines of code, files, entry points
–
dependencies: number of modules, third
-
party libraries
–
runtime complexity: number of objects types and instances
•
Problem size:
–
number of processors needed, amount of memory, cpu time
•
Project longevity:
–
life cycle, duty cycle
–
cost/benefit of reuse
–
managing change: people, hardware, technologies
•
Locality of needed resources
–
compute/persist: where, how, when, who
•
User interfaces
–
younger users will have no tolerance for either “bad” or “ugly”
•
Adapting to new tehnology
•
Turning
craft
into:
science, engineering, … art
7
NOBUGS, Sydney, 3
-
5 November 2008
michael
aivazis
(
aivazis@caltech.edu
)
Past successes
•
Projects:
–
Caltech ASC Center (DOE)
–
GeoFramework (NSF)
–
Computational Infrastructure in
Geodynamics (NSF):
–
DANSE (NSF)
–
Caltech PSAAP Center(DOE)
•
Large collaborations
–
faculty, post
-
docs, students
–
geographically distributed
•
Challenges
–
independent but coherent evolution
–
integration
NOBUGS, Sydney, 3
-
5 November 2008
8
michael
aivazis
(
aivazis@caltech.edu
)
Leveraging
NOBUGS, Sydney, 3
-
5 November 2008
9
michael
aivazis
(
aivazis@caltech.edu
)
Flexibility through scripting
•
Scripting enables us to
–
Organize the large number of simulation parameters
–
Allow the simulation environment to discover new capabilities without
the need for recompilation or re
-
linking
•
Integration framework is written in Python
–
The interpreter
•
modern object oriented language
•
robust, portable, mature, well supported, well documented
•
easily extensible
•
rapid application development
–
Support for parallel programming
•
trivial embedding of the interpreter in an MPI compliant manner
•
a python interpreter on each compute node
•
MPI is fully integrated: bindings + OO layer
–
No measurable impact on either performance or scalability
NOBUGS, Sydney, 3
-
5 November 2008
10
michael
aivazis
(
aivazis@caltech.edu
)
Pyre
•
Pyre is a
software architecture
:
–
a specification of the organization of the
software system
–
a description of the crucial structural
elements and their interfaces
–
a specification for the possible
collaborations of these elements
–
a strategy for the composition of structural
and behavioral elements
•
Pyre is multi
-
layered
–
flexibility
–
complexity management
–
robustness under evolutionary pressures
•
Pyre is a
component framework
NOBUGS, Sydney, 3
-
5 November 2008
11
application
-
general
application
-
specific
framework
computational engines
michael
aivazis
(
aivazis@caltech.edu
)
Choosing your gear
NOBUGS, Sydney, 3
-
5 November 2008
12
michael
aivazis
(
aivazis@caltech.edu
)
Some pyre services
•
journal
–
flexible control over the generation and delivery of simulation diagnostics from
the compute nodes to the workstation
•
monitor
–
a distributed service for low bandwidth, on the fly visualizations
–
currently used mostly for status monitoring and debugging
•
timer: embedded performance monitor
•
ipa: user authentication
•
passwords, SSL certificates, … Grid authentication
•
weaver
–
a general source code generation facility
–
support for many languages
•
FORTRAN, C, C++, python, HTML, XML
–
automatic web page creation for cgi scripts
•
blade: a toolkit
-
independent UI generator
•
opal: web based UI and application hosting
–
pyre based cgi scripts
–
auto
-
generation of html/javascript
–
ajax support in progress (jQuery)
NOBUGS, Sydney, 3
-
5 November 2008
13
michael
aivazis
(
aivazis@caltech.edu
)
Distributed computing
•
gsl:
–
a package that completely encapsulates the middleware
–
provides both user space and
grid
-
enabled solution
•
User space:
–
ssh, scp
–
pyre service factories and component management
•
Web services
–
full pyre/opal support for “
science gateways
”
–
py
Grid
Ware from Keith Jackson’s group
•
Advanced features
–
dynamic discovery for optimized deployment
–
reservation system for computational resources
NOBUGS, Sydney, 3
-
5 November 2008
14
michael
aivazis
(
aivazis@caltech.edu
)
Pyre components
•
Component based solutions
are ideal for complex systems
–
encourage the decomposition of
the problem into manageable
functional units
–
expose the interaction
mechanisms between these units
–
enable the nearly independent
evolution of the parts
•
Component frameworks
enable an incremental and
evolutionary approach
–
existing codes can start
producing results immediately
–
new services can be incorporated
incrementally
NOBUGS, Sydney, 3
-
5 November 2008
15
Component
input ports
output ports
properties
component core
name
control
michael
aivazis
(
aivazis@caltech.edu
)
Component anatomy
•
Core: encapsulation of
computational engines
–
middleware that manages the
interaction between the framework
and codes written in low level
languages
•
Harness: an intermediary between
a component’s core and the
external world
–
framework services:
•
control
•
port deployment
–
core services:
•
deployment
•
launching
•
teardown
NOBUGS, Sydney, 3
-
5 November 2008
16
michael
aivazis
(
aivazis@caltech.edu
)
Component cores
•
Three tier encapsulation of access
to computational engines
–
engine
–
bindings
–
facility implementation by
extending abstract framework
services
•
Cores enable the lowest
integration level available
–
suitable for integrating large codes
that interact with one another by
exchanging complex data
structures
–
UI: text editor
NOBUGS, Sydney, 3
-
5 November 2008
17
public interface
bindings
computational engine
core
michael
aivazis
(
aivazis@caltech.edu
)
Application archiving
•
Produce a fully repeatable execution by recording
–
scripts
–
user choices
–
sources (cvs/svn tags or even the files themselves)
–
build procedure
–
required third party libraries
–
version of as many runtime components as can be determined
–
generated data sets (urls, actual files)
•
Implementation
–
meta
-
data in PostgreSQL
–
HDF5
–
embed XML meta
-
data
•
parsed for deducing the layout of the file as format evolves
•
can be extracted for easy indexing
NOBUGS, Sydney, 3
-
5 November 2008
18
michael
aivazis
(
aivazis@caltech.edu
)
Services for computational engines
•
Normal engine life cycle:
–
deployment
•
staging, instantiation, static initialization, dynamic initialization, resource
allocation
–
launching
•
input delivery, execution control, hauling of output
–
teardown
•
resource de
-
allocation, archiving, execution statistics
•
Exceptional events
–
core dumps, resource allocation failures
–
diagnostics: errors, warnings, informational messages
–
monitoring: debugging information, self consistency checks
•
Distributed computing
•
Parallel processing
NOBUGS, Sydney, 3
-
5 November 2008
19
michael
aivazis
(
aivazis@caltech.edu
)
HelloApp: hello world
NOBUGS, Sydney, 3
-
5 November 2008
20
from
pyre.application.Application
import
Application
class
HelloApp
(Application)
:
def
main(
self
)
:
print
"Hello world!"
return
def
__init__(
self
):
Application.__init
__(
self
, "hello")
return
# main
if
__name__
== "__main__":
app =
HelloApp
()
app.run
()
access to the base class
> ./hello.py
Hello world!
•
Output
michael
aivazis
(
aivazis@caltech.edu
)
Properties
NOBUGS, Sydney, 3
-
5 November 2008
21
•
Named attributes that are under direct user control
–
automatic conversions from strings to all supported types
•
Properties have
–
name
–
default value
–
optional validator functions
•
Accessible from pyre.properties
–
factory methods:
str
,
bool
,
int
,
float
,
sequence
,
dimensional
–
validators:
less
,
greater
,
range
,
choice
import
pyre.inventory
flag =
pyre.inventory.bool
(name=“some
-
flag", default=True)
style =
pyre.inventory.string
(name=“my
-
style", default=“boring")
scale =
pyre.inventory.float
(
name="scale", default=1.0,
validator
=
props.inventory.greater
(0))
•
You can derive your own property type from
pyre.inventory.Property
michael
aivazis
(
aivazis@caltech.edu
)
HelloApp: adding properties
NOBUGS, Sydney, 3
-
5 November 2008
22
from
pyre.application.Application
import
Application
class
HelloApp
(Application)
:
…
class
Inventory(
Application.Inventory
):
import
pyre.inventory
friend =
pyre.inventory.str
(“friend", default="world")
…
michael
aivazis
(
aivazis@caltech.edu
)
HelloApp: using properties
•
Now you can say hello to your friend…
NOBUGS, Sydney, 3
-
5 November 2008
23
from
pyre.application.Application
import
Application
class
HelloApp
(Application)
:
…
def
main(
self
)
:
print
"Hello %s!" %
self
.inventory.friend
return
def
__init__(
self
):
Application.__init
__(
self
, "hello")
return
> ./hello.py
--
name="Michael"
Hello Michael!
michael
aivazis
(
aivazis@caltech.edu
)
Units
NOBUGS, Sydney, 3
-
5 November 2008
24
•
Properties can have units:
–
the framework provides the type
dimensional
•
Support for units is in
pyre.units
–
all SI base and derived units
–
most common abbreviations and alternative unit systems
–
correct handling of all arithmetic operations
•
addition, multiplication, functions from
math
–
parsing expressions from the command line
import
pyre.inventory
from
pyre.units.time
import
s, hour
from
pyre.units.length
import
m, km, mile
speed =
pyre.inventory.dimensional
(
name="speed", default=50*mile/hour)
v =
pyre.inventory.dimensional
(
name="velocity", default=(0.0*m/s, 0.0*m/s, 10*km/s))
michael
aivazis
(
aivazis@caltech.edu
)
Parallel HelloApp
NOBUGS, Sydney, 3
-
5 November 2008
25
from
mpi.Application
import
Application
class
HelloApp
(Application)
:
def
main(
self
)
:
import
mpi
world =
mpi.world
()
print
"[%03d/%03d] Hello world"
%
(
world.rank
,
world.size
)
return
def
__init__(
self
):
Application.__init
__(
self
, "hello")
return
# main
if
__name__
== "__main__":
app =
HelloApp
()
app.run
()
michael
aivazis
(
aivazis@caltech.edu
)
Facilities and components
NOBUGS, Sydney, 3
-
5 November 2008
26
•
A design pattern that enables the assembly of application
components at run time under user control
•
Facilities are named abstract application requirements
•
Components are concrete named engines that satisfy the
requirements
•
Dynamic control:
–
the application script author provides
•
a specification of application facilities as part of the
Application
definition
•
a component to be used as the default
–
the user can construct scripts that create alternative components that
comply with facility interface
–
the end user can
•
configure the properties of the component
•
select which component is to be bound to a given facility at runtime
michael
aivazis
(
aivazis@caltech.edu
)
Inversion of control
NOBUGS, Sydney, 3
-
5 November 2008
27
•
A feature of component frameworks
–
applications require facilities and invoke the services they promise
–
component instances that satisfy these requirements are injected at the
latest possible time
•
The pyre solution to this problem
–
eliminates the complexity by using "service locators"
–
takes advantage of the dynamic programming possible in python
–
treats components and their initialization state fully symmetrically
–
provides simple but acceptable persistence (performance, scalability)
•
XML files, python scripts
•
databases (PostgreSQL, MySQL)
–
can easily take advantage of other object stores
–
is ideally suited for both parallel and distributed applications
michael
aivazis
(
aivazis@caltech.edu
)
Are we there yet?
NOBUGS, Sydney, 3
-
5 November 2008
28
michael
aivazis
(
aivazis@caltech.edu
)
Cost/benefit … rationalizations
NOBUGS, Sydney, 3
-
5 November 2008
29
•
Drawbacks
–
some reengineering required
–
paradigm shift
–
learning curve
–
not helped by the (current) lack of documentation…
•
Benefits
–
clear path forward for “legacy” applications
–
easy, normalized access to large number of facilities
–
structured way for enabling engines in modern computational
environments
–
rigorous separation of UI from computational engines
–
easy re
-
hosting of compliant application
michael
aivazis
(
aivazis@caltech.edu
)
Forecast
•
Some things change
–
languages, platforms, networks
–
algorithms, tools, processes
–
…
•
Some things don’t
–
people: your colleagues, your customers,
your boss, his boss…
–
money/time always short
–
there will always be bugs
–
users never know what they want
•
but they know how you should implement it…
–
security always annoying but necessary
–
collaboration is difficult
•
see items above
•
Design for “constrained change”
NOBUGS, Sydney, 3
-
5 November 2008
30
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο