SGW-AUS-01-28-10 - TeraGrid Forum

obtainablerabbiData Management

Jan 31, 2013 (4 years and 7 months ago)

235 views

Gateway Update for
AUS

Nancy Wilkins
-
Diehr

TeraGrid Area Director for Science
Gateways

wilkinsn@sdsc.edu

AUS telecon, January 28, 2010

How did the Gateway program develop?

A natural result of the impact of the internet on worldwide
communication and information retrieval





Implications on the conduct of science are still evolving


1980’s, Early gateways, National Center for Biotechnology Information BLAST
server, search results sent by email, still a working portal today


1989 World Wide Web developed at CERN


1992 Mosaic web browser developed


1995 “International Protein Data Bank Enhanced by Computer Browser”


2004 TeraGrid project director Rick Stevens recognized growth in scientific
portal development and proposed the Science Gateway Program


Today, Web 3.0 and programmatic exchange of data between web pages


Simultaneous explosion of digital information


Growing analysis needs in many, many scientific areas


Sensors, telescopes, satellites, digital images and video,


#1 machine on Top500 today over 1000x more powerful than
all combined
entries on the first list in 1993



AUS telecon, January 28, 2010

Only 18 years since the release of Mosaic!

Why are gateways worth the effort?


Increasing range of
expertise needed to tackle
the most challenging
scientific problems


How many details do you want
each individual scientist to need
to know?


PBS, RSL, Condor


Coupling multi
-
scale codes


Assembling data from multiple
sources


Collaboration frameworks


AUS telecon, January 28, 2010

#! /bin/sh

#PBS
-
q dque

#PBS
-
l nodes=1:ppn=2

#PBS
-
l walltime=00:02:00

#PBS
-
o pbs.out

#PBS
-
e pbs.err

#PBS
-
V

cd /users/wilkinsn/tutorial/exercise_3

../bin/mcell nmj_recon.main.mdl

+(


&(resourceManagerContact="tg
-
login1.sdsc.teragrid.org/jobmanager
-
pbs")


(executable="/users/birnbaum/tutorial/bin/mcell")


(arguments=nmj_recon.main.mdl)


(count=128)


(hostCount=10)


(maxtime=2)


(directory="/users/birnbaum/tutorial/exercise_3")


(stdout="/users/birnbaum/tutorial/exercise_3/globus.out")


(stderr="/users/birnbaum/tutorial/exercise_3/globus.err")

)

=======

# Full path to executable

executable=/users/wilkinsn/tutorial/bin/mcell


# Working directory, where Condor
-
G will write

# its output and error files on the local machine.

initialdir=/users/wilkinsn/tutorial/exercise_3


# To set the working directory of the remote job, we

# specify it in this globus RSL, which will be appended

# to the RSL that Condor
-
G generates

globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3')


# Arguments to pass to executable.

arguments=nmj_recon.main.mdl


# Condor
-
G can stage the executable

transfer_executable=false


# Specify the globus resource to execute the job

globusscheduler=tg
-
login1.sdsc.teragrid.org/jobmanager
-
pbs


# Condor has multiple universes, but Condor
-
G always
uses globus

universe=globus


# Files to receive sdout and stderr.

output=condor.out

error=condor.err


# Specify the number of copies of the job to submit to the
condor queue.

queue 1

Gateways democratize access to high
end resources


Almost anyone can investigate scientific questions using high
end resources


Not just those in the research groups of those who request allocations


Gateways allow anyone with a web browser to explore


Opportunities can be uncovered via google


At 11, my son discovered nanoHUB.org while his class was studying
Bucky Balls in science class


Foster new ideas, cross
-
disciplinary approaches


Encourage students to experiment


Multi
-
disciplinary computational linguistics course at U Chicago uses Social
Informatics DataGrid (SIDGrid) gateway


But used in production too


Significant number of papers resulting from gateways including
GridChem, nanoHUB


Scientists can focus on challenging science problems rather than
challenging infrastructure problems

AUS telecon, January 28, 2010

Today, there are approximately 35
gateways using the TeraGrid

AUS telecon, January 28, 2010

Not just ease of use

What can scientists do that they
couldn’t do previously?


Linked Environments for Atmospheric Discovery (LEAD)
-

access to radar data


National Virtual Observatory (NVO)


access to sky surveys


Ocean Observing Initiative (OOI)


access to sensor data


PolarGrid


access to polar ice sheet data


SIDGrid


expensive datasets, analysis tools


GridChem

coupling multiscale codes



How would this have been done before gateways?

AUS telecon, January 28, 2010

What makes a gateway a
TeraGrid

gateway?


TeraGrid gateways
use

TeraGrid resources


Are they all developed by TeraGrid?


No,
we don’t make gateways the gateways you use, we make
the gateways you use better


The strength of the program lies in the development of end user
interfaces by those in the community


TeraGrid does provide staff to assist with gateway use of the
resources


Anyone can request support via the same peer review process used to
request CPU hours or a data allocation


Works just like AUS


Staff assigned to incoming projects

AUS telecon, January 28, 2010

Gateway
-
AUS crossover


Several projects requesting multiple types of support today


Gridchem


Ultrascan


Request for code porting/compiling, performance, parallelization, but also
database and web server hosting and improved fault tolerance for grid
software


Please let me know if any of the researchers you work with
have gateway needs


We can evaluate needs and assign staff with the right expertise to help


We’ll do the same if we see any requests for porting, scaling,
optimization support

AUS telecon, January 28, 2010

Some history behind gateway
allocations



Individual and community allocations written into policy in
2002!


Dick Crutcher, John Towns, Phil Andrews, Nancy author white paper


Today, the xRAC accepts and reviews proposals in four
general categories


Individual investigators


Large research collaborations (e.g., MILC consortium)


Community Projects (e.g., NEES)


Community Services (e.g., ROBETTA)


The general requirements for proposals of all four types
remain largely the same.

AUS telecon, January 28, 2010

I. Research Objectives


Traditional proposals


Describe the research activities to be pursued


Keep it short: You only need enough detail to support the methods and
computational plan being proposed.


Community proposals


Describe the
classes

of research activities that the proposed effort will
support.


Keep it short, but provide enough detail to support the rest of the
proposal


TIP: Reviewers don’t want to read the proposal you
submitted to NSF/NIH/etc, but they will notice whether you
have funding to pursue these activities.

AUS telecon, January 28, 2010

II. Codes and Methods


Very similar between traditional and community proposals.


More significant if using ‘home
-
grown’ codes.


If using widely known third
-
party codes (e.g., NAMD, CHARMM,
AMBER), you can cut some corners here, although you should explain
why you chose this code over alternatives.


Provide performance and scaling data on problems and test
cases similar to those you plan to pursue.


Or that you expect the community to pursue.


Describe why this code is a good fit for the resource(s) requested
and/or list acceptable alternatives.


Ideally, provide performance and scaling data collected by
you for the specific resource(s) you are requesting


This gives reviewers additional confidence that you know what you’re
doing.



AUS telecon, January 28, 2010

III. Computational Plan


Traditional proposals


Explicitly describe the problem cases you will examine


BAD: “…a dozen or so important proteins under various conditions…”


GOOD: “…7 proteins [listed here; include scientific importance of these
selections somewhere, too]. Each protein will require [X] number of runs,
varying 3 parameters [listed here] [in very specific and scientifically
meaningful ways]…”


Community proposals


Explicitly describe the typical use
-
case(s) that the gateway supports
and the type of runs that you expect individual users to make


Describe how you will help ensure that the community will make
scientifically meaningful runs (if applicable)


BAD: “…the gateway lets users run NAMD on TeraGrid resources…”


BETTER: “…we expect most users to run NAMD jobs on [systems like this]…”


BETTER STILL: “…the gateway allows users to run NAMD jobs on up to 128
processors on problem sizes limited [in some fashion]…”

AUS telecon, January 28, 2010

IV. Justification of SUs (Traditional)


Traditional proposals


If you’ve done sections II and III well, this section should be a
straightforward math problem


For each research problem, calculate the SUs required based on runs
defined in III and the timings in section II, broken out appropriately by
resource


Reasonable scaling estimates from test
-
case timing runs to full
-
scale
production runs are acceptable.


Clear presentation here will allow reviewers to cut time in a rational
fashion

AUS telecon, January 28, 2010

IV. Justification of SUs (Community)


Community proposals


The first big trick: Calculating SUs when you don’t know the precise
runs to be made
a priori
.


In Year 2 and beyond


Start with an estimate of total usage based on prior year’s usage patterns
and estimate for coming year’s usage patterns (justify in Section V).


From this information, along with data from sections II and III, you can
come up with a tabulation of SU estimates.


Year 1 requires bootstrapping


Pick conservative values (and justify them) for the size of the community
and runs to be made, and calculate SUs.


TIP: Start modestly. If you have ~0 users, don’t expect the xRAC to believe
that you will get thousands (or even hundreds) in the next year.


AUS telecon, January 28, 2010

V. Add’l Considerations (Community)


For community proposals, these components can provide
key details:


Community Support and Management Plan


Instead of staff/experience


You may want to include brief description of gateway interface, the fact that
it has been used for production work, relevant development effort


in
terms of how it helps community burn SUs.


If you have a plan for growing the user community, for “graduating” users
from the gateway to their own MRAC awards, it would be good to mention.
If you somehow regulate “gateway hogs,” describe that.


Progress report:

Provide details of the actual user community and
usage patterns seen in the prior award period.


List manuscripts published, accepted, submitted or in preparation, thanks to
this service. Helps convince xRAC that SUs haven’t gone down a black hole.


Local computing environment, Other HPC resources:

Same as
for traditional proposals.

AUS telecon, January 28, 2010

3 steps to connect a gateway to
TeraGrid


Request an allocation


Only a 1 paragraph abstract
required for up to 200k CPU hours


Register your gateway


Visibility on public TeraGrid

page


Request a
community
account


Run jobs for others via your portal


Staff support is available!


www.teragrid.org/gateways

AUS telecon, January 28, 2010

Tremendous Opportunities Using the Largest Shared Resources
-


Challenges too!


What’s different when the resource doesn’t belong just to
me?


Peer
-
reviewed requests for resources


Resource discovery, fault tolerance


Accounting


Must keep track of who’s used what on the gateway


Attribute
-
based authentication for TeraGrid jobs


Security


Software registry for TeraGrid


Tremendous benefits at the high end, but even more work
for the developers


Potential impact on science is huge


Small number of developers can impact thousands of scientists


But need a way to train and fund those developers

AUS telecon, January 28, 2010

What are we working on now?


Arroyo


Adaptive optics corrections for
telescopes


GridChem


Ultrascan


Analysis of ultracentrifugation
experiments


Earth System Grid
-
Community
Climate System Model


GISolve


SimpleGrid


Social Informatics DataGrid


Open Life Sciences Gateway


Pyrosequencing


RENCI Bioportal


Asteroseismology Modeling Portal


Uintah


Gateway software
listing


Investigate use of TG
for overflow OSG jobs


RENCI bioportal,
nanoHUB, using both
resources


Common treatment of
community accounts


Attribute
-
based
authentication



AUS telecon, January 28, 2010

From the Condor flock to Kraken

From simple interfaces to complex video analysis

Diverse goals keep program interesting


AMP gateway, http://amp.ucar.edu


Derive the properties of Sun
-
like stars from observations of their
pulsation frequencies


Kepler mission will use asteroseismology to determine precise absolute
sizes of the potentially habitable Earth
-
like planets


Simple interface, few input parameters, very large CPU consumption
on Kraken, database kept so simulations aren’t re
-
run


Robetta gateway, http://www.robetta.org


Protein structure prediction


Provides access to Dr. David Baker’s award
-
winning Rosetta code


2M hours on Purdue’s Condor pool, very successfully using this time,
reduced calculation backlog to zero recently running 300 jobs
simultaneously


Social Informatics DataGrid, https://sidgrid.ci.uchicago.edu


Slide later

AUS telecon, January 28, 2010

Gateway activities in PY6 aka the
extension


Helpdesk support expanded


From .2 FTE in PY5 to 1.7 in Extension [NCSA, Purdue]


Helpdesk and Condor support, new GIS communities, SimpleGrid extensions


Accounting


Improved views for gateways now that we have attributes [TACC]


Community accounts


Continued work toward improved standardization [NICS]


Prebuilt VMs with gateway software


OGCE, SimpleGrid [IU, NCSA]


Online tutorials with CI Tutor and the EOT team


OGCE, SimpleGrid [IU, NCSA]


More example
-
based documentation


Less talk, more action, short videos, based on user feedback [NCSA,
SDSC]

AUS telecon, January 28, 2010

What else is exciting?


SGW funding a Cyber
-
GIS workshop in conjunction with the
UCGIS meeting in DC in February


Co
-
led by Shaowen Wang at NCSA and Nancy


Approved by UCGIS board after a lengthy voting process


Winter UCGIS meeting will focus on CI and includes a White House
briefing


Workshop attendees welcome to attend the briefing


Briefing will include very short recap of the workshop


Expected outcome of the workshop


Increased awareness of CI resources for GIS researchers


Increased visibility for TeraGrid


New partnerships for TeraGrid, UCGIS, and other pertinent organizations


Workshop report


Interesting collaborative proposal ideas


Potential future publications

AUS telecon, January 28, 2010

A few gateways in detail


AUS telecon, January 28, 2010

SCEC Gateway used to produce realistic
hazard map


Probabilistic Seismic Hazard
Analysis (PSHA) map for California


Created from Earthquake Rupture
Forecasts (ERC)


~7000 ruptures can have 415,000
variations


Warm colors indicate regions with
a high probability of experiencing
strong ground motion in the next
50 years


Ground motion calculated using
full 3
-
D waveform modeling for
improved accuracy


Results in significant CPU use




AUS telecon, January 28, 2010

SCEC: Why a gateway?


Calculations need to be done for each of the hundreds of
thousands of rupture variations


SCEC has developed the “CyberShake computational platform”


Hardware, software and people which combine to produce a useful scientific
result


For each site of interest
-

two large
-
scale MPI calculations and hundreds
of thousands of independent post
-
processing jobs with significant data
generation

»
Jobs aggregated to appear as a single job to the TeraGrid

»
Workflow throughput optimizations and use of SCEC’s gateway
“platform” reduced time to solution by a factor of three


Computationally
-
intensive tasks, plus the need for reduced time to
solution is a priority make TeraGrid a good fit

AUS telecon, January 28, 2010

Source: S. Callahan et.al. “Reducing Time
-
to
-
Solution Using Distributed High
-
Throughput Mega
-
Workflows


Experiences from SCEC CyberShake”.



GridChem


Understanding molecular structure and function increasingly important in
many fields


Materials for electronics, biotechnology, medical devices, pharmaceutical
design


GridChem provides reliable infrastructure for computational chemists


NSF Middleware Initiative (NMI) project


Requested and received advanced support from TeraGrid


Addressing issues which benefit all gateways, support team led by IU


Common user environments for domain software access


Standardized licensing


Application performance characteristics


Incorporation of additional data handling tools and data resources


Fault tolerant workflows


Scheduling policies for community users


Remote visualization


AUS telecon, January 28, 2010

GridChem: Why a gateway?


Integrates high end resources in a
desktop environment


Client
-
server approach allows work
to continue while disconnected
(plane flights)


Ability to monitor jobs across sites


Access to individual allocations


In the future, linkage of multi
-
scale
packages


Focus on chemistry research
without learning the intricacies of
each system


Time limits, nodes, processors,
memory, disk space, etc


AUS telecon, January 28, 2010

Robetta Gateway

Protein structure prediction with an award
-
winning code


Protein structure prediction is among many important
problems in bioinformatics.


The Rosetta code, from the David Baker laboratory, has
performed very well at CASP (Critical Assessment of
Techniques for Protein Structure Prediction) competitions


Available for use by any academic scientist via the Robetta server


Robetta developers able to use TeraGrid’s existing gateway
infrastructure, including community accounts and Globus


This very successful group needed no additional TeraGrid assistance to
incorporate TeraGrid resources into the Robetta gateway


Google scholar reports 601 references to the Robetta
gateway, including many PubMed publications


AUS telecon, January 28, 2010

Robetta: Why a gateway?


Bioinformatics has long
history of web
-
based
services


NCBI Blast server from the
1990s


Easy input from the
web


Access to top modeling
code for all researchers

AUS telecon, January 28, 2010

Social Informatics Data Grid

Collaborative access to large, complex datasets


SIDGrid is unique among
social science data archive
projects


Focused on streaming data
which change over time


Voice, video, images (e.g. fMRI),
text, numerical (e.g. heartrate,
eye movement)


Provides the ability to
investigate multiple datasets,
collected at different time
scales, simultaneously


Large datasets result


Sophisticated analysis tools

AUS telecon, January 28, 2010

http://www.ci.uchicago.edu/research/files/sidgrid.mov

SIDGrid: Why a gateway?


Social scientists have traditionally
worked in isolated labs without the
capability to share data or insights
with others.


Data that is expensive to collect can
now be shared with others


Geographically distant researchers can
collaborate


Complex analysis tools and workflows
available for all


Researchers have access to high
performance computational resources


TeraGrid used for
computationally
-
intensive tasks
such as media transcoding
algorithms for pitch analysis of
audio tracks and fMRI image
analysis

AUS telecon, January 28, 2010

Source: Dr. Steven Boker, Notre Dame

Viewing multimodal data like a
symphony conductor


“Music
-
score” display and
synchronized playback of video
and audio files


Pitch tracks


Text


Head nods, pause, gesture
references


Central archive of multi
-
modal
data, annotations, and analyses


Distributed annotation efforts by
multiple researchers working on a
common data set


History of updates


Computational tools


Distributed acoustic analysis using
Praat


Statistical analysis using R


Matrix computations using Matlab
and Octave

AUS telecon, January 28, 2010

Source: Studying Discourse and Dialog with SIDGrid, Levow, 2008

Uintah


Product of a Utah's DOE ASC Center


C
-
SAFE


Component based framework for solving PDEs on structured
AMR grids.


Computations are expressed as tasks based on inputs and
outputs for each patch in the structured grid.


Tasks are organized in a task graph and assigned processing
resources by the built in scheduler.


Load balancing is achieved by a fast space filling curve
algorithm for the patches.


Primary components are CFD (Arches and ICE), Solid
Mechanics (MPM) and Fluid
-
Structures (MPM
-
ICE)


www.uintah.utah.edu

AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Uintah CFD Components


Industrial Flare
Simulation using Arches
Component


Prediction of flame shape
and tilt using LES


Prediction of pollutant
emissions

AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Fluid Structure Interaction


Microscale Fluid
Structure Interaction
using MPM
-
ICE
Component


Array of pins undergoing
deformation which
influences heat transfer


Potential application in
CPU cooling

AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Solid Mechanics Simulation


Shape charge detonating
forming a jet of particles
which penetrate a steel
target using the MPM
solid mechanics
component.


AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Uintah Science Gateway


Manage the end to end job submission and data
management for TeraGrid resources.


Use the Django framework and Postgresql for the front
-
end
calling Globus scripts to interact with the back
-
end TeraGrid
machines. Strictly web based.


Target new and existing users.


New users will take advantage of web front end for quickly
getting up to speed on TeraGrid resources.


Provide existing users with various data management and
work flow features.

AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Scaling on Kraken


Uintah scales to very
large processor core
counts


Scaling for an AMR MPM
-
ICE (Fluid Structure
Interaction) problem
demonstrating both fixed
and increasing problem
size

AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Scaling on Kraken


An AMR CFD example for
the ICE component for
large processor core
counts


Scales fairly well on
Kraken both for fixed and
increasing problem size

AUS telecon, January 28, 2010

Source: John Schmidt, U Utah

Future Technical Areas


Web technologies change fast


Must be able to adapt quickly


Gateways and gadgets


Gateway components incorporated
into any social networking page


75% of 18 to 24 year
-
olds have
social networking websites


iPhone apps?


Web 3.0


Beyond social networking and
sharing content


Standards and querying interfaces
to programmatically share data
across sites


Resource Description Framework (RDF),
SPARQL

AUS telecon, January 28, 2010

Gateways can further investments in
other projects


Increase access


To instruments, expensive data collections


Increase capabilities


To analyze data


Improve workforce development


Can prepare students to function in today’s cross
-
disciplinary world


Increase outreach


Increase public awareness


Public sees value in investments in large facilities


Pew 2006 study indicates that half of all internet users have been to a
site specializing in science


Those who seek out science information on the internet are more likely
to believe that scientific pursuits have a positive impact on society

AUS telecon, January 28, 2010

Sustainability is the key though


Scientists will tie their research to a tool that they aren’t
convinced has a long life


But, all projects can’t be funded for the long term


Nancy currently leading small 2
-
year EAGER study to look at
the characteristics of gateways that warrant sustained
funding


Working with Katherine Lawrence, U Michigan School of Information

AUS telecon, January 28, 2010

Tremendous Potential for Gateways


In only 18 years, the Web has fundamentally changed
human communication


Science Gateways can leverage this amazingly powerful tool
to:


Transform the way scientists collaborate


Streamline conduct of science


Influence the public’s perception of science


Reliability, trust, continuity are fundamental to truly change
the conduct of science through the use of gateways


High end resources can have a profound impact


The future is very exciting!

AUS telecon, January 28, 2010

AUS telecon, January 28, 2010

Please let us know if you see gateway interest
from researchers you work with.


Thanks for the opportunity to present.




Nancy Wilkins
-
Diehr,
wilkinsn@sdsc.edu

www.teragrid.org