Download Presentation - CyberInfrastructure and Geospatial ...

clumpfrustratedBiotechnology

Oct 2, 2013 (3 years and 10 months ago)

162 views

SAN DIEGO SUPERCOMPUTER CENTER

The Integration of 2 Science
Gateways:

CyberGIS +
OpenTopography

Choonhan

Youn
, Nancy Wilkins
-
Diehr, SDSC

Christopher Crosby, UNAVCO (formerly SDSC)

Anand

Padmanabhan
,
Myunghwa

Hwang,

Yan Liu,
Shaowen

Wang

University of Illinois at Urbana
-
Champaign

SAN DIEGO SUPERCOMPUTER CENTER

What are Science Gateways?


Community
-
designed applications, often Web
-
based, used to conduct science


Commonly known as web portals


Gateways term coined in 2003 in the
TeraGrid

program


Many examples in many fields


CyberGIS


Protein Data Bank


nanoHUB

SAN DIEGO SUPERCOMPUTER CENTER

A natural result of the impact of the Internet on
worldwide communication and information retrieval





Implications on the conduct of science are still
evolving


1980’s, Early gateways, National Center for Biotechnology Information BLAST
server, search results sent by email, still a working portal today


1989 World Wide Web developed at CERN


1992 Mosaic web browser developed


1995 “International Protein Data Bank Enhanced by Computer Browser”


2004 TeraGrid project director Rick Stevens recognized growth in scientific portal
development and proposed the Science Gateway Program


Today, Web 3.0 and programmatic exchange of data between web pages


Simultaneous explosion of digital information


Growing analysis needs in many, many scientific areas


Sensors, telescopes, satellites, digital images, video, genome sequencers


#1 machine on Top500 today over 10,000x more powerful than
all combined
entries
on the first list in 1993

Only 20
years since the release of Mosaic!

SAN DIEGO SUPERCOMPUTER CENTER

vt100 in the 1980s and a

login window on Ranger today

SAN DIEGO SUPERCOMPUTER CENTER

Why gateways?


Increasing utility of the Web for science


And increased need to deal with big data


From sensors, instruments (telescopes, genome sequencers),
supercomputers


Community
-
designed interfaces directly address
community needs


Complex tasks best not re
-
addressed by every
scientist


Coupling multi
-
scale codes


Keeping large numbers of bioinformatics programs up to date


Managing thousands of ensemble jobs


Democratized access to supercomputers


A
nyone regardless of location can have access to top quality
resources


Scalable support
-

questions on gateway use go to gateway
developers


5

SAN DIEGO SUPERCOMPUTER CENTER

Gateways on NSF’s front page

6

7/16/12

SAN DIEGO SUPERCOMPUTER CENTER

Today, there are approximately 35
gateways using XSEDE

SAN DIEGO SUPERCOMPUTER CENTER

The Problem


Coupling of two independent geospatial software
environments


OpenTopography

(OT)


CyberGIS Gateway


Demonstrate this coupling in action driven by an
application


Viewshed

application on CyberGIS gateway is a good candidate


Consumes high
-
resolution Digital Elevation Model (DEM) data


Disconnect between data
-
driven and analytics
-
driven
gateways


Seamless fusion of large spatial data and upscale analytics tools
without losing usability


Abstract away complex technicality of software
integration


8

SAN DIEGO SUPERCOMPUTER CENTER

Goals


Improve usability


Data need to be easily available to users when CyberGIS
analytics is being planned


Seamless access to
OpenTopography

(OT) data
through the CyberGIS gateway


Access OT data through common user interface


Service integration and chaining


Allow the gateway users to directly apply
viewshed

analysis to OT datasets


Reuse existing user interfaces when possible


Benefits both communities



9

SAN DIEGO SUPERCOMPUTER CENTER

OpenTopography


NSF Facility funded by


Earth Sciences Instrumentation and Facilities


Office of Cyberinfrastructure


Aim to increase the amount of science
-
oriented
LiDAR

data available online


Enhanced Web
-
based processing capabilities


W
ith a focus on computationally intensive tasks


Community support


S
oftware tools, tutorials, short courses, and workshops


www.opentopography.org

SAN DIEGO SUPERCOMPUTER CENTER

OpenTopography

Service
-
Oriented Architecture


OGC
Catalogue
Interface

CSW Server

Metadata
Management
Server

SAN DIEGO SUPERCOMPUTER CENTER

CyberGIS Gateway


O
nline
collaborative
geospatial problem
solving environment


Enables easy access to
CyberGIS

analytics and
data sources


Provides transparent
access to a rich set of
cyberinfrastructure
environments


Represents a broad
approach to
CyberGIS



Widely accessible by
general public


12

SAN DIEGO SUPERCOMPUTER CENTER

Application Driver
-

Viewshed Analysis


Given
terrain
data,
viewshed

computes
visible
regions


W
ell
known spatial
analysis
method


High resolution
data
for improved quality
of the analysis


OT
as a
data source



13


Viewshed

analysis on CyberGIS
Gateway


Computation
done
on the Forge GPU cluster at
NCSA and the cloud
infrastructure of the CyberInfrastructure and Geospatial Laboratory

SAN DIEGO SUPERCOMPUTER CENTER

Integration Challenges


User interfaces


Separately developed interfaces need to be bridged


Data discovery


Capabilities for interactive data discovery needed


Service chaining


Services are to be integrated to provide users with an
illusion of a single service


Security


Connecting multiple security domains


14

SAN DIEGO SUPERCOMPUTER CENTER

Integration Approach

GISolve Open Service
APIs



Token
-
based single sign
-
on

Workflow
for composing and
interacting with composite services

Metadata

Services

Shared user interface

components via libraries

Security

Service

Chaining

Data

Discovery

User

I
nterface

Gateway

Service

Integration

level

SAN DIEGO SUPERCOMPUTER CENTER

Security


Opal used by OT to wrap
applications as Web
services


Opal itself comes from a third
gateway!


Opal modified to work
with
GISolve

Open
Service Security API


REST
-
based API


CyberGIS

deploys token
-
based identity server


Authentication and
authorization

SAN DIEGO SUPERCOMPUTER CENTER

OT Services used in
CyberGIS


Count Cloud


Estimate the number of points in
a

selected bounding box


Data Selection


Given a bounding box, retrieve LIDAR point cloud data


Points2Grid


Generate DEMs from point cloud data using a variety of
gridding functions (min, max, mean,
idw
)


FormatTranslation


Conversion between data formats


ARC Grid files to
GeoTIFF



17

SAN DIEGO SUPERCOMPUTER CENTER

Service Chaining


OT Services used to generate Digital Elevation
Models (DEMs) needed by the
viewshed

analysis
application


Services chained and invoked as part of pre
-
processing
step by GISolve middleware


Submit, check status, and get results steps


Workflow to streamline service invocations


Services use GISolve Open Service APIs to
authenticate user requests


18

SAN DIEGO SUPERCOMPUTER CENTER

Data Discovery


Enabled through metadata services


Facilitate the discovery of and access to OT data sources


Two distinct metadata sources used


Google Fusion Tables


Vector (polygon) boundaries of OT datasets


CSW (Catalogue Service for the Web) metadata


CSW service APIs enable users to publish, browse, and
search for specific metadata using CSW protocol


Supports HTTP binding


OGC Standard Catalogue Service


Metadata schema : ISO 19139



19

SAN DIEGO SUPERCOMPUTER CENTER

Workflow


20

GISolve

Middleware

Open Service APIs

Metadata
Service

Count
Service

Data Access
Service

Data Processing
Service

Viewshed

Interface

CyberGIS

Gateway

Service Chaining

CyberGIS

Identity
Service

OpenTopography

Opal Web Services

CyberInfrastructure

Service Infrastructure

SAN DIEGO SUPERCOMPUTER CENTER

Existing User Interfaces

-

CyberGIS & OT

Data Selection &

Viewshed

Analysis

LiDAR

Data
Search
&

DEM
Generation

f

SAN DIEGO SUPERCOMPUTER CENTER

Reusing OT user interface


User interface
components
shared


Via OT libraries


SAN DIEGO SUPERCOMPUTER CENTER

Link user interfaces

SAN DIEGO SUPERCOMPUTER CENTER

Validate input and collect metadata


Restrictions on
number of cloud
points that can be
retrieved and number
of cells in a DEM


Viewpoints must be
within the spatial extent
of datasets


Metadata necessary
for data
transformation


ID, coordinate system,
bounding box

SAN DIEGO SUPERCOMPUTER CENTER

Interaction with OT Web Services

Google Fusion Table &

CSW
Metadata
Services

Count Cloud
Service

SAN DIEGO SUPERCOMPUTER CENTER

What did we achieve

SAN DIEGO SUPERCOMPUTER CENTER

Concluding Remarks


Gateways as a means to democratize science


Importance of interoperation of gateways


Especially in GIS where layering of data is so useful


Application
-
driven


High res
LiDAR

and high
-
end computing


Standard
-
based


Enables interoperability


Principles


Reusability


Extensibility


Reliability


Scalability


Groundbreaking
knowledge gained
for integrating
service
-
oriented geospatial software environments



27

SAN DIEGO SUPERCOMPUTER CENTER

Acknowledgements


National Science Foundation


BCS
-
0846655


OCI
-
1047916


OCI
-
0503697


TeraGrid SES070004N


Colleagues


http://www.cigi.illinois.edu/doku.php/people/index



28

Any opinions, findings, and conclusions or recommendations

e
xpressed in
this material are those of the author(s) and do
not
necessarily reflect
the views of the National Science
Foundation

SAN DIEGO SUPERCOMPUTER CENTER

Thank you


Questions?


Comments?


Discussion?