Pierce-ScienceGateways-CGB - Indiana University

bracechumpInternet and Web Development

Feb 5, 2013 (4 years and 8 months ago)

264 views

Building Science Gateways

Marlon Pierce

Community Grids Laboratory

Indiana University

What Is a Web Portal?


Web container that
aggregates content from
multiple sources into a
single display.


“Start Pages”


Typically consume
RSS/Atom news feeds.


More powerful versions
these days support Flickr,
calendars, games, etc.


Gadgets, widgets


Examples: iGoogle,
Netvibes, My Yahoo!


Grid Computing Overview


Grid computing software is designed to integrate large
supercomputing facilities.


TeraGrid, Open Science Grid, EGEE, etc.


This is done via network services


Key Service Components


Authentication and authorization framework (MyProxy)


Remote process access and control (GRAM, Condor)


Remote file, I/O access (GridFTP)


Additional Services


Information services, replica management, database federation,
storage management, schedulers, etc.


Example Grid Software Stacks: CTSS and VDT



TeraGrid Supercomputing Resources (GPIR)

Science Portals and Gateways


Science Gateways adapt Web portal
technology to build user interfaces to the
Grid.


Science portals resemble standard portals, but
must also


Support access to computing and storage resources.


Allow users remote, Unix
-
like access to these
resources.


Provide access to science applications and data sets.


And we must provide value added services as
well as user interfaces.

Portlets + Client Stubs

DB Service

JDBC

DB

Job Sub/Mon

And File

Services

Operating and

Queuing

Systems

WSDL

Browser Interface

WSDL

WSDL

WSDL

WSDL

WSDL

Visualization

Service

DB

WSDL

Host 1

Host 2

Host 3

My 2002 “octopus”
SOA diagram, from
the archives.

SOAP/HTTP

HTTP(S)

WSDL

WSDL

Terminology


Portlet
: this is a standard Java component that generates
HTML and can also act as a client to a remote service.


Lives in a portal container.


I will also use this term generically.


Web Service
: a remotely invokeable function on the
Internet.


SOAP: the XML message envelop for carrying commands over
HTTP.


WSDL: describes the service’s API in XML.


REST: A variation of this approach.


Lots more info:
http://grids.ucs.indiana.edu/ptliupages/presentations/I590
WebService.ppt


But Why?


Three
-
tiered Service Oriented Architecture is the
network equivalent of the the famous
Model
-
View
-
Controller
design pattern.


View: the user interface components.


Controller: Web service middleware


Model: the backend resources.


Independence of tiers gives flexibility


Services can be reused with alternative user interfaces


Workflow composers like Taverna


User interfaces can work with different service
implementations.


Drawback: reliability and robustness are issues.

Two Approaches to the Middle Tier

Grid Service

Grid Service

Backend

Resource

Web Service

Portal Client

Portal Client

Grid Client

Backend

Resource

Fat Client

Thin Client

Grid Protocol
(SOAP)

Grid Client

HTTP + SOAP

Grid Protocol

(SOAP)

Disloc output

converted to

KML and

plotted.

GeoFEST Finite

Element Modeling

portlet and plotting
tools

What’s In the Screenshots?


GeoFEST and Disloc Portlets


Live on gf7.ucs.indiana.edu


Manage the user’s display: Web forms, links to output,
graphics.


Save user session state persistently.


QuakeTables Fault DB Web Service


Lives on gf2.ucs.indiana.edu


Contains geometric fault models.


GeoFEST and Disloc Execution Web Services


Lives on gf19.ucs.indiana.edu


Generates input files from fault models.


Runs and manages codes.

Best Practice for Scientific Web
Services


There

are many tools to choose from.


.NET, Apache Axis, Sun WS, Ruby on Rails, etc.


Make them self
-
contained.


If possible, generate input files within the service.


Or have an input file generating service.


Remember that they may be used by other people with other
client tools.


Communicate data files with URLs.


Be very careful about exposing the state of the service.


Don’t assume persistent connections.

Components for Portals

Open Grid Computing Environments
Examples. See
http://www.collab
-
ogce.org/


Components for Science Portals


OGCE is founded on the principal that portals
should be built out of reusable parts.


Key standard in our first phase: the JSR 168 portlet
specification.


Portlets can run in multiple containers


uPortal, Sakai, GridSphere, LifeRay, etc.


Allows us to build Grid specific components and
deploy along side other goodies: Sakai
collaboration tools, contributed portlets, etc.


Future:
Open Social
compliant
Google Gadgets

OGCE GPIR portlet can interoperate
with TeraGrid and your own GPIR
services.

Manage TeraGrid MyProxy
credentials with the OGCE
ProxyManager portlets.

OGCE file management client
portlets interact with TeraGrid
GridFTP servers.

General purpose batch and interactive job
submission to GRAM, WS
-
GRAM is supported.

Dashboard Portlet

20

The dashboard portlet allows users to track jobs on the
selected resource. The user can view either his own set
of jobs or get information on all submitted jobs.

Queue forecasting portlets work
with the NWS QBETS to predict
wait times and deadlines.

PURSe

portlets manage user requests for
portal accounts and Grid credentials.

Condor and Condor
-
G

OGCE IFrame Portlet can be
used to integrate external
sites.

Client Libraries for Grid
Computing

Two Major Grid Client Efforts


The Java COG Kit


Supports several versions of Globus and SSH.


Condor
-
G


Has a Web Service interface (BirdBath) and Java
client libraries.


Supports Globus (v2 and v4) and several other Grid
middleware systems.


You can build either portlets or Web services with
either of these.


OGCE portlets use primarily COG


We prefer Condor
-
G based Web services for long
running jobs.

CoG Abstraction Layer

CoG

CoG

CoG

CoG

CoG

CoG Data and Task Management Layer

CoG Gridfaces Layer

CoG

CoG

CoG GridIDE

GT2

GT3

(X)

GT4

WS
-
RF

Condor

Unicore

Applications


SSH

Others

Nano

materials

Bio
-

Informatics

Disaster

Management

Portals

CoG Abstraction Layer

CoG

CoG

CoG

CoG

CoG

CoG Data and Task Management Layer

CoG Gridfaces Layer

CoG

CoG

CoG GridIDE

Development

Support

CoG Abstraction Layers

Task

Task

Handler

Service

Task

Specification

Security

Context

Service

Contact

The class diagram
is the

same for all grid
tasks (running jobs,
modifying files,
moving data).

Classes also abstract toolkit
provider differences. You set
these as parameters: GT2,
GT4, etc.

Coupling CoG Tasks


The COG abstractions
also simplify creating
coupled tasks.


Tasks can be
assembled into task
graphs with
dependencies.


“Do Task B after
successful Task A”


Graphs can be nested.

Problems with Grid Client
Development


Grid portlets typically wrap each single Grid capability in a
separate portlet


Problem is that Grid portlets need to combine these operations


Portlets are entire web applications, so we need a component model for
portlets:
reusable portlet parts


Even with the COG Abstraction Layer, we must still do a lot of
coding to build new applications.


To address these problems we have adopted Java Server Faces


Provides several nice Model
-
View
-
Controller features


JSF provides an extensible framework (tag libraries) for making
reusable components.


Apache JSF portlet bridge allows you to convert standalone JSF
applications (
development phase
) into portlets (
deployment phase
).

GTLAB Example

<html>


<body>


<f:form>


<o:submit id=”test” action=”next_page” />


<o:myproxy id=”pr”



hostname=”gf1.ucs.indiana.edu”


port=”7512” lifetime=”2” username=“mnacar”



password=”***” />


<o:jobsubmit id=”task”






hostname=”cobalt.ncsa.teragrid.org”


provider=”GT4” executable=”/bin/ls”



stdout=”tmp/result



stderr=”tmp/error” />


</o:submit>


</f:form>


</body>

</html>

32

Grid Tags

Associated Grid Beans

Features

<submit/>

ComponentBuilderBean

Creating components, job
handlers, submitting jobs

<handler/>

MonitorBean

Handling monitoring page actions

<multitask/>

MultitaskBean

Constructing simple workflow

<dependency/>

MultitaskBean

Defining dependencies among sub
jobs

<myproxy/>

MyproxyBean

Retrieving myproxy credential

<fileoperation/>

FileOprationBean

Providing Gridftp operations

<jobsubmission/>

JobSubmitBean

Providing GRAM job submissions

<filetransfer/>

FileTransferBean

Providing Gridftp file transfer

ResourceBean

Describes common properties
among all tags and beans. Passing
values given by standard visual
JSF components.

Managing Scientific
Workflows

Scientific Workflows


Portal interfaces encode scientific use cases.


If you have a rich set of services, it is a lot of work
to make portlets for all possible use cases.


And power users will have always want
something more.


Example: our CICC project has dozens of chemical
informatics Web services.


http://www.chembiogrid.org.wiki


Workflow composers can simplify this.


Allow users to encode and execute their own use
cases.

Web Services and Workflows


Perform a similarity
search on the NIH DTP
Human Tumor data.


Filter the results based
on Pharmacokinetic
properties (FILTER)


Convert to 3D (OMEGA)


Docking into a pre
-
defined protein (FRED)



Visualize (JMOL).


Taverna workflow connects
remote services.

OGCE’s XBaya
Workflow Composer

Future of Science
Gateways

Social Gadgets+AJAX

DB Service

JDBC

DB

Job Sub/Mon

And File

Services

Operating and

Queuing

Systems

REST

Browser Interface

REST

WSDL

REST

REST

REST

Visualization

Service

DB

REST

Host 1

Host 2

Host 3

Updating the Octopus

RSS,JSON/HTTP

HTTP(S)

REST

REST

Enterprise Approach

Web 2.0 Approach

JSR 168 Portlets

Gadgets, Widgets

Server
-
side integration and
processing

AJAX, client
-
side integration and
processing, JavaScript

SOAP

RSS, Atom, JSON

WSDL

REST (GET, PUT, DELETE, POST)

Portlet Containers

Open Social Containers (Orkut,
LinkedIn, Shindig); Facebook;
StartPages

User Centric Gateways

Social Networking Portals

Workflow managers (Taverna,
Kepler, etc)

Mash
-
ups

Grid computing: Globus, condor, etc

Cloud computing: Amazon WS Suite,
Xen Virtualization


Semantic Web: RDF, OWL,
ontologies

Microformats, folksonomies

Microformats,

KML, and GeoRSS
feeds used to

deliver SAR data to
multiple clients.

More Information


Contact me:
mpierce@cs.indiana.edu


See what I’m up to:
http://communitygrids.blogspot.com/


OGCE software:
http://collab
-
ogce.org/


QuakeSim:
http://www.quakesim.org/


CICC:
http://www.chembiogrid.org/wiki/


Lots of people worked on all of these.