Elder Matias Presentation to EFD Software - Canadian Light Source ...

radiographerfictionData Management

Oct 31, 2013 (3 years and 9 months ago)

119 views

Web 2.0


Elder Matias

CLS


09
-
04
-
28

What Is Web 2.0?









In plain English ….


Automating tedious tasks using web technology


Tools to help people and software collaborate







Scientific American May
2008

Science 2.0


The Risk and
Reward of Web
-
Based
Research





---------------------------------




“Our real mission isn’t to
publish journals but to
facilitate scientific
communication”
Timo Hannay


Head of Web Publishing at
Nature Publishing Group




ScienceStudio



Elder Matias

CLS


09
-
04
-
28

5

User Access to Synchrotrons


Who is the community that will use your
platform?


Synchrotrons are electron storage rings that emit high intensity
photons that are used for experiments by a large scientific
community (tens of thousands worldwide).


Access is normally granted for single periods of 1
-
3 days in a half
-
year cycle.


What couldn’t your community do without the
platform?



Physical distances and episodic access prevent rapid scientific
progress and limit scientific collaboration.


Why was that a problem or limitation?


Governments worldwide have invested >$2B in these facilities,
yet the scientific outcomes could be
optimised
.



User Access to Synchrotrons




What middleware was needed to resolve the
limitations?


Workflow management Engine for the User Office


Web Portal for remote data access (during and post
experiment)


Enterprise Service Bus and SOA to integrate internal and
external data analysis services


How do your plans meet the needs


Users will have frequent remote access to the VESPERS
beamline at the Canadian Light Source under conditions
where many collaborators can participate in the
experiment.


6

Science Studio serves three purposes:


Management of all aspects of a scientific experiment
including data storage, collaboration with others,
processing of data;



Control of, or interaction with, remote experiments on
the CLSI VESPERS
Beamline

and UWO Nanofabrication
Laboratory and



User Services (sample management, scheduling, peer
review, user training)


7

8

Team: People and Orgs


Remote Control


User Services


System Deployment


Integration



System Architecture


System Requirements


Testing


Data Analysis/Grid Computing


User Office Software


Scientific Workflow Engines


9

Team: People and Orgs

Dionisio Medrano

Dylan
Maxwell

Daron Chabot

Elder Matias


Chris
Armstrong

John
Haley

Mike Bauer

Stewart McIntyre

Marina Suominen Fuller

Jinhui Qin

Nathaniel Sherry

Yuhong
Yan

Zahid Anwar

Ludeng
(Eric) Zhao

Dan Ni

Yaofeng

Xu


System Architecture


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

1.
VESPERS Beamline

2.
EPICS control system

3.
Beamline Control Module (BCM)

4.
Web Application

5.
Database

6.
File Storage

7.
Web Interface

VESPERS Beamline


VESPERS


Very Sensitive Elemental and Structural Probe
Employing Radiation from a Synchrotron



A bending magnet beamline on sector 6 at the Canadian Light
Source synchrotron in Saskatoon, Saskatchewan.



A hard x
-
ray microprobe with an energy range of 6 to 30keV.



Techniques: X
-
Ray Fluorescence (XRF) & X
-
Ray Diffraction (XRD)


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

VESPERS Endstation

CCD Detector (XRD)

Microscope

MCA Detector (XRF)

Sample

EPICS Low
-
level Control System


EPICS


Experimental Physics and Industrial Control System



The standard control system at the CLS.



EPICS consists of a network of Input
-
Output Controls (IOCs) which are connected to
directly to devices.



An IOC provides many Process Variables (PVs) which relate to either an input or
output from a device and have a unique name.



Channel Access (CA) is used to read or write to any PV without knowing which IOC
provides the PV.



More than 50,000 PVs in the CLS control system.


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

Beamline Control Module (BCM)


The BCM provides a high
-
level interface to the low
-
level control system (EPICS).



Logical and physical separation of business logic and control logic.



Virtual device abstraction that provides independence from low
-
level control system.



Virtual devices can be logically organized into a device hierarchy.



Basic devices can be combined to build more functional devices.



Communication with external applications using two message queues (ActiveMQ).


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

Web Application


A J2EE Servlet application that provides a web
-
based interface Science Studio.



Tools: Spring (MVC), iBATIS (ORM), JSecurity (Apache Ki), Apache Tomcat



Divided into two parts: the Core application and the VESPERS beamline application.



Core application is responsible for providing access to the business objects.



VESPERS application is responsible for remote control of the VESPERS beamline.


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

Database


Metadata associated with the operation of a remote controlled beamline and the
organization of experimental data collected on that beamline.



A
project

is the top level organizational unit and is associated with a project team.



A
session

defines a period of time allocated to a project team to conduct experiments.



An
experiment

relates a sample and the technique being applied to that sample.



A
scan

records the location of the acquired experimental data.


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

Database Schema

person

project_person

project_role

project

session

laboratory

sample

experiment

scan

technique

instrument

Instrument_technque

facility

Experimental Data Storage


Experimental data is stored at the CLS.



Common directory structure shared with other beamlines.




A large data storage facility is now operational at the
University of Saskatchewan as part of WestGrid.


Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

VESPERS Web Interface


Rich web interface to Science Studio and the VESPERS beamline.



Designed to be used over commodity broadband internet.



Developed for the Firefox web browser without any additional plugins or extensions.



Known to work with other browsers, but requires the Canvas HTML tag.



AJAX is used for the VESPERS interface to provide device values in pseudo real time.



ExtJS, a JavaScript framework, provides many advanced GUI elements.



Web

Application

Beamline

Control

Module

DB




SAN






JMS

CA

VESPERS

HTTP

Beamline Setup

Experiment Setup

XRF (X
-
Ray Fluorescence)

Beamline Hutch Cameras

Experimental Data Viewer

User Office Workflow


Goal: Many
tasks

in proposal & sample management at CLS


To develop a workflow management system that


manages
ordering
of tasks e.g. (training

before


shipping)


Tracks

manual as well as SS task progression


Mar

6
-
month cycle

CLS call

for proposals

Proposal submission

To CLS

CLS gathers proposals

CLS reviews proposals

CLS grants scientist

B
eamline
time


cientist

packs sample

I wonder if
CLS
received
my sample
yet?

Scientist
must complete

Online SS training

CLS health & safety

inspection

Many other
tasks


Perform Experiment


Return
Sample


Take Survey




User office Workflow Status


Workflow Management Engine


Beamline User


User Office


Task :Training


Completed


Notify


Approved


Notify


Record Progress


Features


Open source Petri
-
nets based


Direct support for workflow
control flow patterns


Ability to interact with web
services declared in WSDL


Relies on XML standards e.g.
XPath

and
XQuery

for data &
doesn’t use proprietary
languages


Architecture


System Core:
YAWL engine.


Engine instantiates
specifications



designed using
YAWL designer
.


managed by the
YAWL repository


Environment composed of
YAWL
services


inspired by “web services”
paradigm,


end
-
users, applications, and
organizations are all services in
YAWL.

Screenshot: User Training Test Creation

Screenshot: User Survey Taking Page

Screenshot: User Survey Edit Page

Screenshot: Workflow Sample Management

Screenshot: Workflow Call for Proposals

User Office Workflow Example


Prototype Implementation

1
. CLS issues a call for proposals and gives deadline

2.
Beamline

users submit proposals

3. User Office administrator ends registration or extends deadline

4. User Office administrator assigns proposals to user office reviewers

5. Reviewers look at proposals and rank them

6. User Office looks at ranking and chooses the proposals to accept

7. Accepted proposals contact persons are notified

8.
Beamline

User completes training (web service)

9. After training is completed (simulated by a delay) the CLS is notified

Scheduling Module


Goal: To automate the review process and the
method by which beam time is allocated and
scheduled to users depending on


the access mechanism chosen by the user and


the stage of operation (construction, commissioning or
operation) of the
beamline
.


Side effects:


Facilitate the
management

of cycles, runs and modes
of operation


Use automatic scheduling to handle more scheduling
conditions and constraints than human beings are able
to handle manually and identify optimal solutions.

Scheduling Module Features

Users Submit
proposals

Integer
Programming and
Heuristic Algorithm

Schedule

INPUT:

SEARCH AND

CONSTRAINT

SATISFIABILITY:

OUTPUT:

Beamlines

2

Experiments

3

Release Times

[1,1,2]

Deadlines

[8,15,5]

Weights

[4,5,1]

Processing Times

[10,4,3]

Eligibility

[[0,1,0],[1,0,1]]

CONSTRAINTS

1. One
beamline

per

experiment

2. Start time after release time

3. Only eligible
beamlines

can be selected

.

.

7. No overlap of experiment
per

beamline


X
-
Ray Fluorescence (
XRF
): Reveals Elemental Composition


Characteristic Element Lines Selected and Mapped Over a 2D Scan Area


S:



Cr:


& Cr:



Fe:


& Fe:



Ni:


& Ni:



2D Maps Generated for Selected
Elemental Lines


X
-
Ray Diffraction (
XRD
): Reveals Structural Information


Peak Fitting and Indexing of Image Set to Create a Grain Orientation Map

Peak Search



Old IDL Programme


Matched Peak



Old IDL Programme


Matched Peak



New C Programme


Matched Peak



New C Programme


Expected Peak


The

XRD

Indexing

programme

examines

the

locations

of

peaks

in

an

image

in

order

to

determine

the

kind

of

lattice

structure

the

samples

constituent

atoms

are

arranged

in
.

Shown

here

are

the

results

of

an

older

indexing

programme

written

in

IDL,

and

the

new

indexing

programme,

written

in

C
.

The

new

indexing

programme

is

proving

to

be

more

versatile,

and

more

reliable

than

the

old

programme,

often

indexing

sets

of

data

that

the

old

programme

failed

with
.


Grain Orientations

Indexing Process

High Performance

Computing


Elder Matias

CLS


09
-
04
-
28

Is this about making processors faster?


“Moore’s Law”

has limited us



There are also other

fundamental limits



We need to look

at parallel

computers



What is High Performance Computing?


Special purpose machines, configured to solve
complex problems


Usually multi
-
processor (tens to thousands)


Requires parallel programming


Models


Grid


multi
-
machines inter
-
connected solving the same
problem,


Supercomputer


multi
-
processor with shared memory

Limitation of Parallel Programming

(Amdahl’s Law and Gustafson’s Law)


The degree to which a problem can be expressed
using a parallel algorithm will limit the speedup
achieved on a multi
-
processor machine.














Amdahl’s Law

P = % Parallelism

S = Speedup (x sequential)

N = number of processors

Examples …. LHC


LHC at CERN is an example of
a
grid application
where no one county has sufficient processing
capabilities


15 million gigabytes of data per year


In 2006 LHC Tier 1 Grid was tested


TRIUMF is
the Canadian
Tier 1 Centre for LHC
Experiments










Courtesy
TRIUMF

How about in the synchrotron Community?


Many synchrotrons understand the need for HPC



Some of CLS users make use of
WestGrid

for
Computation


The New
WestGrid

data storage facility is
intended to support CLS experiments and is
located on campus


UWO/ORNL/APS/CLS are working on a joint
crystallography application
SharcNet

using the
Cell environment




Diamond
-

Racks layout



Courtesy: Nick Rees Diamond Oct/08




Diamond
-

Current
situation

Water pipes

Cable Tray



Courtesy: Nick Rees Diamond Oct/08

How do I get access to a HPC Machine?


Compute Canada


Responsible for High Performance Computing in Canada


Each regional grid is a member of Compute Canada


ACEnet



Atlantic Canada


CLUMEQ
-

Quebec


SCINET
-

UofT


HPCVL


Queens, Royal Military Collage

St. Lawrence, Carlson, Ottawa, …


RQCHP
-

Quebec


SHARCNET
-

Ontario


WESTGRID


Western Canada


Grid Data Storage?


UofS

is the host for the

new
WestGrid

data storage

facility


Cost: $3.2 M


Includes on
-
line and

archival storage


Two sites on campus



Photo: tape backup unit

holding 6,000 tape

(each @1TB)

IBM Cell Processor (3.2 GHz)



ANISE


Elder Matias

CLS


09
-
04
-
28

50

ANISE: Active Network for Information from
Synchrotron Experiments


“Active” means near
-
instantaneous stream processing of complex data
during transfer to the user or to storage.


Cell processing using
Infosphere

Streams software from IBM and
lightpath

provided by CANARIE network.


Distributed processing on facilities provided by SHARCNET and
WESTGRID.


Objective:
Develop such a network to provide processed results from
experiments such as Laue diffraction at APS (34
-
ID) and VESPERS at
CLS




The network would assist the integration of diffraction data from
multiple and large area detectors.




The network would facilitate faster resolution of research problems
and free up time for more users.




The network would
encouage

common data formats and protocols
leding

to closer collaboration.

51

ANISE: Active Network for Information from
Synchrotron Experiments


Some project outcomes:


1 Accessibility of Laue diffraction methods to a greater number and
variety of users could be achieved by reducing the time required to
accumulate meaningful data.


2 The results of complex diffraction measurements involving a wider
segment of angles could be assessed rapidly.


3. Data and experiment management processes of Science Studio could
enable very brief follow
-
up experiments to answer crucial questions
sometime later.


4. Distant collaborators could participate in, and learn from experiments
on samples of critical importance to a project.


5. User support software could man a more rapid publications.


6. Expansion to include APS and NSLS beamlines.






X
-
Ray Fluorescence (
XRF
): Reveals Elemental Composition


Characteristic Element Lines Selected and Mapped Over a 2D Scan Area


S:



Cr:


& Cr:



Fe:


& Fe:



Ni:


& Ni:



X
-
Ray Diffraction (
XRD
): Reveals Structural Information


Peak Fitting and Indexing of Image Set to Create a Grain Orientation Map


The

XRD

Indexi ng

programme

examines

the

locatio ns

of

pe aks

i n

an

image

i n

or der

t o

determine

the

kind

of

lattice

struct ure

the

samples

co nstit uent

atoms

a re

a rra nged

i n
.

Shown

he re

ar e

the

results

of

an

olde r

i ndex ing

progr amme

wr itte n

in

IDL,

and

the

ne w

index ing

pro gramme,

wr itte n

in

C
.

The

ne w

index ing

progr amme

is

provi ng

to

be

mor e

versati le,

and

mor e

reli abl e

than

the

ol d

progr amme,

often

i ndex ing

sets

of

data

that

the

old

programme

failed

with
.


Peak Search


Indexing Process


Grain Orientations


Apply to Entire Data
Set


2D Maps Generated for Selected Elemental Lines


VESPERS Beamline Experimental Setup


Sample


Beam


XRD Area Detector


XRF Output


XRD Output