Accessing Biodiversity Resources in Computational Environments from Workflow Application

fallsnowpeasInternet et le développement Web

12 nov. 2013 (il y a 7 années et 11 mois)

284 vue(s)

Accessing Biodiversity Resources in
Computational Environments from Workflow

J. S. Pahwa
, R. J. White, A. C. Jones, M. Burgess, W. A. Gray,

N. J. Fiddian, T. Sutton, P. Brewer, C. Yesson, N. Caithness,

Culham, F. A. Bisby, M. Scoble, P. Williams and S. Bhagwat

WORKS 2006, Paris


The Biodiversity World (BDW) Project

The three exemplars chosen for BDW

BDW Architectural Components

Resource Wrappers

GRID Interface (BGI) Communications Layer

BDW Datatypes

The Metadata Repository (MTR)

Using BDW for bioclimatic modelling

Access to computational resources in BDW environment

Further Work & Conclusions

The BDW System

A framework for biodiversity problem

provides access to widely dispersed, disparate
data sources and analytical tools

Intended particularly for analysis and modelling of
biodiversity patterns

Provides access to resources originally
designed for use in isolation

Resources may be composed into complex

BDW Exemplars

Biodiversity richness analysis and
conservation evaluation

Bioclimatic modelling and global climate

Phylogenetic analysis and biogeography

Biodiversity Richness Analysis and
Conservation Evaluation


analysis of biodiversity richness patterns for a
particular taxon (e.g. group of species) around the

The BDW System enables:

Taxonomic verification using the Species 2000
Catalogue of Life service

Composition of distribution datasets for the chosen
taxon from various sources around the world

Use of the WorldMap System to

visualise the distribution datasets, and

help identify priority areas for biodiversity conservation

Bioclimatic Modelling and Global Climate


Understand impact of global climate change on
distribution and diversity of plant & animal species

Identify climatic & ecological conditions under which
a single species lives, extrapolating from known

calculate a potentially wider set of areas
where the species might occur, or predict future
distribution under anticipated climatic conditions

A bioclimatic modelling workflow example follows

Phylogenetic Analysis and Biogeography


Discover ancestral relationships between groups of
organisms using methods of
phylogenetic analysis

Estimate ages of species

Use estimates of historical climate to produce
plausible estimates of geographical distributions

Assess historical relationships between changing
climate and development of new species

The BDW System provides (1):

A flexible and extensible problem solving
environment (PSE)

Means of

bringing together heterogeneous, globally distributed,
related resources & analytical tools

assembling resources into workflows to perform complex
scientific analyses

Consistent mechanisms to achieve interoperability
of system components

The BDW System provides (2):

Uniform interfaces for heterogeneous
resources (resource wrappers)

Mechanism for data packaging & transfer

Compatibility with the Triana Workflow
System for assembling and executing

Web Services
based Grid middleware for
accessing remote computational resources

The BDW System Architecture

BDW architectural components (1)

Resource Wrappers

Provide consistent interface to local & remote resources, and
standard resource access/invocation mechanism

Insulate the core BDWorld System from resource

Wrap various kinds of resources and analytical tools and can
be deployed in Grid/Web Services environment.

Give consistent form to data retrieved by encapsulating them
into BDWorld data types

Resources wrapped include AVH, GBIF, OpenModeller, etc.

Resource Wrapper Architecture

BDW architectural components (2)

GRID Interface (BGI) Layer

Provides standard mechanisms for invoking operations on
heterogeneous resources

Acts as an integrated mechanism for accessing all resource

Isolates resource wrapper implementation to a separate layer
to enable the use of web services/grid technologies

BDW architectural components (3)

BDW Datatypes

Encapsulate different types of data and sub
datatypes for
transporting data between end points

Can be transformed into xml representations which can be
easily serialised

Flexible enough to encapsulate user
defined xml documents
or data in a string representation

Extensible; new datatypes can be incorporated

BDW Datatypes

BDW architectural components (4)

BDW Metadata Repository

A specialised BDWorld resource

Provides information such as:

Available resources

Operations supported by each resource

Data types used by operations

Location of resource wrapper

Stores semantic information in the BDWorld ontology,
to answer questions such as

‘Which resources can provide me with species data?’

‘Which available operations can accept the outputs from a
specific operation?’

Bioclimatic Modelling (1)

By using the known localities of a species, a
climate preference profile is produced by
referencing with present day climate

This climate preference profile is then used to
locate other areas where such a climate
exists, indicating areas climatically suitable for
the species

Bioclimatic Modelling (2)

Using present
day climate:

assess areas under threat from invasive species,

those that may benefit from the introduction of a
new crop

Using climate predictions for the future:

assess possible effects of global climate change
on the distribution of study species

Using climate predictions for the past:

assess changes caused by natural factors in the

Bioclimatic Modelling Workflow performed by
Triana workflow package in BDW system

Example model output for the clover species Trifolium patens Schreber (a member of the bean
family). The map shows areas (shaded regions across Central and Eastern Europe, South America,
Asia and Australia) predicted to be suitable for the species in the 2050’s using the bioclimatic
modelling algorithm GARP and the Hadley Centre climate model using the SRES A1F climate

The Current BDW Architecture:

Enables execution of BDW workflow tasks in
remote nodes but with a limited scope.


Lacks in giving sufficient control and

flexibility to the user.


Does not provide the functionality of

distributing user jobs across several



Dependent on libraries at the client side.

The new BDW System architecture (1):

Provides user with access to:


Biodiversity resources.


Computational resources.

Use the existing mechanism of invoking
operations on remote resources via resource
wrapper web services.

It also uses condor middleware for utilising
computational resources and distributing
workload across available nodes.

The new BDW System architecture (2):

Provides access to the condor pool via the web
service interface.

Gives user to flexibility to choose available
computational node by using Ganglia cluster
monitoring toolkit.

Enables matching of workflow task with preferred

The new BDW System architecture (2):

Conclusions and Further Work

BDW brings together varied, distributed resources and
analytical tools for biodiversity researchers and analyse
biodiversity patterns

Disparate resources can be accessed in the Web
enabled BDW PSE.

The BDW PSE has uniform access to heterogeneous

BDW allows linking of tools and resources in a workflow to
automate different activities of an experiment

Three current exemplar study areas

The new BDW architecture also provides access to
computational resources.




BDW team

Species 2000

OpenModeller Community (including CRIA)