The Strategies Web Development Kit (WDK)

hopeacceptableΛογισμικό & κατασκευή λογ/κού

28 Οκτ 2013 (πριν από 3 χρόνια και 1 μήνα)

118 εμφανίσεις

The Strategies Web Development Kit (WDK)

A New Approach to Searching Omics Databases

Steve Fischer

The EuPathDB Project

www.gusdb.org/wdk

The EuPathDB Genome Resources

www.gusdb.org/wdk


21 Annotated Eukaryotic Pathogen Species (parasites)


6 sites by genus


A EuPathDB portal that aggregates all data


Sample research applications:


Find malaria drug targets


Find sleeping sickness vaccine targets


Find
Toxoplasma

proteins that transit to the host


Examples of Omics Data We Integrate

(beyond standard genome annotation)

RNA
Sequencing

Transcript Expression

(multiple platforms)

Protein
Expression

ChIP
-
Chip

SNPs

Isolates

www.gusdb.org/wdk

Transcript
Expression

Transcript
Expression

Genomics data is highly
-
dimensional

Annotated
Gene

Protein
Expression

GO
Annotation

Pathways

Genome
Location

Protein
Structure

Protein
Qualities

Gene
Structure

Gene
Qualities

Phenotype

Protein

Interactions

E.C.
Annotation

SNPs

Protein
Expression

Protein
Expression

Transcript
Expression

Protein
Qualities

Protein
Qualities

Cellular

Location

SNP

BLAST
Similarity

Motif
Similarity

Transcript
Assembly

Library

ORF

Protein

Expression

BLAST
Similarity

Motif
Similarity

SAGE Tag

Differential
Expression

Sequence

Expression
Level

Allele
Frequency

Isolate
Comparison

Isolate
Assay

Phyletic
Pattern

Ortholog
Group

www.gusdb.org/wdk

90 Searches on PlasmoDB

www.gusdb.org/wdk

90 Searches on PlasmoDB

51 Gene Searches

39 Other Data Type Searches

~250 parameters total

www.gusdb.org/wdk

www.gusdb.org/wdk

Challenge
:
help users ask hard questions of the integrated data


to find targeted sets of genes

Solution
:

The Strategies Search Interface


Induces users to move away from one click searching into


seamless use of Boolean set operations.

www.gusdb.org/wdk

Challenge
:
help users ask hard questions of the integrated data


to find targeted sets of genes

Solution
:

The Strategies Search Interface


Induces users to move away from one click searching into


seamless use of Boolean set operations.
Makes “advanced” easy.

www.gusdb.org/wdk

Challenge
:
help users ask hard questions of the data


to find targeted sets of genes

Solution
:

The Strategies Search Interface


Induces users to move away from one click searching into


seamless use of Boolean set operations.
Makes “advanced” easy.

www.gusdb.org/wdk

Challenge
:
help users ask hard questions of the data


to find targeted sets of genes

Portable to any
omics

database

Live demo…


Build the strategy shown above


Find parasite kinases that are likely exposed to the host






www.gusdb.org/wdk

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term



Two expression experiments showing expression in trophozoite life stage

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term



Two expression experiments showing expression in trophozoite life stage



Mass spec trophozoite experiment

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term



Two expression experiments showing expression in trophozoite life stage



Mass spec trophozoite experiment



Phylogenetic profile indicating presence in parasite but not host (mammals)

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term



Two expression experiments showing expression in trophozoite life stage



Mass spec trophozoite experiment



Phylogenetic profile indicating presence in parasite but not host (mammals)



Under purifying selection


essential to the parasite

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term



Two expression experiments showing expression in trophozoite life stage



Mass spec trophozoite experiment



Phylogenetic profile indicating presence in parasite but not host (mammals)



Under purifying selection


essential to the parasite



Ortholog transform


transform
P.falciparum

set into all
Plasmodium

species

More Complex Malaria Drug Targets Strategy

This strategy has been shared. The URL is in the Abstract

www.gusdb.org/wdk



Enzymes


union of EC and GO term



Two expression experiments showing expression in trophozoite life stage



Mass spec trophozoite experiment



Phylogenetic profile indicating presence in parasite but not host (mammals)



Under purifying selection


essential to the parasite



Ortholog transform


transform
P.falciparum

set into all
Plasmodium

species



Save and share

Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model


Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model




Runs on any relational database

Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model




Model
-
View
-
Controller design

Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model




Model


Configured in XML


Abstracts Records and Searches


Specifies columns


Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model




Model



Arbitrary data sources (eg BLAST) via web services


Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model




View


JSP and CSS



(Javascript and Ajax)

Strategies WDK Architecture

www.gusdb.org/wdk




Controller

View
(web)











Genomics
Database

WDK Engine
Query Cache

Genomics

Data

User Login
and Search
History

WDK Model

(Java Objects)

WDK Model

(XML)

WDK Query

Engine

(Java)

Web Services
Framework

JavaBeans

(JSP
compatible)

JSP Tag
Library

Struts
controller

WDK Sanity
Test

JSP and CSS

= You provide

= WDK provides

Processes
(eg, BLAST)

Model




Controller



Struts

User Driven Development


Computer
-
human interaction (CHI) studies


During prototyping


Video and audio capture of workshop participants doing exercises


Drove the design, and showed high user enthusiasm.



User feedback has been very positive.



Usage statistics


show 3
-
fold increase in use of Boolean operations (in comparable two
month periods)





www.gusdb.org/wdk

Upcoming Features




Genes Basket (delivered 1/7/10)


Cherry pick genes


Generate reports from the basket


Send to a postprocessing tool (eg, MSA)


Add as step in a strategy (eg, subtract known genes)



Web services (delivered 1/7/10)


Run searches via RESTful web services



Weighted Searches


Assign weights to searches for increased filtering discrimination


For example, weight EST data more heavily than SAGE data



Span logic


Transform a set of one type into another type based on genome span
relations


For example, find all ESTs that overlap with the set of Genes I found.



www.gusdb.org/wdk


www.gusdb.org/wdk

Quick Tour of Template Site


Simple demo site


You can install it and use it as a template


Has a simple Model


And a simple View



Show


Record page


Searches


Report maker






www.gusdb.org/wdk

Quick Look at WDK Model



Defined in XML



Records


Like “objects,” but data only. No methods.


Two data types


Attributes


Eg, Gene Name


Or Gene Location


Tables


Eg, Gene Xrefs


Or Gene GO Associations


Data acquired from DB through SQL



Searches


Return sets of records


Columns are attributes


www.gusdb.org/wdk

Acknowledgements

Cristina Aurrecoechea

Mark Heiges

Cary Pennington

Eileen Kramer

Jessica Kissinger


Brian Brunk

Jerric Gao

Omar Harb

Charles Treatman

David Roos

Chris Stoeckert


EuPathDB

is an NIAID Bioinformatics
R
esource Center

Supported by



NIAID Contract No. HHSN266200400037C



The Bill & Melinda Gates Foundation

The EuPathDB User Interface Team

And Principal Investigators

www.gusdb.org/wdk

Following slides are demo backup

www.gusdb.org/wdk