How You Can Benefit from the Bioinformatics Resource

dasypygalstockingsΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

91 εμφανίσεις

How can you benefit from the
Bioinformatics Resource?


Can (John) Bruce, Ph.D.

Associate Director

Bioinformatics Resource

Keck Biotechnology Laboratory

The Bioinformatics Core


Created within Keck Lab upon request from Yale
School of Medicine, July 2007.


Director Hongyu Zhao Ph.D; Associate Directors Can
Bruce, Ph.D. & Yong Kong , Ph.D.


The facility is located at Sterling Hall of Medicine.


Commercial software packages provided free by the
Core are available to Yale researchers 24/7.


Services


Access to large number of widely used
commercial and open source bioinformatics
programs.


Fee
-
based consultation services for well
defined bioinformatics analyses.


Collaborative projects requiring longer
-
term
commitment of time and effort

Available programs


DNA/protein sequence analysis :
Lasergene and Gene Construction Kit.


Pathway Analysis: Ingenuity Pathway
Analysis and
MetaCore
.


Protein structure

modeling:
Sybyl
, a
protein structure

modeling and
visualization program.


Mass spectrometry data analysis:
GPMAW.


Pipelining programs: Pipeline Pilot and
VIBE

Examples of Current Collaborations


Pathway analysis on proteomics data (Yale/NIDA
Proteomics Center Project and Yale/NHLBI
Proteomics Center Project investigators)


Development of an algorithm for identification of
phosphorylation sites from tandem spectrometry
data (E. Gulcicek in Keck Proteomics )


Molecular modeling of MAP
Kinase

ligand

interactions (B. Turk in Pharmacology)


Sequence analysis for defining invention claim for
Office of Collaborative Research



Microarray analysis software


GeneSpring

GX,

provides visualization and
advanced statistical analysis for gene
expression data.


Partek

Genomics Suite
, provides advanced
statistics and interactive data visualization
designed for gene expression analysis,
exon

expression analysis, promoter tiling array
analysis, chromosomal copy number analysis,
and SNP analysis.


Sequence Analysis Software


DNASTAR Lasergene
, a comprehensive suite
of programs for analysis of DNA/RNA/protein
sequences including sequence editing,
sequence assembly, sequence alignment,
primer design, protein structure prediction,
and gene detection and annotation.


Gene Construction Kit 2.5
, a tool for
designing, drawing, and annotating DNA
sequences especially plasmid constructs.


PIPELINING PROGRAMS

This pipeline from
Pipeline Pilot takes a
Swiss
-
Prot sequence,
from a Web portal, then
generates a results
page with four tabs,
giving summary data,
sequence features
map, chemical
structures of substrates
and blast results.

PATHWAY ANALYSIS


MetaCore
(from GeneGo),


Ingenuity Pathways Analysis 3.1

(from Ingenuity Systems).



Both are integrated software suite for functional analysis.


Based on a proprietary manually curated database of human protein
-
protein, protein
-
DNA and protein compound interactions, metabolic and
signaling pathways and the effects of bioactive molecules.


Metacore can be integrated with other software packages such as
Genespring, Resolver, Expressionist etc. , Pipeline Pilot, EndNote,
Cytoscape.


Ingenuity can be integrated with Genespring, Partek genomics, SAS
-
Jump
Genomics, Spotfire.


Why Pathway Analysis?

Pathway Creation Algorithms in
MetaCore (1)

Direct Interactions Algorithm

Draws direct
interactions between
selected objects.

No additional objects
are added to the
network

Self regulatory Networks

Finds the shortest
directed paths
containing
transcription
factors between
your genes in the
gene list.


(better used for
small number of
targets)

Expand by one

(not suitable for large collections of targets)

Auto expand

Draws sub
-
networks around
the selected objects, stopping
the expansion when the sub
-
networks intersect

Pathway Creation Algorithms in
MetaCore (2)


Analyze Network
: Creates a list of possible networks, ranked
according to how many objects in the network correspond to
the user's list of genes, how many nodes are in the network,
how many nodes are in each smaller network.


Analyze Transcription Network
similar to above, sub
-
networks created are centered on TFs.


Analyze Networks (Transcription Factors)
focusses on
presence of TFs at end notes.


Analyze Networks (Receptors)
focusses on presence on
Receptors at end point of a network.

Analyze Network Algorithm

P<1e
-
18

A proteomics
experiment.

Effect of drug
infusion on plasma
proteins

Generates sub
-
networks highly
saturated with selected
objects. Sub
-
networks
are ranked by a P
-
value and

G
-
Score and
interpreted in terms of
Gene Ontology

Analyze Networks (Transcription Factors) Algorithm

-

an example
-

Favors netwok
construction where the
end
-
nodes of
transcriptionally
regulated pathways are
present in the original
gene list.

P=7.2e
-
46

Example from an
mRNA expression
analysis data set
comparing healthy and
lesion skin.

Analyze Network (Receptors) Algorithm

-

an example
-

Favors network construction
where the end
-
point of a

pathway leads to a receptor
(through “receptor binding”)
and the starting point of a

pathway (a transcription
factor, or ligands, etc…) is
present in the original gene
list, regardless of the
presence of the end
-
point
receptor in the list.

Transcription Regulation Algorithm

13 targets/14 nodes

P=7.3e
-
31

Generates sub
-
networks
centered on transcription
factors. Sub
-
networks are
ranked by a P
-
value and
interpreted in terms of
Gene Ontology

Immune response: Histamine H1 receptor
signaling in immune response (p=1e
-
4)

GeneGo process networks

WNT signaling (p=1e
-
5)

Disease biomarker enrichment

Network
-
disease associations

1) Carcinoma (72%
coverage, p=3.3e
-
10)


2) Neoplasms,
connective and soft
tissue. (42% coverage,
p=8e
-
10)

Use of Pathway Analysis in
Candidate Gene Identification

1061 genes

are located to
mapped region for
disease

FGF2,

WNT5A,
Tenascin
-
C, EGF,

ILI1RN,

BDNF,

TGF
-
beta2, FGF2,

OSF
-
2,
CSPG4(NG2), IL
-
8,

ENA
-
78,

GCP2,

SLIT2,

SLIT3,

Activin beta A,

Annexin I


360 genes up
-

or down
-

regulated by >2x

17 receptor ligand genes
are important “input”
nodes to pathways
formed by genes with
changed expression.

Other up
-

or down
-

regulated genes

Pathway analysis narrows down
number of candidate genes for disease

ErbB2

PECAM1

DDX5

BCAS3

microRNA1

RARalpha

MUL

VHR

WIP

ErbB2

NIK

Plakoglobin

HEXIM1

Prohibitin

STAT5A

STAT3

Clathrin

PSME3

PSMC5

ErbB2



FGF2,

ILI1RN,

ErbB2

360 genes up
-

or down
-

regulated by >2x

Other up
-

or down
-

regulated genes

These

genes, from mapped region of interest, are able to form
interaction pathways going through
these

receptor ligands
identified by first analysis.

A caveat

Not every gene belongs to a pathway in the database…

Why Pathway Analysis Software?


A learning tool


Study a group of gene products.


A data analysis tool.


Which pathways are particularly affected?


What disease has similar biomarkers?


A hypothesis generation tool


Can provide insight into mechanism of regulation of your genes.
Which is the likely causative agent for the observed changes?
What is likely to happen as a result of these changes?


Suggest effects of gene knock
-
in or knock
-
outs.


Suggest side
-
effects of drugs.


Can highlight new phenomena that needs further investigation.
What does the program
not

explain?

Thank you.