Less is more

farmpaintlickInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 3 χρόνια και 1 μήνα)

47 εμφανίσεις

Less is more

Approaches to

biologist
-
driven analysis

and

next
-
generation sequencing data

Paul Gordon

Genome Canada Bioinformatics Platform

University of Calgary


What am I doing here?



Next Generation Sequencing




Next Generation Web




Future challenges

Genome Canada

Bioinformatics Platform

Better tech: less DNA, more sequence

44
μ
m

70nm

PhytoMetaSyn

Sprockets:

Hierarchical Gene Models from ESTs

Developed in collaboration with BASF Plant Sciences

Genozymes

Hydrocarbon
Metagenomics

Exploring gene expression patterns

CAVEman


Java 3D
-
based, world
-
first complete 3D human body atlas (adult male)


2,335 organs, hierarchical organization following
Terminologia

Anatomica


Numerous applications involving mapping of genetic and disease data


More information:
http://cave.ucalgary.ca/caveman

Patient MRI stack
mapped onto atlas and
registered by landmarks

Pharmacokinetics visualization

(Absorption
-
distribution
-
metabolism
-
excretion of Aspirin)

Basic Research


Archaeal

UV
-
light
response



Large
-
scale human


genome organization



ING
-
protein interactions
(cancer and ageing
-
rated
proteins)

Research Applications



Kidney transplants:
improved
rejection
diagnostics in
Edmonton




Mad cow disease/chronic

wasting disease: live diagnostics


Desulf
.: mechanisms of oil pipeline
corrosion and its prevention

DNA Diagnostics Discovery for Mad Cow

Preclinical

Clinical

Preinoculation

Controls

Control

animal #6

Ball toy

Photo:
S.
Czub
, CFIA
Lethbridge

Next
-
gen

Motif finding (elk dataset)

61

blood samples

107 million

base pairs

432 billion

pairwise alignments (
657431
2
)

1082019

25mers or smaller

Uninfected

152317

Infected

3 universal

Infected

132417

Thousands of animal

coverage/timepoint combos (CPU intensive)

Decypher hardware accelerator

Decypher hardware accelerator

Motif Results


EVI1


PLZF

Retrovirus

PrP
sc
(+?)


PLZF
-
controlled


genes

Infectious agent

Circulating

Nucleic Acids

Endogenous Retrovirus?

Consistent with protein
-
only evidence…

Neurovirulent?
(e.g. M.L. Labat 1999)

Possible mode of action?

Virus particles? ~25nm

PrP Amyloid fibres

Vacuole

Manuelidis
et al
, PNAS 2007

Protected promoters

(Motifs A & B)

Feedback

PrP

Integration

Nucleoprotein complexes

Cell death

CNA Export

Carp
et al.
, EMBO J., 2006

Leblanc
et al.
, EMBO J. 2006

Stengel
et al.
, Biochem. Biophys. Res. Commun. 2006

Lee
et al.
, Biochem. Biophys. Res. Commun. 2006

Etc.

Activation

Better tech: less input, more results

Better tech: less DNA, more sequence

Generate

Manuscript

Now

Where are we at?

Bioinformatics

Web

Emerging

Technologies

Life Sciences

Semantic Web

Source: Gartner Inc.

How software works…

Functions/

Rules

Parameters/Input

Results/

Output

(article, allele,…)

(Gene name, DNA sequence, QTL…)

The problem with the Web

Once you label me, you negate me.

Søren Kierkegaard


1998

Now

Bluejay

http://bluejay.ucalgary.ca


Comparative
genomics

BioMoby

linking

Waypoints

Gene
expression
integration

The task at hand
(biologist)

Sequencer Data File
(Binary)

ACCGT…

Known

Proteins

BLAST

Report

(related

proteins)

(computer scientist)

DNASequence

NCBI_gi

Sequence_Alignment

Audience

God

Amoeba

Self
-
perception of computer skills

The need for shoehorns


The current vision of the Semantic Web
intends to create a new structure starting up
with no reference to its vast, functioning, but
more primitive predecessor … things just don’t
happen like that

All the Web as Workflows

Seahawk

Proxied

Web page

Drag ‘n’ drop

Seahawk

prompting

What’s Ahead?

The more a man learns, the more he realizes how little he knows

Semantic Web

http://www.uniprot.org/tissues/229

http://purl.uniprot.org/po/0009009

Take home messages

As tech improves, we can ask better questions


We will need shoehorns to access existing

resources for the foreseeable future