Dedication_BCC_posterx - MIT

hordeprobableΒιοτεχνολογία

4 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

146 εμφανίσεις

The Bioinformatics and Computing Core Facility 76
-
189

Sebastian Hoersch, AJ Bhutkar, Paola Favaretto, and Charlie
Whittaker

Personnel

Charlie Whittaker

(
charliew@mit.edu
)

Sebastian Hoersch

(
hoersch@mit.edu
)

Arjun Bhutkar

(
arjun@mit.edu
)

Paola Favaretto

(
paolaf@mit.edu
)



Characterization of Complex Mixtures

Libraries of biologically
active sequences are
subjected to experimental
conditions. NGS is then
used to compare the
composition of the
mixtures derived from the
different conditions.



Sequences that promote or
inhibit rumor growth can be
identified.

Comparison of the same
sequence from different
organisms facilitates the
identification of features that are
invariant over hundreds of
millions of years of evolution.


Conservation indicates important
biological function.

Conservation

KI Computing Resources

KI Computing Resource Highlights



Storage
-

rowley.mit.edu



162 Tb Isilon storage cluster (6 nodes
)



Data Processing
-

rous.mit.edu


96
-
core
linux

cluster (12 nodes
)



Excellent
networking



Dedicated local network (red arrows) and MIT
-
net (green arrows)
at both 10Gb/s (thick lines) and 1
Gb/s

(thin lines).



Storage volumes are available
for each lab (blue cylinders
), core
facility
(orange cylinders)

and headquarters (
green cylinder)



All
personnel can have their own storage
space.



Flexible access
and permission

control based on group.



All
instruments with networked computers
can connect
.



Storage is backed
up to
MIT
-
TSM system.



Web
-
based data transfer and wiki pages are available.



All
equipment housed in
76
-
060 with space for expansion.



rous.mit.edu

named for Peyton Rous

-

discovered
tumor
-
inducing
viruses



rowley.mit.edu

named for Janet
Rowley
-

established
link between
chromosomal
translocations and cancer


Bioinformatics and the Analysis of Biological Information

Massively parallel
measurement

Microarray Technology

Next Generation Sequencing

DNA

RNA

Protein

Replication

Transcription

Translation

Biological information passes from the DNA in genomes through RNA into proteins. Perturbations in this information or its flo
w c
an cause cancer
and other diseases. Those perturbations and their effects can be studied using genomics technologies such as microarrays and
nex
t
-
generation
sequencing (NGS). These tools produce huge amounts of data and bioinformatics is used to relate those data to biological cond
iti
ons.

The analysis and storage of data produced by the research techniques used in the Koch Institute
requires sophisticated computational resources. Many of the computing needs of the KI are met
by a high
-
performance Isilon cluster (
rowley.mit.edu
) and a cluster of computers (
rous.mit.edu
).

Next generation sequencing is used to identify mutations or polymorphisms.



Sequences derived from experimental samples are compared to reference sequence.



Positions that differ (arrowheads) may document alterations in biological function.



Identification of Mutations

Structural variations in tumors are
characterized using both arrays (A)
and sequencing (B).



Gains or losses of genetic material may lead to
alterations in protein levels and subsequent
acquisition of disease traits.

Structural Alterations of Genomes

A

B

chr 25,
100K windows

chr

25,
10K windows

Fold Change

Concentration

Analysis of Functional Annotations

Biobase
TransPath

GeneGo

MetaCore


Ingenuity
Pathway Analysis

Proteins function in interconnected pathways to impart biological traits.



Pathway analysis is used to characterize the functional relationships between groups of genes identified
by experimentation.

Microarrays and sequencing can be used to
study transcript diversity. Alternative splicing
can result in many different versions of a
protein being produced by the same gene.


Different proteins can have different functional
characteristics.


The types of transcripts produced by a gene can be
regulated according to disease state .

Analysis of Transcript
Isoform Variation
.

Gene Expression Analysis

Gene expression analysis

measures the amount of each RNA present for
each gene in a sample. It can
be done using

sequencing (
RNAseq
)
and
microarrays
. RNA amounts are used as proxy for protein levels.


The activity of biological processes in normal and disease traits can be characterized by
studying RNA levels

RNAseq

workflow

Differential gene expression

B statistic

Log ratio

Enrichment Score

Gene set enrichment analysis

Clustering Analysis

DATA