Bioinformatics Challenge Day

tanktherapistΒιοτεχνολογία

23 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

79 εμφανίσεις

Peter Carr

2/2/2013

Bioinformatics Challenge Day

This work is sponsored by the Defense

Threat Reduction Agency

under Air Force Contract #FA8721
-
05
-
C
-
0002.



Opinions, interpretations, recommendations and conclusions are those of the authors and

are not necessarily endorsed by the United States Government.

Bioinf ormatics Challenge Day
-
2

Peter Carr 2/2/2013

Sponsor: Defense
Threat
Reduction Agency
(DTRA
)

Organizer: MIT
Lincoln
Laboratory (MIT LL)



Approach: A one day hack
-
a
-
thon


Innovate
: tackle huge challenges in bioinformatics


Educate
: bring in specialists from diverse fields,
participants in
DoD

bioinformatics interests


Investigate
: what this short format can accomplish


Aggregate
: bring people together


The Challenges:


Can you determine the cause of an infection?


Can you invent a new way to visualize complex
bioinformatics data?


Can you spot the signs of genetic engineering? Can
you figure out what an engineered organism does?


Bioinformatics Challenge Days

The problem: drowning in complex data,
very
hard to make sense of it all

DNA sequencing

MAGE engineering

Bioinf ormatics Challenge Day
-
3

Peter Carr 2/2/2013

Cast of Characters

Darrell
Ricke

(MIT Lincoln Laboratory)


Bioinformatics

Peter Carr (MIT Lincoln Laboratory)


Synthetic Biology, Biochemistry

Anna
Shcherbina

(MIT Lincoln Laboratory)


Bioengineering, Electrical Engineering


Nancy Burgess (Defense Threat Reduction Agency)


Chemical and Biological Defense


Bioinf ormatics Challenge Day
-
4

Peter Carr 2/2/2013


Sequencing


Complete genome sequences


Mixed populations


Expression (RNA species)


Interaction (
ChIP
-
seq
)



Mass spectroscopy


Protein/peptide fingerprinting


Metabolites


Interaction (cross
-
linking)



Other tools


Microarrays


High
-
throughput screening


(e.g. fluorescence)

Some Big Hammers

Bioinf ormatics Challenge Day
-
5

Peter Carr 2/2/2013


Data galore:
Omics

approaches are generating
massive amounts of increasingly complex
measurement data



How do we best make sense of this information?



Some fundamental development areas


Processing


Visualizing/analyzing


Storing/accessing

Now and Future

Bioinf ormatics Challenge Day
-
6

Peter Carr 2/2/2013

The Challenges


1.
Metagenomic

Visual

Developing visualization methods to facilitate analysis of
metagenomic

data with unknown numbers of genomes at
varying concentrations

2.
Genome Assembly for the Clinic

Performing de novo assembly from clinical samples with an
emphasis on pathogen identification

3.
Genetic Engineering

ID and interpret the signatures of genetic engineering


Bioinf ormatics Challenge Day
-
7

Peter Carr 2/2/2013

What can your efforts today produce?


Analysis, answers to questions



Heuristics, algorithms



Specific software tools



Roadmap for future work


Bioinf ormatics Challenge Day
-
8

Peter Carr 2/2/2013

What to get out of this?


A deeper understanding of the field


Tools


Approaches


Concerns/challenges



Ideas and experiences that may motivate future work



Connection to others with similar interests

Bioinf ormatics Challenge Day
-
9

Peter Carr 2/2/2013

Creativity

(innovative ideas and efforts)



Energy

(intensity and focus)



Communication

(results, feedback)

What We Hope to See From You

Bioinf ormatics Challenge Day
-
10

Peter Carr 2/2/2013


You can work alone, come with a team, or team up
on
-
site



You can use any of the resources we have provided,
any you have access to (including tools you code
yourself ahead of time or today)



You keep what you make (DTRA and MIT LL make no
claims to what you produce)

Theme: Flexibility

Bioinf ormatics Challenge Day
-
11

Peter Carr 2/2/2013

Schedule

8:00 AM


Breakfast/check
-
in

9:00 AM


Welcome
(Pete)

9:15 AM


Overview
and logistics (
Pete)

9:45 AM


The Challenges:




1
.
Metagenomic

Visual (Anna
)




2
. Genome Assembly for the Clinic (Darrell
)



3
.
Genetic Engineering (Pete)

10:45 AM


Coffee/Break into project groups

12:30 PM


Lunch served (groups can continue to work)


3
:30 PM

Snack
(groups can continue to work)


6:
3
0
PM

Progress updates ready by
dinnertime


6:
3
0
PM

Dinner and progress
r
eports


8
:00
PM+

Groups can continue to work

Bioinf ormatics Challenge Day
-
12

Peter Carr 2/2/2013


On the USB sticks:


Data for the three challenges (FASTA, FASTQ,
CSV)


Software (Mac, Windows, Linux)


Local
wifi

access


Teaming


Getting Started

Bioinf ormatics Challenge Day
-
13

Peter Carr 2/2/2013

Questions?

Bioinf ormatics Challenge Day
-
14

Peter Carr 2/2/2013


Background: a sample has been dug from the back of a lab
freezer, and subjected to Ion Torrent sequencing



We would like to know what it is:


Simple or complex?


Natural or engineered?


If engineered, how? (what techniques)


For what purpose?


Will the design work?



[No surprise: yes, there is an (in
silico
) engineered component.
Find it! And figure out as much as you can about it.]


We have a lot of great questions, but may not have all the
answers

Challenge 3: Genetic Engineering

Bioinf ormatics Challenge Day
-
15

Peter Carr 2/2/2013


Investigation (answer a biological question)



Production (make a drug, a fuel)



Serve a specialized role


Protect against infection


Detect dangerous chemicals


Environmental remediation



Creatively explore an interesting design space


What Do We Design For?

Bioinf ormatics Challenge Day
-
16

Peter Carr 2/2/2013


How Do We Produce These?

Bioinf ormatics Challenge Day
-
17

Peter Carr 2/2/2013


Transformation/transfection can be via natural, chemical, or
electrical methods

Getting DNA In

Bioinf ormatics Challenge Day
-
18

Peter Carr 2/2/2013


Transfer “
in vivo
” protects fragile DNA


An entire genome can be transferred


Transfer to other species


Requires an origin of replication,
pilus

protein

Old School: Conjugation

donor

(sender)

recipient

(receiver)

Bioinf ormatics Challenge Day
-
19

Peter Carr 2/2/2013

Old School: Phage Transduction


Phage/virus can replicate independently, or integrate into genome


DNA or RNA, single
-

or double
-
stranded


Examples:


Lentivirus

(mammalian)


Lambda, T4, T7, P1, M13 (E. coli)

Bioinf ormatics Challenge Day
-
20

Peter Carr 2/2/2013


Natural mutation rates
(mutations accumulate
slowly over time)



Exposure to damaging
effects (chemicals,
radiation)



Mutator

strains: cells
defective for one or more
natural repair mechanisms

Old School: Mutagenesis

Bioinf ormatics Challenge Day
-
21

Peter Carr 2/2/2013


Specific sites: often 6
bp
, but
can be longer or shorter



“Outside cutters” cut some
distance away from
recognition site



Homing nucleases (longer ~30
bp

sites, can be unique in a
genome)



Multiple Cloning Site (MCS)
often engineered into cloning
vector

Revolution 1: Restriction Enzymes

Bioinf ormatics Challenge Day
-
22

Peter Carr 2/2/2013


Circular


Contain origin of replication


Single copy


Low to high copy (hundreds)


Selection gene (1 or more)


MCS and other features
common




Extension: BACs and YACs

Plasmids

Bioinf ormatics Challenge Day
-
23

Peter Carr 2/2/2013


Almost all
approaches give a
mix of successes
and failures


Screening

searches
for what you want


Selection

kills off
what you don’t want

Selection and Screening

Bioinf ormatics Challenge Day
-
24

Peter Carr 2/2/2013

Polymerase Chain Reaction



Simple scheme made it
possible to manipulate DNA in
new ways



Used not just to make more
DNA, but to modify it



Dependent on oligonucleotide
synthesis and enzyme (DNA
polymerase)

Revolution 2: PCR

Bioinf ormatics Challenge Day
-
25

Peter Carr 2/2/2013


Perform on DNA
in vitro

(higher
background error rates than
in
vivo
)



Employs a synthetic
oligo

and
an enzyme (polymerase)



Users typically screen clones
with PCR or restriction, then
sequencing



Rest of the plasmid typically not
re
-
sequenced

Site
-
Directed Mutagenesis

Bioinf ormatics Challenge Day
-
26

Peter Carr 2/2/2013


Can bring together many
pieces of DNA at once


Based on identical
sequence overlaps


3
-
enyzme reaction


Intrinsically scar
-
less


Often relies on PCR (&
thus
oligos
) to produce
each segment

Gibson Assembly

http://www.youtube.com/watch?v=
WCWjJFU1be8


Bioinf ormatics Challenge Day
-
27

Peter Carr 2/2/2013


“Outside cutter”
restriction enzymes



Little or no scar at
joining point



Segments may or may
not be produced by PCR

Golden Gate Assembly

Bioinf ormatics Challenge Day
-
28

Peter Carr 2/2/2013

Recombination


Site
-
specific


attB

(Gateway)


Cre
/lox



Homologous


Natural (B.
Subtilis
,
RecA
)


Engineered (lambda red)



Directed by double
-
stranded break repair


Zn finger nucleases


TALENs


CRISPRs


Bioinf ormatics Challenge Day
-
29

Peter Carr 2/2/2013


Oligo

synthesis (building
blocks) using
organic chemistry



Assemble to genes using
biochemistry

(
in
vitro)



Assemble to genomes (small
ones for starters) using
biology

(in vivo)



Each of these processes can
carry their own error signature,
but can also be counteracted by
sequencing
-
based screening,
post
-
repair, etc.

DNA Synthesis to Genome Assembly

Bioinf ormatics Challenge Day
-
30

Peter Carr 2/2/2013

MAGE: Multiplexed Automatable Genome
Engineering

Wang, Isaacs, Carr
et al.
(
2009)

Nature

460(7257):894
-
8

Generation of
genome
edits
at many targeted
chromosomal locations

Much like site
-
directed
mutagenesis, but on a
chromosome

Bioinf ormatics Challenge Day
-
31

Peter Carr 2/2/2013


A lot like site
-
directed mutagenesis

but on the genome of
living cells


Uses long
oligos


Does not require selection markers (but can use them)


Other than the desired change (as small as a DNA base, as large
as a multi
-
gene deletion) there is no obvious sign



BUT there can be secondary signs:


Oligo
-
mediated defects within 50
-
100
bp

of the edited site


Higher background mutation rates (mismatch repair deactivated)

MAGE

Bioinf ormatics Challenge Day
-
32

Peter Carr 2/2/2013


Conjugation now employed with controlled precision


But DNA crossover points not always perfectly defined

CAGE: Conjugative Assembly Genome
Engineering

Isaacs, Carr, Wang, ...

(
2011)
Science

Bioinf ormatics Challenge Day
-
33

Peter Carr 2/2/2013


Make use of DNA
“parts” libraries for
constructing more
advanced genetic
designs



Fundamental concept in
synthetic biology,
inspired by electrical
engineering



Basis of the
iGEM

competetion

(International
Genetically Engineered
Machines)


Genetic Circuits: DNA Parts

Bioinf ormatics Challenge Day
-
34

Peter Carr 2/2/2013


Repressilator

an early
example of synthetic
biology circuits


Three inverters in series
(circular) made a ring
oscillator)

Genetic Circuits:
Bacteria

Elowitz

and
Liebler

(2000)
Nature

Bioinf ormatics Challenge Day
-
35

Peter Carr 2/2/2013


Adapted a signaling
system from plants



Used to engineer
communication
between yeast cells



Basic features can be
installed in a variety of
organisms

Genetic Circuits:
Yeast

Chen and Weiss (2005)
Nature Biotechnology

Bioinf ormatics Challenge Day
-
36

Peter Carr 2/2/2013

Genetic Circuits:
Mammalian

Xie

et al.
(2011)
Science
(Weiss, Benenson labs)

Genetic Circuits

Overview

DNA for classifier circuit

matc
h

c
ell death

no
match

no effect

cancer
cell

n
ormal
cell

Concept: insert DNA circuit into cells


ID cancer and/or kill it

Bioinf ormatics Challenge Day
-
37

Peter Carr 2/2/2013


Codon usage


Adapt how often codons are used to match target organism


New amino acids (
Tirrell
, Schultz)


New genetic codes (Church, Carr)



Minimal life


Engineering by subtraction (
Blattner
)


Compose from the ground up (Forster/Church)



New DNA bases


Alternate hydrogen
-
bonding (Benner)


Hydrophobic bases (Schultz)



Mirror
-
image life

Increasingly Alien