Topic 18

clumpfrustratedΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

80 εμφανίσεις

Aka, The Inverse Folding Problem

Topic
18

Chapter 39,
Du and Bourne “Structural Bioinformatics”

Protein Design is an Inverse Problem of Structure Prediction

MDVGQAVIFLGPPGAGKG
TQASRLAQELGFKKLSTG
DILRDHVARGTPLGERVR
PIMERGDLVPDDLILELI
REELAERVIFDGFPRTLA
QAEALDRLLSETGTRLLG
VVLVEVPEEELVRRIL…

Biology

Adopted from Amy Keating’s slides at MIT.

Different Types of Protein Design

Protein design

Grand challenge

De novo
design

Immediate Practical
applications

Design of new proteins

--

novel protein folds

--

binding interfaces

--

enzymatic activities

--

etc.

Redesign of existing proteins

--

increased
thermostability

--

altered binding specificity

--

improved binding affinity

--

enhanced enzymatic activity

--

altered substrate specificity

Current Opinion in Biotechnology 2007, 18:1
-
7.


Protein Design Problems

Annu
. Rev.
Biochem
.
2008
.
77
:
363
-
382
.


Goal: design a protein that adopts a given structure

Open problems with assessment:

--

What resolution is required? (fold,
sidechain
, loop,
etc
?)

--

Stability of the designed protein

--

Structural uniqueness

--

Must solve the structure to know how
you did!

There are typically many sequences that
adopt the fold, so you must try to find one
that the most stable.


That is, minimize the quantity:


D
G
fold

=

G
folded



G
unfolded



Search through many possible sequences,
and then pick the one with the best
G
fold
.

Design target

Designed protein


The big challenges

Search

The search space is astronomical:
20
n


Except in rare subspace search problems,
this is computationally intractable.

It is practically impossible to
D
G
fold

because…


--

What is the structure of the folded state?

(
sidechain

and loop positions)


--

How do we model the unfolded state?


--

Entropy?!


Instead, we focus on the energy of the
folded protein, meaning native structure
interactions. That is, replace
D
G
fold

with
D
E
fold

using MM force fields.


Energy


Sidechain packing

Design target

Designed protein

As we did with structure
prediction in homology
modeling, we will typically use a
rotamer

library
-
based approach.


Search algorithms for large spaces

Exhaustive search


too slow!


Stochastic methods

--

Monte Carlo

--

Genetic algorithms


Pruning algorithms (which are
deterministic
)

--

Branch and Bound

--

Dead End Elimination

For all
-
atom protein design, some amount of
stochasticism

is generally required.

Purely deterministic approaches rarely succeed in designing
complete
proteins.


Dead End Elimination

Eliminate, one at a time,
rotamer

choices that
cannot under any circumstance
be part of
the minimum energy solution.

From
Wikipedia:
DEE
is a method for minimizing a function over a discrete set of
independent variables.
The
basic idea is to identify "dead ends", i.e., "bad" combinations
of variables that cannot possibly yield the global minimum and to refrain from searching
such combinations further
. Hence
, dead
-
end elimination is a mirror image of dynamic
programming techniques in which "good" combinations are identified and explored
further.
Although
the method itself is general, it has been developed and applied mainly
to the problems of predicting and designing the structures of proteins.


Dead End Elimination

Identify and eliminate
rotamers

that cannot be part of the best solution.

Note: Cannot afford to calculate energies for all of these configurations!


Dead End Elimination

What is the least energy it would cost to replace

i
s

with
i
r
?

Note: Only need to do
p

x
r

comparisons (versus
r
p
), where:


r

= average # of
rotamers
/residue

p

= # residues.

DEE algorithm applied to protein design

If
D
E

> 0, then eliminate
i
r
.


Apply iteratively to all
rotamer

pairs.


The energy profile changes as
rotamers

are eliminated, leading to elimination of
further
rotamers
.

Coiled
-
coil design (Mayo et al.)

Biosensor design (
Hellinga

et al.)

The
Hellinga

lab has designed many different receptors based on the
bPBP

fold.

Protein
-
protein interface design (Love, Mayo, et al.)

Rosetta Design

Initial sequence
selection

(primarily 12
-
6, HB, and
Born terms)

Monte Carlo
minimization

(both at rotamer and
backbone levels)

Sequence
optimization

Sketch input
structure (the fold)

Final structure

Note: this step is analogous
to structure prediction!

Repeat till convergence

Top7 (Baker, Kuhlman, et al.)

Conformational switch (Kuhlman, et al.)

unfolded

folded

Folded to unfolded transition as zinc is titrated in

The ideal:
Designed sequences that meet
both

criteria


The Holy Grail

TS: transition state

Design model: purple

X
-
ray crystal structure: green

The dirty little secret of protein design…

For every high impact success in the protein design
literature, there are dozens (perhaps hundreds) of
spectacular failures that go unreported.



Paraphrased
from S. Mayo (Protein Society Meeting, 2006).

Scientific misconduct?

Design of a novel
triosephosphate

isomerase

DEE repacking around catalytic site

Scientific misconduct?

Design of a novel
triosephosphate

isomerase

Lineweaver
-
Burke plots

As do I!

Scientific misconduct?

Design of a novel
triosephosphate

isomerase