Structural Bioinformatics - Academia Sinica

weinerthreeforksΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

85 εμφανίσεις

Forces and Prediction of Protein
Structure

Ming
-
Jing Hwang (
黃明經
)

Institute of Biomedical Sciences

Academia Sinica


http://gln.ibms.sinica.edu.tw/

Sequence
-

Structure
-

Function

MADWVTGKVTKVQ
NWTDALFSLTVHAP
VLPFTAGQFTKLGLE
IDGERVQRAYSYVN
SPDNPDLEFYLVTVP
DGKLSPRLAALKPG
DEVQVVSEAAGFFV
LDEVPHCETLWMLA
TGTAIGPYLSILR






Sequence/Structure Gap


Current (May 26, 2005) entries in protein sequence and structure
database:




SWISS
-
PROT/TREMBL : 181,821/1,748,002



PDB : 31,059


Sequence

Structure

Structure Prediction Methods

0 10 20 30 40 50 60 70 80 90 100

ab initio

Fold recognition


%
sequence identity

Homology modeling


Levinthal’s paradox (1969)


If we assume three possible states for every flexible
dihedral angle in the backbone of a 100
-
residue protein,
the number of possible backbone configurations is 3
200
.
Even an incredibly fast computational or physical
sampling in 10
-
15
s would mean that a complete sampling
would take 10
80

s, which exceeds the age of the universe
by more than 60 orders of magnitude.


Yet proteins fold in seconds or less!

Berendsen

Energy landscapes of protein folding

Borman, C&E News, 1998

Levitt

s lecture for
S*

Levitt

Levitt

Other factors


Formation of 2nd elements


Packing of 2nd elements


Topologies of fold


Metal/co
-
factor binding


Disulfide bond




Ab initio/new fold prediction


Physics
-
based (laws of physics)


Knowledge
-
based (rules of evolution)

Levitt

Levitt

Levitt

Levitt

Levitt

Levitt

Levitt

Levitt

Levitt

Levitt



Levitt

Levitt

Levitt

Molecular Mechanics (Force Field)

Levitt

1
-
microsecond MD simulation

980ns

-

villin headpiece

-

36 a.a.

-

3000 H2O

-

12,000 atoms

-

256 CPUs (CRAY)

-
~4 months

-

single trajectory


Duan & Kollman, 1998

Protein folding by MD

PROTEIN FOLDING:

A Glimpse of the Holy Grail?

Herman J. C. Berendsen
*




"The Grail had many different manifestations
throughout its long history, and many have
claimed to possess it or its like". We might have
seen a glimpse of it, but the brave knights must
prepare for a long pursuit.


Massively distributed computing


SETI@home
:


Folding@home


Distributed folding


Sengent’s drug design


FightAIDS@home




Letters to nature (2002)

-

engineered protein (BBA5)

-

zinc finger fold (w/o metal)

-

23 a.a.

-

solvation model

-

thousands of trajectories each
of 5
-
20 ns, totaling 700
m
s

-

Folding@home

-

30,000 internet volunteers

-

several months, or ~a million
CPU days of simulation

Massively distributed computing

Energy landscapes of protein folding

Borman, C&E News, 1998

Protein
-
folding prediction technique

CGU: Convex Global

Underestimation

-

K. Dill

s group

Challenges of physics
-
based methods


Simulation time scale


Computing power


Sampling


Accuracy of energy functions

Structure Prediction Methods

0 10 20 30 40 50 60 70 80 90 100

ab initio

Fold recognition


%
sequence identity

Homology modeling


Flowchart of homology (comparative) modeling

From
Marti
-
Renom et al.

Fold recognition

Find, from a library of folds, the 3D template

that accommodates the target sequence best.


Also known as

threading


or

inverse folding



Useful for twilight
-
zone sequences

Fold recognition (aligning sequence to structure)

(David Shortle, 2000)

3D
-
>1D score

On X
-
ray, NMR, and computed models

(Rost, 1996)

Marti
-
Renom et al. (2000)

Reliability and uses of comparative models

Pitfalls of comparative modeling


Cannot correct alignment errors


More similar to template than to true
structure


Cannot predict novel folds


Ab initio/new fold prediction


Physics
-
based (laws of physics)


Knowledge
-
based (rules of evolution)

From 1D


2D


3D

LGINCRGSSQCGLSGGNLMVRIRDQACGNQGQTWCPGERRAKVCGTGN
SISAY
VQ
STNNCIS
GTEACRHLTNLVNH
GCRVCGSDPLYAGNDVSRGQLTVNYVNSC

Tertiary

Primary

Secondary
(fragment)

fragment assembly

seq. to str. mapping

CASP Experiments

One lab dominated in CASP4

One group dominates the ab initio
(knowledge
-
based) prediction

Some CASP4 successes

Baker

s group

Ab initio structure prediction server

Science 2003

A computer
-
designed protein (93 aa)
with 1.2 A resolution

Structure prediction servers

http://bioinfo.pl/cafasp/list.html

Thank You!