1
Molecular Modeling Methods &
Ab Initio
Protein Structure Prediction
By Haiyan Jiang
Oct. 16, 2006
2
About me
2003, Ph.D in Computational Chemistry, University of Science
and Technology of China
Research: New algorithms in molecular structure optimization
2004~2006, Postdoc, Computational Biology, Dalhousie
University
Research: Protein loop structure and the evolution of protein
domain
3
Publications
Haiyan
Jiang,
Christian
Blouin,
Ab
Initio
Construction
of
All

atom
Loop
Conformations,
Journal
of
Molecular
Modeling,
2006
,
12
,
221

228
.
Ferhan
Siddiqi,
Jennifer
R
.
Bourque,
Haiyan
Jiang,
Marieke
Gardner,
Martin
St
.
Maurice,
Christian
Blouin,
and
Stephen
L
.
Bearne,
Perturbing
the
Hydrophobic
Pocket
of
Mandelate
Racemase
to
Probe
Phenyl
Motion
During
Catalysis,
Biochemistry,
2005
,
44
,
9013

9021
.
(Responsible
for
building
the
simulation
model
and
performing
molecular
dynamics
study)
Yuhong
Xiang,
Haiyan
Jiang,
Wensheng
Cai,
and
Xueguang
Shao,
An
Efficient
Method
Based
on
Lattice
Construction
and
the
Genetic
Algorithm
for
Optimization
of
Large
Lennard

Jones
Clusters,
Journal
of
Physical
Chemistry
A,
2004
,
108
,
3586

3592
.
Xueguang
Shao,
Haiyan
Jiang,
Wensheng
Cai,
Parallel
Random
Tunneling
Algorithm
for
Structural
Optimization
of
Lennard

Jones
Clusters
up
to
N=
330
,
Journal
of
Chemical
Information
and
Computer
Sciences,
2004
,
44
,
193

199
.
4
Publications
Haiyan
Jiang,
Wensheng
Cai,
Xueguang
Shao
.
,
New
Lowest
Energy
Sequence
of
Marks’
Decahedral
Lennard

Jones
Clusters
Containing
up
to
10000
atoms,
Journal
of
Physical
Chemistry
A,
2003
,
107
,
4238

4243
.
Wensheng
Cai,
Haiyan
Jiang,
Xueguang
Shao
.
,
Global
Optimization
of
Lennard

Jones
Clusters
by
a
Parallel
Fast
Annealing
Evolutionary
Algorithm,
Journal
of
Chemical
Information
and
Computer
Sciences,
2002
,
42
,
1099

1103
.
Haiyan
Jiang,
Wensheng
Cai,
Xueguang
Shao
.
,
A
Random
Tunneling
Algorithm
for
Structural
Optimization
Problem,
Physical
Chemistry
and
Chemical
Physics,
2002
,
4
,
4782

4788
.
Xueguang
Shao,
Haiyan
Jiang,
Wensheng
Cai
.
,
Advances
in
Biomolecular
Computing,
Progress
in
Chemistry
(chinese)
，
2002
,
14
,
37

46
.
Haiyan
Jiang,
Longjiu
Cheng,
Wensheng
Cai,
Xueguang
Shao
.
,
The
Geometry
Optimization
of
Argon
Atom
Clusters
Using
a
Parallel
Genetic
Algorithm,
Computers
and
Applied
Chemistry
(chinese),
2002
,
19
,
9

12
.
5
Unpublished work
Haiyan
Jiang,
Christian
Blouin,
The
Emergence
of
Protein
Novel
Fold
and
Insertions
:
A
Large
Scale
Structure

based
Phylogenetic
Study
of
Insertions
in
SCOP
Families,
Protein
Science,
2006
.
(under
review)
6
Contents
Molecular modeling methods and applications in
ab initio
protein
structure prediction
Potential energy function
Energy Minimization
Monte Carlo
Molecular Dynamics
Ab initio protein loop modeling
Challenge
Recent progress
CLOOP
7
Molecular Modeling Methods
Molecular modeling methods
are the theoretical methods and
computational techniques used to simulate the behavior of
molecules and molecular systems
Molecular Forcefields
Conformational Search methods
Energy Minimization
Molecular Dynamics
Monte Carlo simulation
Genetic Algorithm
8
Ab Initio
Protein Structure Prediction
Ab initio
protein structure prediction
methods build protein 3D
structures from sequence based on physical principles.
Importance
The
ab initio
methods are important even though they are
computationally demanding
Ab initio
methods predict protein structure based on physical models,
they are indispensable complementary methods to Knowledge

based
approach
eg.
Knowledge

based approach would fail in following conditions:
Structure homologues are not available
Possible undiscovered new fold exists
9
Applications of MM in
Ab Initio
PSP
Basic idea
Anfinsen’s
theory
:
Protein native structure corresponds to the
state with the lowest free energy of the protein

solvent system.
General procedures
Potential function
Evaluate the energy of protein conformation
Select native structure
Conformational search algorithm
To produce new conformations
Search the potential energy surface and locate the global minimum
(native conformation)
10
Protein Folding Funnel
Local mimina
Global minimum
Native Structure
11
Potential Functions for PSP
Potential function
Physical based energy function
Empirical
all

atom
forcefields:
CHARMM
,
AMBER
,
ECEPP

3
,
GROMOS
,
OPLS
Parameterization: Quantum mechanical calculations, experimental
data
Simplified potential:
UNRES
(
united residue
)
Solvation energy
Implicit solvation model:
Generalized Born
(GB) model,
surface
area based model
Explicit solvation model:
TIP3P
(computationally expensive)
12
General Form of All

atom Forcefields
pairs
,
tic
electrosta
pairs
,
der Waals
van
6
12
Hbonds
10
12
dihedrals
angles
2
0
bonds
2
0
total
cos
1
j
i
ij
j
i
j
i
ij
ij
ij
ij
ij
ij
ij
ij
b
r
q
q
r
B
r
A
r
D
r
C
n
K
K
r
r
K
V
Electrostatic
term
H

bonding term
Van der Waals term
Bond stretching
term
Dihedral term
Angle bending
term
r
Φ
Θ
＋
ー
O
H
r
r
r
The most
time
demanding
part.
13
Search Potential Energy Surface
We are interested in minimum points on Potential Energy Surface (PES)
Conformational search techniques
Energy Minimization
Monte Carlo
Molecular Dynamics
Others: Genetic Algorithm,
Simulated Annealing
14
Energy Minimization
Energy minimization
Methods
First

order minimization:
Steepest descent
,
Conjugate gradient
minimization
Second derivative methods:
Newton

Raphson method
Quasi

Newton methods:
L

BFGS
Local miminum
15
Monte Carlo
Monte Carlo
In molecular simulations, ‘Monte Carlo’ is an importance
sampling technique.
1. Make random move and produce a new conformation
2. Calculate the energy change
E
for the new conformation
3. Accept or reject the move based on the
Metropolis criterion
exp( )
E
P
kT
Boltzmann factor
If
E
<0, P>1, accept new conformation;
Otherwise: P>rand(0,1), accept, else reject.
16
Monte Carlo
Monte Carlo (MC) algorithm
Generate initial structure
R
and calculate
E(R)
;
Modify structure
R
to
R’
and calculate
E(R’)
;
Calculate
E
=
E(R’)
E(R)
;
IF
E
<0
, then
R
R’
;
ELSE
Generate random number
RAND = rand(0,1)
;
IF
exp(
E
/
KT
) > RAND
, then
R
R’
;
ENDIF
ENDIF
Repeat for
N
steps;
Monte Carlo Minimization (MCM) algorithm
Parallel Replica Exchange Monte Carlo algorithm
17
Molecular Dynamics
Molecular Dynamics (MD)
MD simulates the Movements of all the particles in a molecular system by
iteratively solving Newton’s equations of motion.
MC view many frozen butterflies in a museum; MD watch the butterfly fly.
18
Molecular Dynamics
Algorithm
For atom
i,
Newton’s equation of motion is given by
Here,
r
i
and
m
i
represent the position and mass of atom
i
and
F
i
(
t
)
is the force on atom
i
at time
t
.
F
i
(
t
) can also be expressed as the
gradient of the potential energy
V
is potential energy. Newton
’
s equation of motion can then relate
the derivative of the potential energy to the changes in position as a
function of time.
2
2
d
d
i
i i
t
t m
t
r
F
i i i
F ma
i i
V
F
2
2
d
d
i
i i
t
V m
t
r
(1)
(2)
(4)
(3)
19
Molecular Dynamics
Algorithm (continue)
To obtain the movement trajectory of atom, numerous numerical algorithms
have been developed for
integrating the equations of motion
. (Verlet algorithm,
Leap

frog algorithm)
Verlet algorithm
The algorithm uses the positions and accelerations at time
t
, and the positions
from the previous step to calculate the new positions
Selection of time step
Time step is approximately one order of magnitude smaller than the fastest
motion
Hydrogen vibration ~ 10 fs (10

15
s), time step = 1fs
2
( ) 2 ( ) ( ) ( )
t t t t t t t
r r r a
t
( )
t t
r
20
Molecular Dynamics
MD Software
CHARMM
(Chemistry at HARvard Molecular Mechanics) is a program for
macromolecular simulations, including energy minimization, molecular
dynamics and Monte Carlo simulations.
NAMD
is a parallel molecular dynamics code designed for high

performance
simulation of large biomolecular systems.
http://www.ks.uiuc.edu/Research/namd/
Application in PSP
Advantage: Deterministic; Provide details of the folding process
Limitation: The protein folding reactions take place at m
s
level, which is at
the limit of accessible simulation times
It is still difficult to simulate a whole process of a protein folding using the
conventional MD method.
21
Time Scales of Protein Motions and MD
MD Time Scale
10

15
10

6
10

9
10

12
10

3
10
0
(
s)
(
fs)
(
ps)
(
μ
s)
(
ns)
(
ms)
Bond stretching
Elastic vibrations of proteins
α

Helix folding
β

Hairpin
folding
Protein folding
22
MD is fun!
A small protein
folding movie:
simulated with
NAMD/VMD
23
Other Conformational Search Algorithms
Global optimization algorithms
“
Optimization” refers to trying to find the global energy minimum
of a potential surface.
Genetic Algorithm (GA)
Simulated Annealing (SA)
Tabu Search (TS)
Ant Colony Optimization (ACO)
A model system: Lennard Jones clusters
24
Applications of MM methods in PSP
Application in PSP
Combination of several conformational search techniques
Recent developments
Simplified force field: united residue force field
Segment assembly
Secondary structure prediction are quite reliable, so conformation can be
produced by assemble the segments
Ab initio
PSP software
Rosetta
is a five

stage fragment insertion Metropolis Monte Carlo method
ASTRO

FOLD
is a combination of the deterministic
BB
global optimization
algorithm, and a Molecular Dynamics approach in torsion angle space
LINUS
uses a Metropolis Monte Carlo algorithm and a simplified physics

based force field
25
ASTRO

FOLD
26
References
Hardin C,
et. al.
Ab initio protein structure prediction.
Curr Opin
Struct Biol.
2002, 12(2):176

81.
Floudas CA,
et. al.
Advances in protein structure prediction and de
novo protein design: A review.
Chemical Engineering Science
, 2006,
61: 966

988.
Klepeis JL, Floudas CA, ASTRO

FOLD: a combinatorial and global
optimization framework for ab initio prediction of three dimensinal
structures of proteins from the amino acid sequence,
Biophysical
Journal
, 2003, 85: 2119

2146.
27
Ab Initio
Protein Loop Prediction
Protein loop
Protein loops are polypeptides
connecting more rigid structural
elements of proteins like helices and strands.
Challenge in Loop Structure Prediction
Loop is important to protein folding and protein function even
though their size is small, usually <20 residues
Loops exhibit greater structural variability than helices and strands
Loop prediction is often a limiting factor on fold recognition methods
28
Ab Initio
Protein Loop Prediction
Ab initio methods have recently received increased
attention in the prediction of protein loop
Potential energy function
Molecular mechanics force field is usually better than statistical
potential in protein loop modeling.
Recent progress
Dihedral angle sampling
Clustering
Select representative structures from ensembles
29
Ab Initio
Loop Prediction Methods
Loopy
Random tweak
Colony energy
Fiser’s method
MM methods:
Physical energy function
Energy Minimization + MD + SA
Forrest & woolf
Predict membrane protein loop
MM methods: MC + MD
Review:
Floudas C.A. et al, Advances in protein structure prediction and de novo protein
design: A review,
Chemical Engineering Science
, 2006, vol. 61, 966

988.
30
CLOOP:
Ab Initio
Loop Modeling Method
CLOOP build all

atom ensemble of protein loop conformations (it
is not a real protein loop prediction method)
Paper
Haiyan Jiang, Christian Blouin,
Ab Initio
Construction of All

atom Loop
Conformations,
Journal of Molecular Modeling
, 2006, 12, 221

228.
CLOOP methods
Energy function: CHARMM
Dihedral sampling
Potential smoothing technique
The designed minimization (DM) strategy
Divided loop conformation construction
31
The Energy Function of CHARMM Forcefield
CHARMM
elec
vdw
imp
dihe
angle
UB
bonds
CHARMm
E
E
E
E
E
E
E
E
bonds
b
bonds
b
b
k
E
2
0
)
(
UB
UB
UB
S
S
k
E
2
0
)
(
angle
angle
k
E
2
0
)
(
dihe
dihe
n
k
E
))
(cos(
1
(
imp
imp
imp
k
E
2
0
)
(
nonbond
ij
ij
ij
ij
ij
vdw
r
R
r
R
E
6
min,
12
min,
2
nonbond
ij
j
i
elec
r
q
q
E
0
4
32
CLOOP
Dihedral sampling
Loop main

chain
dihedral
and
are generated by sampling main

chain dihedral angles from a restrained
/
set
The restrained dihedral range has 11 pair of
/
dihedral sub

ranges. It was obtained by adding 100 degree variation on each
state of
the 11
/
set developed by Mault and James
for loop
modeling.
Side chain conformations are built randomly.
33
CLOOP
Potential smoothing technique
A soft core potential provided in CHARMM software package
was applied to smooth non

bonded interactions
soft
r
is the switching distance for the soft core potential
is the distance of the two interacting atoms
CHARMM
nonbond nonbond
E E
soft
r r
)
(
CHARMM
nonbonded soft nonbonded
E k r r E
soft
r r
r
34
CLOOP
The designed minimization (DM) strategy
Minimization methods:
steepest descent, conjugate gradient, and adopted basis
Newton

Raphson minimization method
Two stages:
1. Minimize the internal energy terms of loop conformations including
bond, angle, dihedral, and improper
2. The candidates were further minimized with the full CHARMM
energy function including the van der Waal and electrostatic energy
terms.
35
CLOOP
Divided loop conformation construction
Generate position of middle residue
Build initial conformation of main chain with dihedral sampling
Build side chain conformation
Run DM and produce closed loop conformation
36
CLOOP
Performance of CLOOP
CLOOP
was
applied
to
construct
the
conformations
of
4
,
8
,
and
12
residue
long
loops
in
Fiser’s
loop
test
set
.
The
average
main

chain
root
mean
square
deviations
(RMSD)
obtained
in
1000
trials
for
the
10
different
loops
of
each
size
are
0
.
33
,
1
.
27
and
2
.
77
Å
,
respectively
.
The
performance
of
CLOOP
was
investigated
in
two
ways
.
One
is
to
calculate
loop
energy
with
a
buffer
region
,
and
the
other
is
loop
only
.
The
buffer
region
included
a
region
extending
up
to
10
Å
around
the
loop
atoms
.
In
energy
minimization,
only
the
loop
atoms
were
allowed
to
move
and
all
non

loop
atoms
include
those
in
the
buffer
region
were
fixed
.
37
Loop Conformations built by CLOOP
a. 1gpr_123

126
b.
135l_84

91
c.
1pmy_77

88
38
Performance of CLOOP
39
Conclusion
CLOOP can be applied to build a good all

atom conformation
ensemble of loops with size up to 12 residues.
Good efficiency, CLOOP is faster than RAPPER
The contribution of the protein to which a loop is attached (i.e.
the ‘buffer region’ ) facilitates the discrimination of near

optimal loop structures.
The soft core potentials and a DM strategy are effective
techniques in building loop conformations.
40
Thanks!
Comments 0
Log in to post a comment