Basics of molecular dynamics simulations
Reduction of the quantum problem to a classical one
Parameterization of force fields
Boundaries, energy minimization, integration
algorithms
Analysis of trajectory data
Lecture notes are at :
www.physics.usyd.edu.au/~serdar/ws
1
Quantum, classical and stochastic description
of molecular systems
Quantum mechanics (
Schroedinger
equation)
The most fundamental approach but feasible only for few atoms (~10).
Approximate methods (e.g. density functional theory) allows treatment of
larger systems (~1000) and dynamic simulations for picoseconds.
Classical mechanics (Newton’s equation of motion)
Most atoms are heavy enough to justify a classical treatment (except H).
The main problem is finding accurate potential functions (force fields).
MD simulation of 100,000 atoms for many nanoseconds is now feasible.
Stochastic mechanics (
Langevin
equation)
Most biological processes occur in the range of
microsec
to
millisec
.
Thus to describe such processes, a simpler (coarse

grained)
representation of atomic system is essential (e.g. Brownian dynamics).
2
,
2
2
2
2
2
2
2
2
2
,
,
i
i
i
ne
e
j
i
j
i
j
i
i
i
i
n
i
i
ne
e
n
e
z
U
e
m
H
e
z
z
M
H
E
U
H
H
r
R
r
r
R
R
r
R
r
R
Many

body Schroedinger eq. for a molecular system
Here m and M
i
are the mass of the electrons and nuclei,
r
and R
i
denote the electronic and nuclear coordinates,
and
and
i
denote the respective gradients.
nuclear hamiltonian
electronic hamiltonian
elect

nucl. interact.
(1)
3
r
R
R
r
R
,
,
i
e
i
n
i
r
R
R
r
R
R
,
,
i
e
i
n
i
e
i
n
ne
e
n
E
U
H
H
Separation of the electronic wave function
Nuclei are much heavier and hence move much slower than electrons.
This allows decoupling of their motion from those of electrons.
Introduce the product wave function:
Substituting this ansatz in the Schroedinger eq. gives
For fixed nuclei, the electronic part gives
r
R
R
r
R
,
,
i
e
i
e
i
e
ne
e
E
U
H
(2)
4
r
R
R
r
R
R
R
,
,
i
e
i
n
i
e
i
n
i
e
n
E
E
H
Substitute the electronic part back in the Schroedinger eq.
Eqs. (2, 3) need to be solved simultaneously, which is a formidable
problem for most systems. For two nuclei, there is only one coordinate
for R (the distance), so it is feasible. But for three

nuclei, there are
4 coordinates (in general for N nuclei, 3N

5 coordinates are required),
which makes numerical solution very difficult.
Born

Oppenheimer (adiabatic) approximation consists of neglecting
the cross terms arising from
(which are of order m/M), so that the nuclear part becomes
r
R
,
i
e
n
H
i
n
i
n
i
e
n
E
E
H
R
R
R
(3)
5
i
e
j
i
j
i
j
i
i
i
i
i
i
E
e
z
z
U
N
i
U
dt
d
M
R
R
R
R
R
R
2
2
2
,
,
1
Classical approximation for nuclear motion
Nuclei are heavy so their motion can be described classically, that is,
instead of solving the Schroedinger Eq. (3), we solve the corresponding
Newton’s eq. of motion
At zero temperature, the potential can be minimized with respect to the
nuclear coordinates to find the equilibrium conformation of molecules.
At finite temperature, Eqs. (2) and (4) form the basis of ab initio MD
(ignores quantum effects in nuclear motion and electronic exc. at finite T)
(4)
6
Methods of solution for the electronic equation
Two basic methods of solution:
1.
Hartree

Fock (HF) based methods: HF is a mean field theory.
One finds the average, self

consistent potential in which electrons move.
Electron correlations are taken into account using various methods.
2. Density functional theory: Solves for the density of electrons.
Better scaling than HF (which is limited to ~10 atoms); 100’s of atoms.
Car

Parrinello MD (DFT+MD) has become popular in recent years.
(2)
r
R
R
r
R
r
R
r
r
,
,
2
,
2
2
2
2
i
e
i
e
i
e
i
i
i
i
E
e
z
e
m
Electronic part of the Schroedinger eq. (2) has the form
7
e
e
n
)
(
)
(
r
r
r
)]
(
[
'
'
)
'
(
)
(
2
)]
(
[
)
(
)
(
3
3
2
3
2
,
2
r
r
r
r
r
r
r
r
R
r
r
r
R
n
E
r
rd
d
n
n
e
n
T
H
r
d
n
e
z
e
z
U
U
H
E
xc
e
i
i
i
e
i
i
i
e
ne
e
ne
e
e
e
Density functional theory (DFT)
In DFT one uses the density function of electrons, n(r) instead of
their wavefunction
Expectation value of the electronic energy in the ground state becomes
8
N
i
dt
d
M
N
i
j
ij
i
i
i
,
,
1
,
2
2
F
F
r
Classical mechanics
Molecular dynamics (MD) is the most popular method for simulation
studies of biomolecules. It is based on Newton’s equation of motion.
For N interacting atoms, one needs to solve N coupled Diff. Eq’s:
Force fields are determined from experiments and ab initio methods.
Analytically this is an intractable problem for N>2,
but we can solve it easily on a computer using numerical methods.
Current computers can handle N ~ 10
5

6
particles, which is large
enough for description of most biomolecules.
Integration time, however is still a bottleneck (10
6
steps @ 1 fs = 1 ns)
9
i
i
i
i
i
i
i
m
dt
d
m
R
v
F
r
2
2
Stochastic mechanics
In order to deal with the time bottle

neck in MD, one has to simplify
the simulation system (coarse graining). This is achieved by
describing parts of the system as continuum with dielectric constants.
Examples:
•
transport of ions in electrolyte solutions (water
→
continuum)
•
protein folding (water
→
continuum)
•
function of transmembrane proteins (lipid, water
→
continuum)
To include the effect of the atoms in the continuum, modify Newton’s
eq. of motion by adding frictional and random forces:
Langevin equation
10
Molecular dynamics (MD) simulations
MD programs consist of over 100,000 lines of coding. To understand
what is going on in an MD simulation, we need to consider several
topics:
1.
Force fields (potential functions among atoms)
2.
Boundaries and computation of long

range interactions
3.
Molecular mechanics (T=0, energy minimization)
4.
MD simulations (integration algorithms, ensembles)
5.
Analysis of trajectory data to characterize structure and
function of proteins
11
Empirical force fields (potential functions)
Problem: Reduce the ab initio potential energy surface among the atoms
j
i
k
j
i
k
j
i
j
i
i
U
U
U
)
,
,
(
)
,
(
3
2
R
R
R
R
R
R
To a classical many

body interaction in the form
i
e
j
i
j
i
j
i
i
E
e
z
z
U
R
R
R
R
2
Such a program has been tried for water but so far it has failed to
produce a classical potential that works. In strongly interacting systems,
it is difficult to describe the collective effects by summing up the
many

body terms.
Practical solution: Truncate the series at the two

body level and assume
(hope!) that the effects of higher order terms can be absorbed in U
2
12
Non

bonded interactions
Interaction of two atoms at a distance R = 
R
i

R
j
 can be decomposed
into 4 pieces
1.
Coulomb potential
2.
Induced polarization
3.
Attractive dispersion (van
der
Waals) (1/R
6
)
4.
Short range repulsion (e
R/a
)
The last two terms are combined into a 6

12
Lennard

Jones potential
Combination rules for different atoms:
6
12
4
R
R
U
LJ
2
/
)
(
j
i
ij
j
i
ij
13
R
q
q
U
j
i
0
4
Coul
12

6 Lennard

Jones potential (U is in kT, r in Å)
3
3.5
4
4.5
5
0.5
0
0.5
1
1.5
2
2.5
r
U(r)
A
kT
3
,
4
1
14
Because the polarization interaction is many

body and requires iterations,
it has been neglected in current generations of force fields.
The non

bonded interactions are thus represented by the Coulomb and
12

6 LJ potentials.
Model
R
O

H
(Å)
q
HOH
q
H
(e)
(kT)
(Å)
m
(D)
eps (T=298)
SPC
1.0
109.5
0.410
0.262
3.166
2.27
65
±
5
TIP3P
0.957
104.5
0.417
0.257
3.151
2.35
97
±
7
Exp.
1.86
80
SPC: simple point charge
TIP3P: transferable intermolecular potential with 3 points
Popular water models (rigid)
15
Covalent bonds
In molecules, atoms are bonded together with covalent bonds, which
arise from sharing of electrons in partially occupied orbitals. If the
bonds are very strong, the molecule can be treated as rigid as in
water molecule. In most large molecules, however, the structure is
quite flexible and this must be taken into account for a proper
description of the molecules. This is literally done by replacing the
bonds by springs. The nearest neighbour interactions involving 2, 3
and 4 atoms are described by harmonic and periodic potentials
0
2
0
2
0
cos
1
2
2
q
q
q
ijkl
ijkl
ijkl
ijk
ijk
ijk
ij
ij
r
ij
bond
n
V
k
r
r
k
U
bond stretching
bending
torsion (not very good)
16
Interaction of two H atoms in the ground (1s) state can be described
using Linear Combinations of Atomic Orbitals (LCAO)
)
1
(
)
1
(
2
1
s
c
s
c
B
A
)
1
(
)
1
(
2
1
s
s
B
A
From symmetry, two solutions with lowest and highest energies are:
+ Symmetric

Anti

symmetric
17
The R dependence of the potential energy is approximately given by
the Morse potential
2
)
(
0
1
R
R
e
e
D
U
Morse
2
0
)
(
2
1
R
R
k
U
bond
where
D
e
: dissociation energy
R
0
: equilibrium bond distance
Controls the width of the potential
Classical representation:
e
D
k
2
/
k
18
An in

depth look at the force fields
Three force fields, constructed in the eighties, have come to dominate
the MD simulations
1.
CHARMM (Karplus @ Harvard)
Optimized for proteins, works well also for lipids and nucleic acids
2.
AMBER (Kollman & Case @ UCSF)
Optimized for nucleic acids, otherwise quite similar to CHARMM
3. GROMOS (Berendsen & van Gunsteren @ Groningen)
Optimized for lipids and does not work very well for proteins
(due to smaller partial charges in the carbonyl and amide groups)
The first two use the TIP3P water model and the last one, SPC model.
They all ignore the polarization interaction (polarizable versions are
under construction)
19
Charm parameters for alanine
Partial charge (e)
ATOM N

0.47

ATOM HN 0.31 HN
—
N
ATOM CA 0.07
 HB1
ATOM HA 0.09

/
GROUP
HA
—
CA
—
CB
—
HB2
ATOM CB

0.27

\
ATOM HB1 0.09
 HB3
ATOM HB2 0.09
O==C
ATOM HB3 0.09

ATOM C 0.51
ATOM O

0.51
Total charge: 0.00
20
Bond lengths :
k
r
(kcal/mol/Å
2
)
r
0
(Å)
N
CA
320.
1.430
(1)
CA
C
250.
1.490
(2)
C
N
370.
1.345
(2)
(peptide bond)
O
C
620.
1.230
(2)
(double bond)
N
H
440.
0.997
(2)
HA
CA
330.
1.080
(2)
CB
CA
222.
1.538
(3)
HB
CB
322.
1.111
(3)
1.
NMA (N methyl acetamide) vibrational spectra
2.
Alanine dipeptide ab initio calculations
3.
From alkanes
2
0
)
(
r
r
k
U
r
bond
21
Bond angles :
k
q
(kcal/mol/rad
2
)
q
0
(deg)
C
N CA
50.
120.
(1)
C
N H
34.
123.
(1)
H
N CA
35.
117.
(1)
N
CA C
50.
107.
(2)
N
CA CB
70.
113.5
(2)
N
CA HA
48.
108.
(2)
HA CA CB
35.
111.
(2)
HA CA C
50.
109.5
(2)
CB
CA C
52.
108.
(2)
N
C CA
80.
116.5
(1)
O
C CA
80.
121.
(2)
O
C N
80.
122.5
(1)
2
0
)
(
q
q
q
k
U
angle
Total 360 deg.
Total 360 deg.
22
Basic dihedral configurations
trans
cis
Definition of the dihedral
angle for 4 atoms A

B

C

D
)
cos(
1
2
0
n
V
U
n
dihed
Dihedrals:
23
Dihedral parameters:
V
n
(kcal/mol)
n
0
(deg)
Name
C N CA C
0.2
1
180.
N CA C N
0.6
1 0.
CA C N CA
1.6
1 0.
w
H N CA CB
0.0
1 0.
H N CA HA
0.0
1 0.
C N CA CB
1.8
1 0.
CA C N H
2.5
2
180.
O C N H
2.5
2
180.
O C N CA
2.5
2
180.
)
cos(
1
2
0
n
V
U
n
dihed
24
Boundaries
In macroscopic systems, the effect of boundaries on the
dynamics of biomolecules is minimal. In MD simulations,
however, the system size is much smaller and one has to
worry about the boundary effects.
•
Using nothing is not realistic for bulk simulations.
•
Minimum requirement: water beyond the simulation box must be
treated using a continuum representation
•
Most common solution: periodic boundary conditions.
The simulation box is replicated in all directions just like in a
crystal. The cube and rectangular prism are the obvious choices
for a box shape.
25
Periodic boundary conditions in two dimensions: 8 nearest neighbours
Particles in the box freely move to the next box, which means when
one moves out another appears from the opposite side of the same
box. In 3

D, there are 26 nearest neighbours.
26
Treatment of long

range interactions
Problem: the number of non

bonded interactions grows as N
2
where N is
the number of particles. This is the main computational bottle neck
that limits the system size.
Because the Coulomb potential between two unit charges, has long range
[U=560 kT/r (Å)], use of any cutoff is problematic and should be
avoided.
All the modern MD codes use the Ewald sum method to treat the long

range interactions without cutoffs. Here
one replaces the point
charges with Gaussian distributions, which leads to a much faster
convergence of the sum in the reciprocal (Fourier) space.
The remaining part (point particle
–
Gaussian) falls exponentially in real
space, hence can be computed easily.
27
Molecular mechanics
Molecular mechanics deals with the static features of
biomolecular systems at T=0 K, that is, the energy is given
solely by the potential energy of the system.
Two important applications of molecular mechanics are:
1.
Energy minimization (geometry optimization):
Find the coordinates for the configuration that minimizes the
potential energy.
2.
Normal mode analysis:
Find the vibrational excitations of the system built on the absolute
minimum using the harmonic approximation.
28
Molecular dynamics
In MD simulations, one follows the trajectories of N particles
according to Newton’s equation of motion:
where U(
r
1
,…,
r
N
) is the potential energy function
consisting of bonded and non

bonded interactions.
We need to consider:
•
Integration algorithms, e.g., Verlet, Leap

frog
•
Initial conditions and choice of the time step
•
MD simulations at constant temperature and pressure
•
Constrained dynamics for rigid molecules (SHAKE)
)
,
,
(
,
2
2
N
i
i
i
i
i
i
U
dt
d
m
r
r
F
F
r
29
2
2
)
(
)
(
)
(
)
(
t
m
t
t
t
t
t
t
i
i
i
i
i
F
v
r
r
2
2
)
(
)
(
)
(
)
(
)
(
)
(
)
(
t
m
t
t
t
t
t
t
t
m
t
t
t
t
i
i
i
i
i
i
i
i
i
F
v
r
r
F
v
v
Integration algorithms
Given the position and velocities of N particles at time t, integration
of Newton’s equation of motion yields at t+
t
In the popular Verlet algorithm, one eliminates velocities using the
positions at t

t,
Adding eq. (*) yields:
2
)
(
)
(
)
(
2
)
(
t
m
t
t
t
t
t
t
i
i
i
i
i
F
r
r
r
(*)
30
t
t
t
t
t
t
t
m
t
t
t
t
t
t
t
i
i
i
i
i
i
i
i
)
2
/
(
)
(
)
(
)
(
)
2
/
(
)
(
)
(
2
v
r
r
F
v
r
r
t
t
t
t
t
t
t
m
t
t
t
t
t
i
i
i
i
i
i
i
)
2
/
(
)
(
)
(
)
(
)
2
/
(
)
2
/
(
v
r
r
F
v
v
In the Leap

frog algorithm, the positions and velocities are
calculated at different times separated by
t/2
To show its equivalence to the Verlet algorithm, consider
Subtracting the two equations yields the Verlet result.
If required, velocity
at t is obtained from:
)]
2
/
(
)
2
/
(
[
2
1
)
(
t
t
t
t
t
i
i
i
v
v
v
31
To iterate these equations, we need to specify the initial conditions.
•
The initial configuration of biomolecules can be taken from the
Protein Data Bank (PDB) (if available).
•
In addition, membrane proteins need to be embedded in a lipid
bilayer.
•
All the MD codes have facilities to hydrate a biomolecule, i.e., fill
the void in the simulation box with water molecules at the correct
density.
•
Ions can be added at random positions.
After energy minimization, these coordinates provide the positions at
t=0.
Initial velocities are sampled from a Maxwell

Boltzmann distribution:
kT
v
m
kT
m
v
P
ix
i
i
ix
2
exp
2
)
(
2
2
/
1
32
MD simulations at constant temperature and pressure
MD simulations can be performed in the NVE ensemble, where all 3
quantities are constant. Due to truncation errors, keeping the energy
constant in long simulations can be problematic. To avoid this
problem, the alternative NVT and NPT ensembles are employed. The
temperature of the system can be obtained from the average K. E.
Thus an obvious way to keep the temperature constant at T
0
is to
scale the velocities as:
)
(
),
(
)
(
0
t
T
T
t
v
t
v
i
i
NkT
K
2
3
Because K. E. has considerable fluctuations, this is a rather crude
method.
33
A better method which achieves the same result more smoothly is
the Berendsen method, where the atoms are weakly coupled an
external heat bath with the desired temperature T
0
If T(t) > T
0
, the coupling term is negative, which invokes a viscous
force slowing the velocity, and vice

versa for T(t) < T
0
Similarly in the NPT ensemble, the pressure can be kept constant
by simply scaling the volume. Again a better method (Langevin
piston), is to weakly couple the pressure difference to atoms using a
force as in above, which will maintain the pressure at the desired
value (~1 atm).
dt
d
t
T
T
m
dt
d
m
i
i
i
i
i
i
r
F
r
1
)
(
0
2
2
34
Analysis of trajectory data
1. Fundamental quantities
Total energy, temperature, pressure, volume (or density)
2. Structural quantities
Root Mean Square Deviation (RMSD)
Distribution functions (e.g., pair and radial)
Conformational analysis (e.g., Ramachandran plots, rotamers)
3. Dynamical quantities
Time correlation functions & transport coefficients
Free energy calculations using perturbation, umbrella sampling and
steered MD methods
35
Fundamental quantities
•
Total energy
: strictly conserved but due to numerical errors it may
drift.
•
Temperature
: would be constant in a macroscopic system
(fluctuations are proportional to 1/
N, hence negligible). But in a
small system,
they will fluctuate around a base line. Temperature
coupling ensures that the base line does not drift from the set
temperature.
•
Pressure
: similar to temperature, though fluctuations are much
larger compared to the set value of 1 atm.
•
Volume
(or density): fixed in the NVT ensemble but varies in NPT.
Therefore, it needs to be monitored during the initial equilibration
to make sure that the system has converged to the correct volume.
Relatively quick for globular proteins but may take a long time (~1
ns) for systems involving lipids (membrane proteins).
36
Root Mean Square Deviation
For an N

atom system, variance and RMSD at time t are defined as
Where
r
i
(0) are the reference coordinates. Usually they are taken
from the first frame in an MD simulation, though they can also be taken
from the PDB or a target structure. Because side chains in proteins are
different, it is common to use a restricted set of atoms such as
backbone or C
. RMSD is very useful in monitoring the approach to
equilibrium (typically, signalled by the appearance of a broad shoulder).
It is a good practice to keep monitoring RMSD during production runs
to ensure that the system stays near equilibrium.
RMSD
,
)
0
(
)
(
1
)
(
1
2
N
i
i
i
t
N
t
r
r
37
Evolution of the RMSD for the backbone atoms of ubiquitin
38
r
r
r
N
r
g
water
ion
2
4
)
(
)
(
Distribution functions
Pair distribution function g
ij
(r) gives the probability of finding a pair
of atoms (i, j) at a distance r. It is obtained by sampling the distance
r
ij
in MD simulations and placing each distance in an appropriate bin,
[r, r+
r]. Pair distribution functions are used in characterizing
correlations between pair of atoms, e.g., hydrogen bonds, cation

carbonyl oxygen. The peak gives the average distance and the width,
the strength of the interaction.
When the distribution is isotropic (e.g., ion

water in an electrolyte),
one samples all the atoms in the spherical shell, [r, r+
r]. Thus to
obtain the radial distribution function (RDF), the sampled number
needs to be normalized by the volume ~4
r
2
r (as well as density)
39
Conformational analysis
In proteins, the bond lengths and bond angles are more or less fixed.
Thus we are mainly interested in conformations of the torsional
angles. As the shape of a protein is determined by the backbone
atoms, the torsional angles,
(N−C
) and
(C−C
), are of particular
interest. These are conveniently analyzed using the Ramachandran
plots. Conformational changes in a protein during MD simulations can
be most easily revealed by plotting these torsional angles as a
function of time.
Also of interest are the torsional angles of the side chains,
c
1
, c
2
,
etc.
Each side chain has several energy minima (called rotamers), which
are separated by low

energy barriers (~ kT). Thus transitions
between different rotamers is readily achievable, and they could play
functional roles, especially for charged side chains.
40
Time correlation functions & transport coefficients
At room temperature, all the atoms in the simulation system are in a
constant motion characterized by their average kinetic energy:
(3/2)kT.
Free atoms or molecules diffuse according to the equation
Dt
t
2
6
)
(
r
dt
t
D
0
)
(
)
0
(
3
1
v
v
While this relationship can be used to determine the diffusion
coefficient, more robust results can be obtained using correlation
functions, e.g., D is related to the velocity autocorrelation function as
Similarly, conductance of charged particles is given by the Kubo formula
i
i
i
q
dt
t
VkT
v
J
J
J
,
)
(
)
0
(
3
1
0
(current)
41
2
)
0
(
)
(
)
0
(
)
(
t
t
C
Bound atoms or groups of atoms fluctuate around a mean value.
Most of the high

frequency fluctuations (e.g. H atoms), do not directly
contribute to the protein function. Nevertheless they serve as
“lubricant” that enables large scale motions in proteins (e.g. domain
motions) that do play a significant role in their function. As mentioned
earlier, large scale motions occur in a ~ ms to s time scale, hence are
beyond the present MD simulations. (Normal mode analysis provides an
alternative.) But torsional fluctuations occur in the ns time domain, and
can be studied in MD simulations using time correlation functions, e.g.
These typically decay exponentially and such fluctuations can be
described using the Langevin equation.
42
Free energy calculations
Free energy is the most important quantity that characterizes a
dynamical process. Calculation of the absolute free energies is difficult
in MD simulations. However free energy differences can be estimated
more easily and several methods have been developed for this purpose.
The starting point for most approaches is Zwanzig’s perturbation
formula for the free energy difference between two states A and B:
If the two states are not similar enough, there is a large hysteresis
effect and the forward and backward results are not equal.
)
(
)
(
/
)]
(
exp[
ln
)
(
/
)]
(
exp[
ln
)
(
A
B
G
B
A
G
kT
H
H
kT
A
B
G
kT
H
H
kT
B
A
G
B
B
A
A
A
B
43
Alchemical transformation and free energy perturbation
To obtain accurate results with the perturbation formula, the energy
difference between the states should be < 2 kT, which is not satisfied
for most biomolecular processes. To deal with this problem, one
introduces a hybrid Hamiltonian
and performs the transformation from A to B gradually by changing
the parameter
from 0 to 1 in small steps. That is, one divides [0,1]
into n subintervals with {
i
,
i = 0
,
n}, and for each
i
value, calculates
the free energy difference from the ensemble average
B
A
H
H
H
)
1
(
)
(
i
kT
H
H
kT
G
i
i
i
i
/
))]
(
)
(
(
exp[
ln
)
(
1
1
44
1
0
1
)
(
n
i
i
i
G
G
The total free energy change is then obtained by summing the
contributions from each subinterval
The number of subintervals is chosen such that the free energy
change at each step is < 2 kT, otherwise the method may lose its
validity.
1
0
)
(
d
H
G
Thermodynamic integration
: Another way to obtain the free energy
difference is to integrate the derivative of the hybrid Hamiltonian:
This integral is evaluated most efficiently using a Gaussian quadrature.
45
Absolute free energies from umbrella sampling
Here one samples the densities along a reaction coordinate and
determines the potential of mean force (PMF) from the Boltzmann eq.
)
(
)
(
ln
)
(
)
(
)
(
)
(
0
0
/
)]
(
)
(
[
0
0
z
z
kT
z
W
z
W
e
z
z
kT
z
W
z
W
Here z
0
is a reference point, e.g. a point in bulk where W vanishes.
In general, a particle cannot be adequately sampled at high

energy
points. To counter that, one introduces harmonic potentials, which
restrain the particle at desired points, and then unbias its effect. For
convenience, one introduces umbrella potential at regular intervals along
the reaction coordinate (e.g. ~0.5
Å
). The PMF’s obtained in each
interval are unbiased and optimally combined using the Weighted
Histogram Analysis Method (WHAM).
46
Steered MD (SMD) simulations and Jarzynski’s equation
Steered MD is a more recent method where a harmonic force is
applied to an atom on a peptide and the reference point of this force
is pulled with a constant velocity. It has been used to study unfolding
of proteins and binding of ligands. The discovery of
Jarzynski’s
equation in 1997 enabled determination of PMF from SMD, which has
boosted its applications.
)]
(
[
,
.
0
/
/
t
k
W
e
e
f
i
kT
W
kT
F
v
r
r
F
ds
F
Jarzynski’s equation:
Work done by
the harmonic force
This method seems to work well in simple systems and when
F is large
but beware of its applications in complex systems!
47
A critical look at simulation methods
Molecular dynamics is the “standard model” of biomolecules.
•
A higher level (quantum) description is simply not feasible for any
biomolecular system.
•
A lower level description sacrifices the atomic detail and cannot
be expected to succeed without guidance from MD.
Nevertheless, MD simulations alone cannot provide a complete
description of biomolecules, and hence we need to appeal to both
higher and lower level theories to make progress.
•
Quantum description (in a mixed QM/MM scheme) is required for
1.
Development and testing of force fields
2.
Enzyme reactions that involve making or breaking of bonds
3.
Transport of light particles (e.g., protons and electrons)
48
•
Stochastic description and coarse

graining are required for any
process that is beyond the current capabilities of MD simulations,
that is, more than 10
5

6
atoms and longer than microseconds.
1.
Protein

protein interactions, protein aggregation (water in
continuum, protein may be rigid, coarse

grained, semi

flexible or
flexible)
2.
Aggregation of membrane proteins (water and lipid in continuum,
protein as above)
3.
Ion channels and transporters: modelling of ion permeation (water
and lipid in continuum, protein may be rigid or semi

flexible)
4.
Protein folding (water in continuum, protein coarse

grained)
5.
Self assembly of lipid bilayers and micelles (water and lipid
molecules coarse

grained)
6.
DNA condensation (water in continuum, DNA coarse

grained)
49
The guiding principle:
Occam’s razor
Use the simplest method that yields the answer, provided:
•
The method is justified (domain of validity)
•
The parameters employed can be derived from a more fundamental
theory.
Most of the people who use molecular dynamics or other simulation
methods treat them as black boxes and apply them to biomolecules
without worrying too much whether they are valid or relevant.
You also need to watch out for experts who publish only the good
results!
A cautionary tale
: Continuum description of ion channels using the
Poisson

Nernst

Planck (drift

diffusion) equations.
50
Here,
: potential, r: charge density,
J
: flux,
n
: concentration of ions
PNP equations are solved simultaneously, yielding potential and flux
Assumption
: Ions are represented by an average charge density,
z
v
en
v
which invokes a mean

field approximation.
This is alright in a bulk electrolyte, but in ion channels with radius
smaller than the Debye length (8 Å for 0.15 M), one should worry about
the validity of PNP equations.
Poisson

Nernst

Planck (PNP) Equations
kT
en
z
n
D
J
en
z
ext
0
Poisson equation
Nernst

Planck equation
51
+
+
+
+
+
+
+
= 2
= 80
Induced charges
at the water

protein interface
Image force on an ion
In application of PNP to Ca channels, diffusion coeff. is reduced by 10

4
!
The problem
: Ions induce like

charges at a water

protein interface, which
repel them. When ions are represented as a continuous charge density,
this image force (or dielectric self

energy) is severely underestimated.
water
protein
52
A simple test: PNP vs Brownian dynamics (BD)
Control study:
Set artificially
ε
= 80
in the protein. No induced charges on the
boundary, hence no discrepancy between the two methods
r = 4
Å
Na
+
Cl

53
Realistic case:
In the realistic case (
ε
= 2
in the protein), ions do not enter the
channel in BD due to the dielectric self

energy barrier.
Only in large pores (r > 10 Å), validity of PNP is restored.
r = 4
Å
PNP
BD
54
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο