CS612  Algorithms in Bioinformatics
Spring 2012 { Class 22
April 17,2012
From a Rigid Ligand to a Flexible Ligand
Torsional (Dihedral) Degrees of Freedom (DOF)
Nurit Haspel
CS612  Algorithms in Bioinformatics
Roboticsinspired Approach to Protein Flexibility
Similarity between proteins and robots:exploration of
complex highdimensional space
Similarity exploited to sample conformations with spatial
constraints
Articulated manipulator
Protein Extended Backbone
Nurit Haspel
CS612  Algorithms in Bioinformatics
Roboticsinspired Approach to Protein Flexibility
Exploration of protein conformational space has parallels in
robotics
0/1 collisions for robots versus energy eld for proteins
adapted from J.C.
Latombe,Stanford
adapted from P.Smith,
KSU
Nurit Haspel
CS612  Algorithms in Bioinformatics
Roboticsinspired Approach to Protein Flexibility
Dimensionality of conguration space
DOFs (rigidbody transformations and DOFs of the ligand)
Too many DOFs mean that the conguration space of the
ligand is highdimensional and dicult to search
Similar issue when planning motions for an articulated robotic
chain in a cluttered environment
Geometric complexity of the free space
Dicult to determine whether a ligand conformation and
specic position and orientation result in a good t
Similar issue for an articulated robot
Address:Plan motions in the conguration space but compute in
workspace (protein surface or cavity)!
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
Congurations are sampled by picking coordinates at random
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
Congurations are sampled by picking coordinates at random
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
Sampled congurations are tested for collision (in workspace!)
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
The collisionfree congurations are retained as\milestoned"
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
Each milestone is linked by straight paths to its knearest neighbors
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
Each milestone is linked by straight paths to its knearest neighbors
Nurit Haspel
CS612  Algorithms in Bioinformatics
Probabilistic Roadmap Motion Planning (PRM)
The collisionfree links are retained to form the PRM
Nurit Haspel
CS612  Algorithms in Bioinformatics
Application of PRM to ProteinLigand Docking
Protein is assumed to be
rigid
A xed coordinate system P
is attached to the protein
Ligand is a small exible
molecule
A moving coordinate system
L is dened using three
bonded atoms in the ligand
A conformation of the ligand
is dened by the position
and orientation of L relative
to P and the torsional angles
of the ligand
A.P.Singh,J.C.Latombe,and D.L.Brutlag.A Motion Planning Approach to Flexible Ligand Binding.Proc.7th
ISMB,pp.252261,1999
Nurit Haspel
CS612  Algorithms in Bioinformatics
Roadmap Construction:Node Generation
The nodes of the roadmap are
generated by sampling
conformations of the ligand
uniformly at random in the
parameter space (around the
protein)
The energy of each sampled
conformation is E = E
interaction
(electrostatic) + E
internal
(vdw)
A sampled conformation is
retained with probability:
p =
8
>
<
>
:
0 if E > E
max
E
max
E
E
max
E
min
if E
min
E E
max
1 if E < E
min
Results in denser distribution of
nodes in lowenergy regions of
conformational space
Nurit Haspel
CS612  Algorithms in Bioinformatics
Roadmap Construction:Edge Generation
Each node is connected to
its closest neighbors by
straight edges
Each edge is discretized so
that between q
i
and q
i +1
no
atom moves by more than
some"= 1
A.
Results in denser distribution of
nodes in lowenergy regions of
conformational space
Nurit Haspel
CS612  Algorithms in Bioinformatics
Querying the Roadmap
For a given goal node q
g
(e.g.,binding conformation),
the Dijkstras singlesource
shortestpath algorithm
computes the lowestweight
paths from q
g
to each node
(in either direction) in
O(N log N) time,where N
= number of nodes
Various quantities can then
be easily computed in O(N)
time,e.g.,average weights
of all paths entering qg and
of all paths leaving q
g
(binding and dissociation
rates K
on
and K
o
)
Nurit Haspel
CS612  Algorithms in Bioinformatics
Computing Binding Conformations
Sample many (several
1000s) ligands
conformations at random
around protein
Repeat several times:
Select lowestenergy
conformations that are close
to protein surface
Resample around them
Retain k (approx.10)
lowestenergy conformations
whose centers of mass are at
least 5
A apart
Active site
?
lactate dehydrogenase
Nurit Haspel
CS612  Algorithms in Bioinformatics
Testing on Three Complexes
PDB ID:1ldm Receptor:Lactate Dehydrogenase (2386
atoms,309 residues) Ligand:Oxamate (6 atoms,7 dofs)
PDB ID:4ts1 Receptor:Mutant of tyrosyltransferRNA
synthetase (2423 atoms,319 residues) Ligand:L
leucylhydroxylamine (13 atoms,9 dofs)
PDB ID:1stp Receptor:Streptavidin (901 atoms,121
residues) Ligand:Biotin (16 atoms,11 dofs)
Nurit Haspel
CS612  Algorithms in Bioinformatics
From Flexible Ligand to Flexible Receptor?
Nurit Haspel
CS612  Algorithms in Bioinformatics
From Flexible Ligand to Flexible Receptor?
Protein receptors are exible and in water probably look like this!
Nurit Haspel
CS612  Algorithms in Bioinformatics
From Flexible Ligand to Flexible Receptor?
Target receptor is big has many DOFs (in the thousands) Need to
somehow nd and focus only on relevant motions.
Nurit Haspel
CS612  Algorithms in Bioinformatics
From Flexible Ligand to Flexible Receptor?
The dimensionality of the protein
conformational space is much larger than
that of a small ligand
PRMbased methods that sample
thousands of conformations to get a good
view of the ligand conformational space
are not sucient
Challenge:from 710 DOFs to thousands
of DOFs
Goal:Model protein exibility to capture
relevant conformations of the exible receptor
Nurit Haspel
CS612  Algorithms in Bioinformatics
From Flexible Ligand to Flexible Receptor?
Flexibility limited to one or few amino
acids on or near the binding site of the
receptor
Soft docking:docking ligand
conformations to a single average receptor
conformation
Ensemble docking:docking of ligand
conformations to individual protein
conformations
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Limited Receptor Flexibility
Selection of specic degrees
of freedom such as on
designated amino acids on
binding site
Shown here:
Acetylcholinesterase:
Phe330 exible { acts as
swinging gate
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Limited Receptor Flexibility
Moving larger number of amino acids (illustration on
acetylcholinesterase)
Nurit Haspel
CS612  Algorithms in Bioinformatics
Critical Assessment of Protein Interactions (CAPRI)
Proteinprotein docking competition
The equivalent of CASP for proteinprotein docking
Communitywide experiment that started in 2001
Interesting review of docking methods:S.Vajda & C.J.
Camacho.Proteinprotein docking:is the glass halffull or
halfempty?Trends in Biotechnology,22(3):110116,2004.
Nurit Haspel
CS612  Algorithms in Bioinformatics
Finding Folding Pathways Using RPM
Degrees of freedom { number of rotatable backbone dihedral
angles (approx.2N,number of amino acids)
Nodes generated in a similar manner as the docking scheme
above.
Sampling cannot be done at random due to high
dimensionality { sampling is done from a set of distributions
around the native state.
Edges connect neighboring nodes in a similar manner to the
one described above.
Can be used to discover folding pathways,intermediate
structures and other folding events.
G.Song,N.Amato,RECOMB 2001
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Goal:Model the ensemble of conformations of a protein.
It is known that proteins are not rigid but uctuate about an
ensemble of structures under equilibrium conditions.
Focus mostly on loop regions,as they are the most exible
ones.
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Inverse kinematics:Manipulate the degrees of freedom of an
articulated chain to satisfy some endconstraints.
In this case  manipulate the rotational degrees of freedom of
a loop region to nd possible loop conformations that attach
to the rest of the protein.
Cyclic Coordinate Descent (CCD):solve for and rotate one
dihedral at a time.
Canutescu A.A.,and Dunbrack R.L.Protein Science 12,2003
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Cyclic Coordinate Descent:
solve for and rotate one
dihedral at a time
Given:atom at current
position M,target position F
Goal:Solve for dihedral
s.t.jF Mj2 = S() <"
threshold
Time complexity:Linear
time on the nr.DOFs to
solve for all dihedrals of a
chain
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Cyclic Coordinate Descent:
solve for and rotate one
dihedral at a time
Given:atom at current
position M,target position F
Goal:Solve for dihedral
s.t.jF Mj2 = S() <"
threshold
Time complexity:Linear
time on the nr.DOFs to
solve for all dihedrals of a
chain
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Cyclic Coordinate Descent:
solve for and rotate one
dihedral at a time
Given:atom at current
position M,target position F
Goal:Solve for dihedral
s.t.jF Mj2 = S() <"
threshold
Time complexity:Linear
time on the nr.DOFs to
solve for all dihedrals of a
chain
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Cyclic Coordinate Descent:
solve for and rotate one
dihedral at a time
Given:atom at current
position M,target position F
Goal:Solve for dihedral
s.t.jF Mj2 = S() <"
threshold
Time complexity:Linear
time on the nr.DOFs to
solve for all dihedrals of a
chain
Nurit Haspel
CS612  Algorithms in Bioinformatics
Modeling Loops Using Inverse Kinematics
Since there is redundancy,many solutions are feasible.
Find rotations to satisfy spatial constraints on atoms Combine
with energy minimization to obtain physical structures
Example:Chymotrypsin inhibitor 2
Nurit Haspel
CS612  Algorithms in Bioinformatics
Equilibrium Fluctuations
More DOFs than spatial constraints can be exploited to generate
fragment uctuations
Example:Chymotrypsin inhibitor 2
Nurit Haspel
CS612  Algorithms in Bioinformatics
Equilibrium Fluctuations
Sample equilibrium uctuations:
Spatially constrained through Cyclic Coordinate Descent
Energetically constrained to be feasible
Local Fluctuations in
Lactalbumin
Boltzmann ensemble average
RMSD
x
=
X
Confs
RMSD(C;C
native)
e
E
c
Q
E
c
= E
c
E
native
Q =
X
Confs
e
E
c
Nurit Haspel
CS612  Algorithms in Bioinformatics
Equilibrium Fluctuations
Lactalbumin (Lac)
123 residues
Hydrogen exchange
protection factors available
Ubiquitin
76 residues NMR
information on uctuations
available
Nurit Haspel
CS612  Algorithms in Bioinformatics
Enter the password to open this PDF file:
File name:

File size:

Title:

Author:

Subject:

Keywords:

Creation Date:

Modification Date:

Creator:

PDF Producer:

PDF Version:

Page Count:

Preparing document for printing…
0%
Comments 0
Log in to post a comment