Discovery and Analysis of Novel Biochemical
Transformations
Linda J. Broadbelt
Department of Chemical and Biological Engineering
Northwestern University
Evanston, IL 60208
How Can We Create Products from
Natural Resources?
•Biochemical processes are being
explored as alternatives to traditional
chemical processes
Overall reaction
•Concern over dwindling petroleum
-
based resources sparks exploration
of alternative feedstocks
www.clemson.edu/edisto/ corn/corn.htm
www.timberland.com/.../ tim_
product_detail.jsp?OID=18298
biomass polysaccharides monosaccharides glucose ethanol
Reaction Networks of Novel
Biochemical
Transformations
•Reactants
,
intermediates
and
products
•Reactions
•Thermodynamic parameters
D
G
1
D
G
3
D
G
4
D
G
5
D
G
7
D
G
6
D
G
8
D
G
9
D
G
10
D
G
11
D
G
13
D
G
12
D
G
14
D
G
15
D
G
16
D
G
17
D
G
18
•Kinetic parameters
k
1
k
2
k
4
k
6
k
7
k
9
k
3
k
8
k
5
k
10
k
12
k
13
k
15
k
16
k
14
k
17
k
11
k
18
Challenges for Reaction Network
Development
Reactive intermediates have not been detected
Pathways have not been elucidated experimentally
Thermodynamic and kinetic parameters are unknown
Reaction networks are large
Construction is tedious and prone to user’s bias and
errors
Computer generation of reaction networks
Elements of Computer Generated
Reaction Networks
• Graph Theory
• Reaction Matrix
Operations
• Connectivity
Scan
• Uniqueness
Determination
• Property
Calculation
• Termination
Criteria
Reactants
Reaction
Types
Reaction
Rules
k
1
k
2
k
4
k
6
k
7
k
9
k
3
k
8
k
5
k
10
k
12
k
13
k
15
k
16
k
14
k
17
k
11
k
18
D
G
1
D
G
3
D
G
4
D
G
5
D
G
7
D
G
6
D
G
8
D
G
9
D
G
10
D
G
11
D
G
13
D
G
12
D
G
14
D
G
15
D
G
16
D
G
17
D
G
18
Bond
-
Electron Representation Allows
Implementation of Chemical Reaction
ij
entries denote the bond order between atoms
i
and
j
ii entries designate the number of
nonbonded
electrons
associated
with atom
i
methane
methyl radical
ethylene
C 0 1 1 1 1
H 1 0 0 0 0
H 1 0 0 0 0
H 1 0 0 0 0
H 1 0 0 0 0
C 0 2 1 0 0 1
C 2 0 0 1 1 0
H 1 0 0 0 0 0
H 0 1 0 0 0 0
H 0 1 0 0 0 0
H 1 0 0 0 0 0
C 1 1 1 1
H 1 0 0 0
H 1 0 0 0
C 1 1 1 1
H 1 0 0 0
H 1 0 0 0
H 1 0 0 0
Reaction Operation
H 0 1 0
C 1 0 0
H• 0 0 1
H 0 0 1
C• 0 1 0
H 1 0 0
+
0
-
1 1
-
1 1 0
1 0
-
1
Reactant
Matrices
Reactant
Matrix
Reordered
Reactant Matrix
Product
Matrix
C 0 1 1 1 1
H 1 0 0 0 0
H 1 0 0 0 0
H 1 0 0 0 0
H 1 0 0 0 0
H• 1
C 0 1 1 1 1 0
H 1 0 0 0 0 0
H 1 0 0 0 0 0
H 1 0 0 0 0 0
H 1 0 0 0 0 0
H• 0 0 0 0 0 1
H 0 1 0 0 0 0
C 1 0 0 1 1 1
H• 0 0 1 0 0 0
H 0 1 0 0 0 0
H 0 1 0 0 0 0
H 0 1 0 0 0 0
H 0 0 1 0 0 0
C• 0 1 0 1 1 1
H 1 0 0 0 0 0
H 0 1 0 0 0 0
H 0 1 0 0 0 0
H 0 1 0 0 0 0
H • + CH
4
•CH
3
+ H
2
Chemical Reaction as a Matrix
Addition Operation
EC i.j.k.l
→
unique enzyme
Tipton, S.B. and Boyce, S.
Bioinformatics
. 16 (2000), 34
-
40
Kanehisa, M. and Goto, S.
Nucleic Acid Research
. 28 (2000), 27
-
30
i
→
main class
j
→
functional group
k
→
cofactor / cosubstrate
Unique
Patterns
of
O
bserved
E
nzyme Chemistry
4
th
level is specific to substrate
Enzyme commission (EC) code number provides
systematic names for enzymes
EC
i.j.k.
l
unique enzyme
i
the
main
class
j
the
specific functional
groups
k
cofactors
l
specific
to the substrates
Formulation of Reaction Matrices
Using Enzyme Classification System
Enzyme commission (EC) code number provides
systematic names for enzymes
EC
i.j.k.
l
unique enzyme
i
the
main
class
j
the
specific functional
groups
k
cofactors
l
specific
to the substrates
Formulation of Reaction Matrices
Using Enzyme Classification System
•
More than
5,000
specific enzyme functions
(
i.j.k.l
)
•
Fewer than
250
generalized enzyme
functions (
i.j.k
)
•
Novel enzyme functions should be
expected through genomic sequencing,
proteomics and protein engineering
Generalized Enzyme Function
Examined at the
i.j.k
Level
Example of a Generalized Enzyme
Reaction
•
EC 4.2.1.2 (
fumarate
hydratase
)
•
EC 4.2.1.3 (
aconitate
hydratase
)
H
-
C
-
C
-
O
-
H
C=C
+
H
2
O
Generalized
enzyme reaction
(EC 4.2.1)
H
-
C
-
C
-
O
-
H
C=C
+
H
2
O
-
-
-
-
+
H
O
2
C
C
O
2
H
O
H
H
O
2
C
C
O
2
H
H
O
H
+
H
O
2
C
C
O
2
H
C
O
2
H
H
O
H
H
O
2
C
C
O
2
H
H
O
C
O
2
H
Matrix Representation of Generalized
Enzyme Function (
i.j.k
)
Products
Reaction operator
+
Reactant
+
+
H
O
2
C
C
O
2
H
O
H
H
O
2
C
C
O
2
H
O
H
H
O
2
C
C
O
2
H
H
O
2
C
C
O
2
H
H
O
2
C
C
O
2
H
H
O
H
H
O
H
H
O
H
4
1
0
0
O
1
0
1
0
C
0
1
0
1
C
0
0
1
0
H
O
C
C
H
0
-
1
0
1
O
-
1
0
1
0
C
0
1
0
-
1
C
1
0
-
1
H
O
C
C
4
0
0
1
O
0
0
2
0
C
0
2
0
0
C
1
0
0
0
H
O
C
C
H
Generalized enzyme reaction EC 4.2.1
C=C
+
H
-
C
-
C
-
O
-
H
H
HO
H
=
0
A + B
+ A + B
C
C
D
C
I.J.K
L.M.N
Q.R.S
D
+ A + B
E
Generation
1
Generation
2
Generation
3
A + B
C
D
E
I.J.K
L.M.N
Q.R.S
I.J.K
L.M.N
Q.R.S
I.J.K
L.M.N
Q.R.S
Generation
0
Discovery of
N
ovel
B
iosynthetic
R
outes
Implications for Novel Pathway
Development
Given a novel reaction (reactant/product),
can we identify enzymes (catalysts) that
could be engineered (evolved) to carry this
novel biotransformation ?
If A gives B under 2.4.1 action,
then target enzymes within the 2.4.1 class
Step 1
Enumerate all enzymes in the EC system
Step 2
Choose a specific pathway to explore its
synthetic ability
Example
Aromatic amino acid biosynthesis
Application of Reaction Matrix Approach
Exists in higher plants and microorganisms
Pathway does not exist in mammals
chorismate
prephenate
phenylalanine
tyrosine
4
-
hydroxyphenyl
pyruvate
phenylpyruvate
Aromatic Amino Acid Biosynthesis:
Phenylalanine and Tyrosine
aromatic
aminotransferase
prephenate
dehydratase
chorismate
mutase
prephenate
dehydrogenase
glutamate
glutamate
chorismate
prephenate
phenylalanine
tyrosine
4
-
hydroxyphenyl
pyruvate
phenylpyruvate
aromatic
aminotransferase
prephenate
dehydratase
chorismate
mutase
prephenate
dehydrogenase
glutamate
glutamate
5.4.99.5
2.6.1.57
1.3.1.12
2.6.1.57
4.2.1.51
Aromatic Amino Acid Biosynthesis:
Phenylalanine and Tyrosine
Reaction Misclassification (?)
Some reactions within classes are not general
General 4.2.1 reaction (4. =
lyase
)
Loses water (4.2.1 =
hydrolyase
) AND forms a
double bond. However…
4.2.1.51
+ H
2
O + CO
2
It is both a
carboxy
-
lyase
(4.1.1)
and a hydro
-
lyase
(4.2.1)
prephenate
dehydratase
4.2.1.51
+ H
2
O + CO
2
4.2.1.51 can be broken down into 3 general reactions
:
4.1.1 will
decarboxylate
(4.1.1 is a
carboxy
-
lyase
)
5.3.3 will rearrange the double bond
(5.3.3 transposes C=C bonds
)
4.2.1 will lose H
2
O and form a double bond
(4.2.1 is a hydro
-
lyase
)
Reaction Decomposition
prephenate
dehydratase
Mapping Results
•
Although only 2500 reactions in the KEGG and 269 reactions in the iJR904 model were
contained in the
curated
EC classes,
3267 (50%) of the KEGG reactions and 430 (46%)
of the
iJR904
reactions were reproduced using the 86 reaction rules
•
The reproduced reactions are involved in 129 different third
-
level enzyme classes
in the
KEGG and
i
JR904
•
100% of the reactions contained in
25
of the
uncurated
EC classes in the KEGG were
mapped to the 86 existing reaction rules
Tryptophan Biosynthesis Pathway
Input Molecules
phosphoenolpyruvate (PEP), erythrose
-
4
-
phosphate (E4P), glutamine, serine, ribose
-
5
-
phosphate (R5P)
Cofactors
ATP, NADPH
Specific Enzyme Actions
12
The Evolution and Wealth of the Aromatic
Amino Acid Biochemistry
Convergence
1
10
10
2
10
3
10
4
10
5
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Number of Products
TYR
TRP
PHE
Generation Number
7
-
carboxyindole
Specialty Organic Chemicals
•
3
-
Hydroxypropanoate is a useful chemical with known biochemical
production routes
3
-
Hydroxypropanoate from
Pyruvate
•
Generate all of the possible compounds and reactions from
pyruvate
using only the reaction rules involved in the known
biosynthetic routes to 3HP
•
Generate all of the possible compounds and reactions from
pyruvate
using all of the 86 current reaction rules
2
3
4
5
6
7
8
9
10
Novel
Biosynthetic Pathways
D
iscovered:
P
yruvate
to 3HP
Distribution of lengths of
pathways
Pathway length
Number of pathways
1
10
2
10
3
10
4
10
10
5
10
6
A pathway of length two and a pathway of length three were both
discovered using the additional reaction rules
pathways discovered using all
reaction rules
pathways discovered using only the
reaction rules involved in the known
pathways to 3HP
Pyruvate
Oxaloacetate
Alanine
Lactate
Propenoate
Ethylamine
Hydroxyacetone
Homoserine
2
-
hydroxy
-
2,4
-
pentadienoate
Propan
-
2
-
ol
Propene
Propane
-
1
-
ol
Propanoate
Propanoate
Theonine
Lactaldehyde
Acrolein
Beta
-
alanine
3
-
hydroxypropanol
3
-
HP
Propane
-
1,3
-
diol
Aspartate
Fumarate
Malate
Malonate semialdehyde
Ethanol
Allyl alcohol
Propane 1,2 diol
Ethylene
Acetaldehyde
Propanoyl
-
CoA
3HP
-
CoA
Lactoyl
-
CoA
Acryloyl
-
CoA
Beta
-
alanyl
-
CoA
3
-
Oxopropionyl
-
CoA
Pyruvate
Oxaloacetate
Lactate
Beta
-
alanine
3
-
HP
Aspartate
Malonate semialdehyde
3HP
-
CoA
Lactoyl
-
CoA
Acryloyl
-
CoA
Beta
-
alanyl
-
CoA
•
Pathway length
•
Fewest novel intermediates
•
Thermodynamic feasibility
•
Maximum achievable yield to 3HP from glucose
during anaerobic growth
•
Maximum achievable intracellular activity at which
3HP can be produced
•
Protein docking calculations
•
Quantum chemical investigations
What Screening Methods Can We Use to
Identify the Most Attractive Pathways?
Shortest Novel Pathways to 3HP
Part of the Patented Pathway
KEGG Reaction not in Patented Pathways
Not found in KEGG or Patented Pathways
CO
2
glu
2
-
oxo
CO
2
2
-
oxo
glu
nad
nadh
H
+
nad
nadh
H
+
H
2
O
CoA
H
2
O
1.1.1
2.3.1
4.2.1
4.1.1 Rev
2.6.1
4.1.1
2.6.1
1.1.1
4.2.1 Rev
2.3.1 Rev
4.1.1
2.6.1
4.2.1 Rev
4.3.1
H
2
O
NH
3
H
2
O
CoA
H
2
O
CoA
glu
2
-
oxo
CoA
H
2
O
glu
2
-
oxo
CO
2
CO
2
NH
3
H
2
O
NH
3
glu
2
-
oxo
H
2
O
nad
nadh
H
+
Pyruvate
Oxaloacetate
Aspartate
Lactate
Β
-
alanine
Β
-
alanyl
CoA
Lactoyl
CoA
Acryloyl
CoA
3
-
HP
-
CoA
3
-
HP
Malonate
Semialdehyde
Propenoate
Alanine
Ethylamine
Acetaldehyde
3
-
oxopropionyl
CoA
Pathway N1
Pathway N2
nad
nadh
H
+
CoA
•
Two
-
step pathway identified with only one novel
reaction
•
Maximum achievable yield to 3HP from glucose
during anaerobic growth matches commercial pathway
•
Slightly reduced maximum achievable intracellular
activity at which 3HP can be produced
•
Numerous other attractive candidates
Attractive Novel Pathways Successfully
Identified
Can
the
enzyme
that catalyzes
decarboxylation
of
pyruvate
perform catalysis
of different substrates?
Decarboxylation
reaction of
ketoacids
PFOR (1.2.7.a) :
pyruvate
+
CoA
+
Fd
(ox) CO
2
+ acetyl
-
CoA
+
Fd
(red)
Generalized enzyme operators can act on
all of the above
keto
acids
to
give their corresponding
products
Are
These Novel Reactions Feasible?
•
Substrate
binding
Docking
analysis
•
Ability to form initial enzyme
-
substrate bound
species with no distortion to the active site of the
enzyme or the cofactor
QM/MM structural studies
•
Follow the reaction pattern of the native substrate
Study of reaction mechanism using QM methods
Explore Novel Reactions Using
Molecular Modeling
PFOR
Substrate
1.2.7
pyruvic
acid
-
10.7
2
-
ketobutyric acid
-
11.63
2
-
ketoisovaleric acid
-
11.56
2
-
ketovaleric acid
-
11.31
2
-
keto
-
3
-
methylvaleric acid
-
11.27
2
-
keto
-
4
-
methylpentanoic acid
-
11.01
phenylpyruvic
acid
X
Scored using GLIDE
Enzyme Docking Results
1
2
3
4
5
6
pyruvic acid
2
-
ketovaleric acid
2
-
keto
-
4
-
methylpentanoic
acid
2
-
ketoisovaleric acid
2
-
ketobutyric
acid
2
-
keto
-
3
-
methylvaleric
acid
Enzyme Docking Poses
MM
part
: 50 Å of
active
site
and solvent molecules
~20,000
atoms
QM part : 63
atoms
Geometry : B3LYP/6
-
31G*
Binding Using Quantum
Mechanics/Molecular Mechanics
2
-
keto
-
4
-
methylpentanoic
acid
pyruvic acid
2
-
keto
-
3
-
methylvaleric
acid
2
-
ketoisovaleric
acid
QM/MM structural studies suggest that the binding of the
substrates does not cause
distortions
to the
active site
Comparison of Bound Structures of
Different Acids: QM/MM
TS1
TS2
HEThDP enamine
LThDP
ThDP ylide + KA
Ville
et al., Nature Chemical Biology,
2(6),
2006, 324
Kinetics of Enzyme
-
Catalyzed
Decarboxylation
: Quantum Mechanics
TS 1
TS 2
ThDP
+
pyruvic
acid
LThDP
enamine
+ CO
2
Free Energy Surface of Thiamine
-
Catalyzed
Decarboxylation
:
Pyruvic
Acid
Free energy barrier (∆G
activation
298K, DCE)
Comparison of Thiamine
-
Catalyzed
Decarboxylation
NOT present in KEGG
NOT present in CAS REGISTRY
1,3,4,5
-
Tetrahydroxy
-
Cyclohexanecarboxylic
acid
H
O
C
O
2
H
H
O
O
H
O
H
3
-
[1
-
Carboxy
-
2
-
(1,4
-
dihydro
-
pyridin
-
3
-
yl)
-
ethoxy]
-
4
-
hydroxy
-
cyclohexa
-
-
1,5
-
dienecarboxylic acid
N
H
O
C
O
2
H
H
O
C
O
2
H
Present in KEGG
(
K
yoto
E
ncyclopedia of
G
enes
and
G
enomes)
Exploring Novel Pathways and Molecules
New routes to
bioavailable species
New molecules
Migration to
Biocatalytic
Processes
H
O
C
O
2
H
H
O
O
O
H
NOT present in KEGG
Present in CAS REGISTRY
1,3,5
-
Trihydroxy
-
4
-
oxo
-
cyclohexane
carboxylic acid
New biochemical routes
to existing chemicals
Acknowledgments
•
Department of Energy
•
National Science Foundation Cyber
-
enabled Discovery
and
Innovation
•
Vassily
Hatzimanikatis
•
Chunhui
Li
•
Chris
Henry
•
Goran
Krilov
•
Raj
Assary
Funding
Collaborators
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο