Click to download the presentation

hopeacceptableSoftware and s/w Development

Oct 28, 2013 (3 years and 8 months ago)

65 views

Discovery and Analysis of Novel Biochemical
Transformations

Linda J. Broadbelt

Department of Chemical and Biological Engineering

Northwestern University

Evanston, IL 60208

How Can We Create Products from
Natural Resources?

•Biochemical processes are being

explored as alternatives to traditional

chemical processes

Overall reaction

•Concern over dwindling petroleum
-

based resources sparks exploration

of alternative feedstocks

www.clemson.edu/edisto/ corn/corn.htm

www.timberland.com/.../ tim_

product_detail.jsp?OID=18298

biomass polysaccharides monosaccharides glucose ethanol

Reaction Networks of Novel
Biochemical
Transformations

•Reactants
,

intermediates

and
products

•Reactions

•Thermodynamic parameters

D
G
1

D
G
3

D
G
4

D
G
5

D
G
7

D
G
6

D
G
8

D
G
9

D
G
10

D
G
11

D
G
13

D
G
12

D
G
14

D
G
15

D
G
16

D
G
17

D
G
18

•Kinetic parameters

k
1

k
2

k
4

k
6

k
7

k
9

k
3

k
8

k
5

k
10

k
12

k
13

k
15

k
16

k
14

k
17

k
11

k
18

Challenges for Reaction Network
Development

Reactive intermediates have not been detected


Pathways have not been elucidated experimentally


Thermodynamic and kinetic parameters are unknown


Reaction networks are large


Construction is tedious and prone to user’s bias and
errors


Computer generation of reaction networks

Elements of Computer Generated
Reaction Networks

• Graph Theory

• Reaction Matrix


Operations

• Connectivity


Scan

• Uniqueness


Determination

• Property


Calculation

• Termination


Criteria

Reactants

Reaction

Types

Reaction

Rules

k
1

k
2

k
4

k
6

k
7

k
9

k
3

k
8

k
5

k
10

k
12

k
13

k
15

k
16

k
14

k
17

k
11

k
18

D
G
1

D
G
3

D
G
4

D
G
5

D
G
7

D
G
6

D
G
8

D
G
9

D
G
10

D
G
11

D
G
13

D
G
12

D
G
14

D
G
15

D
G
16

D
G
17

D
G
18

Bond
-
Electron Representation Allows
Implementation of Chemical Reaction


ij

entries denote the bond order between atoms
i

and
j


ii entries designate the number of
nonbonded

electrons


associated
with atom
i

methane

methyl radical

ethylene

C 0 1 1 1 1

H 1 0 0 0 0

H 1 0 0 0 0

H 1 0 0 0 0

H 1 0 0 0 0

C 0 2 1 0 0 1

C 2 0 0 1 1 0

H 1 0 0 0 0 0

H 0 1 0 0 0 0

H 0 1 0 0 0 0

H 1 0 0 0 0 0

C 1 1 1 1

H 1 0 0 0

H 1 0 0 0

C 1 1 1 1

H 1 0 0 0

H 1 0 0 0

H 1 0 0 0

Reaction Operation

H 0 1 0

C 1 0 0

H• 0 0 1

H 0 0 1

C• 0 1 0

H 1 0 0

+


0
-
1 1

-
1 1 0


1 0
-
1

Reactant

Matrices

Reactant

Matrix

Reordered

Reactant Matrix

Product

Matrix

C 0 1 1 1 1

H 1 0 0 0 0

H 1 0 0 0 0

H 1 0 0 0 0

H 1 0 0 0 0

H• 1

C 0 1 1 1 1 0

H 1 0 0 0 0 0

H 1 0 0 0 0 0

H 1 0 0 0 0 0

H 1 0 0 0 0 0

H• 0 0 0 0 0 1

H 0 1 0 0 0 0

C 1 0 0 1 1 1

H• 0 0 1 0 0 0

H 0 1 0 0 0 0

H 0 1 0 0 0 0

H 0 1 0 0 0 0

H 0 0 1 0 0 0

C• 0 1 0 1 1 1

H 1 0 0 0 0 0

H 0 1 0 0 0 0

H 0 1 0 0 0 0

H 0 1 0 0 0 0

H • + CH
4

•CH
3

+ H
2

Chemical Reaction as a Matrix
Addition Operation

EC i.j.k.l


unique enzyme

Tipton, S.B. and Boyce, S.
Bioinformatics
. 16 (2000), 34
-
40

Kanehisa, M. and Goto, S.
Nucleic Acid Research
. 28 (2000), 27
-
30

i


main class

j


functional group

k


cofactor / cosubstrate

Unique

Patterns
of

O
bserved
E
nzyme Chemistry

4
th

level is specific to substrate

Enzyme commission (EC) code number provides
systematic names for enzymes


EC
i.j.k.
l



unique enzyme


i




the
main
class



j




the
specific functional
groups



k




cofactors


l




specific
to the substrates

Formulation of Reaction Matrices
Using Enzyme Classification System

Enzyme commission (EC) code number provides
systematic names for enzymes


EC
i.j.k.
l



unique enzyme


i




the
main
class



j




the
specific functional
groups



k




cofactors


l




specific
to the substrates

Formulation of Reaction Matrices
Using Enzyme Classification System


More than
5,000

specific enzyme functions
(
i.j.k.l
)


Fewer than
250

generalized enzyme
functions (
i.j.k
)


Novel enzyme functions should be
expected through genomic sequencing,
proteomics and protein engineering

Generalized Enzyme Function
Examined at the
i.j.k

Level

Example of a Generalized Enzyme
Reaction



EC 4.2.1.2 (

fumarate

hydratase

)



EC 4.2.1.3 (

aconitate

hydratase

)

H

-

C

-

C

-

O

-

H

C=C

+

H

2

O

Generalized

enzyme reaction

(EC 4.2.1)

H

-

C

-

C

-

O

-

H

C=C

+

H

2

O

-

-

-

-

+

H

O

2

C

C

O

2

H

O

H

H

O

2

C

C

O

2

H

H

O

H

+

H

O

2

C

C

O

2

H

C

O

2

H

H

O

H

H

O

2

C

C

O

2

H

H

O

C

O

2

H

Matrix Representation of Generalized
Enzyme Function (
i.j.k
)

Products

Reaction operator

+

Reactant

+

+

H

O

2

C

C

O

2

H

O

H

H

O

2

C

C

O

2

H

O

H

H

O

2

C

C

O

2

H

H

O

2

C

C

O

2

H

H

O

2

C

C

O

2

H

H

O

H

H

O

H

H

O

H

4

1

0

0

O

1

0

1

0

C

0

1

0

1

C

0

0

1

0

H

O

C

C

H

0

-
1

0

1

O

-
1

0

1

0

C

0

1

0

-
1

C

1

0

-
1

H

O

C

C

4

0

0

1

O

0

0

2

0

C

0

2

0

0

C

1

0

0

0

H

O

C

C

H

Generalized enzyme reaction EC 4.2.1

C=C

+

H

-

C

-

C

-

O

-

H

H

HO
H

=

0

A + B

+ A + B

C

C

D

C

I.J.K

L.M.N

Q.R.S

D

+ A + B

E

Generation

1

Generation

2

Generation

3

A + B

C

D

E

I.J.K

L.M.N

Q.R.S

I.J.K

L.M.N

Q.R.S

I.J.K

L.M.N

Q.R.S

Generation

0

Discovery of
N
ovel
B
iosynthetic
R
outes

Implications for Novel Pathway
Development

Given a novel reaction (reactant/product),

can we identify enzymes (catalysts) that

could be engineered (evolved) to carry this

novel biotransformation ?

If A gives B under 2.4.1 action,


then target enzymes within the 2.4.1 class


Step 1

Enumerate all enzymes in the EC system


Step 2

Choose a specific pathway to explore its
synthetic ability


Example

Aromatic amino acid biosynthesis

Application of Reaction Matrix Approach


Exists in higher plants and microorganisms


Pathway does not exist in mammals

chorismate

prephenate

phenylalanine

tyrosine

4
-
hydroxyphenyl

pyruvate

phenylpyruvate

Aromatic Amino Acid Biosynthesis:

Phenylalanine and Tyrosine

aromatic

aminotransferase

prephenate

dehydratase

chorismate

mutase

prephenate

dehydrogenase

glutamate

glutamate

chorismate

prephenate

phenylalanine

tyrosine

4
-
hydroxyphenyl

pyruvate

phenylpyruvate

aromatic

aminotransferase

prephenate

dehydratase

chorismate

mutase

prephenate

dehydrogenase

glutamate

glutamate

5.4.99.5

2.6.1.57

1.3.1.12

2.6.1.57

4.2.1.51

Aromatic Amino Acid Biosynthesis:

Phenylalanine and Tyrosine

Reaction Misclassification (?)

Some reactions within classes are not general

General 4.2.1 reaction (4. =
lyase
)

Loses water (4.2.1 =
hydrolyase
) AND forms a
double bond. However…

4.2.1.51

+ H
2
O + CO
2

It is both a
carboxy
-
lyase

(4.1.1)

and a hydro
-
lyase

(4.2.1)

prephenate

dehydratase

4.2.1.51

+ H
2
O + CO
2

4.2.1.51 can be broken down into 3 general reactions
:



4.1.1 will
decarboxylate


(4.1.1 is a
carboxy
-
lyase
)



5.3.3 will rearrange the double bond


(5.3.3 transposes C=C bonds
)



4.2.1 will lose H
2
O and form a double bond


(4.2.1 is a hydro
-
lyase
)

Reaction Decomposition

prephenate

dehydratase

Mapping Results


Although only 2500 reactions in the KEGG and 269 reactions in the iJR904 model were
contained in the
curated

EC classes,
3267 (50%) of the KEGG reactions and 430 (46%)
of the
iJR904

reactions were reproduced using the 86 reaction rules



The reproduced reactions are involved in 129 different third
-
level enzyme classes

in the
KEGG and
i
JR904


100% of the reactions contained in
25

of the
uncurated

EC classes in the KEGG were
mapped to the 86 existing reaction rules

Tryptophan Biosynthesis Pathway

Input Molecules

phosphoenolpyruvate (PEP), erythrose
-
4
-
phosphate (E4P), glutamine, serine, ribose
-
5
-
phosphate (R5P)


Cofactors

ATP, NADPH


Specific Enzyme Actions

12





The Evolution and Wealth of the Aromatic
Amino Acid Biochemistry

Convergence

1

10

10
2

10
3

10
4

10
5

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Number of Products

TYR

TRP

PHE

Generation Number

7
-
carboxyindole

Specialty Organic Chemicals


3
-
Hydroxypropanoate is a useful chemical with known biochemical
production routes

3
-
Hydroxypropanoate from
Pyruvate




Generate all of the possible compounds and reactions from
pyruvate

using only the reaction rules involved in the known
biosynthetic routes to 3HP



Generate all of the possible compounds and reactions from
pyruvate

using all of the 86 current reaction rules

2

3

4

5

6

7

8

9

10

Novel

Biosynthetic Pathways
D
iscovered:
P
yruvate

to 3HP

Distribution of lengths of
pathways

Pathway length

Number of pathways

1

10
2

10
3

10
4

10

10
5

10
6

A pathway of length two and a pathway of length three were both
discovered using the additional reaction rules

pathways discovered using all
reaction rules

pathways discovered using only the
reaction rules involved in the known
pathways to 3HP

Pyruvate

Oxaloacetate

Alanine

Lactate

Propenoate

Ethylamine

Hydroxyacetone

Homoserine

2
-
hydroxy
-
2,4
-
pentadienoate

Propan
-
2
-
ol

Propene

Propane
-
1
-
ol

Propanoate

Propanoate

Theonine

Lactaldehyde

Acrolein

Beta
-
alanine

3
-
hydroxypropanol

3
-
HP

Propane
-
1,3
-
diol

Aspartate

Fumarate

Malate

Malonate semialdehyde

Ethanol

Allyl alcohol

Propane 1,2 diol

Ethylene

Acetaldehyde

Propanoyl
-
CoA

3HP
-
CoA

Lactoyl
-
CoA

Acryloyl
-
CoA

Beta
-
alanyl
-
CoA

3
-
Oxopropionyl
-
CoA

Pyruvate

Oxaloacetate

Lactate

Beta
-
alanine

3
-
HP

Aspartate

Malonate semialdehyde

3HP
-
CoA

Lactoyl
-
CoA

Acryloyl
-
CoA

Beta
-
alanyl
-
CoA


Pathway length


Fewest novel intermediates


Thermodynamic feasibility


Maximum achievable yield to 3HP from glucose
during anaerobic growth


Maximum achievable intracellular activity at which
3HP can be produced


Protein docking calculations


Quantum chemical investigations

What Screening Methods Can We Use to
Identify the Most Attractive Pathways?

Shortest Novel Pathways to 3HP

Part of the Patented Pathway

KEGG Reaction not in Patented Pathways

Not found in KEGG or Patented Pathways

CO
2

glu

2
-
oxo

CO
2

2
-
oxo

glu

nad

nadh

H
+

nad

nadh

H
+

H
2
O

CoA

H
2
O

1.1.1

2.3.1

4.2.1

4.1.1 Rev

2.6.1

4.1.1

2.6.1

1.1.1

4.2.1 Rev

2.3.1 Rev

4.1.1

2.6.1

4.2.1 Rev

4.3.1

H
2
O

NH
3

H
2
O

CoA

H
2
O

CoA

glu

2
-
oxo

CoA

H
2
O

glu

2
-
oxo

CO
2

CO
2

NH
3

H
2
O

NH
3

glu

2
-
oxo

H
2
O

nad

nadh

H
+

Pyruvate

Oxaloacetate

Aspartate

Lactate

Β
-
alanine

Β
-
alanyl

CoA

Lactoyl
CoA

Acryloyl
CoA

3
-
HP
-
CoA

3
-
HP

Malonate

Semialdehyde

Propenoate

Alanine

Ethylamine

Acetaldehyde

3
-
oxopropionyl

CoA

Pathway N1

Pathway N2

nad

nadh

H
+

CoA


Two
-
step pathway identified with only one novel
reaction


Maximum achievable yield to 3HP from glucose
during anaerobic growth matches commercial pathway


Slightly reduced maximum achievable intracellular
activity at which 3HP can be produced


Numerous other attractive candidates

Attractive Novel Pathways Successfully
Identified

Can

the
enzyme

that catalyzes
decarboxylation

of
pyruvate

perform catalysis
of different substrates?

Decarboxylation

reaction of
ketoacids

PFOR (1.2.7.a) :
pyruvate

+
CoA

+
Fd

(ox) CO
2
+ acetyl
-
CoA

+
Fd

(red)



Generalized enzyme operators can act on
all of the above
keto

acids

to
give their corresponding
products

Are

These Novel Reactions Feasible?


Substrate
binding



Docking

analysis


Ability to form initial enzyme
-
substrate bound
species with no distortion to the active site of the
enzyme or the cofactor



QM/MM structural studies


Follow the reaction pattern of the native substrate



Study of reaction mechanism using QM methods

Explore Novel Reactions Using
Molecular Modeling

PFOR

Substrate

1.2.7

pyruvic

acid

-
10.7

2
-
ketobutyric acid

-
11.63

2
-
ketoisovaleric acid

-
11.56

2
-
ketovaleric acid

-
11.31

2
-
keto
-
3
-
methylvaleric acid

-
11.27

2
-
keto
-
4
-
methylpentanoic acid

-
11.01

phenylpyruvic

acid

X

Scored using GLIDE

Enzyme Docking Results

1

2

3

4

5

6

pyruvic acid

2
-
ketovaleric acid

2
-
keto
-
4
-
methylpentanoic
acid

2
-
ketoisovaleric acid

2
-
ketobutyric
acid

2
-
keto
-
3
-
methylvaleric
acid

Enzyme Docking Poses

MM

part
: 50 Å of

active
site

and solvent molecules
~20,000

atoms

QM part : 63

atoms

Geometry : B3LYP/6
-
31G*

Binding Using Quantum
Mechanics/Molecular Mechanics

2
-
keto
-
4
-
methylpentanoic
acid




pyruvic acid

2
-
keto
-
3
-
methylvaleric
acid

2
-
ketoisovaleric
acid

QM/MM structural studies suggest that the binding of the
substrates does not cause

distortions
to the

active site

Comparison of Bound Structures of
Different Acids: QM/MM

TS1

TS2

HEThDP enamine

LThDP

ThDP ylide + KA

Ville
et al., Nature Chemical Biology,

2(6),
2006, 324

Kinetics of Enzyme
-
Catalyzed
Decarboxylation
: Quantum Mechanics

TS 1

TS 2

ThDP

+

pyruvic

acid

LThDP

enamine

+ CO
2

Free Energy Surface of Thiamine
-
Catalyzed
Decarboxylation
:
Pyruvic

Acid

Free energy barrier (∆G
activation
298K, DCE)

Comparison of Thiamine
-
Catalyzed
Decarboxylation

NOT present in KEGG

NOT present in CAS REGISTRY

1,3,4,5

-

Tetrahydroxy

-

Cyclohexanecarboxylic


acid

H

O

C

O

2

H

H

O

O

H

O

H

3

-

[1

-

Carboxy

-

2

-

(1,4

-

dihydro

-

pyridin

-

3

-

yl)

-

ethoxy]

-

4

-

hydroxy

-

cyclohexa

-

-

1,5

-

dienecarboxylic acid

N

H

O

C

O

2

H

H

O

C

O

2

H

Present in KEGG

(
K
yoto
E
ncyclopedia of
G
enes
and
G
enomes)

Exploring Novel Pathways and Molecules

New routes to

bioavailable species

New molecules

Migration to
Biocatalytic

Processes

H

O

C

O

2

H

H

O

O

O

H

NOT present in KEGG

Present in CAS REGISTRY

1,3,5

-

Trihydroxy

-

4

-

oxo

-

cyclohexane

carboxylic acid

New biochemical routes

to existing chemicals

Acknowledgments



Department of Energy


National Science Foundation Cyber
-
enabled Discovery

and
Innovation


Vassily

Hatzimanikatis


Chunhui

Li


Chris
Henry


Goran

Krilov


Raj
Assary

Funding

Collaborators