Applying Bayesian networks to modeling of cell signaling ... - MIT

cabbageswerveAI and Robotics

Nov 7, 2013 (3 years and 9 months ago)

96 views

Applying Bayesian
networks to modeling
of cell signaling
pathways

Kathryn Armstrong and
Reshma Shetty

Outline


Biological model system (MAPK)


Overview of Bayesian networks


Design and development


Verification


Correlation with experimental data


Issues


Future work

MAPK Pathway

K
-
PP

KK
-
PP

KKK*

KKK

E1

E2

KK

KK
-
P

K

K
-
P

KK’ase

K’ase

Overview of Bayesian Networks

Burglary

Earthquake

Alarm


P
Burglary
|
Alarm



P
B
,
E
,
A



P
B
,
^
E
,
A


P
B
,
E
,
A



P
B
,
^
E
,
A



P
^
B
,
E
,
A



P
^
B
,
^
E
,
A



P
Burglary



0
.
01
P
Earthquake



0
.
01
P(A)

P(^A)

B

E

0.01

0.99

No

No

0.80

0.20

Yes

No

0.10

0.90

No

Yes

0.90

0.10

Yes

Yes

Givens:

Bayesian network model

K
-
PP

KK
-
PP

KKK*

KKK

E1

E2

KK

KK
-
P

K

K
-
P

KK’ase

K’ase


Normalized concentrations of all
species


Discretized continuous concentration
curves at 20 states






Considered steady
-
state behavior

Simplifying Assumptions


The key factor in determining the
performance of a Bayesian
network is the data used to train
the network.

Training

data

Probability

tables

Bayesian

network

Network training I: Data source


Current experimental data sets were
not sufficient to provide enough
information


Relied on ODE model to generate
training set (
Huang et al.
)


Captured the essential steady
-
state
behavior of the MAPK signaling
pathway


Network training II: Poor data
variation

Network training III: incomplete
versus complete data sets

4D

Time = (# samples)
4

E1

1D x 4

E2

MAPKPase

MAPKKPase

Time = (# samples) x 4

Verification: P(Kinase | E1, P’ases)

Huang
et al.

Bayesian network

Verification: P(E1 | MAPK
-
PP, P’ases)

C.F. Huang and J.E. Ferrell,
Proc. Natl. Acad. Sci. USA

93
, 10078 (1996).

Correlation with experimental data

Correlation with experimental data

J.E. Ferrell and E.M. Machleder, Science
280
, 895 (1998).

Where does our Bayesian
network fail?

Where does our Bayesian
network fail?

Inference from incomplete data

K
-
PP

KK
-
PP

KKK*

KKK

E1

E2

KK

KK
-
P

K

K
-
P

KK’ase

K’ase

Future work


Time incorporation to represent signaling
dynamics


Continuous or more finely discretized
sampling and modeling of node values


Priors


Bayesian posterior


Structure learning

Open areas of research


Should steady state behavior be
modeled with a directed acyclic
graph?


Cyclic networks

Hard, but doable

Theoretically impossible

Need an alternate way to
represent feedback loops

Why use a Bayesian network?


ODE’s require detailed kinetic and
mechanistic information on the
pathway.


Bayesian networks can model
pathways well when large amounts
of data are available regardless of
how well the pathway is understood.

Acknowledgments


Kevin Murphy


Doug Lauffenburger


Paul Matsudaira


Ali Khademhosseini


BE400 students

References



http://www.cs.berkeley.edu/~murphyk/Bayes/bayes.html


http://www.ai.mit.edu/~murphyk/Software/BNT/usage.html


A.R. Asthagiri and D.A. Lauffenburger,
Biotechnol. Prog.

17
, 227 (2001).


A.R. Asthagiri, C.M. Nelson, A.F. Horowitz and D.A. Lauffenburger,
J. Biol.
Chem.

274, 27119 (1999).


J.E. Ferrell and R.R. Bhatt,
J. Biol. Chem.

272
, 19008 (1997).


J.E. Ferrell and E.M. Machleder,
Science

280
, 895 (1998). C.F. Huang and
J.E. Ferrell,
Proc. Natl. Acad. Sci. USA

93
, 10078 (1996). F. V. Jensen.
Bayesian Networks and Decision Graphs
. Springer: New York, 2001.


K.A. Gallo and G.L. Johnson,
Nat. Rev. Mol. Cell Biol.

3
, 663 (2002). K.P.
Murphy, Computing Science and Statistics. (2001).


S. Russell and P. Norvig.
Artificial Intelligence: A Modern Approach
.
Prentice Hall: New York, 1995.


K Sachs, D. Gifford, T. Jaakkola, P. Sorger and D.A. Lauffenburger,
Science STKE

148
, 38 (2002).



Network training IV: final data
set

E1

E2 (P’ase)

MAPKKPase

MAPKPase

MAPK
-
PP

0

0

0

0

0

0

0

0

1

0

0

0

1

0

0

0

0

1

1

0

0

1

0

0

0

0

1

0

1

0

0

1

1

0

0

0

1

1

1

0

1

0

0

0

1

1

0

0

1

1

1

0

1

0

1

1

0

1

1

0

1

1

0

0

1

1

1

0

1

0

1

1

1

0

0

1

1

1

1

0

Network training V: Final
concentration ranges

Network training III: Observation
of all input combinations

E1

E1

MAPKKPase

E2

4D Visualization

3D Visualization

2D Visualization

Time = (# samples)
4

1D Visualization

E2

MAPKPase

MAPKKPase