# Applying Bayesian networks to modeling of cell signaling ... - MIT

AI and Robotics

Nov 7, 2013 (4 years and 8 months ago)

115 views

Applying Bayesian
networks to modeling
of cell signaling
pathways

Kathryn Armstrong and
Reshma Shetty

Outline

Biological model system (MAPK)

Overview of Bayesian networks

Design and development

Verification

Correlation with experimental data

Issues

Future work

MAPK Pathway

K
-
PP

KK
-
PP

KKK*

KKK

E1

E2

KK

KK
-
P

K

K
-
P

KK’ase

K’ase

Overview of Bayesian Networks

Burglary

Earthquake

Alarm


P
Burglary
|
Alarm

P
B
,
E
,
A

P
B
,
^
E
,
A

P
B
,
E
,
A

P
B
,
^
E
,
A

P
^
B
,
E
,
A

P
^
B
,
^
E
,
A


P
Burglary

0
.
01
P
Earthquake

0
.
01
P(A)

P(^A)

B

E

0.01

0.99

No

No

0.80

0.20

Yes

No

0.10

0.90

No

Yes

0.90

0.10

Yes

Yes

Givens:

Bayesian network model

K
-
PP

KK
-
PP

KKK*

KKK

E1

E2

KK

KK
-
P

K

K
-
P

KK’ase

K’ase

Normalized concentrations of all
species

Discretized continuous concentration
curves at 20 states

-
state behavior

Simplifying Assumptions

The key factor in determining the
performance of a Bayesian
network is the data used to train
the network.

Training

data

Probability

tables

Bayesian

network

Network training I: Data source

Current experimental data sets were
not sufficient to provide enough
information

Relied on ODE model to generate
training set (
Huang et al.
)

-
state
behavior of the MAPK signaling
pathway

Network training II: Poor data
variation

Network training III: incomplete
versus complete data sets

4D

Time = (# samples)
4

E1

1D x 4

E2

MAPKPase

MAPKKPase

Time = (# samples) x 4

Verification: P(Kinase | E1, P’ases)

Huang
et al.

Bayesian network

Verification: P(E1 | MAPK
-
PP, P’ases)

C.F. Huang and J.E. Ferrell,

93
, 10078 (1996).

Correlation with experimental data

Correlation with experimental data

J.E. Ferrell and E.M. Machleder, Science
280
, 895 (1998).

Where does our Bayesian
network fail?

Where does our Bayesian
network fail?

Inference from incomplete data

K
-
PP

KK
-
PP

KKK*

KKK

E1

E2

KK

KK
-
P

K

K
-
P

KK’ase

K’ase

Future work

Time incorporation to represent signaling
dynamics

Continuous or more finely discretized
sampling and modeling of node values

Priors

Bayesian posterior

Structure learning

Open areas of research

modeled with a directed acyclic
graph?

Cyclic networks

Hard, but doable

Theoretically impossible

Need an alternate way to
represent feedback loops

Why use a Bayesian network?

ODE’s require detailed kinetic and
mechanistic information on the
pathway.

Bayesian networks can model
pathways well when large amounts
of data are available regardless of
how well the pathway is understood.

Acknowledgments

Kevin Murphy

Doug Lauffenburger

Paul Matsudaira

BE400 students

References

http://www.cs.berkeley.edu/~murphyk/Bayes/bayes.html

http://www.ai.mit.edu/~murphyk/Software/BNT/usage.html

A.R. Asthagiri and D.A. Lauffenburger,
Biotechnol. Prog.

17
, 227 (2001).

A.R. Asthagiri, C.M. Nelson, A.F. Horowitz and D.A. Lauffenburger,
J. Biol.
Chem.

274, 27119 (1999).

J.E. Ferrell and R.R. Bhatt,
J. Biol. Chem.

272
, 19008 (1997).

J.E. Ferrell and E.M. Machleder,
Science

280
, 895 (1998). C.F. Huang and
J.E. Ferrell,

93
, 10078 (1996). F. V. Jensen.
Bayesian Networks and Decision Graphs
. Springer: New York, 2001.

K.A. Gallo and G.L. Johnson,
Nat. Rev. Mol. Cell Biol.

3
, 663 (2002). K.P.
Murphy, Computing Science and Statistics. (2001).

S. Russell and P. Norvig.
Artificial Intelligence: A Modern Approach
.
Prentice Hall: New York, 1995.

K Sachs, D. Gifford, T. Jaakkola, P. Sorger and D.A. Lauffenburger,
Science STKE

148
, 38 (2002).

Network training IV: final data
set

E1

E2 (P’ase)

MAPKKPase

MAPKPase

MAPK
-
PP

0

0

0

0

0

0

0

0

1

0

0

0

1

0

0

0

0

1

1

0

0

1

0

0

0

0

1

0

1

0

0

1

1

0

0

0

1

1

1

0

1

0

0

0

1

1

0

0

1

1

1

0

1

0

1

1

0

1

1

0

1

1

0

0

1

1

1

0

1

0

1

1

1

0

0

1

1

1

1

0

Network training V: Final
concentration ranges

Network training III: Observation
of all input combinations

E1

E1

MAPKKPase

E2

4D Visualization

3D Visualization

2D Visualization

Time = (# samples)
4

1D Visualization

E2

MAPKPase

MAPKKPase