# Using Bayesian Networks to Analyze Expression Data

AI and Robotics

Nov 7, 2013 (4 years and 8 months ago)

91 views

Using Bayesian Networks to
Analyze Expression Data

Jeong, Jong Cheol

Nir Friedman, Michal Linial, Iftach Nachman & Dana Pe’er

Dept. Electrical Engineering & Computer Science

University of Kansas

Bayesian Networks

Representation of a joint probability
distribution

A directed acyclic graph:

-

Random variables

-

Conditional distribution

Conditional independence assumption

-

Each variable is independent of its none
-
descendants

Conditional independence assumption

Any joint distribution can be decomposed
into product form

1
1
(,...,) ( | ( ))
n
G
n i i
i
P X X P X X

Pa
( )
G
i
X
Pa
i
X

is the set of parents of

Bayesian Networks

Conditional independence

I(A; E), I(B; D | A, E),

I(C; A, D, E | B)

I(D; B, C, E | A), I(E; A, D)

P(A,B,C,D,E) = P(A)∙P(B|A,E) ∙P(C|B) ∙P(D|A) ∙P(E)

Specifying conditional distribution

Discrete variables:

-

Using table that specifies the probability of values

-

For binary variable, the table specifies
distribution

Continuous variables:

-

Using linear Gaussian distribution

1
( |,...)
k
P X U U
2
k
1
{,...}
k
U U
: parents of a variable of

2
( |,...,) (,)
i k o i i
i
P X u u N a a u

 

Equivalence Classes of Bayesian Networks

Two directed acyclic graphs are equivalent if only if
they have the same underlying undirected graph and
the same v
-
structure (Pearl & Verma 1991).

-

converging directed edges into the same node:

a b c
 

Ind(G): the set of independence statements

-

if more than one graph exactly same sat of
independencies

where

How can be distinguish between equivalent graph?

:
G X Y

':
G X Y

Ind( ) Ind(')
G G
 
Learning Bayesian Networks

Training set:

-

finding a network

which best matches D

Score function

-

evaluating the posterior probability of a graph given

the data

1
{,...,}
N
D

x x
,,
B G

θ
( | ( ))
G
i i
P X X

Pa
(:) log ( | )
log ( | ) log ( )
S G D P G D
P D G P G C

  
( | ) ( |,) ( | )
P D G P D G P G d
  

-

Marginal likelihood

Property of priors

Structure equivalent:

if graph G and G’ are
equivalent graphs then they are guaranteed to have
the same posterior score.

Decomposable:

the contribution of variable to
the total network score depends only on its own value
and the values of its parents in G

Local contributions for each variable can be
computed using a closed form equation.

i
X
(:) ScoreContribution(,( ):)
G
i i
i
S G D X X D

Pa
Learning Causal Patterns

Bayesian network
:
model of dependencies
between multiple measurements
.

A causal network
:
having stricter interpretation of
the meaning of edges (i.e., the parents of a variable are
its immediate causes.)

X Y X Y
  
Using Bayesian Networks to Analyze
Expression Data
(Nir Friedman et. al.)

Goal

-

Building Bayesian networks which can be applied to
model interactions among genes

-

Examine

1. Markov relation

2. Order relations

Estimating Statistical Confidence in
Features

Using bootstrap method

for

1...
i m

i
D
: sampling N instances from D with replacement

Apply the learning procedure on to induce a network structure

i
D
i
G
f
for each feature , calculating

1
1
conf ( ) ( )
m
i
i
f f G
m

where

1 : is a feature in G
( )
0 : otherwise
i
f
f G

Sparse Candidate algorithm

Choosing most promising candidate parents for

i
X
1
{,...,}
n
i k
C Y Y

Searching a high network in which

( )
n
G
n
i i
X C

Pa
Repeat

if

2 1
( ) ( )
n n
Score G Score G
 

1
n n
G G

End if

Until

is no changeable

n
i
C

Measuring the relevance of potential parent

to .

j
X
i
X
1
1
ScoreContribution(,( ) { }:)
ScoreContribution(,( ):)
n
n
G
i i j
G
i i
X Pa X X D
X Pa X D

Application to Cell Cycle Expression
Patterns

Data set: S. cerevisiae ORFs

-

76 gene expression measurements of the mRNA
levels of 6177.

Sparse candidate algorithm with 200
-
fold bootstrap

Experiment

-

the discrete multinomial distribution

-

linear Gaussian distribution

Markov features

Local map for the gene SVS1

Multinomial

Linear Gaussian

References

Friedman, N., Linial, M., Nachman, I., & Pe’er, D. Using bayesian
networks to analyze expression data.