Protein-Protein Interaction

signtruculentΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

69 εμφανίσεις

06/05/2008


Jae Hyun Kim

Chapter 1

Probability Theory (
i
) : One Random Variable

Bioinformatics Tea Seminar: Statistical Methods in Bioinformatics


Discrete Random Variable


Discrete Probability Distributions


Probability Generating Functions


Continuous Random Variable


Probability Density Functions


Moment Generating Functions


2

Content

jaekim@ku.edu


Discrete Random Variable


Numerical quantity that, in some
experiment

(Sample
Space)

that involves some degree of randomness, takes
one value from
some discrete set of possible values
(EVENT)


Sample Space


Set of all outcomes of an experiment (or observation)


For Example,


Flip a coin { H,T }


Toss a die {1,2,3,4,5,6}


Sum of two dice { 2,3,…,12 }


Event


Any subset of outcome

3

Discrete Random Variable

jaekim@ku.edu


The probability distribution


Set of values that this random variable can take, together
with their associated probabilities


Example,


Y = total number of heads when flip a coin twice




Probability Distribution Function



Cumulative Distribution Function


4

Discrete Probability Distributions

jaekim@ku.edu


A Bernoulli Trial


Single trial with two possible outcomes


“success” or “failure”


Probability of success = p

5

One Bernoulli Trial

jaekim@ku.edu


The Binomial Random Variable


The number of success in a fixed number of
n

independent
Bernoulli trials with the same probability of success for
each trial


Requirements


Each trial must result in one of two possible outcomes


The various trials must be independent


The probability of success must be the same on all trials


The number n of trials must be fixed in advance


6

The Binomial Distribution

jaekim@ku.edu


Comments


Single Bernoulli Trial = special case (n=1) of
Binomial Distribution


Probability p is often an unknown parameter


There is no simple formula for the cumulative
distribution function for the binomial
distribution


There is no unique “binomial distribution,” but
rather a family of distributions indexed by n
and p

7

Bernoulli Trail and Binomial Distribution

jaekim@ku.edu


Hypergeometric

Distribution


N objects ( n red, N
-
n white )


m objects are taken at random, without replacement


Y = number of red objects taken




Biological example


N lab mice ( n male, N
-
n female )


m Mutations


The number Y of mutant males:
hypergeometric

distribution

8

The
Hypergeometric

Distribution

jaekim@ku.edu


The Uniform Distribution


Same values over the range



The Geometric Distribution


Number of Y Bernoulli trials before but not including the
first failure



Cumulative distribution function

9

The Uniform/Geometric Distribution

jaekim@ku.edu


The Poisson Distribution


Event occurs randomly in time/space



For example,


The time between phone calls


Approximation of Binomial Distribution


When


n is large


p is small


np

is moderate


Binomial (n, p, x ) = Poisson (
np
, x) (


=
np
)


10

The Poisson Distribution

jaekim@ku.edu


Mean / Expected Value



Expected Value of g(y)



Example



Linearity Property



In general,



11

Mean

jaekim@ku.edu


Definition


12

Variance

jaekim@ku.edu

Summary

13

jaekim@ku.edu


Moment


r
th

moment of the probability distribution about
zero



Mean : First moment (r =
1
)


r
th

moment about mean



Variance : r =
2




14

General Moments

jaekim@ku.edu


PGF



Used to derive moments


Mean



Variance



If two
r.v
. X and Y have identical probability
generating functions, they are identically
distributed


15

Probability
-
Generating Function

jaekim@ku.edu


Probability density function f(x)



Probability




Cumulative Distribution Function

16

Continuous Random Variable

jaekim@ku.edu


Mean



Variance




Mean value of the function g(X)

17

Mean and Variance

jaekim@ku.edu


Chebyshev’s

Inequality



Proof

18

Chebyshev’s

Inequality

jaekim@ku.edu


Pdf



Mean & Variance

19

The Uniform Distribution

jaekim@ku.edu


Pdf



Mean

, Variance

2

20

The Normal Distribution

jaekim@ku.edu


Normal Approximation to Binomial


Condition


n is large


Binomial (
n,p,x
) = Normal (

=
np
,

2
=
np
(
1
-
p), x)


Continuity Correction




Normal Approximation to Poisson


Condition




is large


Poisson (

,x) = Normal(

=

,

2
=

, x)


21

Approximation

jaekim@ku.edu


Pdf



Cdf



Mean
1
/

, Variance
1
/

2

22

The Exponential Distribution

jaekim@ku.edu


Pdf



Mean and Variance



23

The Gamma Distribution

jaekim@ku.edu


Definition




Useful to derive





m’(
0
) = E[X], m’’(
0
) = E[X
2
], m
(n)
(
0
) = E[
X
n
]


mgf

m(t) =
pgf

P(e
t
)

24

The Moment
-
Generating Function

jaekim@ku.edu


Conditional Probability



Bayes
’ Formula



Independence



Memoryless

Property

25

Conditional Probability

jaekim@ku.edu


Definition




can be considered as function of P
Y
(y)


a measure of how close to uniform that distribution
is, and thus, in a sense, of the unpredictability of
any observed value of a random variable having that
distribution.


Entropy
vs

Variance


measure in some sense the uncertainty of the value
of a random variable having that distribution


Entropy : Function of
pdf


Variance : depends on sample values

26

Entropy

jaekim@ku.edu