Unit 1
Probability Theory
1.1 Set Theory
Definition
: sample space, all possible outcomes
Example: tossing a coin,
Example: reaction time to a certain stimulus,
Sample
space
:
may be
c
ountable or
uncountable
C
ountable: put 1

1 correspondence with a subset of integers
F
inite elements
countable
I
nfinite elements
countable
or uncountable
Fact: There is only countable sample space since measu
rements cannot be made with
infinite accuracy
Definition
event: any measurable collection of possible outcomes, subset of
I
f
,
occurs if outcome is in the set
.
: probability of an event (rather than a set)
Theorem
: events on
(1).
Commutativity
,
(2).
Associativity
,
(3).
Distributive Laws
(4).
DeMorgan
’
s Law
,
Example:
(a) If
,
,
.
(b) If
,
.
Definition
(a)
are disjoint if
.
(b)
are pairwise disjoint if
,
.
Definition
are pa
irwise disjoint and
, then
form a
partition of
.
1.2
Basic Probability Theory
: event in
,
,
:
probability of
D
omain of
: all measurable subsets of
D
efinition
(sigma algebra,

algebra, Borel field):
collection
of subsets of
satisfies (a)
(b) if
(
closed
under complementation)
(c) if
(closed under countable unions)
Properties
(a)
(b)
Example
: (a)
: trivial

algebra
(b) smallest

algebra that contains all of the open sets in
=
{all open sets in
}=
(intersection on all possible

algebra)
Definition
Kolmogorov Axioms (Axioms of Probability)
Given
, probability function is a function
with domain
s
atisfies
(a)
,
(b)
(c) If
, pairwise disjoint
(Axiom of countable additivity)
Exercise: axiom of finite additivity + continuity of
(if
)
axiom of countable additivity
Theorem
If
,
: probability
(a)
(b)
(c)
Theorem
(a)
(b)
(c)
, then
Bonferroni
’
s Inequality
Example: (a)
,
(b)
,
,
, useless but correct
Theorem
(a)
, for any partition
(b)
for any
(Boole
’
s inequality)
General
versio
n of Bonferroni inequality:
Counting
without
replacement
with
replacement
Ordered
Unordered
Let
be a sequence of sets.
The
set of all points
that belong to
for infinitely many values of
is known as the
limit superior
of the sequence and is
denoted by
or
.
The set of all points that belong to
for all but a finite number of values of
is known as the
limit inferior
of the sequence
and is denoted by
or
. If
, we say that the limit exists and write
for the
common set and call it the
limit set
.
We have
.
If the sequence
is such that
, for
, it is called
nondecreasing
; if
,
, it is called
nonincreasing
. If the sequence
is nondecreasing, or nonincreasin
g, the limit exists and we have
if
is nondecreasing and
if
is nonincreasing.
Theorem
Let
be a nondecreaing sequence of events in
; that is
,
, and
,
Then
.
P
roof
. Let
. Then
.
By countable additivity we
have
,
and letting
, we see that
.
The
second
term on the right tends to zero as
since the sum
and each summand is
nonnegative
.
T
he re
sult follows.
C
orollary
Let
be a nonincreasing sequence of events in
. Then
.
Proof
.
Consider the nondecreasing sequence of events
. Then
.
It follows from
the
above Theorem that
.
Hence,
.
Example (Bertrand
’
s Paradox) A chord is drawn at random in
the
unit circle. What is
the probability that the chord is longer than the side of the equilater
al triangle
inscribed in the circle?
Solution 1
. Since the length of a chord is uniquely determined by the position of its
midpoint, choose a point
at random in the circle and draw a line through
and
, the center of the circle. Draw the chord through
perpendicular to the line
.
If
is the length of the chord with
as midpoint,
if and only if
lines inside the circle with center
and radius
. Thus
.
Solution 2
.
Because of symmetry, we may fix one endpoint of the chord at some point
and then choose the other endpoint
at random. Let the probability that
lies on an arbitrary arc of the circle be proportional to the length of this arc. Now the
inscribed equilateral
triangle having
as one of
its vertices divides the
circumference into three equal parts. A chord drawn through
will be longer than
the side of the triangle if and only
if the other endpoint
of the chord lies on that
one

third of the circumference that is opposite
. It follows that the required
probability is
.
Solution 3
.
Note that the length of a chord is determined uniquely by the distanc
e of
its midpoint from the center of the circle. Due to the symmetry of the circle, we
assume that the midpoint of the chord lies on a fixed radius,
, of the circle. The
probability that the midpoint
lies in a
given segment of the radius through
is
then proportional to the length of this segment. Clearly, the length of the chord will be
longer than the side of the inscribed equilateral triangle if the length of
is
less
than
. It follows that the required probability is
.
Question: What
’
s happen? Which answer(s) is (are) right?
Example: Consider sampling
items from
items,
with replacement. The
outcomes in the ordered and unordered sample spaces are these.
Unordered
{1,1}
{2,2}
{3,3}
{1,2}
{1,3}
{2,3}
Probability
1/6
1/6
1/6
1/6
1/6
1/6
Ordered
(1,1)
(2,2)
(3,3)
(1,2), (2,1)
(1,3), (3,1)
(2,3), (3,2)
Probability
1/9
1/9
1/9
2/9
2/9
2/9
Which one is correct?
Hint: The confusion arises because the phrase
“
with replacement
”
will typically be
interpreted with the sequential kind of sampling, leading to assigning a probability 2/9
to the event {1, 3}.
1.3
Conditional Proba
bility and Independence
Definition
C
onditional
probability of
given
is
,
provided
.
Remark: (a) In the above definition,
becomes the samp
le space and
.
All events are calibrated with respect to
.
(b) If
then
and
.
D
isjoint is not
the same as independent.
Definition
and
are independent if
.
(or
)
Example: Three prisoners,
,
, and
, are on death row. The governor de
cides to
pardon one of the three and chooses at random the prisoner to pardon. He informs the
warden of his choice but requests that the name be kept secret for a few days.
The
next day,
tries to get the warden to tell him who had
been pardoned. The
warden refuses.
then asks which of
or
will be executed. The warden
thinks for a while, then tells
that
is to be execu
ted.
Warden
’
s reasoning:
Each prisoner has a 1/3 chance of being pardoned. Clearly,
either
or
must be executed, so I have given
no information about whether
wil
l be pardoned.
’
s reasoning:
Given that
will be executed, then either
or
will be
pardoned. My chance of being pardoned has risen to 1/2.
Which one is correct?
Bayes
’
Rule
: partition of sample space,
: any set,
.
Example:
When coded messages are sent, there are sometimes errors in transmission.
In particular, Morse code uses
“
dots
”
and
“
dashes
”
, which are known to occur in the
proportion of 3:4. This means that for any given symbol,
and
.
Suppose there is interference on the transmission line, and with probability 1/8 a dot
is mistakenly received a
s a dash, and vice versa. If we receive a dot, can we be sure
that a dot was sent?
Theorem
If
then (a)
, (b)
, (c)
.
Definition
: mutuall
y independent if any subcollection
then
.
1.4
Random Variable
Definition
Define
new sample space
.
: random
variable
,
,
, where
: induced probability
function on
in terms of original
by
,
and
satisfies the Kolmogorov Axioms.
Exam
ple:
Tossing three coins,
: # of head
S =
{HHH,
HHT,
HTH,
THH,
TTH,
THT,
HTT,
TTT}
X :
3
2
2
2
1
1
1
0
Therefore,
, and
.
1.5 Distribution Functions
W
ith every random variable
, we associate a function called the cumulative
distribution function of
.
Definition
The cumulative distribution function or cdf of a random variable
,
denoted by
, is d
efined by
, for all
.
Example
:
Tossing three coins,
: # of head
, the
corresponding
c
.
d
.
f
.
is
,
where
: (a) is a step function
(b) is defined for all
, not just in
(c) jumps at
, size of jump
(d)
for
;
for
(e) is right

continuous (is left

continuous if
)
Theorem
is a c.d.f.
(a)
,
.
(b)
: non

decreasing
(c)
: right

continuous
Example: Tossing a coin until a head appears. Define a random variable
:
# of
tosses required to get a head. Then
,
,
.
The c.d.f. of the random variable
is
,
.
It i
s easy to check that
satisfies the three conditions of c.d.f.
Example:
A continuous c.d.f. (of logistic distribution) is
, which
satisfies the three conditions of c.d.f.
Definition
(a)
is continuous if
is continuous.
(b)
is discrete if
is a step
function
.
Definition
and
are identical distributed if
,
.
Example:
Tossing a fair coin three times. Let
: # of head and
: # of tail. Then
,
.
But for each sample point
,
.
Theorem
and
are identical distributed
,
.
1.6 Density and Mass Function
Definition
The probability mass function (p.m.f.) of a discrete random variable
is
for all
.
Example:
For the geometric distribution,
we have the p.m.f.
.
A
nd
size of jump in c.d.f. at
,
,
.
Fact: For continuous random variable (a)
,
.
(b)
. Using the Fundamental Theorem of Calculus, if
is continuous, we have
.
Definition
The probability density function or pdf,
, of a continuous random
variable
is the function that satisfies
for all
.
Notation: (a)
,
is distributed as
.
(b)
,
and
have the same distribution.
Fact: For continuous,
.
Example:
For the logistic distribution
,
we have
, and
.
Theorem
: pdf (or pmf) of a random variable if and only if
(a)
,
.
(b)
(pmf) or
(pdf).
Fact: For any nonnegative function with finite positive integral (or sum) can be turned
into a pdf (or pmf)
then
.
Comments 0
Log in to post a comment