Introduction
PACLearning
Summary
Probably Approximately Correct Learning
Holger Wunsch
Seminar für Sprachwissenschaft
Universität Tübingen
26th April 2005
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Outline
1
Introduction
Computational Learning Theory
Concepts and Hypotheses
2
PACLearning
The ProblemSetting
Error of a Hypothesis
PACLearnability
3
Summary
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Computational Learning Theory – Why?
Questions we would like to have answers for:
How complex is my learning problem?
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Computational Learning Theory – Why?
Questions we would like to have answers for:
How complex is my learning problem?
Can the learner I amconsidering solve this problem?
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Computational Learning Theory – Why?
Questions we would like to have answers for:
How complex is my learning problem?
Can the learner I amconsidering solve this problem?
How much data about the problemdoes the learner need
to be able to learn?
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Computational Learning Theory – Why?
Questions we would like to have answers for:
How complex is my learning problem?
Can the learner I amconsidering solve this problem?
How much data about the problemdoes the learner need
to be able to learn?
How well will the learner do?
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Computational Learning Theory – Why?
Questions we would like to have answers for:
How complex is my learning problem?
Can the learner I amconsidering solve this problem?
How much data about the problemdoes the learner need
to be able to learn?
How well will the learner do?
How long will it take?
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Computational Learning Theory – Why?
Questions we would like to have answers for:
How complex is my learning problem?
Can the learner I amconsidering solve this problem?
How much data about the problemdoes the learner need
to be able to learn?
How well will the learner do?
How long will it take?
)
We would like to have the answers before we start learning!
)
A computational theory of learning can help us to ﬁnd
answers.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Concepts and Hypotheses
Target Concept
:The hidden law,which assigns a given
instance to the correct class.
Hypothesis
:The learner’s current “guess” of what the
target concept might be.
After having ﬁnished learning,the learner should output a
hyphothesis that is “right” (sufﬁciently close to the target
concept).
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Computational Learning Theory
Concepts and Hypotheses
Concepts and Hyptheses – a Little More Formal...
Deﬁnition
A
target concept
is a function c:X 7!C that maps an instance
x froma set of instances X to a class y froma set of possible
classes C.
Deﬁnition
A
hypothesis
is a function h:X 7!C that maps an instance x
froma set of instances X to a class y froma set of possible
classes C.
Deﬁnition
The
hypothesis space H
is the set of all hypotheses h that a
learner considers while learning.
)
Multiple
hypotheses,but
only one
target concept!
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
The Learning Process
The learner L is presented a sequence of
training
instances x
.
The training instances are drawn fromX according to
some probability distribution D.
D must be
stationary
!it may not change over time!
L considers a set of possible hypotheses H while learning.
As its result,L ﬁnally outputs one hypothesis
h 2 H
.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
Learner’s Performance
The more times h(x) and c(x) are equal,the smaller is the
error
of h.
)
The error of a hypothesis is a measure of the learner’s
performance.
)
The smaller the error,the better the learner.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
The True Error of a Hypothesis
The
true error
of a hypothesis h measures how closely the
hypothesis matches the target concept.
h
c

+
+


Areas where target concept and hypothesis match are green,those
areas where the two do not match are red.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
Formal Deﬁnition of True Error
Deﬁnition
The
true error
(denoted error
D
(h)) of hypothesis h with respect
to target concept c and distribution D is the probability that h
will misclassify an instance drawn at randomaccording to D.
error
D
(h) ´ Pr
x2D
[c(x) 6= h(x)]
The
true error
applies to
all instances
,not only the training
instances!
The learner itself can only observe the performance of h
over the training instances.This error is called
training
error
.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
Learnability (1)
What classes of target concepts can be reliably learned...
...froma reasonable number of randomly drawn training
examples?
...requiring a reasonable amount of computation?
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
Learnability (1)
What classes of target concepts can be reliably learned...
...froma reasonable number of randomly drawn training
examples?
...requiring a reasonable amount of computation?
One possible characterization of learnability:
Number of training examples needed to learn a hypothesis
h such that error
D
(h) = 0.
)
The more training examples are needed,the harder the
target concept is to learn.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
Learnability (2)
Learnability:Number of training examples needed to learn a
hypothesis h such that error
D
(h) = 0.
Problems with this characterization:
In order to achieve zero error,examples corresponding to
every possible instance must be presented.
)Unrealistic (actually,impossible)
Randomly drawn training examples may be misleading.
)The requirement of zero error is too strong.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
PACLearnability
Weakened demands on learnability:
The learner L doesn’t need to output a zero error
hypothesis.
Instead,a hypothesis is okay if its error is not greater than
a small value ².
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
PACLearnability
Weakened demands on learnability:
The learner L doesn’t need to output a zero error
hypothesis.
Instead,a hypothesis is okay if its error is not greater than
a small value ².
The learner L doesn’t need to succeed for every sequence
of instances.
Instead,we accept the learner even if it fails sometimes
with a probability not higher than a small value ±.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
PACLearnability
Weakened demands on learnability:
The learner L doesn’t need to output a zero error
hypothesis.
Instead,a hypothesis is okay if its error is not greater than
a small value ².
The learner L doesn’t need to succeed for every sequence
of instances.
Instead,we accept the learner even if it fails sometimes
with a probability not higher than a small value ±.
)
We only require a learner that
probably
learns an
approximately correct
hypothesis
)
Probably Approximately Correct (PAC) learner
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
The Problem Setting
Error of a Hypothesis
PACLearnability
PACLearnability formalized
Deﬁnition (Mitchell,p.206)
Consider a concept class C deﬁned over a set of instances X
of length n and a learner L using hypothesis space H.C is
PAClearnable
by L using H if for all c 2 C,distributions D over
X,² such that 0 < ² <
1
2
,and ± such that 0 < ± <
1
2
,learner L
will with probability at least (1 ¡±) output a hypothesis h 2 H
such that error
D
(h) · ²,in time that is polynomial in
1
²
;
1
±
;n,and
size(c).
n is the size of instances in X.
size(c) is the encoding length of C.
Holger Wunsch
PACLearning
Introduction
PACLearning
Summary
Wrapping Up
Computational Learning Theory guarantees that if a problemis
PAClearnable,
a learner will be able to learn a hypothesis in most cases
sometimes the learner might fail,but we know in advance
how likely that is
the hypothesis might have a small error,but we know that it
will never be greater than a certain bound we can estimate
in advance
a solution will be found in reasonable (polynomial) time
For each problem,we must
prove
that it is PAClearnable by the
learner.This can be a challenge!
Holger Wunsch
PACLearning
Comments 0
Log in to post a comment